Installation and running instructions for the ESGF search services. INSTALLING AND BUILDING FROM THE COMMAND LINE Retrieve the software from the ESGF git repository with one of the following commands: - git clone git@esg-repo.llnl.gov:esg-search.git (if you have an ssh git account) - git clone git://esg-repo.llnl.gov/esg-search.git (if you don't have an ssh git account) The bulding process is based on Apache Ant (to execute tasks) and Ivy (to manage and download dependencies). After downloading the software, execute the following commands: - cd esg-search - ant make_dist (to build the distribution jars) - ant test (to run the tests) INSTALLING AND CONFIGURING APACHE SOLR FOR ESG o Download the latest version of Apache Solr from the the web site: http://lucene.apache.org/solr/ (at the time of this writing, the latest distribution is apache-solr-1.4.1.tar) - tar xvf apache-solr-1.4.1.tar - cd apache-solr-1.4.1.tar o Customize the example directory to use the ESGF specific configuration: - cd examples - cp /src/java/test/solr/conf/schema.xml solr/conf/. - cp /src/java/test/solr/conf/solrconfig.xml solr/conf/. o start the server: - java -jar start.jar o test the server is running by accessing the URL: http://localhost:8983/solr/ in a web browser RUNNING THE COMMAND LINE PUBLISHING TOOL The application comes with a command-line program that can be used to experiment with publishing/unpublishing metadata records into/from a Solr engine. o Start an Apache Solr engine configured with the specific ESGF schema as described above. o Edit the file src/java/main/esg/search/config/application.properties and insert the URL of the Solr server you just started. By default, the application is configured to publish/unpublish records from a Solr server at http://localhost:8983/solr o Run the Java class PublishingServiceMain supplying it with the URL of a metadata repository, the type of that repository, and a switch flag for publishing or unpublishing. Note that at this time the application can parse THREDDS catalogs, records coming from a OAI/DIF repository, and from a CAS/RDF metadata catalog. Some example XML documents of these kinds are provided in the "resources/" subdirectory. Examples: a) Publishing a single THREDDS catalog o cd o cd build (this directory contains the .class files after you have run "ant make_dist") o java -Dlog4j.configuration=./log4j.xml -Djava.ext.dirs=../lib/:../lib/fetched esg.search.publish.impl.PublishingServiceMain file:///Users/cinquini/Documents/workspace/esg-search/resources/pcmdi.ipcc4.GFDL.gfdl_cm2_0.picntrl.mon.land.run1.v1.xml THREDDS true o a search for "snow" in the Solr admin interface available at http://localhost:8983/solr/admin/ will return one record b) Unpublishing a single THREDDS catalog o java -Dlog4j.configuration=./log4j.xml -Djava.ext.dirs=../lib/:../lib/fetched esg.search.publish.impl.PublishingServiceMain file:///Users/cinquini/Documents/workspace/esg-search/resources/pcmdi.ipcc4.GFDL.gfdl_cm2_0.picntrl.mon.land.run1.v1.xml THREDDS false c) Publish all XML records contained in a serialized OAI feed: o cd /build o java -Dlog4j.configuration=./log4j.xml -Djava.ext.dirs=../lib/:../lib/fetched esg.search.publish.impl.PublishingServiceMain file:///Users/cinquini/Documents/workspace/esg-search/resources/ORNL-oai_dif.xml OAI true o a search for "eosdis" in the Solr admin interface available at http://localhost:8983/solr/admin/ will return 293 records d) Unpublish a single XML record: o java -Dlog4j.configuration=./log4j.xml -Djava.ext.dirs=../lib/:../lib/fetched esg.search.publish.impl.PublishingServiceMain OTTER_LICOR d) Unpublish all the XML records from the OAI feed: o java -Dlog4j.configuration=./log4j.xml -Djava.ext.dirs=../lib/:../lib/fetched esg.search.publish.impl.PublishingServiceMain file:///Users/cinquini/Documents/workspace/esg-search/resources/ORNL-oai_dif.xml OAI false RUNNING THE SEARCH TOOL The application contains a command-line tool that can be used to execute some canned queries towards the Solr engine, both free text queries and faceted queries. Note that the result of these queries will depend on how the Solr engine has been populated in the harvesting phase. The Java program can be edited and recompiled to execute different queries that better reflect the content of the XML records that were inserted. To run the search command line tool, do the following: o Start the Apache Solr engine. o Edit the file src/java/main/esg/search/config/application.properties and insert the URL of the Solr server you want to query. o cd /build o java -Dlog4j.configuration=./log4j.xml -Djava.ext.dirs=.:../lib/:../lib/fetched esg.search.query.impl.solr.SearchServiceMain SPRING CONTEXT DEPLOYMENT The ESG search module is already configured to be easily included in any application that runs within the Spring Framework container (IMPORTANT NOTE: Spring Framework version 3 or later is required). Specifically, the module beans are auto-wired via Spring annotations, including injecting specific parameters defined in a property file into the beans configuration. To bootstrap the ESGF search services within a Spring container, modify the application Spring XML configuration as follows: o In the XML file header, make sure to include the Spring namespace for context support: o Insert support for Spring annotations: o Load a Java property file containing the application parameters: Currently the only required entry in the application.properties file is the URL of the Solr server: esg.search.solr.url=http://localhost:8983/solr o To use the publishing service and related beans, include the instruction to scan the publish sub-package: o To use the search service and related beans, include the instruction to scan the query sub-package: Finally, make sure that the esg-search.X.X.X.X.jar is available to the application. For a web application, this means copying the jar file to the WEB-INF/lib directory of the servlet container. DEPLOYING THE SEARCH MODULE WEB SERVICES The Ant target "make_war" allows to build the esg-search-XXXX.war artifact, which is a web application that exposes the ESGF search publishing services as (hessian) web services that can be invoked by remoting clients. The application deployment descriptor web.xml loads the Spring application context from the file esg/search/config/web-application-context.xml, which in turns combines the configuration from two files: application-context.xml, which contains the core search services, and ws-context.xml, which contains the search web services. Example web services clients can be found in the package esg.search.ws.hessian.client. To run the web services application within Eclipse: o execute the Ant target "make_web_dir", which builds the web application directory o refresh the project o add the project to an embedded instance of Tomcat To execute a secure Hessian call with client authentication: o Server side: - configure Tomcat secure port to use SSL cert and request client ceertificate: - start Tomcat with trustore that contains client CA: -Djavax.net.ssl.trustStore=/Users/cinquini/myApplications/apache-tomcat/esg-truststore.ts -Djavax.net.ssl.trustStorePassword=changeit o Client side: - use a certificate issued by some CA: CertUtils.setKeystore("esg/search/ws/hessian/client/client-cert.ks"); - trust the server certificate: CertUtils.setTruststore("esg/search/ws/hessian/client/localhost-client-trustore.ks");