Federated search engine and the (PAPIzed) TF-emc2Wiki TF-EMC2 The Searchy Architecture ● Each source incorporates an agent, available through a SOAP interface ● Uses RDF as internal representation ● Agents for LDAP, SQL, the Google API, and Searchy itself Federated search engine and the (PAPIzed) TF-emc2Wiki Searchy test installation ● To evaluate federated data acces using Searchy ● Build a directory of middleware resources ● Using each organization's data sources ● Installing a Searchy agent in your systems ● Initially, RedIRIS runs the main search interface http://www.rediris.es/busquedas/searchy/middleware/index.en.phtml ● Prepare a report with your feedback as a deliverable Federated search engine and the (PAPIzed) TF-emc2Wiki Installing your Searchy agent ● Download and unpack the lattest Searchy distribution ● http://jsearchy.sourceforge.net/ ● You only need J2SE >= 1.4 ● Select your data sources (backends) ● SQL ● LDAP ● Web servers (Google API for a restricted search) ● Configure your agent ● Use the sample agent configuration file in the conf directory ● Or the simplified configuration to be distributed in the list ● Support at http://lists.sourceforge.net/lists/listinfo/jsearchy-users ● Register your agent ● Host and port ● [email protected] Federated search engine and the (PAPIzed) TF-emc2Wiki Configuring your Searchy agent ● Searchy configuration is contained in a XML file ● conf/agent.xml ● Three main elements ● <transport> ● General parameters of the agent ● <provider> ● Access parameters to the different data sources ● More than one provider can be used for an agent ● <map> ● Take care of the data transformations ● Queries received by the agent into queries to the provider(s) ● Responses from the providers into metadata to be sent by the agent Federated search engine and the (PAPIzed) TF-emc2Wiki The <transport> element ● Basic configuration parameters ● Identifier for the agent ● Providers to be used ● Port to listen at and maximum number of connections ● Log configuration (using log4j) ● Vocabulary to be used by the metadata ● A subset of Dublin Core is going to be used: ● dc:title, dc:subject and dc:description for queries ● dc:title, dc:subject, dc:description, dc:creator (and URL!) for responses ● ACLs to be applied when receiving ● Simple rules based on hostname or IP addresses ● Pilot config only accepts connections from certain RedIRIS hosts Federated search engine and the (PAPIzed) TF-emc2Wiki The <provider> element ● Identifier, type and applicable map ● The rest of parameters depend on the type ● Three types included in the pilot config ● Google ● The account key to be used when connecting to the WS interface ● SQL ● A valid JDBC driver class name ● Connection data: URL using the jdbc method, hostname, port, database, username, password ● LDAP ● URL for the LDAP server ● Root and search scope ● Other LDAP parameters: follow referrals, timeout,... Federated search engine and the (PAPIzed) TF-emc2Wiki The <map> element ● Map name and applicable vocabulary ● Elements describing input/outpust transformations ● <URL>: Do not fiddle with it unless you know what you're doing! ● One element per input term (type="query") ● How query term is translated into the backend query language <dc:title filter="query"> SELECT titleDB, subjectDB, creatorDB, descriptionDB FROM table WHERE (titleDB="%query%") </dc:title> ● One element per output term (type="response") ● How results field (enclosed between %) are transformed to build the term contents in the response <dc:description type="response"> %snippet% </dc:description> Federated search engine and the (PAPIzed) TF-emc2Wiki The (PAPIzed) TF-emc2Wiki ● Available at http://www.rediris.es/wiki/tf-emc2/ ● Protected by PAPI ● Possibility of full and read-only access ● We'll be happy to make interoperability tests with other AAIs ● We'll include all the users in the mailing list ● Username: your e-mail address ● Password: you'll receive one that you can (should) change ● Those already with access to the JRA5Wiki will be automatically enabled Federated search engine and the (PAPIzed) TF-emc2Wiki
© Copyright 2026 Paperzz