Xper2: training and example of management system for description and free access identification key Hélène Fradin Elise Kuntzelmann / [email protected] Régine Vignes / [email protected] EDIT National Museum of Natural History of Paris Laboratoire Informatique et Systématique UMR 5143 Paleobiodiversity (CNRS, MNHN, Paris 6) Université Paris 6 – Pierre et Marie Curie FRANCE Localisation /building place: Bâtiment de Géologie (MNHN) 43, rue Buffon 75005 Paris Tel 01 40 79 80 61 Postal address: MNHN CP48 57 rue Cuvier 75231 Paris Cedex O5 - France http://lis.snv.jussieu.fr Abstract: Xper2 is a plate-forme dedicated to taxonomic descriptions and computer-aided-identification. It includes an editor to edit taxonomic standardized descriptions and several functionalities to identify specimens to construct diagnosis, to compare and compute morphological dissimilarities etc. To read more: http://lis.snv.jussieu.fr/newlis/?q=en/about During this session Xper2 and the good practices to use it will be presented. 1 I. Taxonomic identification Taxonomic descriptions can lead to several applications (Fig. 1). In this context we will especially focus on Computer-Assisted Identification (C.A.I.), a discipline on which taxonomists will more and more rely on in the future. 1. What is an identification key? An identification key (or determination or dichotomous key) is a practical tool used to identify the taxon on which a given specimen belongs to. In traditional keys (or paper keys), the user answers to questions about his specimen by selecting one of the predefined answers: it is an imposed step-by-step approach. The given answer lead to a new question and so on until obtaining the name of the search taxon. It is a step by step elimination of the taxa which description do not match with the answers given by the user. Such a discriminant route is a kind of graph: the questions are represented by nodes; branches represent the different answers to a given question, they compose the « decision road » when the specimen is determined successfully, and finally leaves, which are the terminal nodes of the graph, represent the taxa taken into account by the key (Fig. 2). Informatic has allowed to implement algorithms to compute such traditional keys, but also to develop new identification methods, we named Computer-Assisted Identification, or free access key. Computer-Assisted Identification systems either exist in local on the computer by installing the identification software, either by online identification through the internet. 2. Taxonomic identification on Internet Internet gives an access to an important number of taxonomic identification keys but a high percent of these applications are classical keys customized for the web (more or less dynamical HTML documents), but without using software managing taxonomic descriptions (e.g. Fishbase1). These online resources are very interesting for a large public but do not provide tools to be include as components for an internet platform of taxonomy on which taxonomists could work collaboratively. Other web sites offer online atlas on a taxonomic group (e.g. Online atlas of Russian beetles2) but do not offer tools usable by other taxonomists to create new applications. Free access keys are also available for various taxonomic groups. See the DELTA website 3 for a list of applications and TDWG website on SDD format4 to know more on the knowledge representation standard. In the table (Fig.3) some of these numerous comparable systems are summarized. It is possible to consult the EDIT report entitled « List of identified and to be tested descriptive tools »5 and also the BdTracker website6, for a collection of links to softwares, tools useful to taxonomists. This last website is developed and populated with the help of WP5. 2 3. Computer-Assisted Identification (C.A.I.) As said previously, Computer-Assisted Identification offers a free access key: it means that the user chooses himself the order by which he answers the questions. The discriminant path is not preconceived but constructed in a dynamic way based on the user's choice. The advantage of such a system is to be adapted to various identification context: in a traditional key, if the user doubt, or if the descriptor is not visible on the specimen the course within the graph is blocked and the determination compromised (it can happen, especially in the case of plants where the reproductive and vegetative organs' shape change over the seasons). C.A.I. can also assist the user to answer first to questions that will lead quickly to the sought taxon. However, C.A.I. is not based on the same degree of strategy than a traditional key. Indeed traditional keys focus on a tactic based on the studied group and defined by the expert who designed the key. With free access keys, the strategy is chosen by the user or only advised by the system: so it is possible for the user to select first the safest and / or easiest descriptors, according to his level of taxonomic skills. Finally, and to conclude on the advantages and disadvantages of these systems, it is important to underline that free access keys do not necessarily guarantee a quick identification of a specimen; the user can spend time in going back in his previous choices, whereas traditional keys guide the user step by step. Moreover this kind of interactive keys can sometimes be safer for inexperienced users. II. Xper2 - a free access taxonomic software 1. What is Xper2? Xper2 (Ung et al. 2008 in prep) is based on a previous system Xper (Lebbe et al. 1988; CIPA group 7 1993; Lebbe & Vignes 1998). It is a taxonomic management system for the storage, edition and on-line distribution of descriptions. It allows interactive identification of specimens and the creation of keys. Even if Xper2 is a powerful program for the professional taxonomist, it does not require any special computer skills. It is also user-friendly for the neophyte naturalist who just wants to identify a specimen with an already made application. Xper2 is written in Java, and runs on current OS (Windows, Mac OS X or Linux). Its interface can be displayed in three languages: English, French and Spanish. 2. Structure of a Xper2 Knowledge Base (KB) A Knowledge Base is structured into four main objects: ⁃ the described entities (e.g. Taxa, specimens, phyto-associations); ⁃ the second type of object corresponds to the descriptors, e.g. the properties used to describe the taxa; (e.g. « Type of leaves» «Colour of petals » etc.) ⁃ the third type of object is the set of descriptors-states or domain values for each one of the descriptors; (e.g. « Entire leaves » « compound leaves » etc.) 3 ⁃ optionally a list of groups that structures and enrich content. (e.g. « Flower », « Petals », « Fruit » etc.) Each type of object may be documented by text and illustrated with images. This knowledge representation is rich and flexible enough for representing complex descriptions and taxa polymorphism. Other properties that are associated to the knowledge base itself include authors, external links, commentaries on the taxonomic or geographic limits and the context. It also can include a legal information section about copyrights concerning the application, e.g. the knowledge base, and the illustrations and pictures used. III. Xper2 functionalities 1. Construction of a Knowledge Base with Xper2 The construction of a knowledge base with Xper2 can be divided into 5 main steps: Step 1: Download and install Xper2 LIS website: lis.snv.jussieu.fr/apps/xper2 |-> Tab "Téléchargement" A Java Runtime Environnement is required to use XPER2. Step 2: Edit a knowledge base. It is possible either to open an existing base either to create a knew base Open an existing base: => File => New base: select the file in .xpd format then => Open Create a new base: => File => New base : Name of the base Nb: the name of the base (title) is different from files name (e.g.: Pinus knowledge base and Pinus.xpd) Step 3: Descriptors edition: For each descriptor one or more states can be linked. Decriptors and descriptors states can be documented by text (simple text or HTML text) and images Dependancy between descriptors: if descriptors depend of other ones creating a hierarchy of descriptors, this information can be expressed as rules by defining « exception states ». The « Number of leaflets » depends of the « Type of leaves » (parent descriptor) and cannot be described if the leaves are « entire » (exception character state) Numerical descriptors: It is important to find the more relevant intervals in terms of taxa discrimination Groups: creation of groups allows to organize and to structure the list of descriptors (often based on the anatomy, but also with/without microscope, field, ...). One descriptor can belong to several groups. Step 4: Taxa edition: To describe a taxon, tick the corresponding states (one or several in case of polymorphism), or “unknown description, because if nothing is ticked it means that no state matches. A comment can be added to each 4 description unit to store additional information, bibliography, etc., to maintain the tracability of the data. Nb: it is possible to describe several taxa at the same time when the description is similar for one descriptor ; and it is easy to look for « unknown descriptions » to complete a previous knowledge base. The selection of a list of taxa allow also to compare their descriptions. Step 5: Checking and test of the knowledge base: The menu « Check the base » controls the consistency of the descriptions and compute the discrimination level of the taxa. Printable forms (lists, complete forms on taxon, matrix) and import/export functions may be very usefull to control or to publish the knowledge base (CSV format for spreadsheet, HTML format). 2. Identification process with Xper2 It is a free access key: the user chooses the descriptors, and their order, to describe the unknown specimen (Fig.4 and Fig. 5). The system can also propose an advice by computing and sorting descriptors according to their discriminant power at each step of the identification process. Illustrations and texts are available to guide and to prevent misunderstanding. Uncertainty is therefore managed by selecting several possible characters states. The lists of remaining and eliminated taxa are updated at each step of the key. To control the result, a complete form describing each taxon is available if required. In these forms the differences between specimen description and the description of the eliminated taxon are pointed in red. The same identification system is available on line or locally. With the on line version, it is possible to benefit of additionnal tools (following section) in order to rewrite description in natural language, to compute diagnose and to focus on most similar taxa etc...And then to obtain a more sophisticated form describing each taxon (see an example on Pinus on website8 or on Phlebotomine sandflies on website9). 3. Additionnal use of the knowledge base Additionnal tools are available to analyse the knowledge base, especially: – to construct printable keys (or the total or subset of the knowledge base) – to construct automatically taxonomic diagnoses – to compute taxonomic dissimilarities – to rewrite automatically the descriptions in readable texts These tools are not already included in Xper2 and their use needs to export the knowledge base in the old Xper format. The possibilities of these tools will be demonstrated during the course. A tutorial (in french only) is available on the LIS website 10). 5 IV. References: 1. http://fishbase.org/identification/classlist.cfm 2. http://www.zin.ru/Animalia/coleoptera/eng/index.htm 3. http://delta-intkey.com 4. http://wiki.tdwg.org/twiki/bin/view/SDD/WebHome 5. http://wp5.e-taxonomy.eu/blog/2007/05/16/new-deliverable-d-547 6. http://www.bdtracker.net 7. http://lis.snv.jussieu.fr/apps/xper2/ 8. http://lis.snv.jussieu.fr/cgi-bin/viewxper.cgi?base=pins 9. http://lis.snv.jussieu.fr/cgi-bin/viewxper.cgi?base=cipa_en 10. http://lis.snv.jussieu.fr/apps/xper/ Ung, V., Dubus, G., Zaragueta-Bagilis, R., Vignes-Lebbe, R., (2008) in preparation Xper2: a powerful tool for managing taxonomic descriptions in knowledge databases. Vignes-Lebbe, R., (2004) Biodiversity information management. Advanced Geographic Information Systems, from Encyclopedia of Life Support Systems (EOLSS), Developed under the Auspices of the UNESCO, Eolss Publishers, Oxford ,UK, [http://www.eolss.net]. Dettai, A., Bailly, N., Vignes-Lebbe, R., Lecointre, G. (2004) Metacanthomorpha: Essay on a PhylogenyOriented Database for Morphology — The Acanthomorph (Teleostei) Example. Syst. Biol. 53(5), 14–26 Gerard, D., Vignes-Lebbe, R., Dubois, A. (2006) Ziusudra, de la nomenclature à l’informatique : l’exemple des Amphibiens. Alytes, 24(1-4) 117-132. Cao, N., Zaragüeta Bagils, R., Vignes-Lebbe, R. (2007) Hierarchical representation of the hypotheses of homology. Geodiversitas 29(1): 5-15. J. LEBBE, S. NILSSON, J. PRAGLOWSKI, R. VIGNES et M. HIDEUX, 1988. The morphology of airborne pollen grains and spores from northern Europe in relation to allergenic function : a microcomputer-aided identification. Grana, 26 : 223-229. CIPA group (Bermudez H., Dedet J.P., Falcao A.L., Feliciangeli D., Ferreira Rangel E., Ferro C., Galati E.A.B., Gomez E.L., Herrero M.V., Hervas D., Lebbe J., Morales A., Ogusuku E., Perez E., Sherlock I., Torrez M., Vignes R. et Wolff M.), 1993. A programme for computer-aided identification of the phlebotomine sandflies of the americas (CIPA), presentation and check-list of american species. Memorias do instituto Oswaldo Cruz, 88 : 221-230. J. LEBBE & R. VIGNES, 1998. Modelling taxonomic description for identification. In : Information Technology, Plant Pathology and Biodiversity (P. Bridges, P. Jeffries, D.R. Morse & P.R. Scott eds.), : 3746. Example of knowledge bases build with Xper2: http://lully.snv.jussieu.fr/xperbotanica/ http://lis.snv.jussieu.fr/apps/xper2/identification/ http://lis.snv.jussieu.fr/apps/xper/data/varanID/ 6 V. Figures and Tables: Figure 1: Taxonomic descriptions use 7 Figure 2: Course of a key Delta IntKey Free, Delta format, runs locally Navikey Free, Delta format, runs online Diversity Description Free, data base format, runs locally Frida (Dryades) Free, runs online IKBS Free, specific format Xper2 Free, specific format, runs locally and online Linnaeus Commercial, specific format, runs locally Lucid Commercial, XML format, runs locally and online ActKey Free, DataBase MySQL, online SLIKS Free, Delta format, online Figure 3: Many comparable systems... 8 Figure 4: Edition mode of Xper2 9 Figure 5: Identification mode of Xper2 10
© Copyright 2026 Paperzz