PART 1 : What is a thesaurus ? Concept and samples Christine Laaboudi-Spoiden Publications Office of the European Communities EUR-LEX Unit – Documentary section Cape Town, June 2006 EUR-Lex – Searching information EUR-LEX http://eur-lex.europa.eu/en/index.htm direct free access to European Union law • the treaties, legislation, case-law and legislative proposals – – – – – – – Official Journal of the European Union Official Journal L – Legislation Official Journal C – Information and notices Official Journal – Special editions European Court Reports Documents of the institutions Consolidated texts Cape Town, June 2006 EUR-Lex : Searching information COMPUTER CRIME • Title and text: computer crime 40 Hits COMPUTER RELATED CRIME • Title and text: computer crime 58 Hits CYBERCRIME • Title and text: cybercrime 55 Hits CYBER CRIME • Title and text: cyber crime 48 Hits COMPUTER CRIME, CYBERCRIME, CYBER CRIME (Boolean - OR) • Title and text: computer crime, cybercrime 129 Hits USE OF SYNONYMS OR EQUIVALENT TERMS Cape Town, June 2006 EUR-Lex sample –Bibliographic Notice EUROVOC DESCRIPTORS TERMES D’INDEXATION ou DESCRIPTEURS INDEXING TERMS PREFERRED TERMS CLASSIFICATION SCHEME SUBJECT HEADINGS Cape Town, June 2006 Indexing process Indexing = Identify the concept Represented in a document EUROVOC descriptor: information society, computer crime, personal data, electronic mail, confidentiality For information retrieval (information request) Title and text: computer crime, cybercrime 129 Hits Content Indexing = only 1 process ! Searching = start again if the results are not relevant to the question. Cape Town, June 2006 Search Results Relevant / Relevancy = relationship between a document and a request. – The document is relevant to the topic – It replies to the user’s request Pertinence = relationship between a document and an information need. • Relevant and useful for a user • Relevant but the user doesn’t find it useful (language, level of comprehensibility, type) Irrelevant results = NOISE Non-retrieved results = SILENCE Cape Town, June 2006 Causes of searching failures Two words don’t mean exactly the same thing Enormous range of choices of words and expressions No true synonyms, although words are often close in meaning Words are not clearly understood Inconsistent use of words Users are unlikely to choose all the relevant terms The user might choose the terms used by the indexer with a different understanding of meaning. Cape Town, June 2006 Need of a controlled vocabulary A controlled vocabulary = A consistent set of words/expressions, along with rules of usage, to be followed when indexing / searching Nature of indexing language A list of terms acceptable to users Mechanisms for structuring and using those terms Minimize the ambiguity of isolated vocabulary that may be out of context Cape Town, June 2006 Out of context information What means SENSITIVE AREA ? urban military environmental sensitive epidermis … A sensitive area protected by special measures to preserve a highly vulnerable habitat (Eurovoc thesaurus) Cape Town, June 2006 Types of Vocabulary – Authority List Simple list or index enumerating the terms available for indexing a collection of documents Author names, organization names, Countries, E.g. • Library of Congress Authorities • ISO Country Codes Cape Town, June 2006 Vocabulary control – Classification Scheme Heading / Caption Notation Upper level Class Sub-classes Lower level EUR-Lex directory codes Cape Town, June 2006 Vocabulary control – Classification Scheme Systematic arrangement of entities/concepts into classes (group or categories) group of concepts whose members share a common feature vertical arrangement – level of specificity Words may appear in several classes Cape Town, June 2006 Vocabulary control – Classification Scheme Classes are identified by a heading/caption a notation (alphabetical and/or numerical code) • Key for arranging items in physical libraries Expressiveness (reflects the structure of the scheme) 11.60.30.20 External relations / Commercial policy / Trade arrangements / Common import arrangements Cape Town, June 2006 EUR – Lex Directory Codes Numerical classification of the “Directory of Community legislation in force” and is used to index legislation and preparatory acts. http://eur-lex.europa.eu/RECH_repertoire.do 20 principal chapters, each covering a specific area of European Union activity. Each descriptor is composed of eight digits • (principal chapter heading and up to three subsequent subdivisions, each represented by two digits) Cape Town, June 2006 EUR-Lex – Subject Headings One to maximum 5 descriptors based on the subject-matter list of terms The alphabetically structured list of over 200 keywords is based on the subdivisions of the treaties and the areas of activity of the institutions. The descriptors are less specific than those of the Directory code but provide a general overview of the content of the document. Cape Town, June 2006 Thesaurus - Definition ISO 2788 (1984) A structured list of expressions intended to represent in unambiguous way the conceptual content of the document in a documentary system and of the queries addressed to the system. = NOUN, NOUN PHRASE = INDEXING PROCESS = ONE SINGLE INTERPRETATION Cape Town, June 2006 Thesaurus - Definition BSI 8723 (2006) = MUTUALLY EXCLUSIVE RELATIONSHIPS A controlled vocabulary in which concepts are represented by descriptors, formally organized so that paradigmatic relationships between the concepts are made explicit, and the descriptors are accompanied by lead-in entries for synonyms and quasi-synonyms. = EQUIVALENCE The purpose of a thesaurus is • to guide both the indexer and the searcher to select the same descriptor or combination of descriptors to represent a given subject. = INDEXING PROCESS Cape Town, June 2006 Eurovoc - Scope Eurovoc A multilingual thesaurus (hierarchical list of terms) Multidisciplinary vocabulary • Community and national point of view • Parliamentary activities Definition of concepts Samples from Eurovoc Cape Town, June 2006 Eurovoc - Coverage 21 FIELDS = HEADINGS 04 POLITICS 08 INTERNATIONAL RELATIONS 10 EUROPEAN COMMUNITIES 12 LAW 16 ECONOMICS 20 TRADE 24 FINANCE 28 SOCIAL QUESTIONS 32 EDUCATION AND COMMUNICATIONS 36 SCIENCE 40 BUSINESS AND COMPETITION 44 EMPLOYMENT AND WORKING CONDITIONS 48 TRANSPORT 52 ENVIRONMENT 56 AGRICULTURE, FORESTRY AND FISHERIES 60 AGRI- FOODSTUFFS 64 PRODUCTION, TECHNOLOGY AND RESEARCH 66 ENERGY 68 INDUSTRY 72 GEOGRAPHY 76 INTERNATIONAL ORGANISATIONS Cape Town, June 2006 0806 international affairs 0811 cooperation policy 0816 international balance 0821 defence 127 MICROTHESAURUS = CLASSES Eurovoc - Equivalence NON-DESCRIPTOR USE DESCRIPTOR Cape Town, June 2006 Eurovoc – Contextual information DESCRIPTOR MT - MICROTHESAURUS (MAIN CLASS) UF (USED FOR) - NON-DESCRIPTOR This descriptor is USED FOR a non-descriptor BT - BROADER TERM / GENERIC TERM NT - NARROWER TERM / SPECIFIC TERM RT – RELATED TERM Cape Town, June 2006 Eurovoc – Relationships TOP TERM = higher in the hierarchy Equivalence relationship (USE, UF) SCOPE NOTE (SN) = Usage or definition note NT1 NT3 Hierarchical relationship (MT, BT, NT) Associative relationship (RT) Cape Town, June 2006 Vocabulary Control – Thesaurus The scope of a descriptor is limited to a single meaning (unambiguous) • Nouns or Noun phrases • Pre-coordination of concepts The context is provided by : • The hierarchical relationships (MT, BT, NT) • The scope note (SN) – (state the chosen meaning or indicate other meanings excluded for indexing purposes) A concept is represented by two or more synonyms • One term selected as a descriptor (indexing term) • Equivalents = non-descriptors – (lead-in entries or references to the descriptor – USE, UF) Cape Town, June 2006 Vocabulary control - Targets Represents the general conceptual structure of a subject area and presents a guide to the user of an index Reflects closely the literature vocabulary and the user’s own technical usage Employs pre-coordinated phrases to reduce false drops to minimum • Venetian Blind Controls synonyms and near-synonyms in order to increase the consistency Only one term from a list of similar terms will be used in indexing Horizontal and vertical relationships among terms (cross-references) Cape Town, June 2006 Classification & Thesaurus - Difference Classification Single preferred location (physical libraries) • Directory code: 03.60.55.00 Agriculture / Products subject to market organisation / Wine • Post-coordination of concepts Eurovoc Admits relationships as hierarchical wine MT 6021 beverages and sugar BT1 alcoholic beverage BT2 beverage NT1 NT1 NT1 NT1 bottled wine champagne flavoured wine fortified wine Cape Town, June 2006 Indexing systems - Types Derived-term system Assigned-term system All descriptors are taken from the text itself Subject heading list, thesaurus, classification, taxonomy Intellectual effort Natural language or free-text indexing The Indexer determines the scope of the document and assigns descriptors from a controlled vocabulary Descriptors identify the concepts expressed by the documents Automatic indexing Greater time and efforts Cost is important Cape Town, June 2006 PART 2 : EUROVOC THESAURUS Christine Laaboudi-Spoiden Publications Office of the European Communities EUR-LEX Unit – Documentary section Cape Town, June 2006 Eurovoc 4.2 - Languages http://europa.eu/eurovoc/: Official EU Languages Acceeding countries BG - Bulgarian, RO – Romanian Candidate country HR – Croatian Local sites Other languages Albanese, Ukranian, Russian, Georgian, Serbian Regional languages : basque, catalan Cape Town, June 2006 ES LT CS HU DA NL DE PL EL PT ET(*) SI EN SK FR FI IT SV LV Eurovoc 4.2 in figures Eurovoc 4.1 Eurovoc 4.2 DOMAINS 21 21 MICROTHESAURI 127 127 DESCRIPTORS 6501 6645 GENERIC RELATIONSHIPS 6510 6669 ASSOCIATIVE RELATIONSHIPS 3542 3636 Cape Town, June 2006 Eurovoc – fields most frequently used 76 68 40 36 66 56 48 72 44 Fields 32 24 20 52 16 08 28 10 12 04 1 1 1 1 2 2 2 76 68 40 36 66 56 48 72 44 32 24 20 52 16 08 28 10 12 04 3 4 4 4 4 5 7 9 10 - INTERNATIONAL ORGANISATIONS – INDUSTRY – BUSINESS AND COMPETITION – SCIENCE – ENERGY – AGRICULTURE, FORESTRY AND FISHERIES – TRANSPORT – GEOGRAPHY – EMPLOYMENT AND WORKING CONDITIONS – EDUCATION AND COMMUNICATIONS – FINANCE – TRADE – ENVIRONMENT – ECONOMICS – INTERNATIONAL RELATIONS – SOCIAL QUESTIONS – EUROPEAN COMMUNITIES – LAW – POLITICS 11 17 18 0 5 10 Number of users Cape Town, June 2006 15 20 Eurovoc – Polyhierarchical relationship Main rule : Descriptors belong to one category (1 BT, 1 MT) Exception : Descriptors from Domains 72 & 76 Field 72 : Geography Field 76 : International Organizations Cape Town, June 2006 Eurovoc - Advantages Multilingualism Indexation in the documentalist’s language Search in the user’s language Update 18 months Cooperation National parliaments Candidate descriptors Normalisation ISO 2788 & 5964 Cape Town, June 2006 Eurovoc - Limits Generic vocabulary, not specific Don’t cover national specificities Cape Town, June 2006 Eurovoc - Display Formats Printed – paper version Web site http://europa.eu/eurovoc/ XML Files (provided to licensees) PDF Files to download Types of display Alphabetical Thematic • Alphabetical listing by field/domain Cape Town, June 2006 Eurovoc – Thematic display Languages Field/Domain Microthesauri NAVIGATING Cape Town, June 2006 Eurovoc – Thematic display Microthesauri Top Term / Broader Term Related Terms Alphabetical index of descriptors/non-descriptors of the current field Specific Terms NT1 – NT2 Cape Town, June 2006 Eurovoc – Terminology of the field Alphabetical index of descriptors and non-descriptors Cape Town, June 2006 Eurovoc – Searching for concept Cape Town, June 2006 Eurovoc – Alphabetical display Cape Town, June 2006 Eurovoc – Alphabetical display PT FR Cape Town, June 2006 Eurovoc – Translations A descriptor = an equivalent concept in every language Cape Town, June 2006 Eurovoc - History 1982 : • comparative study of the existing documentary languages at the European Commission and the European Parliament 1984 : first edition • seven languages (DA, DE, EN, FR, EL, IT, NL) 1987 : 2nd edition • + ES, PT 1995 : 3rd edition - 1999 : 3.1 edition • + SE, FI 2002 : 4.0 edition - 2004 : 4.1 edition 2005 : 4.2 edition • 17 languages 2006 : 4.3 edition • 21 langues Cape Town, June 2006 Eurovoc - Users National parliaments European institutions (European Parliament, Publications Office, Court of Justice) Private users = Eurovoc License holders (licence Eurovoc) Cape Town, June 2006 Eurovoc – Users 16 16 14 NationalP arliament 12 NationalA dministration 10 EU Institutions 8 Consultants 6 5 6 4 Universities 3 2 4 2 2 0 Private User Research Institutes Total Cape Town, June 2006 Eurovoc – Users Transla tors 20% 1% 6% Informatics 3% Termino logues Lingui sts Libraria ns Docum entalis ts 14% 56% Res earchers Other Cape Town, June 2006 Eurovoc - Licenses (1) 50 45 40 35 30 25 20 15 10 5 0 44 25 Licence s 15 2003 2004 Number of Licences Cape Town, June 2006 2005 Eurovoc – Licenses (2) 35 33 30 25 18 20 2004 2005 15 2006 10 5 0 1 4 Acade mic 4 2 3 Commercial 4 Transla tion Cape Town, June 2006 3 Inde xing PART 3 : EUROVOC MAINTENANCE Christine Laaboudi-Spoiden Publications Office of the European Communities EUR-LEX Unit – Documentary section Cape Town, June 2006 Eurovoc - Maintenance 2 interinstitutional committees Maintenance committee • Commission, Council, Parliament, Court of Justice, Court of Auditors Steering committee • Commission, Council, Parliament, Court of Justice, Court of Auditors Eurovoc Maintenance Team Publications Office Cape Town, June 2006 Eurovoc - Steering committee Supervises the Eurovoc project • Objectives, priorities, overall timetable • Resources and budget Officially adopts each new version Chair by a representative of the European Parliament Cape Town, June 2006 Eurovoc – The maintenance committee Examines and votes on the proposals for updating the thesaurus Decides on the amendments to be made Chair by the Publications Office Meets twice a year Cape Town, June 2006 Eurovoc – The maintenance team Location: Publications Office Collects and examines the proposals made by all users Coordinate the work of the Maintenance Committee Responsible for IT developments, translation monitoring, web site Works through a maintenance interface Cape Town, June 2006 Eurovoc – Maintenance process The European Parliament – Collects, examines and filters the proposals from the national parliaments The Maintenance Team – Collects the proposals made by all users (E.P, licensees, OPOCE) – Manage the proposals through the maintenance system The Maintenance Committee – Votes on the various proposals – Decides on the final amendments The Maintenance Team – New descriptors and amendments are sent to the E.C translation The Maintenance Committee – Review the multilingual draft version The Steering Committee – Officially adopts the new version Cape Town, June 2006 EUROVOC – The maintenance interface https://webgate.cec.eu.int/eurovoc/maint Users EU Institutions : Members of the maintenance committee, Translators National parliaments Features Propose Candidate descriptors, amendments Translation module A dedicated layer for each user Cape Town, June 2006 EUROVOC – Maintenance CANDIDATE DESCRIPTOR How to propose new concepts / amendments Eurovoc maintenance form (web site) Email to [email protected] Cape Town, June 2006 EUROVOC – Maintenance Criteria’s of acceptance / non acceptance of candidates descriptors Acceptance : Creation necessary : • European Food Safety Authority (new european organism) • Greater Poland province in Regions of Poland in MT7211 (new regions to incorporate) New concept interesting and useful • Access to healthcare • selfregulation Cape Town, June 2006 EUROVOC – Maintenance Criteria’s of acceptance / non acceptance of candidates descriptors Non acceptance : Descriptor already existing under another form • Second home secondary home • Community Customs Code exists as a nondescriptor of « Customs regulations » Concept which can be obtained in combining two or three descriptors already created ( • European Refugee Fund EC fund + aid to regufees Cape Town, June 2006 EUROVOC – Maintenance Criteria’s of acceptance / non acceptance of candidates descriptors Non acceptance : Term too specific (not enough used) • Arctic agriculture Term too national (not useful for the other users) • Popular school (in SV) Term too vague • Right to peace • Small states Cape Town, June 2006 PART 4 : INDEXING AND SEARCHING WITH EUROVOC & the EP Library Christine Laaboudi-Spoiden Publications Office of the European Communities EUR-LEX Unit – Documentary section Isabelle Gautier – European Parliament - Library Cape Town, June 2006 INDEXING AND SEARCHING WITH EUROVOC 1. Content analysis and subject determination : Example from Eur-Lex database (Directive 50/2006) Example from Eur-Lex database (Règlement 802/2006) Cape Town, June 2006 Cape Town, June 2006 Cape Town, June 2006 INDEXING AND SEARCHING WITH EUROVOC 1. Term selection in Eurovoc • Check the relationships (hierarchy and semantical environment of a descriptor) • Definition of horizontal or vertical specificity • Translation of concepts into indexing terms : cases of generic terms, compounds terms, lack of precision, proper names. 3. Depth of indexing : • Exhaustivity and selectivity 4. Making choice : indexing policy Cape Town, June 2006 Cape Town, June 2006 Cape Town, June 2006 Cape Town, June 2006 Cape Town, June 2006 EUROVOC at EP LIBRARY 1999 : change of our data processing system of our catalogue ; involves a new indexing policy to manage for the library. new catalogue => needs to develop a new consistency for indexing ; to obtain this consistency, organization of a training for all indexers ; creation of a Working Group in charge of the Indexing Coordination among the library. Cape Town, June 2006 EUROVOC at EP LIBRARY The Indexing Coordination Group Working Group formed by indexers Information Specialists (nationalities and languages differents) in charge of : Writing an internal guide to use the practical rules for indexing, this for the departement ; Creating some updated lists (descriptors studied and descriptors created for the Library) and templates (to propose a creation or a modification) useful for the colleagues organizing regularly some meetings on the indexing policy and its implementation; training the new colleagues. Cape Town, June 2006 EUROVOC at EP LIBRARY The Indexing Guide Target : to obtain a better consistency of the indexing operation in the catalogue and a good knowledge of the new data processing system. Contents three parts : definition and basic rules for indexing ; the indexing policy in the library ; practical application in our catalogue. Completed by some advised-sheets for indexing if it appears necessary. Cape Town, June 2006 EUROVOC at EP LIBRARY Indexing Meetings Target : the group studies the proposals of new descriptors or modifications sent by the colleagues ; To answer to specific questions asked by the colleagues ; to write if necessary some advisedsheets ; questions are analysed by the group in some meetings and presented in meetings at the department level; Advise and help role. Cape Town, June 2006 EUROVOC at EP LIBRARY Examples of proposals received by the Group Candidate-descriptor created (library level) : Community law-international law MT 1231 international law - BT international law SN influence du droit communautaire sur le droit international et vice-versa Candidate-descriptor rejected : environmental damage principle Advise to index with : environment impact + risk prevention Modification of a descriptor : polluter pays principle Proposal to change the English term (in place of polluter pays policy). Cape Town, June 2006 EUROVOC at THE EP LIBRARY Training Training Organisation for new colleagues : Internal with a presentation of : the thesaurus, the indexing guide, the indexing policy of the department, indexing in our catalogue and little practical exercises ; internal but an external trainer to review or to train - if necessary – to index a group of people external : as needs requested by indexers and if training available in the different countries. Cape Town, June 2006 EUROVOC at EP LIBRARY European Parliament’s role as member of Maintenance Committee : Represents both the EP and the national parliaments at the Maintenance Committee ; Receives as representative the proposals of the national parliaments users of the thesaurus ; Filters the proposals (criteria's rejection : concept too national or too specific or too vague) ; Forwards the proposals of the department and of the national parliaments to the Committee ; organises regularly seminars with national Parliaments. Cape Town, June 2006 IN CONCLUSION : USEFUL LINKS EUROVOC : http://eurovoc.europa.eu Eur-Lex : http://eur-lex.europa.eu Parlement européen : http://www.europarl.europa.eu Cape Town, June 2006
© Copyright 2026 Paperzz