Controlled Terminology: its use in (Static) Information Models and in Applications © Blue Wave Informatics LLP, 2012 Introduction What is “controlled terminology”? Some definitions that I use..... © Blue Wave Informatics LLP, 2012 Vocabulary: • A vocabulary is “Words used by language, book, branch of science or author; (a) list of these (words)” • In the informatics context, a vocabulary is a list of words or phrases (“concepts”) that describe concepts in a particular domain © Blue Wave Informatics LLP, 2012 Dictionary: • A dictionary is a listing (usually alphabetic) of concepts accompanied by explanatory text for those concepts (explaining meaning of the concept) • it may also give corresponding words in another language © Blue Wave Informatics LLP, 2012 Classification (Taxonomy): • Classification (verb and noun) – Classification is the action or process of classifying concepts; arranging concepts in classes or categories according to shared qualities or characteristics – Classification is a means of giving order to a group of disconnected facts • Taxonomy (noun) – Taxonomy is the practice and science of classification. Taxonomies, or taxonomic schemes, are composed of taxonomic units known as taxa, or kinds of things that are arranged in a hierarchical structure, related by subtypesupertype relationships. – A taxonomy is a systematic classification of concepts within a domain. © Blue Wave Informatics LLP, 2012 Hierarchy: • A hierarchy is a system or organisation in which concepts are ranked one above the other according to rules (in the wider world, according to status or authority) • A hierarchy is therefore a type of classification based on “ranking*” • * “ranking” is “putting something in its place within a system” © Blue Wave Informatics LLP, 2012 Terminology: • Terminology is the study of terms and their use — of words and compound words (phrases) that are used in specific contexts (domains) • Terminology also describes a formal discipline which systematically studies or practices the description and organisation of concepts in a domain (subject area) for one or more uses – the product of this is sometimes called a “systematic terminology” © Blue Wave Informatics LLP, 2012 Terminology Principles: • Analysis and description of the concepts (“units of thought”) within the particular domain • Analysis and description of the relationships between the concepts (the organisation between the terms) • Identification of all the terms (words or phrases) also associated with a concept (synonyms); each concept has one unambiguous preferred term and any number of synonyms (note: synonyms may be shared with other concepts). This produces definition of the concepts • A terminology has aspects of a vocabulary (the set of concepts used in a domain) and a dictionary (explaining concepts) and taxonomy/classification (organising the concepts – structure – relationships between concepts). © Blue Wave Informatics LLP, 2012 Controlled Terminology: • The product of applying the principles of terminology to produce “a body of terms* used with a particular technical application in a subject of study, theory, profession etc.” • It is authored (“controlled”) in such a way as to adhere to the “Cimino Desiderata” for healthcare practice • It is (therefore) specifically designed to support robust semantics in machine processing as well as human use • May be called a “systematic terminology” because the principles are applied in an organised way • * Because the “terms” are words or phrases, this is often also referred to as “controlled vocabulary” © Blue Wave Informatics LLP, 2012 Interface v. Reference Terminology: • An interface terminology is one designed for use directly in applications • May have – navigation concepts – many synonyms – short keys (and short cuts!) • A reference terminology is one designed primarily for robust definition of concepts – used in analysis – decision support – may be very large or complex, so may not be easily used (without modification) by end users/end user systems © Blue Wave Informatics LLP, 2012 What about the “o” word?: • Traditionally, “ontology” is a “branch of metaphysics concerned with the nature of being” • An ontology (as opposed to “ontology”) is a representation of a set of concepts within a domain and the relationships between those concepts • How therefore does an ontology differ from a terminology (if at all!)? © Blue Wave Informatics LLP, 2012 Ontology: • An ontology is used in information science to “reason” about the properties of the domain, and may be used to “define” the domain. An ontology moves from the purely informational representation into the area of assertional knowledge, but like all boundaries, it is not easy to draw an exact line, and many terminologies become more “ontological” over time • Ontologies generally have a rich set of relationships – – synonymy and sometimes antonymy (although that’s difficult) – hyponymy/hypernymy (the is_a subsumption relationship) – meronymy and holonymy (partitive relationships) – negation (although this is really difficult) © Blue Wave Informatics LLP, 2012 Ontology (II): • Ontologies will generally include “classification(s)” for the concepts within it – Note that this is different from the subsumption relationship because of the contextualisation of classification • Ontologies generally have individual instances of concepts, classes of concepts (also sometimes known as “types”, “sorts” “categories” or “kinds” – they are collections of concepts), relationships between concepts and attributes of concepts – As such, ontologies have an underlying information model that they instantiate with the concepts themselves, rather than using concepts from terminology(ies) to instantiate them…. © Blue Wave Informatics LLP, 2010 Starting to Join Terminology to Information Models Using Terminology in Information Models Introducing Concept Domains © Blue Wave Informatics LLP, 2012 3 + 1 Pillars of CSI • Following Charlie Mead, US NCI Necessary, but not necessarily sufficient … 1. A common information model 2. Information model uses robust standard datatypes 3. Information model uses domain specific attribute semantics from concept-based terminologies 1. Specification for information exchange 2012-03-05 © Blue Wave Informatics LLP, 2012 15 © Blue Wave Informatics LLP, 2012 Information Models • A static information model – a class model - has classes representations of types of things in relationship to each other • In the logical model level, the things will have attributes – properties of the thing that describe the thing (may or may not be definitional to the thing) © Blue Wave Informatics LLP, 2012 Concept Domains • Each class and attribute in a model has a name, a label, that describes what it represents • It should also have a definition, a description of the semantic space that it encompasses • For those classes and attributes that will be instantiated, helped to represent real things (instances) using vocabulary, (so using the CD datatype) that description of the semantic space (semantic type) is the concept domain • A Concept Domain defines the Semantic Space for a “thing" attribute in an Information Model © Blue Wave Informatics LLP, 2012 Concept Domains (II) • Therefore, a concept domains has a definition, usually identical to the definition of the attribute in the model that it supports • It also may have a description or usage notes • If it can have some examples of instances of things in that semantic space, even better • • • Make: the manufacturer of the car [Examples: Ford, General Motors, Ferrari] Model: the version of the car, usually a defined by a particular chassis [Examples: Mondeo, Mustang, F450] Colour: the hue of the paint on the body of the car [Examples: Red, Racing Green, Blue] © Blue Wave Informatics LLP, 2012 Value of Concept Domains • Using concept domains is useful because it supports: – the selection of vocabulary occurring at a separate time from the information model design – gaining consensus on the set of concepts to be used in the concept domain – different implementations to use their own vocabulary but still share semantic foundations – management of changing instances over time • BUT no information model, or application built upon such an information model, is implementable or useable until the concept domain is “bound” to a code system or value set – this is sometimes known as “making a vocabulary declaration” © Blue Wave Informatics LLP, 2012 An Example – SiteStatusCode in BRIDG © Blue Wave Informatics LLP, 2012 An Example – in an Application © Blue Wave Informatics LLP, 2012 Key pieces in the vocabulary machinery to instantiate concept Concepts Codes and Designations – Concept Representation Code Systems and Concept Identifiers Value Sets Using the definitions and principles from ISO 21090 and ISO 17583 © Blue Wave Informatics LLP, 2012 X79Q8 Concepts Codes and Designations – Concept Representation Apple Pomme Manzana © Blue Wave Informatics LLP, 2012 © Blue Wave Informatics LLP, 2012 A concept is a unit of thought With thanks to : David Robinson - NHSIA © Blue Wave Informatics LLP, 2012 Concept Definition • A Concept is a unitary mental representation of a real or abstract thing – an atomic unit of thought. • Concepts, as abstract, language- and context-independent representations of meaning, are important for the design and interpretation of static information models. They constitute the smallest semantic entities with which models are built. The authors and the readers of a model use concepts and their relationships to build and understand the models; these are what matter to the human user of models. • The vocabulary machinery exists to permit software manipulation of these units of thought As models are layered and developed, the size and description of the smallest semantic entity may change, to best meet the use case(s) and requirements, and to show different views on reality © Blue Wave Informatics LLP, 2012 A concept can be labelled with a code X79Q8 With thanks to : David Robinson - NHSIA © Blue Wave Informatics LLP, 2012 Code Definition • A Code is a machine processable Concept Representation published by the author of a Code System as part of the Code System • It is the preferred unique identifier for that concept in that Code System for the purpose of communication (preferred machinereadable identifier), and is used in the 'code' property of an ISO 21090 CD data type • Codes are sometimes meaningless identifiers, and sometimes they are mnemonics that imply the represented concept to a human reader. – MedDRA code – has meaningless identifiers – “10040589” (Shoplifting) – ISO (2 letter) Country codes – mnemonic – GB = Great Britain • Meaningless identifiers are advised (see the Cimino Desiderata) particularly in larger vocabulary systems © Blue Wave Informatics LLP, 2012 A concept can be labelled with a designation Apple Pomme Manzana With thanks to : David Robinson - NHSIA © Blue Wave Informatics LLP, 2012 Designation Definition • A Designation is a language symbol for a concept that is intended to convey the concept meaning to a human being • A Designation may also be known as an appellation, symbol, or term • A Designation is typically used to populate the 'displayName' property of an ISO 21090 CD data type Concept Representation • Putting together a code and a designation gives a concept representation for a concept, a single unit of thought • This is something that is both machine-readable and humanreadable Concept Representation: X79Q8: Apple © Blue Wave Informatics LLP, 2010 Concept Representation (II) • A Concept Representation is a vocabulary object that enables the description and manipulation of a Concept in systems and applications (such as information models, xml schema) • A Concept Representation exists in some form that is computable, and can be used in information models and specifications • Concept Representations can take on a number of different roles in the structure and processing of vocabulary in information models © Blue Wave Informatics LLP, 2012 Code Systems – collections of concepts and Concept Identifiers © Blue Wave Informatics LLP, 2012 © Blue Wave Informatics LLP, 2012 Code System • A Code System is a managed collection of Concept Representations, including codes and/or designations, but sometimes with more complex sets of rules, references (definitions), and relationships • A Code System may be described as “ a collection of uniquely identifiable concepts with associated representations, designations, associations, and meanings” • A Concept should be unique in a given Code System – A concept may have synonyms – A concept maybe a singleton, or may be constructed of other concepts (i.e. post-coordinated concepts) • Although these things may be differentially referred to as terminologies, vocabularies, or coding schemes, or even classifications, the ISO 21090 CD datatype considers all such collections ‘Code Systems’ – Examples include ICD-9 CM, SNOMED CT, LOINC, and MedDRA Hence a “terminology model” © Blue Wave Informatics LLP, 2012 Code System Properties • Code systems should have: – an identifier that uniquely identifies the Code System. For ISO 21090 conformant model instances, this SHALL be in the form of an ISO OID – a description consisting of prose that describes the Code System, and may include the Code System uses, maintenance strategy, intent and other information of interest • when using a code system to support instantiation of a model, it is this description that should match or be compatible with the relevant concept domain – administrative information proper to the Code System, independent of any specific version of the Code System, such as ownership, source URL, and copyright information © Blue Wave Informatics LLP, 2012 Managing Change in Code Systems • Code Systems should evolve over time • Changes occur because of – corrections and clarifications – the understanding of the concepts being described evolves (e.g., new genes and proteins are discovered) – the concepts being described change (e.g., new countries emerge; old countries are absorbed) – the assessment of the relevance of particular concepts within the knowledge resource change (e.g., the addition of new parent-child relationships) Code System Versions • Depending upon how well the Code System adheres to Good Vocabulary Practices (the “Cimino Desiderata”), changes could be significant • Changes in concept meaning – although discouraged – can occur and can cause issues which could themselves be significant • Therefore it can be important to know which version of a given Code System was used in – the creation of a system record or message instance – (in some cases) the creation of an information model/schema • Hence “Code System Version” is a property of the CD datatype © Blue Wave Informatics LLP, 2010 The Concept Identifier • A Concept Identifier is a vocabulary object that unambiguously and globally uniquely represents a concept within the context of a Code System in a machine readable way • A Concept Identifier consists of: the OID for Code System + Code (+ Designation/Display name) • To make a Concept Identifier human readable, add the “display name” (the designation) thus: the OID for Code System + Code (+ Designation/Display name) – note that the designation (display name) is not mandatory for the concept identifier, but it is considered good practice to always have the designation for safety reasons (data unscrambling etc.) © Blue Wave Informatics LLP, 2012 Value Sets – making concepts and code systems work in information models and applications © Blue Wave Informatics LLP, 2010 © Blue Wave Informatics LLP, 2012 Value Sets • A Value Set represents a uniquely identifiable set of valid concept identifiers where any concept identifier used within the CD datatype can be tested to determine whether it is a member of the Value Set at a specific point in time – it is this that makes a particular attribute “conformance testable” • Value Sets exist to constrain the permissible content of a concept domain for a particular use – in an information model vocabulary binding – in analysis – In UI data collection - in a pick list (drop-down box), etc. • A Value Set may have a description, but this is not intended to describe the semantics of the Value Set; a Value Set has no intrinsic semantics separate from the coded concepts contained in its expansion – a value set is useful only in context, not as a stand-alone object Looking at the ISO 21090 CD datatype © Blue Wave Informatics LLP, 2012 © Blue Wave Informatics LLP, 2012 ISO 21090 Concept Descriptor Datatype code codeSystem (identified using an OID) codeSystemName codeSystemVersion codingRationale displayName originalText source translation valueSet (identified using an OID) valueSetVersion © Blue Wave Informatics LLP, 2012 Concept Descriptor Attributes • code - the (machine readable) concept representation – • • • • • • • • • • Note – the cardinality is 0..1 to allow for NULL FLAVOURS codeSystem (OID) – uniquely and machine-readably identifies the code system that the code comes from codeSystemName – the human readable name of the code system (e.g. “MedDRA”) codeSystemVersion – the version of the code system that the code comes from codingRationale – information about how /why the code was selected – the reason the concept has been provided – rarely if ever used displayName – the human readable description of the concept - as it exists in the code system – the Term Name originalText – the piece of text in a document or report that the concept has been selected to represent (it shows the meaning the user intended to communicate) - this might be used in an ICSR, for example source – if any translation (mapping) has occurred, this gives the source code translation - a set of other concept descriptor information that each represent a translation of this code into equivalent codes within the same code system or into corresponding concepts from other code systems (could be used for synonyms, or could be used to describe mapped concepts) valueSet – the value set that applied when this instance of information was created valueSetVersion – the version of the value set that applied when this instance of © Blue Wave Informatics LLP, 2012 information was created
© Copyright 2026 Paperzz