The ISO 25964 data model for the structure of an information

The ISO 25964 data model for the structure
of an information retrieval thesaurus
Leonard Will
ISO 25964-1:2011
Thesauri for information
retrieval
0..*
+hasVersion
+hasConceptGroupLabel
+isVersionOf
1
+isConceptGroupLabelOf
+identifier: String[1]
+conceptGroupType: String[1]
+notation: String[0..*]
+hasSubgroup
0..*
+isPartOf
+contains
ConceptGroup
1
+hasNodeLabel
0..*
1
+isPartOf
+contains
0..*
1
0..*
+IsPartOf
+contains
+hasAsMember
+hasTopConcept
0..*
0..1
+hasSuperOrdinateArray
1..*
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+status: String[0..1]
+notation: String[0..*]
+topConcept: Boolean[0..1]
0..*
+isTopConceptOf
0..1
+hasSubordinateArray
+hasSuperOrdinateConcept
1..*
0..*
+isMemberOfArray
+hasMemberConcept <ordered>
0..*
0..*
HierarchicalRelationship
+hasHierRelConcept
0..*
+role: String[1]
0..*
+hasRelatedConcept
+isHierRelConcept
0..*
+role: String[0..1]
CustomConceptAttribute
+isRelatedConcept
+hasCustomConceptAttribute
1
1
+isCustomConceptAttributeOf
0..*
+isNonPreferredLabel
1
0..*
+isPreferredLabel
Equivalence
+isReferredToIn
+refersTo
0..*
1
CompoundEquivalence
+annotatesHistory
+role: String[0..1]
1
+isCustomNoteOf
0..*
+hasNonPreferredLabel
+hasPreferredLabel
+hidden: Boolean[0..1]
1..*
PreferredTerm
SimpleNonPreferredTerm
+UF
0..*
2..*
0..*
CustomTermAttribute
+hasCustomTermAttribute
+lexicalValue: String[1]
+customAttributeType: String[1]
+lang: language[0..1]
0..*
Note
+hasCustomNote
1
SplitNonPreferredTerm
+definesScopeOf
0..*
+hasScopeNote
+isCustomTermAttributeOf
1
ThesaurusTerm
0..*
+lexicalValue: String[1]
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+source: String[0..1]
+status: String[0..1]
+lang: language[0..1]
+hasHistoryNote
0..*
+annotatesHistory
1
CustomNote
+noteType: String[0..1]
ScopeNote
HistoryNote
+hasHistoryNote
+isDefinitionOf
1
0..*
+hasDefinit
+isEditorialNoteOn
1
+lexicalValue: String[1]
+customAttributeType: String[1]
+lang: language[0..1]
+lexicalValue: String[1]
+created: date[0..1]
+modified: date[0..1]
+lang: language[0..1]
0..*
+USE+ +UF+
+USE
1
+identifier: String[1]
+ordered: Boolean = false[1]
+notation: String[0..*]
+hasMemberArray <ordered>
1
ThesaurusConcept
0..*
TopLevelRelationship
ThesaurusArray
+isNodeLabelOf
0..*
+isMemberOfGroup
AssociativeRelationship
+lexicalValue: String[1]
+created: date[0..1]
+modified: date[0..1]
+lang: language[0..1]
1
0..*
0..*
+hasSupergroup
NodeLabel
+identifier: String[1..*]
+contributor: String[0..*]
+coverage: String[0..*]
+creator: String[0..*]
+date: date[0..*]
+created: date[0..1]
+description: String[0..*]
+format: String[0..*]
+lang: language[1..*]
+publisher: String[0..*]
+relation: String[0..*]
+rights: String[0..*]
+source: String[0..*]
+subject: String[0..*]
+title: String[0..*]
+type: String[0..*]
+identifier: String[0..1]
+date: date[0..1]
+versionNote: String[0..1]
+currentVersion: Boolean[0..1]
+thisVersion: Boolean[1]
+lexicalValue: String[1]
+created: date[0..1]
+modified: date[0..1]
+lang: language[0..1]
1..*
Thesaurus
VersionHistory
ConceptGroupLabel
0..*
+hasEditorialNote
Definition
+source: String[0..1]
EditorialNote
Notes and attributes
ThesaurusConcept
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+status: String[0..1]
+notation: String[0..*]
+topConcept: Boolean[0..1]
CustomConceptAttribute
1
+hasCustomConceptAttribute
+lexicalValue: String[1]
+isCustomConceptAttributeOf 0..* +customAttributeType: String[1]
+lang: language[0..1]
+isReferredT
0..*
+refersTo
0..*
1
+annotatesHistory
1
+isCustomNoteOf
Note
+lexicalValue: String[1]
+created: date[0..1]
+modified: date[0..1]
+lang: language[0..1]
CustomNote
0..*
+hasCustomNote
1
+definesScopeOf
+noteType: String[0..1]
0..*
+hasScopeNote
ThesaurusTerm
0..*
+lexicalValue: String[1]
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+source: String[0..1]
+status: String[0..1]
+lang: language[0..1]
+hasHistoryNote
0..*
+annotatesHistory
1
HistoryNote
+hasHistoryNote
+isDefinitionOf
1
0..*
+hasDefinit
+isEditorialNoteOn
1
ScopeNote
+hasEditorialNote
0..*
Definition
+source: String[0..1]
EditorialNote
Terms and concepts
●A concept is defined by its scope note and its relationships,
not by the term chosen to label it.
●A concept has one preferred term per language.
●A concept may have one or many non-preferred terms.
●Simple non-preferred terms are linked to the preferred term
by a USE/UF relationship, e.g.
automobiles
USE cars
cars
UF automobiles
Terms and concepts
ThesaurusConcept
+identifier: String[1]
1 +created: date[0..1]
+modified: date[0..1]
+status: String[0..1]
+isNonPreferredLabelFor
+notation: String[0..*]
1 +topConcept: Boolean[0..1]
+isPreferredLabelFor
0..*
+hasNonPreferredLabel
SimpleNonPreferredTerm
1..*
+hasPreferredLabel
One preferred
term per language
PreferredTerm
+hidden: Boolean[0..1]
ThesaurusTerm
+lexicalValue: String[1]
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+source: String[0..1]
+status: String[0..1]
+lang: language[0..1]
Links between terms
ThesaurusConcept
1
+isNonPreferredLabelFor
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+status: String[0..1]
+notation: String[0..*]
+topConcept: Boolean[0..1]
1
+isPreferredLabelFor
CompoundEquivalence
Equivalence
+role: String[0..1]
0..*
+hasNonPreferredLabel
+hasPreferredLabel
PreferredTerm
SimpleNonPreferredTerm
+hidden: Boolean[0..1]
1..*
+UF
0..*
+USE
2..*
0..*
SplitNonPreferredTerm
+USE+ +UF+
1
ThesaurusTerm
+lexicalValue: String[1]
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+source: String[0..1]
+status: String[0..1]
+lang: language[0..1]
Compound equivalence 1
Split non-preferred terms are linked to the preferred term by a
relationship tagged as USE+/UF+ , when they imply an
intersection, e.g.
coal mining
USE+ coal
USE+ mining
coal
UF+ coal mining
mining
UF+ coal mining
coal
coal
mining
mining
Compound equivalence 2
When a compound concept expresses the union of concepts, it is
better to treat it as a broader concept, e.g.
not
fossil fuels
USE+ coal
USE+ natural gas
USE+ petroleum
coal
petroleum
but
fossil fuels
NT coal
NT natural gas
NT petroleum
fossil fuels
natural gas
(ISO 25964 does not specifically deal with this case)
Hierarchical relationships
ThesaurusConcept
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+status: String[0..1]
+notation: String[0..*]
+topConcept: Boolean[0..1]
0..*
+hasHierRelConcept
HierarchicalRelationship
+role: String[1]
0..*
+isHierRelConcept
“role” may be used to distinguish
generic (BTG/NTG),
partitive (BTP/NTP), and
instantive (BTI/NTI) relationships.
This is important when considering “transitivity”, which fails when these
types are mixed. E.g. if you have the relationships
countries
NTI India
India
NTP Karnataka
you can not conclude that
Karnataka is a country.
Arrays and node labels
nails
– <nails by form>
– – cut nails
– – helical nails
– – hook nails
– – <nails by form: head type>
– – – double-headed nails
– – – flat-head nails
– – – headless nails
– – <nails by form: point type>
– – – barbed nails
– – – blunt nails
– – – chisel nails
Arrays may be
● nested
● ordered
● with node labels
Node labels
● show a characteristic of
division
● do not label thesaurus
concepts
● do not have hierarchical or
associative relationships
Arrays and node labels
Thesaurus
NodeLabel
+identifier: String[1..*]
+contributor: String[0..*]
+coverage: String[0..*]
+creator: String[0..*]
+date: date[0..*]
+created: date[0..1]
+description: String[0..*]
+format: String[0..*]
+lang: language[1..*]
+publisher: String[0..*]
+relation: String[0..*]
+rights: String[0..*]
+source: String[0..*]
+subject: String[0..*]
+title: String[0..*]
+type: String[0..*]
+lexicalValue: String[1]
+created: date[0..1]
+modified: date[0..1]
+lang: language[0..1]
+hasNodeLabel
0..*
1
+isNodeLabelOf
1
+IsPartOf
0..*
+contains
+hasMemberArray <ordered>
0..*
+isPartOf
1
+contains
1..*
ThesaurusConcept
0..1
+hasSuperOrdinateArray
0..1
+hasSubordinateArray
+hasSuperOrdinateConcept
0..*
+identifier: String[1]
+created: date[0..1]
1..*
+isMemberOfArray
+modified: date[0..1]
+hasMemberConcept <ordered>
0..*
+status: String[0..1]
+notation: String[0..*]
+topConcept: Boolean[0..1] 0..*
ThesaurusArray
+identifier: String[1]
+ordered: Boolean = false[1]
+notation: String[0..*]
Concept groups
Many thesauri group concepts in
additional ways to supplement the
hierarchical BT/NT structure.
These may be based on subject
areas, and called:
● microthesauri
● themes
● domains
These groups may have a
hierarchical structure and a
notation, forming a type of
classification scheme, each group
containing concepts from several
facets (e.g. people, objects,
materials, abstract concepts,
activities, space and time).
Example of a domain in the
Eurovoc thesaurus
28 SOCIAL QUESTIONS
2806 family
2811 migration
2816 demography and population
2821 social framework
2826 social affairs
2831 culture and religion
2836 social protection
2841 health
2846 construction and town planning
Concept groups
ConceptGroupLabel
Thesaurus
+lexicalValue: String[1]
+created: date[0..1]
+modified: date[0..1]
+lang: language[0..1]
1..*
+identifier: String[1..*]
+contributor: String[0..*]
+coverage: String[0..*]
+creator: String[0..*]
+date: date[0..*]
+created: date[0..1]
+description: String[0..*]
+format: String[0..*]
+lang: language[1..*]
+publisher: String[0..*]
+relation: String[0..*]
+rights: String[0..*]
+source: String[0..*]
+subject: String[0..*]
+title: String[0..*]
+type: String[0..*]
+hasConceptGroupLabel
+isConceptGroupLabelOf
1
+hasSubgroup
ConceptGroup
+identifier: String[1]
+conceptGroupType: String[1]
+notation: String[0..*]
+contains
+isPartOf
0..*
1
0..*
+isPartOf
1
1..*
+contains
+hasSupergroup
1
+isMemberOfGroup
0..*
0..*
0..*
+hasAsMember
ThesaurusConcept
+identifier: String[1]
+created: date[0..1]
+modified: date[0..1]
+status: String[0..1]
+notation: String[0..*]
+topConcept: Boolean[0..1]
1
Version history
Thesaurus
VersionHistory
+identifier: String[0..1]
+date: date[0..1]
+versionNote: String[0..1]
+currentVersion: Boolean[0..1]
+thisVersion: Boolean[1]
0..*
+hasVersion
+isVersionOf
1
Version history provides a list of all the
versions of this thesaurus that exist,
identifying which versions are still
current and which version this is.
+identifier: String[1..*]
+contributor: String[0..*]
+coverage: String[0..*]
+creator: String[0..*]
+date: date[0..*]
+created: date[0..1]
+description: String[0..*]
+format: String[0..*]
+lang: language[1..*]
+publisher: String[0..*]
+relation: String[0..*]
+rights: String[0..*]
+source: String[0..*]
+subject: String[0..*]
+title: String[0..*]
+type: String[0..*]
Online model and XML schema
The data model in diagrammatic form is publicly available on the web site
for the ISO25964 project, at
<http://www.niso.org/schemas/iso25964/>
There is also there an XML schema embodying the structure of the model,
which can be used for data interchange or for generating various possible
output formats by using standard XML transformation software.