On the Identification of Inference Rules for Automatic Metadata

On the Identification of Inference Rules for Automatic
Metadata Generation
Merkourios Margaritopoulos, Athanasios Manitsaris, Ioannis Mavridis
Department of Applied Informatics, University of Macedonia,
156 Egnatia Street 54006, Thessaloniki, Greece
{mermar, manits, mavridis}@uom.gr
Abstract. Learning objects metadata (LOM) facilitate search, evaluation, acquisition, and use of learning objects, which may contain any kind of multimedia information, by learners, instructors or automated software processes. As a
result, learning objects reuse is promoted and thus, development cost for new
ones is reduced. However, manual indexing of learning resources according to
metadata standards is a laborious task. This fact suspends the full usage and reduces the intended advantages of standards, like IEEE LOM. The introduction
of automatic metadata generation processes is a developing research field with
diverse approaches. In this paper, we present a step-by-step methodology for
automatic generation of metadata that exploits relations between the resources
to be described. The methodology comprises the execution of consecutive steps
of actions aiming at identifying inference rules for automatic generation of a resource’s metadata based on pre-existing metadata of its related resources.
Keywords: Learning objects metadata, automatic metadata generation, inference rules.
1
Introduction
Document retrieval in our modern digital world is mainly implemented by search engines which comprise large databases containing indexes of documents one can find
in the World Wide Web. Search engines populate their databases using tools like spiders and robots which traverse the web from link to link and index web pages and
other documents automatically, based on the documents’ content, by means of content
analysis techniques (mainly content in text format). However, documents having format other than text (containing multimedia information such as images, videos,
sounds, etc.) cannot be fully described this way. Full description of these documents
can be obtained by using metadata. Metadata are usually defined as “data about data”
([7], [15], [25]). ΙΕΕΕ in [16] filters this definition as “information about an object”.
Metadata have the advantage to remain independent from the objects they describe
and store information not present in the described object (such as usage rights, or third
party annotations). The objects used in the learning process (learning objects are defined in [9], [15], [26]), are currently available in various multimedia formats. However, as the population of objects is increasing exponentially, while particular learning
needs are developing equally rapidly, possible lack of information or metadata about
the objects is restricting the ability to locate, use and manage them. The adoption of
standard structures for the interoperable description of learning objects has been a major step towards the definition of a common indexing background. LOM standard in
[16], ambitiously, defines almost 80 elements for the description and management of
learning objects. Yet, not only the number, but, also, the diversity of the elements in
this metadata specification has created implementation difficulties. Such a metadata
set of more than 80 elements is by itself a source of trouble to potential indexers,
since the use of the metadata set in its entirety is a complex and resource demanding
task. In [18] this problem is called “metadata bottleneck”. Several studies on the use
of LOM metadata have been carried out by researchers, providing interesting conclusions on the way LOM metadata are used by indexers (see, for example, [11], [20],
[21]).
A solution to the problem of manual indexing is the involvement of automatic
metadata generation techniques. Automatic metadata generation is a result of machine
processing and is often defined by distinguishing it from metadata generated by a person. Most automatic metadata generation operations only require a human to initiate
the process. Many of these operations manipulate metadata previously produced by
humans ([13]). According to [12], automatic metadata generation is more efficient,
less costly, and more consistent than human-oriented processing. Moreover, the authors concur that the most effective results can be achieved by integrating both human
and automatic methods in order to cover the whole range of metadata, which sometimes require human intellectual discretion.
Automatic metadata generation is based on exploiting several sources from which
metadata values can emerge. These sources of information (retrieved from [3]) are:
• Document content analysis: Metadata are generated by the document itself, regardless of any specific usage. Typical content analysis applications are keyword extractors, language analyzers for text documents or pattern recognizers for images.
• Document context analysis: When an object is used in a specific context and data
about that context are available, one can rely on the context to obtain information
about the object itself. One single learning object typically can be deployed in several contexts. This fact results in additional metadata about the object. A special
case of document context analysis is usage context analysis where metadata can
emerge from the real use of objects; e.g. tracking and storing the time spent reading
a document.
• Composite documents structure: In some cases, even if learning objects are parts of
a whole, they are stored separately. In such a case, the metadata available for the
whole is an interesting source for metadata about a component. Not only is the enclosing object a source of metadata, but also the sibling components can provide
relevant metadata.
Among the above three sources of information for automatic metadata generation,
“composite documents structure” is a highly interesting approach with an efficient
application to collections of objects. Objects, related to each other with some kind of
relation, create together a whole and therefore, it is possible that several of their
metadata elements are influenced by each other. The exact kind of influence specifying the interrelation of the metadata elements is obtained by applying inference rules
referring to the semantics of metadata and the relations between the objects. As [8]
notes, metadata generation through “related content” is a method of metadata propagation parallel to basic object-oriented modelling concepts like inheritance; hence the
authors strongly encourage the research community to tackle this issue.
This paper is organised as follows: In Section 2, a brief description of related work
on the issue of automatic metadata generation based on relations between resources is
presented, along with the motivation for the present proposal, which is mainly focusing on educational resources. In Section 3, a solid methodology for the identification
of inference rules that automatically generate metadata values of a resource based on
pre-existing metadata of its related resources is introduced. The methodology consists
of four consecutive steps each of which is elaborated in its respective subsection. Section 4 provides examples of how the methodology can be applied to the LOM metadata schema. Finally, in Section 5, a conclusion is drawn and plans for future work
are presented.
2
Motivation – Related Work
In [1] a semi-automatic tool is presented which proposes default values of LOM
metadata based on the user profile for some descriptors and uses relations between
documents to infer other descriptors. For the indexation of a new learning object, the
tool asks the user to enter the relation descriptor. For each relation entered, the tool
performs a specific task. For example, for a “IsVersionOf” relation, the tool changes
the version descriptor. If a “IsPartOf” relation is entered, the tool changes the granularity of the object. If the learning object has several “References” relations, the tool
performs the union of the keywords of the related learning objects. Furthermore, the
tool can infer new relations from existing ones, e.g. if a learning object being actually
indexed has a “IsPartOf” relation with an existing indexed one, the tool adds a “HasPart” relation to it. In [6] these ideas are extended and applied to a pedagogical ontology which is being constructed based on LOM. The ontology exploits relations between learning objects to extract inferences that further enrich the descriptions of the
educational resources (represented in RDF) by introducing several levels of description of concepts, using the Web Ontology Language (OWL) [27]. An extensive list of
semantic relations between the learning objects is used in the ontology. Furthermore,
additional knowledge can be extracted through inference processes that exploit various mathematical properties of the relations (e.g. symmetry, transitivity, etc.).
Another relevant approach in [2] proposes the use of additional inference rules to
be developed for the description of lessons with LOM metadata based on pre-existing
metadata of related resources. A RDF representation is used for the metadata descriptions. In addition, the use of the inference language TRIPLE [22], which was exclusively designed for RDF to represent metadata and rules, is introduced. This work
emphasizes the necessity of having a set of logic rules to be processed by an inference
engine in order to create all implicit metadata elements from pre-existing ones. Thus,
a set of inference rules is defined with each one applying to certain LOM elements
based on the relation connecting the resource being described to resources already indexed. Some rules defined are: Inversion and transitivity (mathematical properties of
relations), inheritance along, according to which certain elements are inherited by a
resource from its parts (connected with the “HasPart” relation), summation along,
which uses the sum of values of a metadata element of the parts of a resource to compute the value of the same field of the resource. Additionally, the use of the inference
rules as integrity constraints to check for the semantic integrity of the descriptions,
besides the automatic generation of descriptions, is suggested in [2].
In [14] metadata generation is dealt through suggestions extraction for the metadata values using a combination of four methods: inheritance, aggregation, content
similarity and ontological – semantic similarity. Instead of directly aiming at automatic generation of metadata values, a process was developed suggesting to the user
the most relevant values for every metadata element. Inheritance and aggregation is
applied to learning objects being part of an assembly (e.g. a SCORM collection) or an
aggregation (e.g. a web page containing an image and an animation). This fact results
in creating a whole and, therefore, it is possible for the objects to share several element values. Although the objects in the assembly or the aggregation and their metadata records are distinct, a value set for one metadata element in one object can
propagate itself as a suggested value to other objects in the assembly or the aggregation. If the assembly is organized hierarchically, some of the values can be inherited
form the ancestor nodes or aggregated from the child nodes. The content similarity
method makes use of all accessible metadata records in a repository. A set of suggested values for the elements exhibiting this property is calculated as the union of
element values from metadata records of objects that have contents similar to the object under consideration. The method can employ different algorithms for calculating
content similarity. The fourth method of ontological – semantic similarity is based on
already filled values in the current record that can be mapped to the concepts of an
ontology. These values are fed to a set of inference rules that calculate the values for a
set of metadata fields characterizing similar records. The similar records are retrieved
and a set of suggested values for another field(s) is generated as a union of values
from similar records.
Another approach for LOM metadata generation based on the exploitation of related content is presented in [19]. Relations between learning objects decisively influence their metadata values. This influence can take the form of acquisition, suggestion
and restriction, which are applied as heuristic rules. With acquisition, a metadata element collects values from the same elements of related objects and appends them to
its (pre-existing) values. With suggestion, the value of a metadata element is generated as a result of an expression containing metadata element values of related objects. Finally, with restriction, the range of values of a metadata element is reduced
based on the value of the same metadata element of a related object. A framework
(called diffusion framework due to its recursive nature) for the processing of acquisition, suggestion and restriction, which is expressed as a set of mathematical functions,
is also presented in this work.
In [10] a dynamic approach to relations connecting Hypermedia Learning Objects
(HyLOs) described by the LOM standard is presented. The authors claim that a well
defined mesh of connected learning objects, a semantic learning net, may be presented to the learner for navigation and knowledge exploration, as well as to the author or instructional designer. For this reason, on the one hand, LOM relations are extended by redefining their semantics or including new relations. On the other hand, an
ontological evaluation layer for encoding the semantics of the relations in an OWL
ontology and providing an initial net of relations to a rule engine for further enriching
with other implicit (heuristically generated) relations is used. The presented implementation is based on the JENA framework for the inference engine. For defining
logical interrelations between related properties, some 50 logic rules have been identified, which are fed to the inference engine and enrich the initial relations manually
inserted by the user.
The study of these efforts, which take advantage of already described related resources to produce metadata descriptions of a resource, reveals that the set of inference rules used in every proposed approach is a core element, since it is the means for
determining the metadata values. Every one of the above described research efforts
incorporates a set of logic rules the application of which (usually by means of using a
rule or inference engine) produces implicit propositions for the values of metadata of
related resources. Moreover, defining logic rules is an intellectual task, which has to
take into account the semantics of the relations and the metadata. Inference rules proposed by the researchers, usually, coincide (there is a relative uniformity in defining
the kinds of rules – inheritance, aggregation, etc.). However, sometimes, they diverge
due to differences in the perceived semantics of the relations and the metadata elements. In this regard, the majority of the rules used by the researchers are considered
to be heuristic rules, since they are not mathematical propositions applied globally,
but solid results of experience. For example, in [19] the educational context of an object is considered to be the same with that of an object being part of the first one,
while in [2] such an inference is not adopted.
Concluding the above discussion, the need for a generic, common framework
methodology to define a guided process for the identification of the complete set of
inference rules suggesting the metadata of a resource based on the metadata of its related resources becomes obvious. Our work stems from the observation that the process of defining such rules must follow a well-formed theoretical construction based on
the semantics of the relations connecting the resources. For this reason, we propose a
generic methodology for identifying inference rules, as a result of existing relations
between resources, so as, initially, to enrich the existing relations by identifying new
(implicit) ones, and, finally, to propagate existing metadata to the metadata of related
resources. Such a methodology can be applied regardless of the metadata schema and
the semantics of the relations used.
3
The Proposed Methodology
The identification of inference rules to generate metadata values of a resource by exploiting a net of existing relations between the resources, is a process that can be followed in four consecutive steps: First, the interrelated properties of the resources connected with a certain relation are located and presented in a tabular form. Next, inference rules for generating new implicit relations between the resources are created.
Mapping of the located interrelated properties of the resources to metadata elements of
the used metadata schema is the third step. The process concludes with specifying the
influence type of the metadata values in order to generate the exact metadata generation inference rules.
3.1
Locating Connection Features of Related Resources
In most cases, the semantics of a relation connecting two resources is defined in free
text describing the concept of relating two resources with this relation, or is implied
from the meaning of the verb or the noun used to specify the relation with no further
explanations. Thus, the adoption of logic inferences regarding interconnected properties of the resources is a highly demanding intellectual task. In order to come up with
such inferences, one has to locate the interrelated properties of the resources connected with a relation that specify this connection on the basis of similarities or differences. In the rest of this paper these properties are called “connection features”.
Connection features may be stated explicitly in the definition of the semantics of a
relation. However, in other cases, connection features may be implied. For example,
the definition of the semantics of the relation “IsVersionOf” of Dublin Core ([5],
[24]) clearly highlights the connection features “Format” and “Creator” (the related
resources have the same format and the same creator), whereas one can presume the
connection feature “Topic area” (different versions of a resource belong to the same
topic area). The interrelation of the connection features of a relation does not, necessarily, take the form of equality. It is possible that a relation defines a certain type of
differentiation, e.g. the relation “IsLessSpecificThan” in [10] defines the connection
feature “Level of details” and requires that the value of this property of a resource is
lower than the same property’s value of its related resource. Connection features are
eventually becoming the means to extract rules specifying the value of metadata of a
resource based on the values of metadata of its related resources.
Locating the connection features is a process requiring the involvement of human
intelligence. This is because the semantics of a relation, as well as the kind of resources it connects, may allow huge variances in the range of the possible interrelated
properties of the related resources, so as every different version falls into the category
of heuristics.
Apart from relations referring to semantic characteristics of the resources they connect, structural relations (part – whole relations) connecting the related resources are
also included in the definition of connection features. As connection features for such
kind of relations “superset” for whole relations and “subset” for part relations are defined. This definition is based on the observation that a resource containing another
(whole relation) is connected to it with the property of superset (it is superset of its
related resource), while a resource being part of another (part relation) is connected to
it with the property of subset (it is subset of its related resource).
It is obvious that some relations have such semantics that it may be impossible to
locate even one connection feature. For example, the relation “References” of Dublin
Core is defined with no restrictions (or any other kind of interrelation) for the resources it connects – a resource may reference any other resource. Of course, any different implementation may impose its own limitations on the resources and, thus,
make the location of connection features possible, on the basis of similarities or differences it will set. For example, it may be considered that referenced resources have
certain interrelated properties with the resources being referenced by – e.g. common
topic area, etc.
For the optical representation of a set of relations and the connection features of
each, in order to extract logic inferences, the use of a 2-dimensional table for assisting
the identification of the inference rules is proposed. Such a table is depicted in Table
1. The rows of Table 1 consist of selected relations from Dublin Core, [4], [10] and
[17], while its columns contain some of their connection features. Common properties
of the two related resources, which are stated in the definition of the semantics of
each relation, as well as properties influenced by each other in a specific way are included in the connection features. Hence, for the relations “IsFormatOf” of Dublin
Core, “IsTranslationOf” in [4] and “HasTextAlternative” in [17], the common connection feature “Intellectual content” is marked. For the relation “IsLessSpecificThan” in [10] the connection feature “Level of details” is marked, since the values of
this property of the two connected resources with this relation are influenced by each
other in a specific way (one lower than the other). Moreover, for the relation “IsNarrowerThan” in [10] the connection feature “Taxonomic system” is marked, in the
sense that resources connected with this relation have their position in a certain taxonomic system influenced by each other in a specific way (one at a higher level than
the other). The inclusion of the connection features of a relation in Table 1 facilitates
checking for possible connections between every stated connection feature and all the
other relations of the set of relations under consideration. In this regard, for the relation “HasTextAlternative” in [17], the connection features “Topic area”, “Taxonomic
system” and “Level of details” are also marked, since equality of “Intellectual content” means equality of these properties, as well. (Equality of “Taxonomic system”
implies the equality of the positions the two related resources hold in a certain taxonomic system that classifies resources based on properties of their content).
Table 1. Relations and connection features
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Creator
x
x
Level of details
Taxonomic system
Format
Subset
x
x
Topic area
IsVersionOf
IsFormatOf
IsReferencedBy
References
HasPart
IsPartOf
IsNarrowerThan
IsLessSpecificThan
IsTranslationOf
HasTextAlternative
Superset
Relation
σ
Intellectual content
Connection
feature
x
x
x
3.2
Creation of inference rules for generating new implicit relations
It is possible that the relations the members of a set of resources are connected to each
other with, can be enriched by applying certain inference rules which will generate
new implicit relations from the already existing ones. This is a process of two substeps:
Firstly, a number of rules for the generation of new relations are created by exploiting the mathematical properties of the relations. Since the relations in question are all
binary relations (having two arguments), they are equipped, as the case may be, with
common properties of binary relations (concepts of set theory in Mathematics). A relation σ, according to its semantics, may be reflexive (for all resources x, it holds, x σ
x), symmetric (for all resources x, y: if x σ y then y σ x), transitive (for all resources
x, y, z: if x σ y and y σ z then x σ z). Two relations σ1 and σ2 are inverse if for all resources x, y: if x σ1 y then y σ2 x. For example, transitivity in the relation “IsPartOf”
of Dublin Core leads us to the rule: «If resource a “IsPartOf” resource b and resource
b “IsPartOf” resource c, then resource a “IsPartOf” resource c».
Secondly, one takes advantage of the connection features defined in the previous
step to create rules for getting new relations by means of “relation transfer”. Relation
transfer states that if resource a is connected to resource b with relation σ, then it is
also connected with the same relation to all other resources connected to b through a
certain connection feature (the relation is transferred to them). For example (using
Dublin Core / LOM relations), the relation “IsReferencedBy” connecting resource a
to resource b can be transferred to all resources sharing the same “Intellectual content” with b (such as all resources related to each other with the relation “IsFormatOf”). The resulting rule is: «If a “IsReferencedBy” b and b “IsFormatOf” c Then a
“IsReferencedBy” c». This sort of propositions can be depicted in a table similar to
Table 1 with the symbol “v”. Symbol “v” in Table 2 depicts the proposition: «Resource a is connected to resource b with relation σ, as well as any other resource connected to b through the connection feature appearing in the respective column of the
table».
Table 2. Relations and connection features enriched with propositions for relation transfer.
3.3
x
x
x
x
x
x
x
x
x
x
Creator
x
x
Level of details
Taxonomic system
Format
Subset
x
x
v
Topic area
IsVersionOf
IsFormatOf
IsReferencedBy
References
HasPart
IsPartOf
IsNarrowerThan
IsLessSpecificThan
IsTranslationOf
HasTextAlternative
Superset
Relation
σ
Intellectual content
Connection
feature
x
v
x
x
v
xv
xv
x
x
x
Mapping Connection Features to Metadata Elements
The connection features, thought as properties of resources, can be mapped to certain
metadata elements of the schema used for describing the resources. The interrelation
of the connection features of two resources (through the relation they are connected
with) is translated into the interrelation of their respective metadata elements. Thus,
considering LOM metadata schema, the connection feature “Intellectual content” is
mapped to metadata elements which are affected by the content of the resources (such
as “1.4 General.Description” and “1.5 General.Keyword”). Consequently, resources
connected to each other with a relation that uses this connection feature (such as “IsFormatOf” which prescribes that the connected resources have the same intellectual
content), will have their respective metadata elements interrelated the same way as
their connection features (i.e. equal). In this regard, a metadata element which can be
mapped to the connection feature “superset” is “1.3 General.Language”, in the sense
that the value of this metadata element of a resource which is superset of another
(they are connected with the relation “HasPart”) can be affected by the value of this
element of the related resource.
3.4
Specifying the influence type of the metadata elements’ values
An inference rule generating the value of a metadata element exploits the relations
that use the connection features corresponding to the element, as its conditions part.
For each and every one of these relations a single rule is created. For example, for the
element “1.4 General.Description” of LOM three rules are created having as their
conditions part the relations: “HasFormat” and “IsFormatOf” (due to the mapping to
the connection feature “Intellectual content”) and “HasPart” (due to the mapping to
the connection feature “Superset”). The actions part of every rule deals with specifying the exact type of influence for the metadata elements values. The value of a metadata element may be influenced in one of the following three types:
• Inclusion of metadata values from related resources, according to which a resource’s metadata element values (with cardinality greater than 1) are added (included) to the values of the same metadata element of a related resource. A resource can include metadata values from its subsets as a result of a whole relation that connects it to its parts (the inclusion relations of the resources are transferred to their metadata elements).
• Computation of metadata value from metadata values of related resources. The
metadata element value of a resource is the result of a mathematical or logic expression (which has to be specified) of metadata element values of related resources.
• Restriction of the range of values of a metadata element, according to which the
range of values of a metadata element of a resource is not the complete value
space defined by the specification of the standard, but a proper subset of it computed from the values of the same metadata element of related resources (the
exact expression has to be specified).
The first two types of influence automatically generate metadata values, while the
third one facilitates the task of manual indexing. Thus, the rules are formatted as: «If
resource a is related to resource b with relation σ, Then the value of the metadata element m of a is determined by the value of metadata element m of b according to one
of (the above) three defined types of influence».
4.
Application of the proposed methodology to LOM
In order to apply the proposed methodology to a certain metadata schema, the first
thing one has to take into account is the set of relations defined by the schema. The
relations defined by LOM are directly adopted from Dublin Core metadata set ([5],
[24]). However, as it is noted in [10] and [23], the semantics of the relations of Dublin
Core suggest a certain semantical perspective for the relations between general resources and documents, which mainly serves the administrative needs of librarians.
The way Dublin Core relations are defined cannot serve the needs of an educational
environment where learning objects described with the LOM standard will be used.
Thus, a slight modification to the semantics of these relations is a first necessary step
before proceeding to the application of the proposed methodology. Our approach has
been greatly influenced by the definition of the semantics for the six pairs of LOM
relations proposed by [10].
The first step of the methodology requires locating connection features for the relations and optically representing the set of relations and the connection features of each
using a 2-dimensional table. Considering the semantics of the relations in [10] (preserving, though, the relation “HasFormat”), we come up with Table 3.
Table 3. Relations and connection features for LOM
HasVersion
IsVersionOf
HasFormat
IsFormatOf
IsReferencedBy
References
Requires
IsRequiredBy
IsBasedOn
IsBasisFor
HasPart
IsPartOf
x
x
x
x
Understanding
Content base
Topic area
Format
Subset
Superset
Relation
σ
Intellectual content
Connection
feature
x
x
x
x
x
x
x
x
x
x
x
x
x
In step No 2 of the methodology, for the generation of new implicit relations, the
mathematical properties of the relations are exploited in the first substep. Thus, we
come up with inference rules like «If learning object a “Requires” learning object b,
then learning object b “IsRequiredBy” learning object a» (inverse relations), or «If a
“HasPart” b and b “HasPart” c, then a “HasPart” c» (transitivity). In the second substep we enrich Table 3 with propositions for “relation transfer”, coming up with Table
4.
Table 4. Relations and connection features enriched with propositions for relation transfer
HasVersion
IsVersionOf
HasFormat
IsFormatOf
IsReferencedBy
References
Requires
IsRequiredBy
IsBasedOn
IsBasisFor
HasPart
IsPartOf
x
x
x
x
v
v
v
v
v
v
v
v
v
v
Understanding
Content base
Topic area
Format
Subset
Superset
Relation
σ
Intellectual content
Connection
feature
x
x
x
x
x
x
x
x
x
x
x
x
x
Interpreting symbol “v” the way it was defined in subsection 3.2, several inference
rules like «If a “IsRequiredBy” b and b “IsFormatOf” c, then a “IsRequiredBy” c» are
produced. From Table 4, one can observe that connection features that can be used to
transfer a relation are “Intellectual content” and “Subset”. This happens due to the
fact that a relation can be transferred to resources having the same intellectual content
or resources containing the connected resource with the transferable relation (as their
common subset).
In step No 3 of the methodology, we map the connection features to the metadata
elements of the LOM schema. For example, the connection feature “Superset” can be
mapped to LOM metadata elements expressing properties of learning objects which
can influence the same properties of an object that contains them (as their superset),
like “1.3 General.Language”, “1.5 General.Keyword”, “4.1 Technical.Format”, “5.9
Educational.Typical learning time”, “6. Rights.Cost”, etc. In this regard, the connection feature “Understanding” can be mapped to the LOM metadata elements that can
be influenced by the notion of understanding the learning objects by their users,
which are “5.6 Educational.Context”, “5.7 Educational.Typical age range” and “5.11
Educational.Language”.
Next, in step No 4, in order to get the conditions part of a rule suggesting the value
of a metadata element, we create a single rule for each and everyone of the relations
that use the connection feature corresponding to the element. For example, for the
metadata element “1.3 General.Language” we create two rules having as their conditions part the relations “HasPart” and “IsPartOf” (as we mapped the connection fea-
tures “Superset” and “Subset” to this metadata element – since it is not hard to conclude that the value of “1.3 General.Language” of a learning object can influence the
value of this metadata element of a learning object that either contains or is part of the
first one).
Finally, in order to get the conditions part of a rule, we have to specify the exact
type of influence for the metadata element values. Thus, considering the above examples, if learning object a contains (“HasPart”) learning object b, the value of the metadata element “1.3 General.Language” of b will be included (“inclusion of metadata
values” type of influence) to the values of the same metadata element of a. Following
the same logic, if a learning object “IsPartOf” two or more learning objects, then the
range of values of its language will be restricted (“restriction of the range of values”
type of influence) to the intersection of the values of the languages of its two supersets.
Some more examples:
• If a learning object “Requires” others, then the typical age range of its intended
user (i.e. “5.7” metadata element) will be greater than the maximum typical age
range of the objects it requires (“restriction of the range of values” type of influence).
• If a learning object “IsRequiredBy” another one, then the human language used by
the typical intended user of this object (i.e. “5.11” metadata element) will be the
same with the corresponding language of the object that requires it (“computation
of metadata value” type of influence).
• If a learning object “IsPartOf” other ones, then whether there is cost in it (i.e. the
value of “6.1” metadata element) is “No”, if at least one of the container objects
has no cost (“computation of metadata value” type of influence).
5
Conclusion – Future work
In this paper, a step-by-step methodology that can guide the process of identifying the
complete set of inference rules for generating metadata of a resource, based on the
metadata of its related resources was presented. This process is integrated in a welldefined theoretical construction, the foundation of which is the semantics of the relations connecting the resources to be indexed. At present, there is an on-going work to
apply the methodology to produce a complete set of rules for the LOM metadata
schema for indexing learning resources to be used for creating e-learning courses at
the Department of Applied Informatics of the University of Macedonia, Greece. Furthermore, we are working on developing an Integrated Metadata Management System
(IMMS) that incorporates the abilities for editing, storing, retrieving and automatically generating new metadata. Metadata generation is supported by the use of a cooperating rule engine.
References
1. Bourda, Y., Doan, B-L., Kekhia, W. A semi-automatic tool for the indexation of learning
objects. In Proceedings of World Conference on Educational Multimedia, Hypermedia and
Telecommunications 2002 (pp. 190-191). Chesapeake, VA: AACE.
2. Brase, J., Painter, M., Nejdl, W. Completion Axioms for learning object metadata - Towards
a formal description of LOM. In 3rd international conference on advanced learning technologies (ICALT). Athens, Greece, July 2003.
3. Cardinaels, K., Meire M., Duval, E. Automating metadata generation: the simple indexing
interface, Proceedings of the 14th international conference on World Wide Web, May 1014, 2005, Chiba, Japan.
4. CEN/ISSS WS/LT Learning Technologies Workshop. SC 36 Doc No 36N0789: CWA
14645:2003. Availability of alternative language versions of a learning resource in IEEE
LOM.
5. DCMI Metadata Terms [Οn-line]. Available from http://dublincore.org/documents/dcmiterms/. Accessed 15 March 2007.
6. Doan, B-L., Bourda, Y. Defining several ontologies to enhance the expressive power of queries. In volume 143 of CEUR, workshop Proceedings, on Interoperability of web-based
Educational Systems, held in conjunction with WWW’05 conference, Chiba, Japan, May
2005. Technical University of Aachen (RWTH).
7. Duval, E. Metadata Standards: What, Who & Why? Journal of Universal Computer Science,
Vol. 7, no 7, 2001, pp. 591-601.
8. Duval, E., Hodgins, W. Making metadata go away - hiding everything but the benefits. In
DC-2004: Proceedings of the International Conference on Dublin Core and Metadata Applications, pp. 29–35.
9. e-Learning Consortium. Making sense of learning specifications and standards: A decision’s
maker’s guide to their adoption (2e). The Masie Centre. [On-line] Available from
http://www.masie.com/standards/s3-2nd-edition.pdf. Accessed 15 March 2007.
10.Engelhardt, M., Hildebrand, A., Lange, D., Schmidt, T. C. Reasoning about eLearning Multimedia Objects. In J. Van Ossenbruggen, G. Stamou, R. Troncy, V. Tzouvaras (Ed.) Proc.
of WWW 2006, Intern. Workshop on Semantic Web Annotations for Multimedia
(SWAMM).
11.Friesen, N. (2004). International LOM Survey: Report (Draft). Available from
http://dlist.sir.arizona.edu/403/01/LOM_Survey_Report2.doc. Accessed 15 March 2007.
12.Greenberg, J., Spurgin, K., Crystal A. AMeGA (Automatic Metadata Generation Applications) Project, University of North Carolina, 2005.
13.Greenberg, J., Spurgin, K., Crystal, A. Functionalities for Automatic-Metadata Generation
Applications: A Survey of Metadata Experts’ Opinions. International Journal of Metadata,
Semantics and Ontologies. Vol. 1, No. 1, 2006.
14.Hatala, M., Richards G. Value-added metatagging: Ontology and rule based methods for
smarter metadata. In RuleML, pp. 65–80, 2003.
15.Horton, W., Horton, K. E-learning Tools and Technologies. Indianapolis: Wiley Publishing,
2003.
16.IEEE. 1484.12.1 (2002). Draft Standard for Learning Object Metadata. Learning Technology
Standards
Committee
of
the
IEEE.
[Online].
Available
from
http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf. Accessed 15 March
2007.
17.Karampiperis, P., Sampson, D. Learning object metadata for learning content accessibility.
In: World Conference on Educational Multimedia, Hypermedia and Telecommunications
(EDMEDIA). Volume 2004., Lugano, Switzerland (2004), pp. 5204–5211.
18.Liddy, E. D., Allen, E., Harwell, S., Corieri, S., Yilmazel, O., Ozgencil, N. E., Diekema, A.,
McCracken, N. J., Silverstein, J., Sutton, S. A. Automatic metadata generation & evaluation.
Proceedings of the 25th Annual International ACM SIGIR Conference on Research, 2002.
19.Motelet, O. Relation-based heuristic diffusion framework for LOM generation. Proc. of 12th
International Conference on Artificial Intelligence in Education AIED 2005 - Young Researcher Track, Amsterdam, Holland.
20.Najjar, J., Ternier, S. and Duval, E. (2003). The Actual Use of Metadata in ARIADNE: an
Empirical Analysis. In Proceedings of ARIADNE Conference 2003.
21.Sicilia, M.A., García-Barriocanal, E., Pagés, C., Martínez, J.J. and Gutiérrez, J.M. (2005).
Complete metadata records in learning object repositories: some evidence and requirements.
International Journal of Learning Technology, 1(4), pp. 411-424.
22.Sintek, M., Decker, S. TRIPLE - An RDF Query, Inference, and Transformation Language,
DDLP,
Japan,
October
2001.
[On-line]
Available
from
http://triple.semanticweb.org/doc/ddlp2001/TripleReport.pdf. Accessed 15 March 2007.
23.Steinmetz, R., and Seeberg, C. (2003). Meta-Information for Multimedia eLearning. Computer science in perspective, Springer-Verlag New York, Inc.
24.Using
Dublin
Core
–
The
Elements.
[Οn-line].
Available
from
http://dublincore.org/documents/usageguide/elements.shtml. Accessed 15 March 2007.
25.Wason,
T.
Dr.
Tom’s
Meta-Data
Guide.
[On-line].
Available
from
http://www.twason.com/drtommeta.html. Accessed 15 March 2007.
26.Wiley, D. The instructional use of learning objects. Agency for Industrial Technology and
the Association for Educational Communications and Technology, 2000.
27.World Wide Web Consortium. OWL Web Ontology Language Reference. [On-line] Available from http://www.w3.org/TR/2004/REC-owl-ref-20040210/. Accessed 15 March 2007.

Download Report

On the Identification of Inference Rules for Automatic Metadata

Paperzz.com

Your Paperzz