PERICLES - Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics [Digital Preservation] DELIVERABLE 3.2 LINKED RESOURCE MODEL GRANT AGREEMENT: 601138 SCHEME FP7 ICT 2011.4.3 Start date of project: 1 February 2013 Duration: 48 months 1 DELIVERABLE 3.2 LINKED RESOURCE MODEL Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013) Dissemination level PU PUBLIC PP Restricted to other PROGRAMME PARTICIPANTS (including the Commission Services) RE RESTRICTED to a group specified by the consortium (including the Commission Services) CO CONFIDENTIAL only for members of the consortium (including the Commission Services) X DELIVERABLE 3.2 LINKED RESOURCE MODEL Revision History V# Date Description / Reason of change Author V0.9 03/06/14 Initial Draft Xerox v1.0a 25/06/2014 alpha release. Integrates feedbacks and contributions by KCL and CERTH Xerox v1.0 30/06/2014 Final draft Xerox v1.1 22/07/2014 Version integrating feedback from PERICLES internal reviewers Xerox CERTH Final version Xerox v1.2 31/07/2014 Authors and Contributors Authors Partner Name Xerox Jean-Yves Vion-Dury Xerox Nikolaos Lagos Xerox Jean-Pierre Chanod Partner Name KCL Simon Waddington CERTH Efstratios Kontopoulos Contributors © PERICLES Consortium Page 2 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Contents Glossary…………………………………………………………………………………………………………………………………………….6 1 Executive Summary .......................................................................................................... 6 2 Introduction & Rationale .................................................................................................. 7 2.1 2.2 Context of this Deliverable Production .................................................................... 7 What to expect from this Document ......................................................................... 7 3 Rationales and Guiding Principles ................................................................................. 8 4 State of The Art................................................................................................................ 11 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5 Detailed Description of the LRM ................................................................................... 17 5.1 5.2 5.3 5.4 5.5 5.6 5.7 6 Ontology Preamble, Namespaces.......................................................................... 17 Digital Resource and associated Descriptors ....................................................... 17 Basic Metadata and Properties associated with PERICLES Digital resources18 Dependencies ............................................................................................................ 19 Giving semantics to dependencies ........................................................................ 20 Operators ................................................................................................................... 21 Ontology Metrics ....................................................................................................... 23 LRM Primer ...................................................................................................................... 25 6.1 6.2 6.3 6.4 6.5 6.6 6.6.1 6.6.2 6.7 6.7.1 6.7.2 7 Aims and objectives .................................................................................................. 11 Generic properties .................................................................................................... 11 Preservation ............................................................................................................... 12 Systems and software .............................................................................................. 13 Probabilistic notions .................................................................................................. 14 Policy .......................................................................................................................... 15 Discussion .................................................................................................................. 15 Creating Digital Resources ...................................................................................... 25 Attaching Descriptions to Digital Resources ......................................................... 25 Creating Dependencies ........................................................................................... 26 Creating Plans ........................................................................................................... 27 Representing Operators........................................................................................... 28 Deploying PROV Constructs ................................................................................... 28 Activities ................................................................................................................ 29 Activity Roles ........................................................................................................ 29 Domain-specific LRM Example............................................................................... 30 Extending the LRM .............................................................................................. 30 Navigating the Ontology...................................................................................... 33 Conclusion and Future Work ......................................................................................... 35 © PERICLES Consortium Page 3 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Figures Figure 1 Relationship of the two core LRM classes (Digital-resource and Dependency) with the prov:Entity class. ..................................................................................................................................... 9 Figure 2 Dual notions of dependency and change ................................................................................ 11 Figure 3 Visual representation of a digital resource. ............................................................................ 25 Figure 4 Visual representation of a digital resource’s description........................................................ 26 Figure 5 Visual representation of a dependency .................................................................................. 27 Figure 6 Visual representation of a dependency .................................................................................. 28 Figure 7 Visual representation of Tate items hierarchy (currently includes only Software-based artworks). .............................................................................................................................................. 31 Figure 8 Visual representation of the specialization of the location descriptor and the identity for the Tate domain. ......................................................................................................................................... 31 Figure 9 Hierarchy of prov:Entity subclasses ........................................................................................ 32 Figure 10 Visualization of the software-based artwork sample instance. ............................................ 33 Figure 11 The “Brutalism” software-based artwork viewed with the “Ontology Browser”. ................ 34 Tables Table 1 Ontology metrics generated by Protégé. .......................................................................... 23 Table 2 Class axioms metrics ........................................................................................................... 23 Table 3 Object property axioms metrics. ........................................................................................ 24 Table 4 Data property axioms metrics. ........................................................................................... 24 © PERICLES Consortium Page 4 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Glossary Abbreviation / Acronym Meaning LRM Linked Resource Model PROV W3C Recommendation (OWL ontology) “It provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts. It can also be specialized to create new classes and properties to model provenance information for different applications and domains.” [1] OWL Web Ontology Language (W3C) W3C World Wide Web Consortium © PERICLES Consortium Page 5 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 1 Executive Summary The current document introduces PERICLES deliverable 3.2: the Linked Resource Model. The Linked Resource Model (LRM) is an OWL ontology that can be used to model dependencies between digital resources handled by the PERICLES tools. This document is a companion to the ontology, to explain the context as well as the guiding principles behind the LRM, and also to give indications about its usage. The LRM views digital ecosystem entities (data, metadata, policies, processes) as a set of evolving linked resources, where typed semantics enable one to describe the dependencies among heterogeneous resources. The main objective of the current LRM is to provide a principled way to modelling digital resources and their dependencies in PERICLES, which in turn should contribute to describing evolving digital ecosystems. To enable the above, the LRM formally defines that each digital resource should necessarily have a physical extension (i.e. must be physically located somewhere) and be represented through a unique id via the model. There can be a number of links among digital resources representing different types of connection (e.g. simple provenance information but also causality). The aim of the LRM is to allow modelling such links as dependencies among the digital resources when required e.g. in the case that these enable us representing change within the preservation environments. In that sense the LRM is developed as a domain-independent meta-model. The LRM will be used to provide fundamental well-defined notions to domain-specific models developed in WP2 and WP4, which in turn will represent specific application and domain needs. Dependencies in the LRM can be complex constructs, departing from the simple view of directed links adopted in other models. First of all, we discovered that what makes a dependency semantically different is the fact that its semantics are tightly connected to the underlying usage intention, so the LRM provides specific classes to describe such information. Secondly, dependencies should not only convey information related to the past (e.g. a file was produced by a specific piece of software) but also model use of the data in the future, which may or may not require use of the application that created it. Finally, dependencies should describe information related to the dynamics of digital resources, including the preconditions (when is it required to trigger the propagation of a change?) and the impact (how depending resources will be impacted) of a dependency. The LRM provides concepts and mechanisms that can be used to model the above, as explained in the main body of the document. To illustrate how the LRM can be used as the basis for domain-specific extensions, an LRM primer is provided in this document as well as an example related to one of the project use cases. Future deliverables (i.e. D2.3.2 “Data survey and domain ontologies for case studies” due M32 and D3.5 “Modelling contextualised semantics" due M30), will include much more detailed domain-specific ontologies extending the LRM. Furthermore, initial extensions to the LRM meta-model presented here are being developed to satisfy needs of diverse approaches that may be adopted to calculate the impact of changes (i.e. a preliminary model of weighted dependencies based on the LRM metamodel are introduced in D4.1 and D5.1 deliverables). The source of the LRM is listed in extenso inside this document (LRM is coded using the Turtle language), and can be downloaded separately as a zip archive (see [2]). © PERICLES Consortium Page 6 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 2 Introduction & Rationale 2.1 Context of this Deliverable Production This deliverable is the first one defined for WP3, “Modelling Resource Dependencies in Evolving Ecosystems” and as such addresses its first objective (DoW): Establish unifying models to describe heterogeneous resources and their dependencies (Linked Resources Model). This includes defining a Link Semantics, in order to discriminate, type and classify links based on their impact on the ecosystem. As such, this deliverable focuses on a static view of the resources and their dependencies and does not address yet change in the digital ecosystem, something planned to happen in later stages of the project. Nevertheless, the LRM, as it is introduced in this document, has been developed with the objective of serving as a principled foundation to describe and manage change over evolving resources. Describing and managing change over evolving linked resources will be the focus of subsequent WP3 work and deliverables. 2.2 What to expect from this Document Formally speaking, Deliverable 3.2 is an ontology. The source code is available at [2] and can be used via appropriate tools to model digital preservation systems. LRM instantiations can be checked for well-formedness and consistency, thanks to the inherent properties of the Web Ontology Language (OWL), on which the Linked Resource Model is based. The current document per se is a companion document to this ontology, to explain the context as well as the guiding principles behind the LRM, and also to give indications about its usage as a model and meta-model. © PERICLES Consortium Page 7 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 3 Rationales and Guiding Principles The LRM views digital ecosystem entities (digital objects, policies, processes) as a set of evolving linked resources, where typed semantics enable one to describe the dependencies among heterogeneous resources. The LRM is foremost a model that should function as a fundamental unified view that can be used to describe dependencies in different applications related to a preservation context. The LRM introduces a link-focused view of such digital ecosystems - the change of resources is tightly connected to the links that exist between these resources, while the properties of such links are also subject to evolution. The LRM should be understood as a domain-independent meta-model, to be eventually associated with domain specific models that will provide the more detailed concepts needed by specific application domains (of course, the needs are quite different for modelling say the Space and the Art & Media ecosystems explored in WP2, even if we expect both to rely on the same fundamental notion of dependency). Examples of domain-specific extensions that use the same LRM meta-model will be included in future deliverables (i.e. D2.3.2 “Data survey and domain ontologies for case studies” due M32 and D3.5 “Modelling contextualised semantics" due M30”). Nevertheless, in this report we include a domain-specific example that illustrates how the LRM could be extended for a specific domain (see LRM primer). The LRM should be interoperable with other models, which are relevant to the digital preservation area (for instance, we linked the PROV ontology (http://www.w3.org/TR/prov-o/) with the LRM to record provenance information). We decided therefore to minimize the design assumptions and constraints to this end. We put significant effort and thinking in reducing the core LRM classes to the essential minimum. As will be presented in detail in section 5, this includes in addition to the Dependency class (cf. Dependencies), classes defining the entities linked via a dependency (cf. Digital Resources), as well as entities that allow creating, reading or deleting digital resources in the ecosystem (cf. Operators). We have also defined a number of properties that allows us to semantically define different dependency types (cf. Giving semantics to dependencies). The LRM should be extensible. This is an obvious requirement as the planned usage of the LRM is that of a fundamental ontology that should be further extended to represent specific domains and applications. A guide explaining how the LRM can be extended is included in this deliverable along with an example (cf. LRM primer). Dependencies in the LRM should be able to capture usage intention. That is because, as we discovered during our exploratory work, one of the main semantic differences between a dependency and a link is that a dependency is always related to a usage intention, and therefore, LRM dependencies always convey a description, be it abstract or concrete, of the intended processing of the digital resources. Furthermore, they should be able to express n-ary oriented relations (as one resource can be dependent on several other resources). Dependencies in the LRM can therefore be complex constructs departing from the view of being expressed as simple binary links between resources. Another important point, captured by the LRM, relates to time, as two very different descriptive mechanisms must coexist in order to describe either dependencies induced by past operations or © PERICLES Consortium Page 8 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL dependencies involving future actions over resources of particular types, which then represent potentials rather than traces. ● For the first set of dependencies (talking about past actions), we decided to apply the minimization principles explained above, and to reuse concepts from the PROV ontology (http://www.w3.org/TR/prov-o/). PROV is a W3C recommendation for modelling the provenance information. This is a precise and rich description of the resource dependencies as the result of past activities. To this end we designed the LRM digital resources as subclasses of the prov:Entity class, so that all the PROV vocabulary can be also applied to LRM instances. Adopting the PROV constructs allows describing the provenance of any LRM resource through the standardized PROV vocabulary, while, at the same time, the provenance of dependencies can also be efficiently represented. Figure 1 illustrates the two core LRM classes (Digital-resource and Dependency) and their relationship with the PROV Entity class. The rest of the classes in the figure correspond to additional LRM constructs, which are more thoroughly described in Section 5. Note that extending the PROV ontology for deploying it within the LRM was not mandatory, it was a design choice, and the adherence of the LRM to PROV can be reconsidered, if required, at some point. In particular, as part of its future work, PERICLES will explore entity models which could further enrich the LRM, esp. the Continuous Record Keeping model and its related RKMS metadata schema [3, 4]. Figure 1 Relationship of the two core LRM classes (Digital-resource and Dependency) with the prov:Entity class. ● For the second set of dependencies (talking about future actions, and therefore, about potential change propagation) we decided to provide specific descriptive means, ranging from informal, text-based explanations, down to formal, computer-oriented, descriptions of how potential changes should be interpreted and propagated. Again, domain-specific models © PERICLES Consortium Page 9 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL will play a major role at this point, and this will be part of our future investigations in PERICLES. Last, but not least, the LRM should aid in exploring further the fascinating problem of preserving preservation systems, a concept coined as reflexive digital preservation. We consider that this issue is central to digital preservation at large: how could we preserve digital materials for the long haul, if the functionalities of the preservation system itself cannot be preserved? As a first step, we designed the LRM having in mind that it could be used to model the future instances of PERICLES (hence, particular descriptions of preservation systems) as a particular collection of digital resources, thus leading to a form of reflexivity. In so doing, we expect that any significant progress in capturing key aspects of the digital ecosystem’s dynamics will benefit to the long-term life of the infrastructure. © PERICLES Consortium Page 10 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 4 State of The Art 4.1 Aims and objectives In this section we discuss notions of dependency that could be relevant to modelling the relationships between entities in the context of digital preservation and content lifecycle management. Further, we also discuss some relevant approaches to ecosystem modelling using the various notions of dependency found in the literature, and some of the information that can be derived from such models. The primary aim of expressing dependencies within PERICLES is to enable modelling of change within the preservation environments. We realised at an early stage that dependency and change in this context of PERICLES could be regarded as essentially dual notions. Figure 2 Dual notions of dependency and change Thus, “entity A depends on entity B” is reflected by the oppose notion that a change in B would necessarily cause some change in A. In modelling dependencies, we particularly wanted to understand how dependencies could be combined to derive further dependencies (e.g. higher-order dependencies). More generally, we were interested to understand, for a given notion of dependency, what statements can be made about the properties of entities from the structure of their dependency graph? As described in Section 3, in order to model a digital ecosystem we need to consider dependencies relating to past events, which can be captured at ingest. However, we were also interested in dependencies related to future reuse of entities, in particular to support access to digital objects stored in a repository. For example, a past dependency models the relationship between an output data file and the piece of software that produced it. On the other hand, a future dependency may model use of the data in the future, which may or may not require use of the application that created it. 4.2 Generic properties A number of generic properties of dependency were determined during our study, details of which are presented in this section. A causal dependency [5] is the relation between an entity (the cause) and a second entity (the effect), where the second entity is understood as a consequence of the first. Such a concept enables the representation of events and change. Causal graphical models or directed © PERICLES Consortium Page 11 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL graphical models are also referred to as Bayesian Networks (BNs) and are used extensively for modelling causal processes. Transitivity is a property of dependencies that is often applied in database theory. A dependency is transitive if A is dependent on B and B is dependent on C implies that A is dependent on C. This property enables chaining of dependencies and inferences to be made on dependency graphs. A dependency may be the conjunction or the disjunction of two dependencies. This enables logical structures to be modelled. A conjunctive dependency requires all dependent entities to be present, whereas a disjunctive dependency requires at least one of a set of entities to be present. 4.3 Preservation The PREMIS Data Dictionary for Preservation Metadata is the international standard for metadata to support the preservation of digital objects and ensure their long-term usability (http://www.loc.gov/standards/premis/). The PREMIS Data Dictionary [6] defines preservation metadata as the information a repository uses to support the digital preservation process. Preservation metadata spans a number of the categories typically used to differentiate types of metadata: administrative (including rights and permissions), technical, and structural. PREMIS metadata is typically created at ingest into a repository or archive. PREMIS defines five semantic units, namely Intellectual Entities, Objects, Events, Rights, and Agents, and a simple data model to relate them. Three types of relationship are defined between objects: structural relationships, derivation relationships and dependency relationships. From the PERICLES perspective, derivation and dependency relationships are the most relevant. A derivation relationship results from the replication or transformation of an object, where the intellectual content remains the same, but the instantiation is different, such as a format conversion. A dependency relationship exists when one object requires another to support its function, delivery, or coherence. Examples would include a font, style sheet, DTD or schema that are not part of the file itself. Objects can also be related to events through user-defined dictionaries of terms, and events can in turn be linked to agents that performed those events, which can be either references to user roles or software applications. An event represents an action that involves or impacts at least one object or agent, such as a format transformation or migration. The Open Provenance Model (OPM) [7], [8] introduces the concept of a provenance graph that aims to capture the causal dependencies between entities. Three types of entities are defined in the model: ● Artefacts represent an immutable piece of state, which may be embodied as a physical object, or have a purely digital representation. ● Processes represent actions performed on or caused by artefacts, and resulting in new artefacts. ● Agents represent contextual entities acting as a catalyst of a process, enabling, facilitating, controlling, or affecting its execution. Therefore, nodes, whether artefacts, processes or agents, can be connected by directed edges that belong to one of the categories defined above, for instance to represent that an artefact was generated by a process. © PERICLES Consortium Page 12 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL In a preservation context, [9] defines notions of module, dependency and profile to model notions of use by a community of users. A module is defined to be a software/hardware component or knowledge base that is to be preserved, and a profile is the set of modules that are assumed to be known (available or intelligible) by a user (or community of users). A dependency relation is then defined by the statement that module A depends on module B if A cannot function without the existence of B. For example, a README.txt file written in English depends on the availability of a suitable text editor (e.g. Notepad). The paper demonstrates chaining of such use dependencies using conjunctive and disjunctive relationships. [10] also define the more specific notion of task-based dependency, which are expressed as Datalog rules and facts. For instance, Compile(HelloWorld.java) denotes the task of compiling HelloWorld.java. Since the compilability of HelloWorld.java depends on the availability of a compiler (specifically a compiler for the Java language), this dependency can be expressed using a rule of the form: Compile(X) :- Compilable(X,Y) where the binary predicate Compilable(X,Y) is used for expressing the appropriateness of Y for compiling X. For example, Compilable(HelloWorld.java, javac_1.6) expresses that HelloWorld.java is compilable by javac 1.6. This more formal approach enables various tasks to be performed such as risk and gap analysis for specific tasks, possibly considering contextual information, such as user profiles. A Preservation Network Model [11] is a formal model for conceptualising the relationships between resources within the scenario of a preservation objective. The preservation network model consists of two types of components: digital objects and the relationships between them. A relationship captures how two objects are related to one another in order to fulfil a specified preservation objective whilst being utilised by a member of the designated user community (in the sense of OAIS). Relationships can possess the attributes Function, Risks and Dependencies, Tolerance, and Quality Assurance and Testing. A relationship may be the conjunction or the disjunction of two relationships. 4.4 Systems and software In the Universal Modelling Language (UML) [12], a dependency is a relationship that shows that an element, or set of elements, requires other model elements for their specification or implementation. In UML there is a notion of a link, which is a relationship between instances of classifiers. In contrast, a dependency is a modelling relationship between definitions. UML provides a conceptual modelling approach for representing relationships between entities. For practical use in PERICLES, such a wide-ranging definition would need to be constrained in order for meaningful information to be extracted from a dependency graph. The Conceptual Dependency Graph technique is introduced in [13]. The notion of dependency defined relates to change by the linked entities. The dependencies have a set of attributes that reflect defined properties of the dependencies. Notions of dependency have been explored extensively in software engineering. A software dependency is a directed relation between two pieces of code (such as expressions or methods). There exist different kinds of dependencies: data dependencies between the definition and use of values and call dependencies between the declaration of functions and the sites where they are called. Dependency analysis is related to parallelism, i.e. whether sections of a program need to be executed sequentially or can be run concurrently. Zimmerman [14] demonstrates that dependency © PERICLES Consortium Page 13 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL graph complexity can be a useful predictor for failures in software subsystems. The IEEE definition of failure1 is the inability of a system or component to perform its required functions within specified performance requirements. Dependency graphs can also be applied to bottleneck analysis [15]. The maximal throughput of a system may be limited by the amount of available resources (e.g. the number or speed of processors, the size of memory, the bandwidth of a bus). Dependency graphs labelled with resource descriptions such as channel capacities can be applied to this problem. Coupling is a term from software engineering to describe the degree of linkage between entities, in this case software modules [16]. It is important consideration in the design and maintainability of software systems. Two modules are independent if each can function completely without the presence of the other – i.e. they are decoupled or uncoupled. Highly coupled modules are joined by many interconnections whereas loosely coupled modules are joined by few interconnections. Here, an interconnection can be considered as a compilation or runtime linkage between the modules. Common-environment coupling refers to the situation where a module writes into global data and a different module reads from it (data or, worse, control). Software change impact analysis is defined as “the determination of potential effects to a subject system resulting from a proposed software change” [17]. The basic principle underlying the need for impact analysis is that a small change in a software system may affect many other parts of the system. A direct impact occurs when the object affected is related by one of the dependencies that fan-in/out directly to/from the Software Lifecycle Object (SLO). This type of impact is also called a first level impact and can be obtained from the connectivity graph. An indirect impact occurs when the object affected is related by the set of dependencies representing an acyclic path between the SLO and affected object. This type of impact is also referred to as an N-level impact where N is the number of intermediate relationships between the SLO and the affected object. 4.5 Probabilistic notions Extending the concept of Bayesian network, an influence diagram [18] (also called a relevance diagram, decision diagram or a decision network) is a compact mathematical representation of a decision situation as a directed acyclic graph. Such diagrams can be used to visualise the probabilistic dependencies in decision analysis and to specify the states of information for which independence can be assumed to exist. Nodes are classified into decision nodes, uncertainty nodes, deterministic, and value nodes (corresponding to a separable utility function). Functional arcs end in a value node, and are used to model parameters of the utility function. Conditional arcs indicate probabilistic relationships between the head and tail nodes of the arcs, and information arcs (ending in a decision node) indicate a decision made when all the inputs are determined beforehand. Such diagrams may be relevant to PERICLES, for instance for deriving decisions on preservation actions from a dependency graph, although some dependencies may require also cyclic graphs e.g. representing two documents that cannot be understood if they are not provided together. A further probabilistic approach to dependency is through dependency networks [19], based on the notion of partial correlation. The approach extracts causal topological relations between the nodes of a directed network and provides an important step in the inference of causal activity relations. The 1 IEEE Std 610.12-1990 © PERICLES Consortium Page 14 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL partial (or residual) correlation [20] is a measure of the effect (or contribution) of a given node on the correlations between another pair of nodes. Using this concept, the dependency of one node on another node, can be calculated for the entire network. 4.6 Policy Dependencies on and between policies are an important subject for PERICLES. Change-impact analysis has been applied extensively in the area of access-control policies. The paper [21] considers access policy change-impact assessment methods based on the XACML access policies. The analysis consumes two policies that span a set of changes and summarises the differences between the two policies. Users can not only examine the summary, but also query it and verify properties of the change. This verification can happen even in the absence of formal properties about the system as a whole (indeed, these properties may not even hold for the entire system). Attributes describe subjects, actions, and resources. The approach uses a change-analysis decision diagram, termed MTBDD (multi-terminal binary decision diagram) as the underlying representation of access-control policies. MTBDDs are a form of decision diagram that map bit vectors over a set of variables to a finite set of results. 4.7 Discussion This survey uncovered a rich set of definitions for dependency relevant for PERICLES, depending on the needs of the topic considered. The concepts defining a dependency range from ones that use abstract notions and properties to the ones that require a concrete realisation of relevant entities. We believe that a good meta-model should allow space for both views. In LRM we define specific classes and metadata that allow abstract descriptions to co-exist with concrete realisations representing the digital objects handled by the preservation system (as described in Section 5). For instance, the Entity class can represent instances that have no concrete materialised form in the realworld while the Digital-resource subclass is defined in the LRM as an entity that must have a digital extension somewhere. Both of them can be related by instances of the Dependency class. Similarly, we proposed a few dedicated metadata classes to capture additional semantics, ranging from textual annotations up to more formalized descriptions (with, possibly, computer-based interpretations). As the LRM is a meta-model, we expect that domain specific ontologies will enrich the semantics of LRM classes in order to address domain specific modelling needs. Another important point is the distinction expressed in the literature between conjunctive and disjunctive dependencies, denoting an intrinsic feature of the dependency semantics. Therefore, we decided to capture these two categories into the LRM by introducing the notion of co-dependency. This notion is based on our choice to model dependency types as classes rather than properties (see Dependencies). This means that we can also use standard logical constructs corresponding to class disjunction and conjunction for the dependencies. Other intrinsic properties of dependencies are inherited from standard relations (i.e. transitivity, symmetry), and will be expressed when we will address the semantics of change, our next step in PERICLES. Also of interest are the various graphical techniques for modelling (probabilistic) relationships. These methods are interesting in the context of PERICLES and they will be more likely explored in specific © PERICLES Consortium Page 15 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL frameworks adapted to this kind of mathematical treatments e.g. based on linear algebra and matrix transformation (see Conclusion and Future Work). Interestingly, we have not identified in the SoTA approaches that identify and specifically address reflexivity (as defined in Rationales and Guiding Principles). We believe this is a fruitful and promising space to be explored in PERICLES through the LRM, and we paid particular attention to letting this possibility open through our current design choices. © PERICLES Consortium Page 16 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 5 Detailed Description of the LRM This section provides an extensive and detailed description of the Linked Resource Model. The source code of LRM is presented using the Turtle syntax (see [22]) and accessible through a zip archive [2]. For the ease of reading, the comments are stripped out from the following excerpts, but still present in the associated code. 5.1 Ontology Preamble, Namespaces The current release of the LRM only imports the PROV ontology [1], thus, the namespaces included refer to the latter (namespace prov) and the LRM ontology itself (namespace pk): 5.2 Digital Resource and associated Descriptors The concept of a digital resource in the LRM specialises the notion of entity as defined in PROV (An entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects; entities may be real or imaginary; [1]) by defining additional constraints. All digital resources that are considered as objects to be represented in a PERICLES ecosystem model: 1. Must be physically located somewhere. That’s a mandatory condition: its digital realisation, or bitstream, must be accessible through one or more location descriptor(s). 2. Must be associated with exactly one LRM identifier that uniquely designates this object inside the LRM instance, irrespective of other external identification mechanisms. Those constraints are captured through the powerful owl:Restriction mechanism, as shown below: © PERICLES Consortium Page 17 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL As mentioned in section 4.7, the above modelling mechanism allows us to represent instances that have no concrete materialised form in the real-world through the Entity class while the Digitalresource subclass is defined in the LRM as an entity that must have a digital extension somewhere. Both of them can be related by instances of the Dependency class. 5.3 Basic Metadata and Properties associated with PERICLES Digital resources We expect that location descriptors and identifiers will be further constrained, if required, by domain-specific ontologies built on top of LRM to provide the precise descriptions that are relevant to the application domain. However, the pk:Description class is more detailed with respect to the information that can be associated with it. The pk:intention property relates a description to a PROV entity that expresses the intended usage of the resource (there can be many of them, as for instance, a user manual); the pk:specification property is structurally similar, but expresses information on the resource itself, as for instance its internal structure, or the convention it follows. We expect that these will be further specialized and/or instantiated to respond to domain-specific needs. © PERICLES Consortium Page 18 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL The following properties relate digital resource instances to their descriptors (location, identification, intent description and specification) 5.4 Dependencies A dependency instance may relate one or many entities to one or more others. To achieve this using RDF, a binary predicate based model, we model Dependency as a class. We refer to the resulting topology as co-dependency in the case that there is more than one entity linked to more than one other entity (see an example of two entities being dependent on two other entities in the Figure below). The pk:from and pk:to properties give an orientation to those co-dependencies. © PERICLES Consortium Page 19 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Note that the “standard” symbolic schema of a co-dependency d between entities Ai and Bi (Bo and B1 simultaneously depends on A0 and A1): d A0 B0 d d A1 B1 d will be expressed this way using LRM: A0 to from B0 d A1 to from B1 The above modelling mechanism is important for a number of reasons: a) it allows us to cover the cases of both conjunctive and disjunctive dependencies (for instance via specialised classes and/or logical constructs such as owl:unionOf) that have been found to be important in the state-of-the-art review (see section 4.7); b) it allows us to express n-ary oriented relations using RDF, a binary predicate based model, one of the requirements mentioned in section 3; c) as pk:Dependency is defined as a subclass of pk:Entity, it inherits the pk:intention and pk:specification properties that link (explained in the section above). This allow us to model one of the most important points highlighted in section 3, namely that “Dependencies in the LRM should be able to capture usage intention”. 5.5 Giving semantics to dependencies Instances of pk:Plan allow detailed definition of the semantics of dependencies. This is what corresponds to the fundamental intention behind any notion of dependency, as discussed in section 3 of this document (Rationales and design principles). pk:Plan is defined as a specialisation of the pk:Description and prov:Plan classes (the PROV ontology proposes a class prov:Plan to describe activities, although its semantics are not very precisely defined). An instance of pk:Plan is characterized through the property pk:how and its sub-property pk:implementedBy which specifies its organization. Whereas pk:how is an informal description, pk:implementedBy is a computer oriented description (it associates an operator to realize the plan). © PERICLES Consortium Page 20 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Note that nothing prevents one from using an unbounded combination of both properties to characterize a plan. When associated with a dependency, plans allow defining the two fundamental dimensions we identified as important to model the dynamics of digital resources: the preconditions (when is it required to trigger the propagation of a change?) and the impact (how depending resources will be impacted). The descriptive means introduced in this subsection allow us to link dependencies to change propagation related notions. This should allow us to compute potential impact in an evolving digital ecosystem (in accordance to Section 3). 5.6 Operators An operator is an executable digital resource allowing creating, reading or deleting digital resources in the ecosystem. The class pk:Operator is both a subclass of prov:SoftwareAgent and of pk:Digitalresource. As such, an operator must be physically located somewhere and its digital extension can be retrieved; an operator can be modelled and handled homogeneously as an intrinsic part of the digital ecosystem, with dependencies and relevant metadata (this illustrates the claimed reflexivity of the LRM). © PERICLES Consortium Page 21 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL We chose to categorize three families of operators based on their impact on the ecosystem. A concrete operator must be specified by a combination of those (which is always possible, as they are not declared as disjoint classes). A pk:Creator instance will create new digital resources and, on the other hand, a pk:Destructor will delete resources. A pk:Reader instance will use resources and may or may not change the ecosystem. As an illustration, the class of XML validators will be a combination of pk:Reader (read the schema, and the input document to be validated against it) and of pk:Creator, if it is configured to write a validation report to be preserved as well (otherwise, the reporting can be ephemeral, as through a computer screen, and it will just be a pk:Reader instance). In order to model the information needed by an operator to perform, the LRM introduces three properties, respectively for defining the input and output parameters, and for the configuration parameters (for this one, the range of the property is not specialized at this stage; this should/can be done in domain specific ontologies). © PERICLES Consortium Page 22 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 5.7 Ontology Metrics This subsection presents some detailed metrics about the current version of the LRM ontology, generated by the well-established Protégé2 ontology editor. Table 1 presents the summary of these metrics, both for the core LRM as well as the LRM extension of PROV. Table 1 Ontology metrics generated by Protégé. The “DL expressivity” metric refers to the Description Logics (DL) variant adopted by the model. Description Logics [23] are a family of knowledge representation formalisms characterised by logically grounded semantics and well-defined reasoning services. DL constitutes the underlying formalism of ontologies and can appear in variants, depending on the adopted features. Indicatively, ALCRIQ(D) encompasses the following features: ● The base language (AL) with complement of any concept allowed (C) - not just atomic concepts. ● Limited complex role inclusion axioms, reflexivity and irreflexivity, role disjointness (R). ● Inverse properties (I). ● Qualified cardinality restrictions (Q). ● Use of datatype properties, data values or data types (D). Table 2 shows a list of metrics regarding the class axioms currently defined in the ontology. As illustrated, excluding subclass axioms, the LRM ontology is not particularly rich at the moment, which is reasonable, since the primary objective at this stage was to provide the static conceptualisation (classes, properties, individuals) necessary to represent the LRM-related dependencies. However, most of the complexity will be introduced in domain modeling activities. Table 2 Class axioms metrics As shown in Table 1 the LRM ontology contains a set of object and data properties for making assertions about the individuals described in the ontology. Further statistics about the ontology properties are shown in Tables 3 and 4, where already a number of axioms have been used to ensure the precise capturing of the property semantics via the use of domain and range property axioms. 2 Protégé ontology editor: http://protege.stanford.edu/ © PERICLES Consortium Page 23 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Table 3 Object property axioms metrics. Table 4 Data property axioms metrics. © PERICLES Consortium Page 24 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 6 LRM Primer One of the main guiding principles of the LRM was that it should extensible (see section 3). This section presents a selection of examples demonstrating how the LRM can be deployed for domain modelling. However, the content of this section is strictly for demonstration reasons, as it is strongly recommended to avoid using the core LRM for domain modeling purposes; instead, one should first create a domain-specific ontology, by extending the LRM and specializing its core constructs. 6.1 Creating Digital Resources As described in Section 5.2, all LRM digital resources (i.e. objects of type pk:Digital-resource) must have: (a) exactly one identifier, and, (b) one or more location descriptors. These requirements are satisfied via two LRM-specific properties: pk:identification and pk:location, respectively. These two properties are “object properties”, meaning that their values will be objects of type pk:Identity and pk:Location-descriptor, respectively. The following is a Turtle fragment describing a digital resource “digres-1”: digres-1 rdf:type pk:identification pk:location pk:Digital-resource id-1 loc-1 . ; ; id-1 rdf:type prov:value pk:Identity "ID001"^^rdfs:Literal . ; loc-1 rdf:type prov:value pk:Location-descriptor "C:\\repository"^^rdfs:Literal . ; The property prov:value provides a literal value that is a direct representation of an entity (the domain of the property is prov:Entity). Figure 3 illustrates a visual representation of the above digital resource, generated with the help of the Protégé OntoGraf plugin3. Figure 3 Visual representation of a digital resource. 6.2 Attaching Descriptions to Digital Resources Digital resources can optionally be associated with descriptions (i.e. objects of type pk:Description) that give information about a digital resource (or an entity in general): why it exists and what it is. 3 Protégé OntoGraf plugin: http://protegewiki.stanford.edu/wiki/OntoGraf © PERICLES Consortium Page 25 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Two optional object properties are defined for this: pk:intention and pk:specification (see Section 5.3). Similarly to the previous example, two new objects of type pk:intention and pk:specification, respectively, have to be created, as illustrated in the following Turtle fragment: desc-1 rdf:type pk:describes pk:intention pk:specification pk:Description ; digres-1 ; int-1 ; spec-1 . int-1 rdf:type prov:value prov:Entity ; "This digital resource was created for ..."^^rdfs:Literal . spec-1 rdf:type prov:value pk:Entity ; "The specifications for this digital resource are ..."^^rdfs:Literal . Descriptions are attached to digital resources through the pk:describes property (which is the inverse of pk:describedBy). Figure 4 illustrates a visual representation of a digital resource’s description. Figure 4 Visual representation of a digital resource’s description 6.3 Creating Dependencies Dependencies are created via the pk:Dependency class (or an appropriate domain-specific specialization/subclass). Since dependencies in LRM are oriented, their two most important elements are object properties pk:from and pk:to, which relate instances of prov:Entity to each other (see Section 5.4). For instance, the dependency of the digital resources “digres-1” and “digres-2” on another digital resource “digres-3” would be represented as: dep-1 rdf:type pk:from pk:to pk:Dependency ; digres-1 , digres-2 ; digres-3 . This dependency reads as: “Resources digres-1 and digres-2 depend on digres-3” and is visually represented as illustrated in Figure 5. © PERICLES Consortium Page 26 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Figure 5 Visual representation of a dependency Note that all three sample digital resources (“digres-1”, “digres-2”, “digres-3”) should have respective identifier and location descriptors, which are, however, omitted from the figure, in order to reduce complexity. A more concrete (i.e. domain-dependent) example of a dependency would be “a piece of compiled Java bytecode depends on the respective Java source code in the case one wants to modify the bytecode object accordingly”, which could be represented in Turtle as follows: java-src rdf:type pk:Digital-resource . # source code java-byte rdf:type pk:Digital-resource . # bytecode java-dep rdf:type pk:from pk:to pk:Compilation-Dependency ; java-src ; java-byte . Note that both the Java source code as well as the bytecode are registered as digital resources. 6.4 Creating Plans As already stated (see Section 5.5), plans offer the means for giving semantics to dependencies. Plans are used for representing the preconditions and impact of a dependency (see Section 5.5) and this is achieved by “attaching” to each dependency a couple of pk:Plan instances via object properties pk:precondition and pk:impact, respectively. For instance, suppose that the dependency “java-dep” introduced in the previous example has the following precondition and impact: ● precondition: The compilation of the Java source code depends on the version of the Java compiler on the host machine. ● impact: The code may no longer compile. The following Turtle fragment represents the precondition and impact of dependency “java-dep”: java-dep pk:precondition pk:impact java-dep-prec ; java-dep-imp . java-dep-prec rdf:type pk:implementedBy pk:Plan ; jc . java-dep-imp rdf:type pk:specification pk:Plan ; # impact “code may not compile” . jc prov:SoftwareAgent . rdf:type © PERICLES Consortium # precondition # compiler Page 27 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 6.5 Representing Operators As already stated, plans are implemented by operators. The core LRM features three (3) types of operators: creators, readers, destructors (see Section 5.7). For instance, the two agents from the previous example (“jc” and “jre”) could be specified both as readers and creators: ● The Java compiler “jc” reads a piece of Java source code (input) and creates a corresponding piece of Java bytecode (output). ● The JRE “jre” reads a piece of Java bytecode (input), the execution of which may lead to the creation of additional digital resources (output) within the ecosystem (e.g. creating a new text file). In Turtle syntax, this could be represented by the following fragment: jc rdf:type pk:inputParameter pk:outputParameter jre rdf:type pk:inputParameter pk:outputParameter text-file-1 rdf:type pk:Creator , pk:Reader ; java-src ; java-byte . pk:Creator , pk:Reader ; java-byte ; text-file-1 . pk:Digital-resource . The above fragment is visually represented as illustrated in Figure 6. Figure 6 Visual representation of a dependency 6.6 Deploying PROV Constructs Since the LRM in its current implementation is an extension of PROV, several constructs of the latter can be deployed in parallel with LRM constructs. This sub-section briefly introduces how some of the key PROV constructs can be used in practice. It should be reminded that the core of the LRM (identified by the pk: prefix) could be made independent from PROV, enabling one to extend the core LRM to their pre-existing ontology of choice. © PERICLES Consortium Page 28 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 6.6.1 Activities PROV activities occur over a period of time and act upon or with entities; activity-related operations may include consuming, processing, transforming, modifying, relocating, using, or generating entities. The following fragment describes a sample activity representing an XSLT transformation of an XML file into an HTML file. xslt-transformation rdf:type prov:used prov:generated prov:startedAtTime prov:endedAtTime prov:wasAssociatedWith prov:wasStartedBy prov:wasInformedBy prov:Activity ; xml-file ; html-file ; "2014-06-15T13:00:00"^^xsd:dateTime ; "2014-06-15T13:10:00"^^xsd:dateTime ; xslt-transformer ; researcher ; xml-generation . xml-file rdf:type pk:Digital-resource . html-file rdf:type pk:Digital-resource . researcher rdf:type prov:Agent . xslt-transformer rdf:type prov:SoftwareAgent . xml-generation rdf:type prov:Activity . Some of the key properties associated with PROV activities are: ● ● ● ● ● Usage on an entity by the activity (property prov:used). Generation of a new entity by the activity (property prov:generated). Start and end time of the activity (properties prov:startedAtTime and prov:endedAtTime). Association with relevant (software) agents (property prov:wasAssociatedWith). Association of the activity’s beginning with a “triggering” agent (property prov:wasStartedBy). ● Association with the outcome from other activities (property prov:wasInformedBy). The interested reader may refer to the rest of the activity-related properties that are available at the W3C PROV specification4. 6.6.2 Activity Roles With respect to an activity, PROV allows defining roles for the involved entities and agents. For example, taking into consideration the “xslt-transformation” activity defined in the previous subsection, suppose that a software agent responsible for publishing the HTML output of the XSLT transformation should be associated with the activity. First of all, the latter should be specialized as a prov:Publish activity (subclass of prov:Activity) as follows: xslt-transformation 4 rdf:type prov:Publish ; prov:qualifiedAssociation assoc-1 . PROV activities: http://www.w3.org/TR/2013/REC-prov-o-20130430/#Activity © PERICLES Consortium Page 29 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL The rest of the activity-related information included in the previous sub-section is here omitted. An association (“assoc-1”, instance of prov:qualifiedAssociation) is attached to the activity, which represents the assignment of responsibility to an agent for this activity, indicating that the agent had a role in the activity. This association is defined as follows: assoc-1 rdf:type prov:Association ; prov:agent publisher ; prov:hadRole role-1 . publisher rdf:type prov:SoftwareAgent . role-1 rdf:type prov:Publisher . Here, a “publisher” software agent is defined, assigned with a specific role called “role-1”, which is an instance of prov:Publisher (subclass of prov:Role) that designates exactly this type of contribution. For demonstration purposes, the following additional example represents the “xml-generation” activity briefly mentioned in the previous sub-section. xml-generation rdf:type prov:qualifiedAssociation prov:Creation ; assoc-2 . assoc-2 rdf:type prov:Association ; prov:agent generator ; prov:hadRole role-2 . generator rdf:type prov:SoftwareAgent . role-2 rdf:type prov:Creator . 6.7 Domain-specific LRM Example For demonstration purposes, this subsection features an example of extending the LRM for representing domain-specific constructs from the Art & Media domain, and, specifically, from the Software-based Artworks subdomain. However, as already stated in this document, extending the LRM for domain-specific representations is out of the scope of this Deliverable and will constitute the topic of future Deliverables (D2.3.2 “Data survey and domain ontologies for case studies” due M32 and D3.5 “Modelling contextualised semantics" due M30). 6.7.1 Extending the LRM As a running example, a software-based artwork called “Brutalism” will be modelled. In order to keep the core LRM classes unaffected by the more specific domain knowledge, the domain ontology that extends the LRM will in essence specialise LRM’s core classes and properties. Towards this direction, the following extensions are implemented: 6.7.1.1 HIERARCHIES OF CLASSES REPRESENTING DOMAIN-RELATED CONSTRUCTS The LRM’s pk:Digital-resource class should be further specialized into a sub-hierarchy of Tate-related classes. Tate-item is the parent class and there should be further subclasses, representing the sub- © PERICLES Consortium Page 30 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL domains within the Tate domain. Currently, only the Software-based subclass is included, as illustrated by the following Turtle fragment: Software-based rdf:type rdfs:subClassOf Tate-item rdf:type rdfs:subClassOf owl:Class ; Tate-item . owl:Class ; pk:Digital-resource . The above description is also illustrated in Figure 7. Figure 7 Visual representation of Tate items hierarchy (currently includes only Software-based artworks). Also, according to the LRM, digital resources are characterized by two mandatory properties: identifier and locator. These are extended in the domain ontology, in order to express more domainspecific notions: Tate-location-descriptor rdf:type rdfs:subClassOf owl:Class ; pk:Location-descriptor . Tate-identity rdf:type rdfs:subClassOf owl:Class ; pk:Identity . And the visual representation of the above fragment is illustrated in Figure 8. Figure 8 Visual representation of the specialization of the location descriptor and the identity for the Tate domain. Dependencies are also further specialized and additional utility classes are added as subclasses of prov:Entity, as shown by Figure 9 below. © PERICLES Consortium Page 31 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Figure 9 Hierarchy of prov:Entity subclasses Notice that the bold-face classes are the ones created within the domain-specific ontology, while the rest of the classes belong to the directly imported LRM ontology or the indirectly imported PROV ontology. As demonstrated in the figure, we have created some indicative explicit specializations of software dependencies (for development platforms, operating systems and programming languages), while more specializations will be added as the ontology is being further developed. Also, these dependencies typically involve respective types of software entities, as observed by the Software-entity sub-tree. 6.7.1.2 A SAMPLE SOFTWARE-BASED ARTWORK The following Turtle fragment demonstrates the modelling of a sample software-based artwork called “Brutalism”: brutalism-2007 rdf:type rdfs:label date-acquired dc:title dc:description foaf:depiction prov:wasAttributedTo pk:identity pk:location Software-based ; "Artwork: Brutalism (2007)"^^rdfs:Literal ; "2011-01-01T00:00:00"^^xsd:dateTime ; "Brutalism"^^rdfs:Literal ; "Jose Carlos Martinat's 'Brutalism"^^rdfs:Literal ; <https://.../brutalism-2007-img-1.png> ; artist-2 ; brutalism-2007-id ; brutalism-2007-loc . artist-2 rdf:type foaf:name prov:Person ; "Jose Carlos Martinat"^^rdfs:Literal . brutalism-2007-id rdf:type prov:value Tate-identity ; "sba-002"^^rdfs:Literal . brutalism-2007-loc rdf:type prov:value Tate-location-descriptor ; "c:\\artworks"^^rdfs:Literal . Figure 10 below illustrates a visualization of the software-based artwork described above. © PERICLES Consortium Page 32 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Figure 10 Visualization of the software-based artwork sample instance. And a sample dependency that involves the artwork would be the following: sw-dependency-2 rdf:type pk:from pk:to OS-dependency ; brutalism-2007-executable ; software-3 . brutalism-2007-executable rdf:type Software-entity ; prov:wasAttributedTo developer-2 . developer-2 rdf:type foaf:name software-3 rdf:type OS ; rdfs:label "Linux Ubuntu"^^rdfs:Literal ; dc:description "Linux Ubuntu operating system."^^rdfs:Literal . prov:Person ; "Arturo Diaz"^^rdfs:Literal . 6.7.2 Navigating the Ontology The ontology developed thus far, along with the created instances, can be viewed/navigated with the help of the “Ontology Browser”, a third-party web application that allows navigating around ontologies online. The “Ontology Browser” was developed by the University of Manchester, within the CO-ODE project5; more information regarding the tool, along with related documentation, is available in [24]. For the purposes of the project, the “Ontology Browser” has been temporarily deployed on a private server [25]6. Figure 11 displays a screenshot of the tool when viewing the sample artwork described above. 5 CO-ODE project: http://www.co-ode.org/ and http://owl.cs.manchester.ac.uk/research/co-ode/ Notice that, since the server is private, there will be occasions when it will be temporarily unavailable. In these cases, please try using the service at a later time. 6 © PERICLES Consortium Page 33 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Figure 11 The “Brutalism” software-based artwork viewed with the “Ontology Browser”. © PERICLES Consortium Page 34 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL 7 Conclusion and Future Work This document describes the first release of the LRM which, as planned, focuses on the notions of digital resource and dependency, seen from a static point of view. This static view introduces descriptive mechanisms that enable defining semantic classes for dependencies. This will serve as a foundation for domain-specific LRM-based models, for instance ones that are being developed in relation with the Art & Media and Space science case studies. Research on the dynamic aspects of dependencies that, as planned, will follow and will be based on this initial static model. Specialised LRM models are being implemented in PERICLES (WP2 and WP4). Illustrative examples are also included in this deliverable (one instance form the Space Science in the source archive [2] and a more developed instance for Art&Media in Section 6.7 (also available independently through an external server [25])). We expect to incrementally develop these instances as progress is made towards modelling the dynamics of change. So far, the LRM view of entities and associated metadata is minimal, on purpose, and deliberately linked to the PROV model. However, we plan to investigate the interest of separating the current model in two parts, one core with its own notion of Entity, and another one able to make links to provenance oriented models, à la PROV. This could be a step towards other intermediate level layers (non-core, but more abstract than domain specific level) able to capture other conceptual approaches to model entities and metadata, such as the one adopted in continuous record keeping [4]. Dependencies, especially when they are more abstract (as opposed for instance to task-based dependencies such as those used in software systems [10]), may require the flexibility to represent the strength of the dependency for instance via weights. In association with (possibly contextsensitive) thresholds, it should be possible to define smart heuristics able to capture and address the inherent complexity of the weighted dependency graph (see for instance a preliminary proposal of how weights could be added to the LRM in WP4 [26]). Therefore, other techniques typically based on linear algebra and graphical operators could be explored and applied to this numerical model of dependencies in order to extract useful information for risk assessment or predictive analysis of changes. Another important aspect of future work relates to WP5 descriptions of ecosystems. How the LRM intersects with those formal descriptions will be further explored as part of the cooperation between WP3 and WP5. To foster collaboration with WP7 around a more operational view of LRM we proposed an initial description of what could be services based on the LRM, through a technical description of a RESTbased API (see [27]). This draft API will evolve depending on the impact of our future work (especially accounting for the semantics of change) and on the feedbacks expressed by our partners. © PERICLES Consortium Page 35 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL Bibliography [1] Yolanda Gil, Simon Miles; eds, “PROV Model primer”, W3C Working Group Note, 30 April 2013. http://www.w3.org/TR/prov-primer/ [2] LRM source code package (zip archive), here [3] McKemmish, Sue, Glenda Acland, Nigel Ward, and Barbara Reed. “Describing Records in Context in the Continuum: The Australian Recordkeeping Metadata Schema.” Archivaria 1, no. 48 (2006). http://journals.sfu.ca/archivar/index.php/archivaria/article/viewArticle/12715 [4] “Recordkeeping Metadata Schema (RKMS) (Information Technology).” Accessed June 24, 2014. www.infotech.monash.edu.au/research/groups/rcrg/projects/spirt/deliverables/austrkmsschemes.html [5] Pearl J. “Causality”. New York: Cambridge, 2000. http://en.wikipedia.org/wiki/Causality [6] PREMIS Data Dictionary for Preservation Metadata (Official Web Site), The Library of congress, USA. http://www.loc.gov/standards/premis/ [7] The Open Provenance Model Core Specification (v1.1) http://eprints.soton.ac.uk/271449/1/opm.pdf . [8] The OPM Provenance Model (web site) http://openprovenance.org/. [9] Tzitzikas, Yannis. “Dependency Management for the Preservation of Digital Information.” In Database and Expert Systems Applications, 582–92. Springer, 2007. http://users.ics.forth.gr/~tzitzik/publications/Tzitzikas_2007_DEXA.pdf [10] Tzitzikas, Yannis, Yannis Marketakis, and Grigoris Antoniou. “Task-Based Dependency Management for the Preservation of Digital Objects Using Rules.” In Artificial Intelligence: Theories, Models and Applications, 265–74. Springer, 2010. http://users.ics.forth.gr/~tzitzik/publications/Tzitzikas_2010_SETN.pdf [11] Conway, Esther, Brian Matthews, David Giaretta, Simon Lambert, Michael Wilson, and Nick Draper. “Managing Risks in the Preservation of Research Data with Preservation Networks.” International Journal of Digital Curation 7, no. 1 (March 12, 2012): 3–15. doi:10.2218/ijdc.v7i1.210. © PERICLES Consortium Page 36 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL http://www.ijdc.net/index.php/ijdc/article/view/200/269 [12] Kirill Fakhroutdinov, 2013, “UML 2.5 diagrams (Dependency)” http://www.uml-diagrams.org/dependency.html?context=class-diagrams [13] Cox, Lisa, Harry S. Delugach, and David Skipper. “Dependency Analysis Using Conceptual Graphs.” In Proceedings of the 9th International Conference on Conceptual Structures, ICCS, 117–30, 2001. http://ceur-ws.org/Vol-41/Cox.pdf [14] Zimmermann, Thomas, and Nachiappan Nagappan. “Predicting Subsystem Failures Using Dependency Graph Complexities.” In Proceedings of the 18th IEEE International Symposium on Software Reliability, 227–36. IEEE, 2007. http://thomas-zimmermann.com/publications/files/zimmermann-issre-2007.pdf [15] Yang, Yang, Marc Geilen, Twan Basten, Sander Stuijk, and Henk Corporaal. “Automated Bottleneck-Driven Design-Space Exploration of Media Processing Systems.” In Proceedings of the Conference on Design, Automation and Test in Europe, 1041–46. European Design and Automation Association, 2010. http://www.ics.ele.tue.nl/~tbasten/papers/date2010yy_final.pdf [16] Penny, David A. “Structured Design.” Software Engineering Courses, Toronto. Accessed June 23, 2014. http://www.cs.toronto.edu/~penny/teaching/csc407-02s/lectures/04structured-design.pdf [17] Bohner, Shawn A. “Extending Software Change Impact Analysis into Cots Components.” In Software Engineering Workshop, 2002. Proceedings. 27th Annual NASA Goddard/IEEE, 175–82. IEEE, 2002. http://www.computer.org/csdl/proceedings/sew/2002/1855/00/18550175.pdf [18] Howard, Ronald A., and James E. Matheson. “Influence Diagrams.” Decision Analysis 2, no. 3 (September 2005): 127–43. doi:10.1287/deca.1050.0020. http://cs.ru.nl/~peterl/BN/influencediagrams05.pdf [19] Wikipedia contributors, "Dependency network," Wikipedia, The Free Encyclopedia, (accessed June 23, 2014). http://en.wikipedia.org/wiki/Dependency_network [20] Dror Y. Kenett, Michele Tumminello, Asaf Madi, Gitit Gur-Gershgoren, Rosario N. Mantegna, and Eshel Ben-Jacob (2010), Dominating clasp of the financial sector revealed by partial correlation analysis of the stock market, ONE 5(12), e15032. © PERICLES Consortium Page 37 / 39 DELIVERABLE 3.2 LINKED RESOURCE MODEL [21] Fisler, Kathi, Shriram Krishnamurthi, Leo A. Meyerovich, and Michael Carl Tschantz. “Verification and Change-Impact Analysis of Access-Control Policies.” In Proceedings of the 27th International Conference on Software Engineering, 196–205. ACM, 2005. http://web.cs.wpi.edu/~kfisler/Pubs/icse05.pdf [22] Eric Prud'hommeaux, Gavin Carothers; eds. “TURTLE: Terse RDF Triple Language”, W3C Recommendation, 25 February 2014. Accessed June 24, 2014. http://www.w3.org/TR/turtle/#sec-intro [23] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel-Schneider, editors, “The Description Logic Handbook: Theory, Implementation, and Applications”, Cambridge University Press, 2003. [24] Ontology-browser: An OWL Ontology and RDF (Linked Open Data) Browser, available at: https://code.google.com/p/ontology-browser/, last access: July’14. [25] PERICLES deployment of draft software-based-art domain ontology: http://kerveroc.gr:8888/browser/ [26] PERICLES, “Initial version of Environment Extraction Tools”, deliverable D4.1, July 2014. [27] Xerox, “Draft proposal on a REST API to LRM based services”, March 2014. © PERICLES Consortium Page 38 / 39
© Copyright 2026 Paperzz