seminar4-Horova_EN

Horová, Chvála: Netextové objekty jako součást databáze VŠKP
Non-text objects as a component part of
the ETDs database of AMU
Iva Horová
Radim Chvála
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
Horová, Chvála: Netextové objekty jako součást databáze VŠKP
Non-text objects as a component part of
the ETDs database of AMU
1.
2.
3.
4.
5.
6.
Procedure of creation of documents at AMU
Building of the repository
Modifications of the repository
Links of the repository to the environment
Practical demonstration
And what to do next?
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
1. 1 The initial state at AMU similar to other places
Production of text as well as non-text materials
Bachelor projects
Master‘s theses
Dissertation theses
Seminar papers
Year-end projects
Semester projects
And other works
(teaching materials)
3
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
1. 2. Comparison of the situation at AMU with other universties
Common situation at other universities
Graduation from the studies – ONE ETD
Text part
(obligatory)
Various
appendices
Title
Supervisor
Opponents
Annotation
Grading
…….
4
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
1.
Comparison
of the
1. 2.
Výchozí
situace
nasituation
AMU at AMU with other universties
Situation at AMU
Graduation – „qualification performance“ i.e. MORE THAN ONE work
Text part
(obligatory)
Various
appendices
Title
Supervisor
Opponents
Annotation
Grading
…….
5
„Qualification
performance“ 1
„ Qualification
performance“ 2
Various appendices
Various appendices
Another title
Another supervisor
Another opponent
Another annotation
Different grading
Other performers.
and so on.
Another title
Another supervisor
Another opponent
Another annotation
Different grading
Other performers.
and so on.
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
1. 2. Comparison of the situation at AMU with other universties
Specifics of the final works at AMU
EXAMPLES:
• theoretical work + script of a play (text)
• theoretical work + film
• theoretical work + files of photographs
• theoretical work + roles in a theatre plays
• theoretical work + interpretation performance
• theoretical work + pedagogical output
• theoretical work + stage design documentation
different technical quality
bulk amounts of data
6
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
1.
Comparison
of the situation at AMU with other universties
2. 2.Budování
repositáře:
Working classification of ETDs atAMU
KOS: basic types of ETDs:
•
•
•
•
•
Theoretical, i.e. text-type „main“ work – type A
Play, script (text- type, but not the „main“ one) – type B
Film, video – type C
Interpretation performance – type D
Composition – type E
For each type:
• separate form
• created a SEPARATE metadata record
7
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
ASSIGNMNET:
To build
Vybudovat
an institutional
pro AMU
repository
providing
institucionální
access to
repositář
the works
having
s některými
somearchivními
archiving functions.
funkcemi.
The aim is to create a tool to be used for
quick search of the documents and easy
assessment of their attractiveness and
availability.
8
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
2. Building of the repository
• Internal and external legislation
• Selection of types of files to be made
accessible
• Selection of SW for the repository and its
modification
• Workflow
9
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
2.1 Building of the repository – legislation
External legislation
Act No. 111/1998, § 47b – the amendment prescribes
obligatory publishing of ETDs:
• AMU Rector’s Decree No. 2/2006 - On publishing final works at
AMU
• AMU Rector’s Decree No. 3/2006 – Methodology of processing,
storage and making the ETDs accessible
• AMU Rector’s Decree No. 4/2006 – Directive on creation and
formal layout of ETDs
• Aspect of copyright
• Descriptive metadata – standard MS-EVSKP (electronic theses)
• Standards of the bibliographic description (library)
10
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
2.1 Building of the repository – legislation
Internal legislation
Aspect of copyright :
• AMU enters into licence agreements with authors
• There are various degrees defined.
• The author provides rights for a specific work
• The rights are provided at the moment of submitting the
work to the information system (KOS)
• The author has the right the reject access to his work –
then the work will be only archived
• Specific rights provided are also displayed in the
repository
11
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
2.1 Building of the repository – legislation
Other aspects – internal regulations of AMU
- Library catalogue is the entry point for users
- The system must offer:
• exports to the library catalogue as well
• links from the catalogue to the repository
• search of information about related documents
• comfort for the „non-standard“ users
- Te text work is „superior“ over other works although it is
not decisive for the qualification
- Hierarchy of records (mother, daughter)
- Completion of the metadata and the bibliographic
description
12
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
2.2 Selection of suitable SW
In 2008 the DSpace system adopted
Advantages:
• Not costly (open source)
• Easy installation and administration, modifications, localisation
• Support of standards (XML, DC, METS...)
• Support of interoperability – OAI-PHM server
• It supports free as well as secured access (LDAP, …)
• Efficient search mechanism, as well as full text
• AMU is not the only university participating, there are also many
other universities (web, meeting of VŠB TUO,...)
Perzistent identifier - Handle
13
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
2.3 Workflow
Workflow metadata
• Starting point for the collection – Studies information system
– KOS
• Export of the metadata, creation of a record in Dspace
• Assignment of the persistent identifier Handle
• Export to Tinlib
• Completion of the subject description in Tinlib (subject
categories, key words, …) - librarians
• Adding (import) of the subject description to Dspace
• Making it accessible for harvesting (currently for theses.cz MU)
14
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
2.3 Workflow – selection of formats
Formats of digitalised documents
„VIEW“ FORMATS:
• Text, static image and
combined documents
• Sound documents
• Video records
PDF/A-1a
mp3
flv, 720 x 576px D1-PAL, 1500 kbps
Full versions of the non-text works will be available at the departments
The selection of the formats changes – i.e. Decree of the Government of
the Czech Republic No. 1338, of 3 November 2008
15
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
2.3 Workflow
Workflow of full texts (in cooperation with
the Czech Technical University in Prague)
• Conversion of full texts to the defined formats:
• texts, static images – PDF/A-1a - (standardisation necessary
for full text search)
• tool: print2pdf – S602
• Audio – mp3 – it is not a problem
• Video – FLV – a problem in general, but AMU tries to consider whether to
use it, FAMU does not want to accept it
“YouTube” - we follow the trends
• Upload to Dspace – currently manually
• Making the works accessible in accordance with the licence
agreement in Dspace
Full versions are not provided outside the AMU
16
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3. Modification of the Dspace repository:
• Structure of the metadata
• Links between the existing records
• Extraction for the full text search (pdf)
• Other modifications (layout, ...)
17
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.1 Modification of Dspace – structure of metadata
Metadata file can be disseminated during operation
1. NameSpace: Dublin Core from
the basic installation
2. NameSpace: AMU – elements
missing to MS-EVSKP:
a.
b.
c.
d.
e.
Author ID
Author‘s date of birth
Code of the department
Name of the department
ID of studies to which the work
belongs
f. Type of work (forms A, B, C)
18
DC
AMU
MS eVŠKP
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.1 Modification of DSpace – structure of metadata – additional
component parts
19
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.2 Modification of Dspace – links among related records
There are several possibilities :
- To create a virtual object – “final part of the studies”, a
fictious record, URI and to link related objects to it
- To use the relations “superior” / “subordinate”
work “has a part / is a part of“
20
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.2 Modification of Dspace – links among related records
There are several possibilities :
- To create a virtual object – “final part of the studies”, a
fictious record, URI and to link related objects to it
- To use the relations “superior” / “subordinate” work
“has a part / is a part of”
21
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.2 Modification of Dspace – links among related records
The component
dc.relation
used - attributes
hasPart / isPartOf
Text part (A) – SUPERIOR RECORD
– dc.relation.hasPart – „Has a part“
Other types (B, C) – SUBORDINATE RECORDS
- dc.relation.isPartOf - „Is a part of“
22
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.2 Modification of Dspace – links among related records
Text work (A) – superior record
23
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.2 Modification of Dspace – links among related records
Other works (B, C) – subordinate record
24
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.2 Modification of Dspace – links among related records
Other works(B, C) – subordinate record
Dspace – browse:
25
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.3 Modification DSpace– extraction of a text for full text search
26
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.3 Modification DSpace– extraction of a text for full text search
Mediafilter: pdfBox pdfToText
27
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
3.3 Modification DSpace– other modifications, English version
28
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
4. Connection of the repository to its environment
Cooperation with other systems
• Interoperability – OAI-PMH
• Modifications for the Tinlib library system
• Making the metadata accessible for other
harvesters
29
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
4.1 Interoperability - OAI PMH
Harvesting (curently) for „theses.cz“ (MU)
Dspace has its own OAI server (support of the OAIPMH protocol) which secures displaying of the
metadata retrieved in Dublin core
• Java plugin was modified to ensure processing of the
metadata added (MS eVSKP)
• The modification is in the permanent part of the code,
it will not be affected by any other upgrades
30
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
4.2 Modifications for Tinlib
XML file retrieved by export from Dspace is converted by
means of XML /XLST technology (+processor SAXON)
to an import file for Tinlib
Based on the value of the element worktype
<dcvalue element="worktype" qualifier="none">A</dcvalue>
Text work (A) – SUPERIOR RECORD
Monograph
Non-text work (B, C, …) – SUBORDINAL RECORD
 Art icle
31
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
4.3 Making the metadata accessible for other repositores
(Charles University)
Dspace contains a module to display the metadata in
METS/MODS format, containering of related records
32
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
4.3 Making the metadata accessible for other repositores
(Charles University)
Dspace contains a module to display the metadata in the
METS/MODS format, containering of related records
33
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
4.3 Making the metadata accessible for other repositores
(Charles University)
A test with University Computer Centre of Charles University - DigiTool
34
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
And now the
practical
part …
35
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
UPLOAD OF RECORDS ABOUT ETDs
Manual processing
Assignment of work
– department
Details about the
Work - student
Full text - student
Study Information System KOS
PDF/A file
Repository of AMU -> DSpace
Harvest OAI PMH
Library system
librarians -> Tinlib
36
Library system
readers -> Tinweb
National registry of ETDs
„theses“ MU Brno -> the public
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
UPLOAD OF RECORDS ABOUT ETDs
Manual processing
Assignment of work
– department
Details about the
Work - student
Full text - student
Study Information System KOS
PDF/A file
Repository of AMU -> DSpace
Harvest OAI PMH
Library system
librarians -> Tinlib
37
Library system
readers -> Tinweb
National registry of ETDs
„theses“ MU Brno -> the public
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
UPLOAD OF RECORDS ABOUT ETDs
Manual processing
Assignment of work
– department
Details about the
Work - student
Full text - student
Study Information System KOS
PDF/A file
Repository of AMU -> DSpace
Harvest OAI PMH
Library system
librarians -> Tinlib
38
Library system
readers -> Tinweb
National registry of ETDs
„theses“ MU Brno -> the public
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
SEARCH RECORDS
User
National registry
ETDs- THESES
Everything from the
universities in the
Czech Republic
Repository
AMU
DSpace
Library system
Tinweb
Everything
from AMU
Full version– text/view
39
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
SEARCH RECORDS
User
National registry
ETDs- THESES
Everything from the
universities in the
Czech Republic
Repository
AMU
DSpace
Library system
Tinweb
Everything
from AMU
Full version– text/view
40
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
Examples
on line…
41
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
Next…
And what to do next :
In cooperation with the ETDs Working Group and the
Dspace community :
- Terminology
- Archiving – the technical part
- The Relations to be incorporated into the metadata
standard MS-EVSKP
Dspace community:
- Acces rights – structure
- Displaying hierarchy of records
42
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
Terminology
Issues to be discussed
For NON-TEXTS – FULL VERSIONS?:
• Creative artistic activity
• Work of art
• Practical part
and so on.
For the WHOLE:
• Qualification performance
• Assignment for the final work (ETDs)
Will these records be of interests for
theses.cz ?
43
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
Horová, Chvála: Netextové objekty jako součást databáze VŠKP
Thank you for your
attention
Any question?
[email protected]
[email protected]
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
45
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
46
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
47
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
48
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
49
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
50
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
51
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
52
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009
53
Brno 21. 10. 2009: Systémy pro zpřístupňování eVŠKP 2009