Chapter 14: DARE also means dare

Chapter 14: DARE also means dare
The institutional repository status in the Netherlands as of early 2006
Leo Waaijers
The context
Emotions are the catalyst, technology the enabler and SURF the stimulator of the DARE
Programme in the Netherlands.
During the decades before the DARE Programme, the way in which publishers used their
copyright-based monopoly caused emotions to run high, particularly among librarians. An
annual price rise for subscriptions averaging 11%, combined with strict limitations on how they
were used, was a constant source of irritation. Even though price rises have been cut in half in
the first years of this century, they are still a good deal higher than inflation. Moreover, this
figure is conditional upon there being no cancellations. The restrictions on use are still in place.
The DARE Programme was a welcome answer to the frustrations. Efforts could be directed
towards creating a new situation, one which would give back to the institutions control of their
own intellectual products while also ensuring better access to them. The grumbling at
publishers faded into the background, making room for an ‘open access’ motivation based on
the idea that unrestricted use of academic knowledge is of the utmost importance for progress
in teaching and research and is absolutely indispensable in a knowledge-intensive society. This
viewpoint is a prime source of inspiration for the DARE participants.
Research
Subject repositories, refereed
portals, databases, collaboratories,
(Open Access) journals, ...
Society
Education
Institutional windows, expertise,
professional journals, personal
Web sites, national windows, ...
Virtual Learning Environments,
Course Ware, Readers, ...
Harvesters
Institutional
Repository
Leiden
DFG
Oxford
NIH
CNRS
MIT
.....
[Insert Figure 14.1]
Figure 11.1 Data layer and Services layer in OAI-PMH
Without the Open Archiving Initiative (OAI), DARE would have been inconceivable. This
design, published in Santa Fe in October 1999, goes beyond the use of the internet, web and
XML to practically reinvent academic communication. It makes a crucial distinction between an
open data layer that can be harvested anywhere in the world and a services layer based upon
it (see Awre, this volume). The data layer is in the public domain; the services can be
developed in accordance with a variety of business models (see Figure 14.1).
© Leo Waaijers 2006
2
After just over two years of testing and experimentation, version 2.0 of the OAI-PMH was
released on 14 June 2002. It is anecdotal that, on that same date in the Netherlands, the Board
of SURF Foundation approved the Action Plan for DARE, which is based on the OAI-PMH
technology, taking Dublin Core as the metadata format. An application will seldom have been
adopted more quickly. In the meantime this version 2.0 has shown itself to be very robust and
is still in use all over the world.
A third crucial factor for the success of DARE is that selfsame SURF Foundation. SURF is an
independent organisation founded in 1987 by the joint Dutch universities to give shape to their
collaboration in academic computer centres. The organisation grew successfully and nowadays
also includes the major research institutes and all universities of applied sciences, giving a total
of nearly 60 participants. Its work is certainly no longer limited to networking and
supercomputing, but now covers the entire field of ICT in research, education and
management. An organisation that closely resembles it, and one with which SURF has regular
and intensive collaboration, is JISC in the UK; JISC is a committee of the Higher Education
Funding Council for England (HEFCE).
Once SURF had given its support to the Action Plan for DARE, additional financing was found
fairly quickly, and the programme could be launched on 1 January 2003. The DARE
Programme will run for a period of four years; the budget is €5.9m (€2.2m from SURF, €2.0m
from the Ministry of Education and €1.7m from Pica) and the name stands for Digital Academic
Repositories.
The DARE Programme has seven areas of responsibility. Alongside general management and
communication, an important focal area was specifying the way in which Dublin Core would be
used within DARE. But the largest portion of the funding was reserved to develop decentralised
services, based on the requirement that these services would demonstrate the potential of the
OAI model and would thus help to populate the repositories. Another area of work was focused
on constructing a link with the e-Depot of the National Library of the Netherlands, so that the
material in the repositories could be assured of sustainable storage. Finally, some funds were
reserved for applications used at repositories in education.
The first year
When DARE first started, a number of universities had e-archives; digital storage places for
material such as dissertations or scanned documents from the university’s mine of information.
Some e-archives were left over from earlier projects, and they were not always properly
maintained. None of the archives was OAI-compliant. Most universities had nothing, or at least
no collection of digital documents that was publicly accessible.
The first activity was community building. All universities were asked to appoint a
representative (‘anchorperson’) who would be locally responsible for the implementation of the
DARE Programme. The joint anchorpersons advised SURF at a strategic level. Other aspects
of the consultative structure related to organisation, communication and technology. A
community site was immediately set up at the outset and maintained well; as the programme
progressed, it played a role of increasing importance as a platform for the exchange of
practises, news and opinions.
© Leo Waaijers 2006
3
An important result early on was the document entitled “DARE use of Dublin Core metadata”
(Domingus and Feijen, 2004), in which the DARE partners reached agreement on the use of
the Dublin Core (DC) format. They decided to start with simple DC, but an optional DAREqualified DC was defined as well.
The collaboration became much more material when the participants took on a collective
challenge: all partners would have an operational repository by 1 January 2004, one year after
the start of the DARE Programme. Operational was to mean harvestable, and the proof would
be furnished by a national site showing the joint result. A demonstrator was built for this
purpose, although it would only harvest metadata that were linked to an openly accessible fulltext document. And indeed, DAREnet was officially and festively opened on 27 January 2004.
This remarkable result – the Netherlands was the first and for a long time the only country that
could boast a nationwide network of repositories – came through genuine teamwork, in which
the front runners took pride and pleasure in helping their more ‘needy’ colleagues, who in their
turn acknowledged that they would not have managed without this help.
The start-up phase of DARE has since been described in several articles (van der Vaart, 2004;
van der Kuil and Feijen, 2005).
The second phase
Although some smaller-scale service projects were started while DAREnet was still being built,
this process could only come to full fruition when the data infrastructure was operational. A
broad call for tenders went out in early January 2004 (total amount €2.4m) and 18 projects
were awarded in April. The yield reflected creativity and enthusiasm, but not always
experience. Opinions on the sustainability of the projects submitted were mild. Nevertheless,
new DARE services are emerging.
Alongside all these separate projects, the collective success of DAREnet tasted of more. And
so another ambitious joint project was started in September 2004. Each DARE participant
would collect the complete works of 10 top researchers of its institute, would digitise the
material if necessary by means of scanning, and would place the complete result in the
institutional repository. Within DAREnet, this special collection would be shown as a separate
view under the name of Cream of Science. In this case – exceptionally – DAREnet would also
harvest metadata that were not linked to an open full-text document. This project was also a
success and on 10 May 2005, during a two-day international leadership conference on ‘Making
the Strategic Case for Institutional Repositories’ (Kircz, 2005b), the new site was opened by the
president of the Royal Netherlands Academy of Arts and Sciences, Prof. Frits van Oosterom
(van Oosterom, 2005). During the conference five DARE participants signed the Berlin
Declaration (2003), preceded earlier in the year by two and afterwards followed by two more.
This project taught us important lessons, the most interesting of which was undoubtedly the
enthusiasm of the researchers. We were not at all sure of this before we started, and we used
role-play to practise the counterarguments we would put forward against possible objections. It
turned out not to be necessary. Not only were almost all the researchers who were invited
happy to lend their cooperation; spontaneous registrations also started to flow in. The target of
150 participants was easily surpassed with 207 plus a waiting list of around 30. Recently the
University of Tilburg has decided to add 70 new authors and the University of Utrecht is adding
28 more of their top researchers.
© Leo Waaijers 2006
4
Naturally, copyright was a tricky problem. An important mental breakthrough came when we
adopted the standpoint – with the exception of one category – not to develop a central policy,
but to rely on the party that is in fact the most important in this matter: the authors themselves.
This choice was prompted by the insight that the transfer of copyright to publishers is currently
undergoing rapid development. It is no longer necessary for OA journals, while for the
traditional subscription journals, policy varies per publisher (see the Sherpa/RoMEO list), with
many exceptions being made to this on an ad hoc basis. Springer even surprisingly gave
permission for open access to all Springer articles within the Cream of Science. The only
category for which we did define a central policy concerned material that had been obtained by
scanning paper articles from before 1997. Before then, the copyright in the articles had been
transferred exclusively for publication in a printed journal. The copyright in the digital version
therefore still rested with the author. Many authors were not aware of this. This view was
therefore given wide and fairly emphatic publicity by SURF, naturally among the authors, but
also among the publishers (who have never disputed the standpoint) and the libraries. The final
and remarkable outcome of this agile approach to copyright was that 60% of the complete
works of the Cream of Science could be presented as open full-text documents.
A third crucial lesson was about the need for optimisation, both locally and nationally. The OAI
protocol, in combination with the agreement to use simple DC as the bibliographic format, was
an inadequate foundation on which to build a robust, scalable and efficient service. Locally, a
workflow had to be set up that was compatible with the institutional environment, so that
documents ‘automatically’ find their way into the repository. An extra complication with Cream
of Science was the separate workflow required to deal with the scanned material. In harvesting
on a national scale, the variety of repository software used by the DARE partners (not only
DSpace but also ARNO, i-Tor and a number of local solutions), differences in architecture
(such as the use of sets) and the loose use of DC were the cause of much brain-cudgelling by
the central DARE team (Feijen and van der Kuil 2005).
Nevertheless, the site turned out to be a huge success – so huge that the day after the
opening, the large number of visitors (50,000!) caused the site to give out. A review of the
repository situation in thirteen countries (11 European, US and Australia) showed that by mid2005, DARE was still in a forward position internationally (van Westrienen and Lynch, 2005).
The final phase
The final phase of the DARE programme was defined in September 2005. Based on previous
experiences, it was decided to demonstrate that when the DARE Programme reached its
conclusion, the Netherlands would have an operational production environment of well-filled
institutional repositories. Concrete decisions taken for this included the following: the DOI
(Digital Object Identifier) will serve as identifier for the digital objects; a national system of DAIs
(Digital Author Identifiers) will be introduced. In order to show specific sub-collections within
DAREnet – such as Cream of Science – emphasis will be shifted from constructing dedicated
sets within the local repositories to the use of generic OAI filters. This means concrete uniform
agreements must be made about the use of DC within the DARE community. For instance,
dc:type will distinguish between ‘bachelor thesis’, ‘masters thesis’ and ‘doctoral thesis’. Metis
will play a central role in the entry of metadata at the universities. Metis is an application for the
bookkeeping in research projects. Its data are used to generate the annual report on scientific
research and to record progress of projects or production figures in research. Developed at a
single university, Metis has gradually come to be adopted by all universities in the Netherlands.
Just as in DARE, in Metis the metadata of academic publications are an essential part of the
© Leo Waaijers 2006
5
system. For this reason a link from Metis to DARE was realised in 2005. In its final phase, the
DARE Programme will further attune the two systems.
But the final phase of DARE takes its name, HunDAREd Thousand, from a quantitative
challenge: by the end of 2006 the number of accessible full-text publications in DAREnet will
have risen by 100,000 to a total of 150,000. A related goal is for the number of doctoral theses
to grow from 6,000 to 10,000. These doctoral theses will be shown as a separate view within
DAREnet under the name of Promise of Science. To dimension these figures: the annual
scientific production at Dutch universities is 51,000 publications, 2,500 of which are doctoral
theses.
Future
When the DARE Programme is concluded, the Netherlands will have a robust but elementary
infrastructure of institutional repositories. At that time, there will no longer be any organisational
or technical obstacles to the inclusion of the complete annual academic production of the
Netherlands in the repositories and thus to making them available to numerous services in the
fields of research (journals, refereed portals) and education (learning environment) or for
society (practitioners, the general public). Promising spin offs of the Programme are the
network of educational repositories LOREnet and the European DRIVER project (see Vogel
and Enserink, 2005).
[Insert Figure 14.2]
Figure 14.2 The research-publishing-funding cycle
A DARE follow-up programme will address the development of enhanced publishing: not only
the article itself, containing the research results, will be brought into circulation, but also the
underlying research data, models and visual elements. The metadata needed for this are still
being developed; not only will they relate to the contents, but also the structure, the rights and
the technology of the digital objects. Thanks to the development of the repositories,
researchers and management will be better and better able to live up to the responsibility of the
© Leo Waaijers 2006
6
institutions in respect to making accessible the results of scientific research (paid for by public
money). The steps in this new publication process are shown in Figure 14.2, along with the
possible actors, such as libraries (Waaijers, 2005), in each step.
© Leo Waaijers 2006
7