Running head: DIGITAL REPOSITORY 1 Introduction Technology has advanced making obsolete the dreaded walk across campus to the library to scourer books and periodicals for hopes of finding relevant information for a research topic. Now a trip to the computer lab or logging on from home will can give you access to digital repositories with digital information from around the world. There are many repositories that are “open” enabling nearly unrestricted access to information, some you must be a member. Eitherway, digital repositories have enabled libraries and historical societies to preserve the past in a format that will not deteriorate from exposure, now rare artifacts can be stored safely in vaults but the information they hold available for use thru digital media. In fact, now that information is available on the global level without stepping on a plane. Digital Repositories are not without potential faults as they keep up with demand, and technology. In this chapter, we will give an overview on digital repository associated with open education environments. It will cover basic descriptions of digital repository, current uses, benefits, challenging/issues, and solutions. We also discuss how the repositories are employed in open access environments. Background Information The institutions in higher education have contributed to produce numerous intellectual outputs, and recently the amounts of intellectual products are increasing much more rapidly than in the past (Heery & Anderson, 2005; McCord, 2003). Deposited content collections of large amounts of data are stored in information environments called 'repositories'. Thus, repositories are important for the institutions in order to collect and manage intellectual assets as a part of their information strategy (Heery & Anderson, 2005; Mark, 2008). Digital and Nondigital Repository Running head: DIGITAL REPOSITORY 2 Traditionally, libraries are used as a typical solution for data preservation (Mark, 2008). This type of repository contains diverse types of data such as books, publications, journals, audio/visual materials, and other instructional materials. The libraries serve as data storage and share it with people who can access the library data physically (Duncan, 2003; Hayes, 2005). Libraries have historically offered repositories of content in paper form. Physical repositories, however, have limitations: storing large amounts of data, spending much of the budget on data maintenance, and only allowing limited access to data (Mark, 2008; Saleh, Adly, & Nagi, 2005). One of the solutions to overcome the limitations is digitalization, which more efficiently preserves data and gives users easier access. Basic Description of Digital Repository These digitalized data require new storage space, so the institutions must be responsible for having digital repositories. The digital repository can be defined as a digital storage space where digital contents and resources are saved (Hayes, 2005). The digital repository is used for searching and retrieving data for further use, as well as storing the digital materials because the repository offers a convenient digital data base using environments which are able to store, search, recall, and reuse digital assets (Hayes, 2005; Zainab, 2010). Thus, the digital repository can be used by many different communities, such as institutions and organizations, where there is a need to store data in digital infrastructure for diverse purposes. Stored contents may vary depending on the communities’ purposes. In the institutions, the repository may typically include digital textbooks, journal articles, research reports, theses/dissertations, and instructional materials (see Figure 1 for a representative structure of a digital repository and to see how digital resources are accessed and managed). Running head: DIGITAL REPOSITORY 3 Figure 1. The structure of digital repository Characteristics and Examples of a Repository The advent of digital technology and high speed networks leads to widespread uses of different types of data storage (Saleh, Adly, & Nagi, 2005). What makes repositories distinctive from others, such as databases, digital libraries, institutional repositories, and digital archives? The digital repository can be differentiated from other digital collections by the following significant characteristics (Heery & Anderson, 2005): Digitalization—The contents must be digitalized within the arranged data format. Running head: DIGITAL REPOSITORY 4 Comprehensive—The repository contains different types of digitalized contents and metadata. Manageable—The repository must offer a set of basic services: access, search, use, store, and manage. Preservation—The repository must be sustainable for preservation. Open access—The repository must provide its contents to the public as open access. However, every digital repository does not satisfy all of these conditions. Each repository has different goals, purposes, functions, and software as shown by the examples of repositories in Table 1 below (Green, 2005). Table 1. Digital repository examples Massachusetts Features: Contents of repository are identified and stored by communities such Institute of as departments, laboratories, and research centers. Primary goal is to Technology “capture, distribute and preserve the intellectual output of MIT and to offer the opportunity to provide access to all the research of the institution through on interface.” Software/Application used: DSpace Contents: Materials must be education-oriented in digital format, and produced by an MIT faculty member. Access URL: http://dspace.mit.edu/ California Features: Faculty, staff, and students at the University of California can deposit Digital contents. People who have permission from the publisher also can put data Library: on the repository. Guideline explains the strategy to “influence scholarly eScholarship communication and provide a publishing platform for electronic journals, Repository and leverage library buying power.” Software/Application used: Berkeley Electronic Press software. Contents: Research or scholarly output such as journals, books, working papers, conference proceedings, and seminar/paper series Access URL: http://escholarship.org/ Georgia Features: Primary purpose of SMARTech repository is to preserve and provide Institute of access to the intellectual output in order to support the research and Technology: educational endeavors of the Georgia Tech community and to other scholars SMARTech all around the world. Contents in SMARTech are open to the public, but Repository requests for the items are reviewed in some cases. Software/Appication used: DSpace Contents: SMARTech repository contains more than 30,000 items including Running head: DIGITAL REPOSITORY University of Michigan: Deep Blue Harvard University: Digital Repository Service theses and dissertations from Georgia Tech. Access URL: http://smartech.gatech.edu/ Features: Faculty, staff, and students at the University of Michigan can deposit their works and scholarly outputs. Guests who want to access data need to request “Friend” account and must be permitted before accessing. Software/Application used: DSpace Contents: Contents in Deep Blue must be educational, artistic, or researchoriented. The items consist of formal and informal publications, books, theses/dissertations, data sets, computer programs, audio visual materials, and learning objects. Access URL: http://deepblue.lib.umich.edu/about/index.html Features: Any Harvard organizational unit and individuals in the Harvard community with organizational sponsor are able to deposit objects. The repository is not intended “to function as a record management system or an institutional repository” (i.e., the repository does not capture all of the research outputs). Software/Application used: The DRS batch loader and Batch Builder application Contents: The DRS allows to deposit “Library-like” objects (intended to support research and pedagogy, or materials with persistent value) Access URL: http://hul.harvard.edu/ois/systems/drs/ Digital Repository and Open Access Benefits of Digital Repositories Digital repositories offer various benefits to researchers, instructors, students, and institutional communities as well as having great potential for data preservation. Digital repositories have many practical benefits: Retaining and managing intellectual assets: The most important function of digital repositories is to retain and manage digital resources efficiently. 5 Ease of access: Digital resources in a repository are available to access via client application/software as well as web browsers which provide quick, easy and remote access to digital resources. Running head: DIGITAL REPOSITORY 6 Open access: Digital repositories offer access for everyone who may be interested in the repositories’ resources. Since the resources are open to service providers such as Google Scholar, data access is allowed to all Web users. Use, re-create, and reuse: Data in the repositories can be used, modified, and reposted. It would facilitate knowledge construction and contribute to new research and education. Wide range of content: Digital repositories contain various types of materials (i.e., audio, video, and image) as well as text-based files (journals, theses, dissertations, magazines, etc.). Permanent data: Digital repositories offer permanent URL naming which minimizes broken link problems. Digital Repository in Open Access Higher-education institutions are employing digital repositories in order to manage and share their educational resources for teaching and learning (Margaryan & Littlejohn, 2008). The advantage of using repositories in the institutions is that it helps institutions to store intellectual assets and share them with others. One of the more important features of employing digital repository in institutions is to share the intellectual products with the public in support of the current worldwide open-access movement. Repositories in institutions may increase opportunities for efficient use of such data and also encourage collaboration with other groups. Thus, the repositories can extend the possibility of content use in open education. Here are some possible uses of digital repositories. Example 1. Assume that a teacher is looking for a geology lesson plan about how the Richter scale is determined, how it is classified, and how frequently earthquakes happen. Running head: DIGITAL REPOSITORY 7 The teacher found a lesson plan in the digital repository and skimmed through it. Then, she recognized that the lesson plan should include some recent data about how frequently earthquakes happen. She gained recent data from other resources in the other repository where she can access without permission. Then, she added it in the lesson plan and uploaded the file on the digital repository for further use. Example 2. Assume that a student is looking for a simulation illustrating the different sidereal period of the planets in the solar system. The student found that BBB science university has abundant materials about astronomy, and he tried to access the digital repository at that university. Although he is not a student at the BBB university, he could reach the information without permission because the repository at the science university is shared with the university, where the student is enrolled. The examples about using digital repositories in the teaching and learning context illustrate small portions of the potential of digital repositories. The repositories can be used for sharing information and encouraging collaboration in diverse communities. Thus, the digital repositories will enable open-education environments for teaching and learning. Users of digital repositories include academic institutions, researchers, scientific community, governments, businesses; educators, historians and any other group who want to preserve information either new or old from around the globe. These groups as defined by the OAIS Reference Model are the “designated community” for a digital repository, the identified group of potential Consumers who should be able to understand a particular set of information (Consultative Committee for Space Data Systems, 2002). Each community has the ability to bring together from a global audience individuals with the same interest, skills or even hobbies. As with the uniqueness of each community’s needs, the motivation for development and the Running head: DIGITAL REPOSITORY 8 governing policies for their repositories can differ. The key services provided by a repository may vary functionally to included: “Enhanced access to resources; New modes of publication and peer review; Corporate information management (records management and content management systems); Data sharing (re-use of research data, re-use of learning objects); Preservation of digital resources” (Heery & Anderson, 2005). There are many repositories which main purpose and mission vary. The two websites below provide vast searchable listings of open repositories. Registry of Open Access Repositories. http://roar.eprints.org: this website provides a searchable list of open repositories. Search filters include: content, country, type and software. “The aim of ROAR is to promote the development of open access by providing timely information about the growth and status of repositories throughout the world. Open access to research maximizes research access and thereby also research impact, making research more productive and effective” (eprints, 2011). SPARC Collected Repositories. The SPARC Collection actually list several sites that provide directories to repositories. Repository 66 (Repository66.org) “This site is a //mashup// of data from //ROAR// and //OpenDOAR// overlayed onto //Google maps//.” (Lewis, 2007) The site displays a Google map and the location of each repository. It is really interesting to get the global view. Key Issues Institutional repositories have emerged over the last decade as a new strategy to allow universities to expand their capabilities and to increase their value to the educational community. Today’s challenges include advancing digital repositories to a point where needs are recognized and defined, technical approaches are at a minimum only superficially mapped out (Lynch, 2003). Running head: DIGITAL REPOSITORY 9 It seems that with increased interest from institutions to form digital libraries so has the interest in defining best practices and guidelines for digital repositories. In reviewing The National Archivers Standard for Record Repositories; The Consultative Committee for Space Data Systems, Reference Model for an Open Archival Information System and Research libraries Group, Trusted Digital Repositories: Attributes and Responsibilites reports there are some common elements that seem to lend to some uniform criteria for development of a successful digital repository: integrity of the digital repository, controlling who has access and circulating the use of the repository, remaining current with technology and digital formatting, and maneuvering copyright laws. Integrity of Digital Repositories A successful digital repository must have integrity; its users must be able to rely upon it to retain information submitted and provide usable material when queried. The academic institutions, national archives, and museums that have it as part of their mission to hold our history, art , scientific data and much more for preservation and the educators, students, or random information seekers who deposit, use, and resubmit information count on the digital information to be uncorrupted over time. As books and artifacts are placed in deep storage or new digital information and history is created and held in digital repositories how do managers ensure that the information is not corrupted? Even more critical is the “born-digital” materials whose value is solely based on it as an information artifact because being digital is not only a method of access but also its identity. (Research Libraries Group (RLG), 2002). For these “borndigital” materials there existence lies in the some memory chip in a server. If the original data is lost, or corrupted, that piece of history or data is essentially gone. In addition to protecting data from corruption, digital repositories have the responsibility to ensure that the data is retrievable Running head: DIGITAL REPOSITORY 10 in some usable format. When information is stored in a repository there are a multiple of formats that the information can be save as. Each format has then different characteristics that can affect how it can be used and how much information can be retrieved by users of the repository. This may vary depending on the community’s needs. For example, a repository may house a report on climate trends, and they will also store the data that goes along with the report. However, based on the format chosen to store the report and data, only the report may be retrievable by users. These decisions made to decide the level of preservation must meet the needs of both the repository and the user. Digital repositories use metadata which is usually defined as “data about data”. Repository managers must decide the level of access users will have to the metadata and original source information. Also, how to ensure proper metadata is used promoting circulation of the digital materials. There are many different standards for metadata and most standards and best practices include the metadata having a placeholder authentication information; unfortunately, they do not all require the use (Research Libraries Group (RLG), 2002). Digital information can be fragile and because of that, the many fears with digital material focus on how easy it is for the information to be changed, duplicated or corrupted. The use of any kind of authentication information in digital material would enable the identification of any such changes. In addition to metadata fields, there are capabilities to digitally fingerprint an item to verify its origin, or imbedded check numbers to trigger notification of changes. LeFurgy (2002) from the U.S. National Archives and Records Administration noted the following in his article on the Levels of Service for Digital Repositories: Current research indicates that digital materials can be managed independent of specific technology. “Persistence” is the term used to indicate the degree to which this is possible. Running head: DIGITAL REPOSITORY 11 For complete persistence, materials must adhere to strict conditions regarding their construction and description. These conditions make it possible to use technology to dynamically recreate a digital object based on explicit and consistent rules defining the object's content, context, and structure. But in a world where few standards govern the technical construction of a digital item (a report can exist in any one of a dozen common file formats) and fewer still govern how an item is described (the report may or may not identify an author, date of issue, or other descriptors), it is realistic to expect that many materials will not fully meet the rigorous conditions for persistence (LeFurgy, 2002). Keeping up with Technology Technology is continually evolving. PDFs are commonly used document formats, but will that change? In the beginning of PDF there were several competing formats what if a new format replaces it like ‘DjVu’ which is promoted as an alternate to PDF and is an open alternative. DjVu was created in 1996 at AT&T Labs to be a file format that would allow for capturing of printed material into high quality digital copies and was targeted at libraries and archives for preserving their books and documents (DjVu). This standard has evolved into the .djv/ .djvu format, which has had growing success and penetration in the online world for eBooks, catalogs, and imagesharing” (DjVu, 2011). A search on Wikipedia for electronic book formats shows tables of 18 file extensions and just as many readers. Which one will win the battle, which will become obsolete? The OAIS model identifies that no matter how well an OAIS maintains its current holdings, it will eventually need to migrate much of its holdings to different media and/or to a different hardware or software environment to keep them accessible (Consultative Committee for Space Data Systems, 2002). The challenge being what digital formats does a repository use that will have the longest life to reduce the need for migrations.In the design of the repository and as the Running head: DIGITAL REPOSITORY 12 community manages and collects materials file formats that allow for the persistence of the repository need to be used. One of the benefits of bringing digital assets into a managed repository framework is the promise of future proofing against technology obsolescence. However, current repository software does not fully support the preservation process. The staff with the necessary skills to undertake this work is few and far between (Heery & Anderson, 2005). Even with strict rules and guidelines as technology advances, the repository needs to incorporate a “technology watch to manage the risk as technology evolves to protect preserved materials and provide continuing access and updated methods of access (Research Libraries Group (RLG), 2002). Thus, at some point in the life of a digital repository the managers will need to look at prospects of having to migrate information in their repository to maintain pace with technology and remain a viable tool for users. It at these points they must also keep in mind the putting their information at risk of corruption. Additionally, what information do they migrate, all of it? Is it all still used? Do they only migrate parts for use and retain the other in some format for preservation? Ownership Ownership and the rights connected to intellectual property can become an issue of concern in the long term preservation of material. The digital content can be put at risk by both the legal restrictions on the information as well as the restrictions on the software itself. Intellectual property includes both the information and the software are covered under copyright. Copyright can limit access to information and the ability to change or reuse the information. Additionally, with the ease of access to the digital information it is easier to copy and possibly infringe on copyright restrictions. Access is provided through licensing arrangements for many digital materials that are considered “intergral” to research collections and archives, an Running head: DIGITAL REPOSITORY 13 organization may own the right to access material or use the software for a specified period there is often no guarantee of rights beyond the terms or the license (Research Libraries Group (RLG), 2002). If a repository looses rights to use software for digital content in their holdings it seemed unclear as to what the longevity of that content would be. Gathering rights metadata and including it in an institutional information system or database will allow users with some basic copyright understanding to make thoughtful judgments about how the law may affect use of the work in accordance with a legal exception (Whalen, 2008). Solutions – Governing policies, cloud computing and cooperation Repositories are amazing resources, literally placing a world of information and knowledge at ones fingertips. In spite of innovation and progress institutions have made in the development of digital repositories there seems to be new hurdles developing on the horizon as technology advances and the sizes of these digital collections grow. As the repository defines policies about its “designated community” within that they should also define the materials to be collected. With knowing that the repository is going to grow from its inaugural deposit, mangers need to define the levels in which information will be stored. To provide effective resource discovery and preservation across distributed repositories there must be agreement on an overall technical architecture: metadata standards, agreed method of linking to digital resources, and common resource discovery protocols (Heery & Anderson, 2005). Institutions will need to decide where there actual digital holdings will be. Some may choose to use on site servers however those are plagued with limited size, limited data storage space, and risk posed by fire or disaster to the space holding the physical server. As another option with the further development of “Cloud” computing repositories can have nearly unlimited digital storage space. As well as peace of mind to know that there information is secure. There are now third party venders such Running head: DIGITAL REPOSITORY 14 as DuraCloud who release their service in 2011. “The service builds on the pure storage from expert storage providers by overlaying the access functionality and preservation support tools that are essential to ensuring long-term access and durability” (DuraSpace). The service they provide actually uses other cloud storage services but they maintain the information on each of the clouds. This provides the customer with layers of backup for their information. Then when one item is updated it shares that with all the other storage locations. Since every cloud storage device at some point has a physical location, that physical location is vulnerable. By utilizing multiple cloud storage services, which they state are commercial and non commercial they have duplicated the information in such a way that a catastrophic lost of one storage device would not wipe out the information. This is of course based on the unlikely hood of an event taking out all servers. LeFurgy (2002) sums up the future needs of digital repositories - “Archivists, librarians, and others with an interest in preserving and making available digital information face an impending paradox”. This stems from the prospect of developing solutions for long-term management of digital records, publications, and other objects (LeFurgy, 2002). A time will come when not all digital information can be or will be saved in easily accessible formats. Consideration for space and organization of the information needs to be considered. In addition to schemes being developed and put in place to prevent redundancy of information. Collaboration among the institutions managing these repositories to share information and share the workload will be an aid to continued success and development (Heery & Anderson, 2005). This coordination along with the interoperability of systems will facilitate innovative work and the future sharing of information. Summary As more information is digitized and collections grow digital repositories will need to Running head: DIGITAL REPOSITORY 15 develop further to meet increase needs. Cloud computing is taking digital repositories to another level providing effective backup information and enabling larger data file collection. When will those systems be bogged down, what will tomorrows needs be for continued success. Digital repositories have enabled global sharing of knowledge. Running head: DIGITAL REPOSITORY 16 References Consultative Committee for Space Data Systems. (2002, January). Reference Model for an Open Archival Information System (OAIS). Retrieved June 1, 2011, from CCSDS.org: http://public.ccsds.org/publications/archive/650x0b1.PDF Crow, R. (2002). The case for institutional repositories: A SPARC position paper. Retrieved from http://www.arl.org/sparc/bm~doc/ir_final_release_102.pdf Duncan, C. (2003). Digital repositories: E-Learning for everyone. Paper presented at the Elearn International Conference, Edinburgh, Scotland, UK. DjVu. What is DjVu. Retrieved June 8, 2011 from http://djvu.org/resources/whatisdjvu.php DjVu. (2011, April 27). In Wikipedia, The Free Encyclopedia. Retrieved June 10, 2011, from http://en.wikipedia.org/w/index.php?title=DjVu&oldid=426261999 DuraSpace (n.d.) DuraCloud. Retrieved June 6, 2011, from DuraSpace.org: http://www/duraspace.org/duracloud.php eprints. (2011). Registry of Open Access Repositories. Retrieved June 2, 2011, from http://roar.eprints.org/ Green, A. (2005). Review of digital repositories (Project Report). Retrieved from Yale University, Integrated Library Technology Services, Research and Planning: http://www.library.yale.edu/iac/documents/DR_Review_final_27Sept05.pdf Hayes, H. (2005, August). Digital repositories: Helping universities and colleges. Community Contributions. Retrieved from http://www.jisc.ac.uk/uploaded_documents/JISC-BPRepository%28HE%29-v1-final.pdf Heery, R. & Anderson, S. (2005). Digital repositories review (Project Report). Retrieved from Joint Information Systems Committee on JISC website: Running head: DIGITAL REPOSITORY 17 http://www.jisc.ac.uk/uploaded_documents/digital-repositories-review-2005.pdf LeFurgy, W. G. (2002, May). Levels of Service for Digital Repositories. D-Lib Magazine, 8 (5). Retrieved from http://www.dlib.org/dlib/may02/lefurgy/05lefurgy.html Lewis, S. (2007). Repository66.org About. Retrieved June 3, 2011, from Repository66: http://maps.repository66.org/blog/about/ Lowry, C. B. (2006). ETDs and digital repositories: A disciplinary challenge to open access?. Libraries and the Academy, 6(4), 387-393. Lynch, C. A. (2003). Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age. ARL (226), 1-7. Retrieved from http://www.arl.org/bm~doc/br226ir.pdf Margaryan, A. & Littlejohn, A. (2008). Repositories and communities at cross-purposes: Issues in sharing and reuse of digital learning resources. Journal of Computer Assisted Learning, 24, 333-347. Mark, J. (2008, May 7). Everybody’s repositories. Retrieved from http://everybodyslibraries.com/2008/05/07/everybodys-repositories-first-of-a-series/ McCord, A. (2003). Institutional repositories: Enhancing teaching, learning, and research. Retrieved from Lawrence Technological University and University of Michigan, EDUCAUSE Evolving Technologies Committee: http://net.educause.edu/ir/library/pdf/DEC0303.pdf Research Libraries Group (RLG). (2002). Trusted Digital Repositories: Attributes and Responsibilities. Mountain View, CA: RLG. Retrieved from http://www.oclc.org/research/activities/past/rlg/trustedrep/repositories.pdf Saleh, I., Adly, N., & Nagi, M. (2005). DAR: A digital assets repository for library collections. Research and Advanced Technology for Digital Libraries, 116-127. Running head: DIGITAL REPOSITORY 18 The National Archives. (2004). Standard for Reocrd Repsotiories, First edition, 2004. Retrived from http://www.nationalarchivers.gov.uk/documents/informaitonmanagement/standard2005.pdf Whalen, M. (2008). Rights Metadata Made Simple. In T. Gill, A. J. Gilliland, M. Whalen, & M. S. Woodley, Introdution to Metadata Online Edition, Version 3.0 (Online edition, Version 3.0 ed.). Los Angleles: Getty Publications. Retrieved from http://www.getty.edu/research/publications/electronic_publications/index.html Zainab, A, N. (2010). Open access repositories and journals for visibility: Implications for Malaysian libraries. Malaysian Journal of Library & Information Science, 15(3), 97-119.
© Copyright 2026 Paperzz