What is a Data Management Plan? - and why write one? - Myriam Mertens | Ghent University Library Data management plan (DMP) üdocument outlining how data will be handled during and after a project üincreasingly required by research funders/institutions ügood practice even if not required, because… 2 First step towards good research data management “[RDM] ensure[s] that data are of a high quality, are well organized, documented, preserved, sustainable, accessible and reusable.” (Corti et al. 2014) 3 ‘Research data’ can mean a lot… For example: any information collected/created for the purposes of analysis to generate scientific claims • content: numerical, textual, audiovisual, multimedia... data • data format/object: spreadsheets/tabular data, field notes, databases, images, audio recordings, marked up texts, surveys, instrument readings… • mode of data collection: experimental, observational, simulation, derived/compiled… data • digital or non-digital data • primary or secondary data • raw, processed or analyzed data 4 Problem 1: navigating & using data is a challenge 5 Problem 2: digital data are inherently fragile 6 Problem 3: research data are undervalued & neglected Vines et al. 2013 7 Problem 3: research data are undervalued & neglected Poor verifiability of science The Economist, 19 Oct 2013 Missed opportunities for data reuse Yozwiak et al. 2015 8 The solution: RDM “[…] the compilation of many small practices that make your data easier to understand, less likely to be lost, and more likely to be useable during a project or ten years later.” (Briney 2015) 9 RDM is part of good research practice Expectations include: • secure preservation for a reasonable period • access: as open as possible, as closed as necessary • FAIR principles (Findable, Accessible, Interoperable & Reusable data) • data = legitimate & citable products of research 10 Traditional view of research process Project planning Data collection Data processing & analysis Publication of findings Adapted from Briney 2015 11 Research data lifecycle Project & RDM planning search for data, use data for further research/teaching, cite data… select data to keep, prepare data for preservation, archive data… check data policies, write a DMP, plan consent for sharing… Data discovery & reuse Data collection Data preservation processing release/publish, register, license data, regulate access… capture, organize, document, store & back up data… Data & analysis Publication of findings & data sharing anonymize, document, store & back up data… Adapted from Briney 2015 & Corti et al. 2014 12 DMPs – a mere administrative burden? - takes time and effort upfront, but… + saves time and problems later on + helps consider whole range of RDM activities/issues + makes expectations, tasks, procedures explicit + leads to more informed decisions about data + helps identify resources required (& obtain funding) 13 Common DMP themes 1. description of data to be collected/created (e.g. content, type, format, volume…) “[…] plans typically state what data will be created and how, and outline the plans for sharing and preservation, noting what is appropriate given the nature of the data and any restrictions that may need to be applied.” (DCC website) 2. methodologies, standards for collecting/creating data & data documentation 3. ethics & intellectual property issues 4. (e.g. informed consent, anonymisation, any restrictions on sharing – e.g. confidentiality, copyright, embargoes – , usage licenses…) plans for data sharing & access (e.g. how, when, with whom…) 5. strategy for long-term preservation (e.g. what to keep, for how long, where…) Also see: http://www.dcc.ac.uk/resources/how-guides/develop-data-plan 14 Example – data description “This project will produce qualitative observational data from interviews and fieldwork conducted at various locations across Finland between January and June 2017. Raw data will comprise digital audio recordings of interviews (stored in .flac format), digital images (.tiff format) and hand-written field notes. Audio files will be transcribed into digital text documents (.xml files) and notes will be digitized (via manual transcription to .txt files) to prepare them for analysis.” 15 Example – data documentation, metadata “Descriptive metadata of data items will be captured in XML files in accordance with the Darwin Core schema, which is an international metadata standard for biodiversity data. In addition, datasets will be accompanied by a separate readme.txt file providing study-level documentation including the field methods used for data collection.” More on metadata standards: http://rd-alliance.github.io/metadata-directory 16 Example – data sharing “My dataset will be made available upon publication of the associated journal article via the Zenodo repository. The data in Zenodo will be made open access and licensed under CC-BY. As required by my institution’s research data policy, and to further aid data discovery, the dataset will also be registered in my institutional repository (with basic descriptive metadata, including a link to the data record in Zenodo, made public).” Registry of Research Data Repositories: http://www.re3data.org 17 Example – restrictions on sharing “Because my research data are part of a potentially patentable invention, sharing will be delayed to investigate patent protection first. Before any disclosure can take place, my research results will be reported to my university’s TechTransfer office, which will determine whether releasing data will need to be embargoed until after a patent application has been filed.” Also see EUDAT legal guide: http://wiz.eudat.eu 18 Example – data preservation “Raw data and documentation files will be offered for deposit to 4TU.ResearchData, which is a DSA-certified data repository accepting research data in the field of engineering and preserves them for a minimum of 15 years. Files will be offered in the repository’s preferred formats (.txt, .xml and JCAMP), and as the volume of data does not exceed 10GB the repository will not charge for the deposit.” More on how to select a data repository: https://www.openaire.eu/opendatapilot-repository 19 Tips for writing a DMP • consider it a ‘living’ document • use a template • e.g. Horizon 2020 FAIR DMP • list of public DMP templates https://wiki.surfnet.nl/display/RD/Datam anagementplannen • use an online planning tool • have a look at example DMPs 20 Example plans • examples on the Digital Curation Centre (DCC) website http://www.dcc.ac.uk/resources/data-management-plans/guidance-examples • examples in the Zenodo repository https://zenodo.org/search?page=1&size=20&q=data management plans • public DMPs on the DMPTool website https://dmptool.org/public_dmps • DMPs published in RIO (Research Ideas and Outcomes OA journal) http://riojournal.com/browse_user_collection_documents?collection_id=3 21 Tips for writing a DMP • check applicable data policies • keep it simple, but be as specific as possible • justify your decisions • familiarize yourself with RDM terminology & best practices (for your field) 22 Online RDM training resources • FOSTER training portal • OpenAIRE webinars • EUDAT training materials • Digital Curation Centre How-to Guides & Checklists • UK Data Archive ‘Create & Manage Data’ webpages • MANTRA – Research Data Management Training • ‘Research Data Management and Sharing’ MOOC on Coursera • Data Management Training Clearinghouse 23 Thank you for listening! 24 Credits • slides adapted from S. Jones (2016), ‘What is a Data Management Plan?’, Licensed under CC-BY 4.0 • images • [slide 2]: ‘Writing’ by Aiconica, licensed under CC0 1.0 • [slide 3]: ‘Database’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK • [slide 5]: From ‘Analyzing DMPs to inform Research Data Services’ by A. L. Whitmire, licensed under CC-BY 4.0 • [slide 6]: ‘Day 10: Lost’ by Dave Hill, licensed under CC-BY-NC-SA 2.0; ‘A Domesday system at the Vintage Computer Festival 2010, Bletchley UK’ by Regregex, licensed under CC-BY 3.0 • [slide 7]: ‘Publications and Data’ by Auke Herrema, licensed under CC-BY 4.0; T. H. Vines et al. , ‘The availability of Research Data Declines Rapidly with Article Age’, Current Biology 24 (2014) 1: 94-97. http://doi.org/10.1016/j.cub.2013.11.014 • [slide 8]: ‘How Science goes wrong’, The Economist, 19 Oct 2013. http://www.economist.com/printedition/2013-10-19; N.L. Yozwiak, ‘Data sharing: Make outbreak research open access’, Nature 518 (2015) 7540: 477- 479. https://doi.org/10.1038/518477a • [slide 9]: From ‘RDM: An Overview’ by Research Support Team, IT Services (University of Oxford), licensed under CC-BY-NC-SA 4.0 • [slide 10]: The European Code of Conduct for Research Integrity. Revised Edition (ALLEA – ALL European Academies, 2017). • [slide 13]: ‘Planning’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK • [slide 16]: ‘Metadata’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK • [slide 20]: ‘Preservation plan’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK • [slide 22]: V. Van den Eynden et al., Managing and Sharing Data. Best practice for Researchers (UK Data Archive, 2009), licensed under CC-BY-NC-SA 3.0 • [slide 24]: ‘Knowledge’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK 25
© Copyright 2026 Paperzz