Collaborative Trends in Research Data Management

Data management planning
at the DCC
Martin Donnelly
Digital Curation Centre
University of Edinburgh
WRRIF/RoaDMaP event
University of York
24 May 2012
Running order
1.
2.
3.
4.
Policies, Principles, Expectations
The DCC and DMP
DMP Online
Group Exercise
1. Policies, Principles, Expectations
http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
RCUK Common Principles
•
Publicly funded research data are a public good, produced in the public interest,
which should be made openly available with as few restrictions as possible in a
timely and responsible manner that does not harm intellectual property.
•
Institutional and project specific data management policies and plans should be in
accordance with relevant standards and community best practice. Data with
acknowledged long-term value should be preserved and remain accessible and
usable for future research.
•
To enable research data to be discoverable and effectively re-used by others,
sufficient metadata should be recorded and made openly available ....
7 principles agreed by all the UK
research councils in May 2011
http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
UK research funder expectations
• timely release of data
– once patents are filed or on (acceptance for) publication
• open data sharing
– minimal or no restrictions
– deposit in data centres, structured databases, data enclave
• preservation of data
– most funders state expect 5-10+ years
• submission of data management and sharing plans…
Data-related policies
http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
2. The DCC and DMP
We’ve responded to requirements by offering support…
Analysed
requirements
Developed a
Checklist
Provided tools
& guidance
Links to all DMP resources via http://www.dcc.ac.uk/resources/data-management-plans
What is a DMP?
UK research funders typically ask for:
• A short statement/plan submitted in grant applications
• An outline of what you will create/collect, methods,
standards, data management and long-term plans
• How and why – justify your decisions and any limits
Common DMP questions
• What data will be created (format, types) and how?
• How will the data be documented and described?
• How will you manage ethics and Intellectual Property?
• What are the plans for data sharing and access?
• What is the strategy for long-term preservation?
DCC Checklist Coverage
§1: Introduction and Context
§2: Data Types, Formats, Standards and
Capture Methods
§3: Ethics and Intellectual Property
§4: Access, Data Sharing and Re-use
§5: Short-Term Storage and Data Management
§6: Deposit and Long-Term Preservation
§7: Resourcing
Checklist for a Data Management
Plan v3.0 (Donnelly and Jones,
§8: Adherence and Review
March 2011)
§9: Agreement/Ratification by Stakeholders
§10: Annexes
http://www.dcc.ac.uk/resources/data-management-plans
Researcher
Research
Support Office
Data Library / Repository
DATA
MANAGEMENT
…PLAN?
UNRULY
DATA
Computing
Support
Faculty Ethics
Committee
Etc...
DMP-related resources
– “Dealing with Data” (Lyon, 2008)
– Analysis of Funder Policies (Jones, 2009)
– Checklist for a Data Management Plan
(Donnelly and Jones, 2009)
– “How to Develop a Data Management and
Sharing Plan” (Jones, 2011) Edinburgh:
Digital Curation Centre
– “Data Management Plans and Planning”
(Donnelly, 2012) in Pryor (ed.) Managing
Research Data, London: Facet
Links to all DCC resources via http://www.dcc.ac.uk/resources/data-management-plans
Key things to remember
All research projects are different
The DMP will depend upon the nature of
the research AND the context (funder,
domain, institution(s) etc)
DMPs are useful communication tools
3. www.dcc.ac.uk/dmponline
What does
do?
A web-based tool that enables users to...
i.
Create, store and update multiple versions of Data
Management Plans across the research lifecycle
ii. Meet a variety of specific data-related
requirements (from funders, institutions, publishers,
etc.) in a single place
iii. Get tailored guidance on best practice and helpful
contacts, at the point of need
iv. Customise, export and share DMPs in a variety of
formats in order to facilitate communication within
and beyond research projects
Technologies involved (v3.0)
–
–
–
–
Ruby on Rails (v3.1.3)
JavaScript (jQuery v1.7.1)
MySQL database (v5+)
Hosting: University of Edinburgh Information Services
Virtual Hosting (13 managed servers across 2 sites)
– Authentication: registered users with passwords encrypted
in DB (we are also testing Shibboleth for integration with UK
Access Management Federation for Education and Research)
– Various export formats (PDF, DOCX, XLSX, CSV, XML etc)
DMP Online v3.0: May 2012
- Improved user interface, inc. customisable
institutional versions
- New features
-
Overlaying multiple templates for ‘hybrid’ DMPs
Template phases (e.g. pre- / during / post-project)
Granular read / write / share permissions
Shibboleth authentication
Multilingual support / boilerplate text
API for systems interoperability (coming soon)
- Endorsement from funders
Collaborations
- Generic data management guidance (
in
conjunction with
)
- Funder-specific guidance developed in collaboration
with the funders themselves
- Institution-specific guidance developed with key
institutional contacts
- Discipline-specific guidance developed and deployed
with JISC MRD projects (e.g. DMT Psych at York)
- Joint training programmes organised and delivered
by DCC and UKDA
- Provided advice to US consortium
Templates: Stakeholder Liaison (i)
RCUK funders
Status
Arts and Humanities Research Council (AHRC)
Discussions beginning
Biotechnology and Biological Sciences Research Council
(BBSRC)
Discussions ongoing
Engineering and Physical Sciences Research Council
(EPSRC)
No explicit data management plan requirements: DCC
referenced in roadmap requirements
Economic and Social Research Council (ESRC)
Template and guidance developed in collaboration with
ESRC and ESDS. Funder’s online guidance points
applicants towards tool.
Medical Research Council (MRC)
Endorsed by funder and mentioned in their guidance
NERC (Natural Environment Research Council)
Discussions ongoing
Science and Technology Facilities Council (STFC)
DCC resources referenced in data requirements
Other funders
Status
The Wellcome Trust
Template and guidance endorsed by funder
National Science Foundation (US)
Template developed by Sherry Lake, University of
Virginia
Templates: Stakeholder Liaison (ii)
Disciplinary templates
Status
History
Developed in conjunction with University of Hull and University of
Hertfordshire
Psychology
Developed by DMT Psych project, led by University of York
Mechanical Engineering
Developed as part of REDm-MED project, led by University of Bath
Health sciences
Developed by DATUM for Health project, led by University of Northumbria
Spatial information (INSPIRE)
Developed in conjunction with EDINA (UK national data centre) and
trialled with Freshwater Biological Association
Institutional templates
Status
University of Northampton
Developed in collaboration with Information Services department
Many more institutional and subject-based templates
are being developed through the JISC RDM projects
and UMF institutional engagements…
Institutional Engagements:
Putting it into practice
- Working with eighteen institutions over
approximately 18 months to improve data
management capabilities
- A broad variety of institutional types and sizes, from
research intensive ancient universities, to new
universities and specialist institutions (e.g. art
schools)
- Institutions select from a ‘menu’ of tools and
services, e.g. (next slide)
The Menu
Components of a Data
Management Strategy
(Research and Admin)
DCC Tools
DCC Services
Policy
Data Asset Framework
(DAF)
Policy development
Planning
DMP Online
Strategy development
Advocacy
CARDIO
Training
Tools
DRAMBORA
Workflow assessment
Training
Costing
Institutional data catalogues
(discovery)
Workflow connections
DMP Online can also be used in conjunction
with other tools that support the data
management/curation lifecycle, e.g.…
- DAF (Data Asset Framework)
- DRAMBORA (Digital Repository Audit Method
Based On Risk Assessment)
- CARDIO (Collaborative Assessment of
Research Data Infrastructure and Objectives)
Also non-DCC tools:
- LIFE
- Planets tools
- and more
How to connect: six export formats
For human readership…
- Pleasant formatting
- Editable. Can be used
in conjunction with
(e.g. MS Sharepoint)
For machine readership…
- Facilitates quick public
sharing
- Compatible with API for
linking with other
systems
- Minimal formatting
- Removes all formatting
External connections
Systems
Standards / protocols
–
–
–
–
–
–
– CERIF*
CRIS / admin systems
RCUK Je-S system
Institutional Repositories
DDI repository
DMP Tool (US)
Other instances of DMP
Online via federated
model (? -TBC)
– SWORD2
– DDI*
– RDF (? - TBC)
* via RESTful API
4. Group Exercise
In institutional groups:
1. For each of the DMP Checklist
headings, brainstorm all the
stakeholders you think might be
involved (and how/why) – be specific!
2. Remember to consider the different
stages of research – pre-award, inproject, post-project – and think about
how the stakeholders change…
3. How does data management planning
fit into existing workflows? What
would you change?
SECTIONS
§1: Introduction and Context
§2: Data Types, Formats,
Standards and Capture
Methods
§3: Ethics and Intellectual
Property
§4: Access, Data Sharing and Reuse
§5: Short-Term Storage and Data
Management
§6: Deposit and Long-Term
Preservation
§7: Resourcing
§8: Adherence and Review
§9: Agreement/Ratification by
Stakeholders
§10: Annexes
Any questions?
Martin Donnelly
Digital Curation Centre
University of Edinburgh
[email protected]
Twitter: @mkdDCC
www.dcc.ac.uk/resources/data-management-plans
For other DCC services see www.dcc.ac.uk or follow us on twitter @digitalcuration and #ukdcc
This work is licensed under the Creative Commons
Attribution 2.5 UK: Scotland License.
Image credit:
Slide 1 - http://upload.wikimedia.org/wikipedia/commons/8/88/LernaeanHydraRephael.jpg