Coriolis data center

EMODnet - Ingestion and safe-keeping of
marine data
Intro and progress
By
Dick M.A. Schaap – Coordinator
Brussels - Belgium, 8 December 2016, EU DG MARE Meeting
Contract, Partnership and General objective
•
Contract officially from: 19 May 2016 to 19 May 2019
•
Contract budget: 4 Million Euro
•
MARIS = Coordinator; HCMR = Scientific coordinator
•
44 partners: in majority NODCs + coordinators EMODnet thematic lots
•
Consortium Agreement established with all 32 partners; bilateral
subcontracts with 11 out of 12 subcontractors
•
To develop and operate a new EMODnet portal with services that facilitate
data holders from public and private sectors to submit marine data sets for
publishing, further processing and safekeeping by data centres, and
subsequent distribution through EMODnet thematic portals
•
Kick-off meeting 26 – 27 May 2016, Amsterdam – The Netherlands
•
TWG meeting 7 – 8 November 2016, Athens - Greece
Considerations for EMODnet Ingestion
the primary focus is aimed at data providers and their data sets that are not
yet handled and part of the mainstream processes of the EMODnet data
centres:
EMODnet data centres are NODCs, Hydrographic Offices, Geological Services,
Biological institutes, etc, involved in EMODnet Thematic portals (Chemistry,
Geology, Bathymetry, Biology, Physical Oceanography and Human Activities)
and contributing to European infrastructures (SeaDataNet, EurOBIS, EGDI,
ICES);
potential data providers are thus marine data holders that are not yet
routinely submitting data sets to national data centres
they must be convinced and supported to submit their data packages for
open access and use in national data centres and EMODnet.
they are not (yet) used to practices and standards as used by the
international marine data management community.
Submitted data packages must be routed to capable data centres for further
processing, which should result in publishing through the EMODnet thematic
portals
Principle data flow
.
•
Planned components and services over time
•
Data Ingestion portal – M6
•
Data Submission service with logon (User Management) for any data
provider – M6
•
Guidance for suggested formats for specific data types and general
instructions – M6
•
Help desk service - M6
•
Data Tracking service for submitters – M12
•
Submission Summary Records service for any user – M12
•
Data Wanted service for any user – M12
•
Optimised pathways – M18
•
Operation and maintenance of ingestion processes – M7 – M36
Workflow from data submission to publishing
A distinction is made between 2 phases in the life cycle of a data submission
Phase I: from data submission to publishing of the submitted datasets
package as is;
Phase II: further elaboration of the datasets package and integration (of
subsets) in national, European and EMODnet thematic portals.
To make the threshold for submission relatively low it is decided to split
completion of the submission form (ISO19115 – INSPIRE compliant model) in
2 parts:
Part 1 submission form: a number of key fields to be completed by
the Data Submitter, including uploading of a zip file with the datasets and
related documentation;
Part 2 submission form: review of the received datasets package and
part I metadata, and consecutive completion of the additional metadata
fields of the submission form by the appointed Data Centre.
PHASE I: from submission to publishing
Data Submitter logs on to Submission service using MarineID and gets access
to its account (possibly first registers to MarineID)
Data Submitter completes and submits 1st part of the data submission form
and uploads the related data package incl relevant documents
Data package and data submission form are routed by Master to a selected
Data Centre that will be in charge
Selection by theme and country *
Data Centre reviews the data package and submission form and in possible
communication with Data Submitter undertakes following actions:
reviews the 1st part and completes the 2nd part of the data submission
form
releases the full form + original data package for publication
maintains log in conjunction with the data submission form, documenting
processing steps; log is accessible for Data Submitters
* Possible complication: data package can be multi-disciplinary. Work with a
leading Data Centre that might have to split the data package in multiple subsets
to be handled by multiple Data Centres!
Work flow from submission to publishing phase 1
MarineID
service
Authentication +
Authorisation
Data Submission service
Data provider
Completes 1st part
+ uploads data
package
submission
form
part 1
Data repository
Contact
Ready +
Submit
Reviews copy of data package and 1st part form
and completes 2nd part of submission form + log
submission
form
part 1
+
+
data package
Incl
documents
Copy of data
package
Incl
documents
+
submission
form
part 2
URL and unique IDs
HCMR cloud
+
Processing
log
Tracking &
Tracing service
for data providers
+ repositories
Using MarineID
and form IDs
Ready +
Publish
Public Discovery and
Access service
Metadata and
downloading) of original
data
packages
Pathways and tuning with EMODnet thematic
portals
One of the objectives is that the submitted datasets are worked up to Data
Centre standards and included also in national, European and EMODnet
portals
Principle is that Data Centres will be found in the EMODnet Thematic
networks
Bathymetry
Geology
Biology
Ingestion
Portal
Chemistry
Human Activities
Physics
Work flow from submission to publishing – phase 2
Data Centre undertakes steps for further processing and inclusion of the data
package, possibly in further contact with Data Submitter:
further processing at dataset level and working up (parts of) the received
datasets to the level of inclusion in the data management system of the
Data Centre, including detailed metadata
maintaining extra log in the Submission service, documenting processing
steps; log is accessible for data providers through Tracking and Tracing
service from the start of processing
the finalized datasets (possibly parts of original data package) are
published at the Data Centre portals and are taken up for long term
stewardship
Data Centre as next step populates the finalised datasets and related
metadata into the appropriate European infrastructures
the finalized datasets and related metadata are pushed forward from the
European infrastructures towards inclusion and publishing in EMODnet
thematic portals
Data Centre complements the data submission form with details of EMODnet
URLs, including these also in the Public Discovery and Access service.
Work flow from submission to publishing phase 2
Authentication +
Authorisation
MarineID
service
Tracking &
Tracing service
for data providers
+ repositories
Using MarineID
and form IDs
Data Submission service
next steps
Data repository
Analysis and processing at dataset level; maintaining extra
processing log; inclusion in DM systems of data
repositories; populating European systems
submission
form
part 1 + 2
+
Copy of data
package
Incl
documents
+
Processing
Log –
Phase 1
+
Analysis and
processing at
dataset level
Storage and
documentation
incl metadata
Processing
Log –
Phase 2
Links to couple
processed datasets
to original data
submission
Subsets of
data incl
metadata
Ingestion
Discovery
and
Access service
Population
In EMODnet
Thematic
portals
DM systems
repository
Population
In European
infrastructures
SDN, EurOBIS,
ICES, EGDI, COGEA
HCMR cloud
Progress status of planned services
•
Data Ingestion portal – set-up with CMS ready; content filling underway
•
Data Submission service – almost ready for Phase 1 Workflow – testing
underway – includes User Management service in connection with
MarineID service
•
Guidance documents – collection discussed at TWG - text underway
•
Help desk service - ready for launch
•
Data Tracking service for submitters, data centres and master – key fields
implemented in Data Submission service – formulation key indicators and
user interface development underway
•
Submission Summary Records service = Public Discovery and Access
service – will be specified and implemented coming 3 months
•
Data Wanted service – draft specifications for discussion – implementation
by May 2017