MineterMike-geosmeta-revised

GeosMeta: a prototype metadata and
provenance service
Mike Mineter1, C. Palansuriya2,
J. Nowell2, M.Hagdorn1, T.M.Sloan2, C.J.Place1, M.Jackson2
School of GeoSciences1 EPCC2
[email protected]
Background….
• 3 workshops
– Metadata
– Research data plans
– UK and Univ. Policy
– Archiving
2
Shoot the
messenger….
3
4
Goals
•
•
•
•
•
“Capture” research activity as its done
Gain advantage during the project
Simplify eventual archiving
Enable response to future queries
Support diverse groups
5
Initial focus….
For researchers
who…
• Develop own
software
Trial 3 possibilities
e-Notebook
API to write from scripts
Database
• Use existing
programs/scripts
Workflow –with
provenance, Python
6
Initial focus….
For researchers
who…
• Develop own
software
• Use existing
programs/scripts
Trial 3 possibilities
e-Notebook:
LabTrove
Database:
develop GeosMeta
Workflow: VisTrails
7
GeosMeta
• Near end of phase one development by
EPCC
• Testing has begun
• Trials continuing in September
• Expect to conclude that we have a
foundation
– projects can add/extend
– prototype close to offering a service capability
8
REST
Database
MongoDB
Client
API
Python
Client
CLI
Frontend
EVE
Overview
Server
AA: - with HMAC
and UUN
- roles and research groups
HMAC: Hash-based message authentication code
9
Why use NO-SQL ?
• What we do is document-oriented
– Projects need to easily define metadata fields
– Researchers impose/extend any schema
• Common foundation of AA
• API can serve many research groups
• Why MongoDB?
– Prior experience in EPCC
– Strong support for python
• Growth area for us
• Options for architecture… Django?,… EVE
10
Types of documents
• “Activity”: has input files and output files
for, e.g.:
• “how was file a.dat made?”
• “where was b.nc used?”
• Update a program, rerun all analyses downstream
• Others are up to the research group
– Fieldwork site
– Sample description
– Simulation experiment
11
Example – X-ray tomography
• “Activities” with
expt
– Input data
– Output data
• Metadata
• Statistical
summaries
– Parameter values
– Software version
– …
prog1
prog2
prog3
12
Plans
•
•
•
•
Continue testing
Move to service provision
Stretch range of research
…with
– richer client-side tools
– higher level functions in EVE/MongoDB ?
Thank you!
[email protected]
13