Keith - RDA and e-RIs 20161211

1
WG/IG Collaboration Meeting 6
Dec 12-13, NIST, Gaithersburg
'Assembling the Pieces: Connecting Outputs with Each
Other and with Domain Adoption‘
2
e-Research Infrastructures:
A Focus for RDA?
-
Keith G Jeffery
RDA Principle / Ethos
Let 100 flowers blossom
(Mao Zedong 1957)
(usually misquoted as let 1000 flowers
bloom!)
Great for groups generating ideas
Is it good for products
… and products we wish to be ‘joined-up?
… and adopted within and across domains?
3
Already Moving Away from the Principle
 Top-down views
 Data Fabric
 Some clustering
 TAB graphic diagrams clustering groups (Beth Plale)
 Grouping of ‘flowers’ into ‘bouquets’
 Repositories groups
 Metadata groups
 ……..
 Should we do this more consistently?
 And if so on what basis?
4
e-Research Infrastructures
 Although individual
researchers still exist
 Much research is
done in teams using
e-RIs
 Across all disciplines
5
e-Research Infrastructures
 Although individual
researchers still exist
 Much research is
done in teams using
e-RIs
 Across all disciplines
6
e-Research Infrastructures
 Assets from research using eRIs commonly made available
openly for re-use with curation
and provenance
 e-RIs increasingly connecting
together (by domain) e.g.
environment, social science,
humanities, materials science
 Exactly in line with RDA
objectives of making data
available within and across
domains
7
e-RI Dimensions
 Topology: centralised
(CERN) vs distributed (EPOS)
 Domains: particle physics to
arts & humanities
 Utilisation of e-Is (grid and
cloud computing,
supercomputing, network,
detectors)
 Which condition the
requirements of the
communities
8
e-RIs: Requirements for RDA activities/products?
9
Example for discussion
1. Common requirements across all e-RIs (and hence their users)
 Support for interoperation
 Support for provenance and curation
 Support for workflow construction (even simple
query)
 Support for deployment to e-Is
 Support for citation
2. Requirements of specific (groups of?) e-RIs
 Support for instrumentation/detector control and data stream
validation
 Support for particular analytics / simulation / visualisation
Matrix of Requirements by Domains
Requirements







Support for interoperation
Support for workflow
construction (even simple query)
Support for deployment to e-Is
Support for provenance and
curation
Support for citation
Support for
instrumentation/detector control
and data stream validation
Support for particular analytics /
simulation / visualisation
Particle Physics
Arts & Humanities
10
Matrix of Requirements by Domains
Requirements







Support for interoperation
Support for workflow
construction (even simple query)
Support for deployment to e-Is
Support for provenance and
curation
Support for citation
Support for
instrumentation/detector control
and data stream validation
Support for particular analytics /
simulation / visualisation
Particle Physics
Arts & Humanities
11
Let us just unpack one of these as an example:
Provenance
 Need to understand the
relationship between datasets
 In temporal dimension
 Versions
 In intent dimension
 Purpose (why a derived
dataset?)
 In process dimension
 Commands/script
(reflecting intent?)
 Software involved
 Operating environment
involved
 Note relationship to curation
 Involves:





PIDs
Metadata
Repositories
Data fabric/workflow
Data
fabric/deployment
 (and more)
12
Leads to:
 Weighted by domain
priorities
 Weighted by ‘size’ of
domain
 Noting particularly
groups that are
represented
maximally in both
requirement for
product and domains
13
Which RDA groups are
concerned with :
Common requirements across all
e-RIs





Support for interoperation
Support for workflow construction
(even simple query)
Support for deployment to e-Is
Support for provenance and curation
Support for citation
Requirements of specific or
groups of e-RIs


Support for instrumentation/detector
control and data stream validation
Support for particular analytics /
simulation / visualisation
Cluster RDA Groups
Base on the requirements /domains matrix
 Reduces management and coordination load
 Encourages joint thinking and concentration of
expertise
 Provides centres of excellence related to:
 e-RIs (especially those with shared concerns
working together)
 Domains
 A stronger basis for products that are (a)
joined-up; (b) adopted by the communities
14
Move progressively to…
 IG for each cluster
 Long-lived, strategic, steering, foresight
 WGs ‘spun out’
 for specific pieces of work
 of short (18 month-24 month) duration
 yielding products
 Of general (all domains) use
 Specific to particular domains
 Some WGs could be ‘owned’ by >1 IG
15
This suggestion should produce:
• Prioritised products
• Developed by WGs
• Supported longer-term by IGs
• Developers in new rôle
• Based on strength of requirements from
communities
• Encourages adoption
• A transparent and reproducible decision-making
process
• More concentration of expertise
• Joined-up solutions
• Best solutions/products
• Sustainability
16
17