UseCase2-ELI

Challenges Posed by Processing Scientific Data at Extreme
Light Infrastructures
Tamás Gaizer
September 27th, 2016
ELI in a Nutshell
ELI = Extreme Light Infrastructure:
•
pan-European research institution
Three pillars (cutting edge research institutions):
•
•
•
ELI-ALPS (Szeged, Hungary):
ELI-BL (Dolní Břežany, Czech Republic): PW-power laser with high repetition rate
ELI-NP (Măgurele, Romania): Ultrahigh intensity laser beam
Transition from construction to operation:
•
•
•
•
•
All pillars are close to completion
Operation is expected to start in 2017 - 2018
In Szeged, 10 endpoints („SeSo”) will be available for experiments
Construction is run independently at the three pillars, coordinated by ELI-DC
ELITRANS: transformation from ERDF - funded distributed implementation towards
ERIC - governed unified operation
ELITRANS H2020 Project on the ELI roadmap
Transition from distributed implementations
towards integrated and unified operation
Parallel
implementation
initiation
2008
PP
2011 2013
MoU
joint
operation
2017
ELI Delivery
Consortium
ELI-ERIC
ELITrans
Main ELITRANS objectives
Developing concepts for ELI-ERIC’s business model:
essential elements of the future ELI-ERIC’s organisation, legal constitution, financial sustainability,
governance, user relations, and international integration
Preparing ELI-ERIC’s “business plan”, adapted to the operation as the world’s first
international laser user facility
corporate-wide concepts for a VRME, definition and standardisation of user-facility interfaces, health and
safety regulations, specialized experiment preparation techniques, computing and big-data management,
innovation and technology transfer aspects
Preparing and undertaking the merger of formally independent national construction
projects towards one unified research infrastructure of pan-European importance.
This includes steps towards the transformation and unification of ELI’s internal structures and
organisational procedures, the creation of an internal corporate identity, harmonisation and unification of
international relations, creation of a common scientific profile and competitive user research
opportunities.
ELITRANS Facts and Figures
Key objectives:
•
•
•
Developing concepts for ELI-ERIC’s business model
Preparing ELI-ERIC’s “business plan”
Manage the merger of independent developments into one unified RI
Timescale, budget:
•
•
•
September 2015 – August 2018 (36 months)
11 workpackages, one devoted to „Data and computing”
EC funding of 3.4 mEUR
Consortium members:
•
•
•
•
Coordinator: Extreme Light Delivery Consortium ASBL
Pillars: ELI-ALPS, ELI-BL, ELI-NP
E-infrastructures specialised on big data handling: PRACE, KIT, EGI
Strategic partners: DESY, STFC, Elettra
Global View of ELI Research
Process
The model is partially based on: J. Bicarregui: Building an Open Data Infrastructure for Science: Turning Policy into
Practice. Franco-British Workshop on Big Data in Science, November 2012
Data and Computing WP:
Goal and Challenges
Goal of Data and Computing:
•
Prepare the implementation of a common, ELI-wide data management service
layer
•
Define interfaces, integrate to e-infrastructure, manage big data, provide unified access to
users, recommend data models, conduct pilot projects
Challenges:
•
•
•
•
•
State-of-the art research tools open new perspective for acquiring raw data ->
quantity and complexity increases
Exact needs are being assessed now / making predictions is challenging
Expected quantity: 1 – 5 PB scientific data / year / pillar
Different endpoints require different acquisition, management and computation
tools and technologies
Need for online processing (during the experiment)
How to Achieve
Tasks within Data and Computing WP:
•
Task 1: Develop common concepts for data management, identify requirements
•
Task 2: Survey and identify e-infrastructure solutions
•
Task 3: Define the common ELI-wide data management service layer
Envisaged User Scenarios (1):
Generic Data Management
Workflow
Envisaged User Scenarios (2)
•
Would-be users: scientists researchers from the field of physics, chemistry,
biology, nanotechnology
•
General public: data might be made available some time after the experiment
•
Expected number of users: 500-1000 / year
•
Access to the system: on-site during the experiment, remote access in certain
cases
•
„System”: different systems for different endpoints! Needs are changing from
„traditional” scientific computation to HPC
Current Status
•
Every pillar is under development yet
•
Step-by-step installation of experiment endpoints during 2017
•
First „friendly user test” are expected to start in 2017
•
Live operation: during 2018
•
Components related to data and computing: still in the design phase.
Equipment supporting the DAQ process definitely will be kept in-house
•
„Offline” processing, mid- and long term storage might be implemented at
pillar, ERIC, or e-infrastructure
•
Closing Remarks: Benefits
for Scientific Community
•
A unified, ELI-wide data management framework will be applied at every pillar
•
Controlled data management processes, transparent service rules
•
Experimental data and related metadata will be maintained according to
international standards
•
Access to a wide range of state-of-the-art data management and computation
tools
•
Availability of e-infrastructure solutions to support processing and curation of
scientific data
THANK YOU
FOR YOUR
ATTENTION!