Presentation

CEOS - WGISS
BRINGING PROCESSING
CLOSE TO THE DATA
Richard MORENO
15 march 2016
1
SOMMAIRE
SUMMARY
BRINGING PROCESSING
CLOSE TO THE DATA
2

CONTEXT

DATA DOWNLOADING IMPROVEMENT

PROCESSING CLOSE TO THE DATA

BIG DATA AND DISTRIBUTED ARCHITECTURE

IS CLOUD COMPUTING THE SOLUTION ?
CONTEXT
 Big data
Big volume
Big processing
 Need to change
Limitations
the data usage
» Cost of storage of several PB
» Bandwidth resources
No more downloading the full archive or bulk extraction
Bring the processing close to the data
 Copernicus
/ Datacube
 Integration of the Copernicus mirrors
» In Europe
» Worldwide ???
 Need / possibility to federate datacube : cube of cubes
3
DATA DOWNLOADING IMPROVEMENT
 Tools
/ Standards / Interoperabilty – French Coperniccus CollGS
 Natural langage for searching data of interest
 Web services access - opensearch
 Bulk extraction
» Metalink
» Jdownloader

Do not solve
 Bandwidth resource
 Duplication of storage
4
PROCESSING CLOSE TO THE DATA
Different
types of processing
Interactive processing via web services : WPS
Interactive processing via MMI
» Google engine,
» GA Analytics Expression langage
» Notebook (eg. Jupyter)
Mass processing on HPC / Cloud
SandBox for algorithms / processing tuning
5
BIG DATA AND DISTRIBUTED ARCHITECTURE
Is
big data compatible with distributed
architecture ?
Examples
» OGC OWS-10
» ESA and european agencies Federated pilot
» EUCLID project
» Can be generalized ?
» Is centralized platform / cloud the unique solution ?
6
IS CLOUD COMPUTING THE SOLUTION ?
Advantages
and disadvantages of Cloud
computing based archiecture ?
7