5Ctorra.pdf

GAIA Data Analysis: Modelling and Data
Reduction
GAIA Data Analysis: Modelling and
Data Reduction
J. Torra, F. Figueras, C. Jordi, X. Luri,
C. Fabricius, E. Masana, B. López-Martí, P. Llimona
University of Barcelona
Sept 13, 2004
JENAM 2004
1
GAIA Data Analysis: Modelling and Data
Reduction
Key Science Objective:
To provide the first statistically significant census of
our galaxy
Origin, Formation and Evolution of the Galaxy
•
Structure and kinematics of our Galaxy
•
Stellar populations
•
Tests of galaxy formation
Sept 13, 2004
JENAM 2004
2
GAIA Data Analysis: Modelling and Data
Reduction
We want:
• An unbiased, on-board selected catalogue of
about 109 objects, containing:
• Positions, parallaxes (ε ≈ 10 µas )
• Proper motions (ε ≈ 10 µas yr -1)
• Radial velocities (ε ≈ 1 - 10 km s-1)
• Photometry (wide (5) and intermediate (12) bands)
Sept 13, 2004
JENAM 2004
3
GAIA Data Analysis: Modelling and Data
Reduction
GAIA Data Analysis
Sept 13, 2004
JENAM 2004
4
GAIA Data Analysis: Modelling and Data
Reduction
Data Analysis: Concept and Requirements
Sept 13, 2004
JENAM 2004
5
GAIA Data Analysis: Modelling and Data
Reduction
Data Reduction processing characteristics
• Global Iterative Solution
• Run on a subset of about 100 million (GAIA) “well-behaved”
astronomical objects
• Process applied to:
9Raw data
9Calibration data
9Attitude data
9Science data
• Instrument calibrations, satellite attitude and scientific results are
simultaneously determined
Sept 13, 2004
JENAM 2004
6
GAIA Data Analysis: Modelling and Data
Reduction
•
GAIA data analysis understood to be a complex task
- Data volume: ≈500 TB data over 5 yrs,
- 1020 flop
- Numerical: 0.1 microarcsec = 10-13 of a circle
- Complexity:
Data ‘mixed’ in time and space due to
scanning motion, very different types of data associated to a given object
• Hipparcos approach (flat files/sequential processing) inappropriate
• Major software engineering infrastructure required
GAIA Data Analysis and Access System
Sept 13, 2004
JENAM 2004
7
GAIA Data Analysis: Modelling and Data
Reduction
GAIA Data Analysis and Acces Study (GDAAS)
Objective: To define an efficient, scalable, maintenable
and useable system for populating the GAIA database
from the satellite data stream allowing not only the data
storage but also the processing of scan data
Challenge: Establish the technical baseline concepts for
the system on realistic basis and prove the feasibility of
the approach chosen for the reduction of the mission.
Sept 13, 2004
JENAM 2004
8
GAIA Data Analysis: Modelling and Data
Reduction
GDAAS1: First design. Ingestion and XM. Rough design of GIS.
Jun 2000- Jun 02
GDAAS2: Implementation of new and more complex
algorithms. Convergence of GIS. Running some shell algorithms.
Aug 2002- Jan 05
GDAAS3: A deeper scientific validation plus a technical design
of the operational concept
Jun 05 - 07
Sept 13, 2004
JENAM 2004
9
GAIA Data Analysis: Modelling and Data
Reduction
Consortium
Project team organisation
ESA/ESTEC
GMV
Prime Contractor
GMV Team
UB Team
CESCA Team
•GMV: Management and Software development
•UB:
Scientific support and customisation of the GAIA simulator
•CESCA: Hardware infraestructure, processing power and on-line support.
Sept 13, 2004
JENAM 2004
10
GAIA Data Analysis: Modelling and Data
Reduction
Prototype Development
• Design
– Design of the GAIA database
• Data Model Refinement
• Data Manipulation Layer
– Design of the Processing Framework
• Implementation
– Database Data Model and Database Manipulation Layer
– Processing algorithms
– Processing Framework
• Testing
– Integration and Validation at CESCA premises
Sept 13, 2004
JENAM 2004
11
GAIA Data Analysis: Modelling and Data
Reduction
Data processing structure: prototype
GASS simulator
Sources
Global Iterative Solution
Telemetry
Stream
Ingestion &
Initial data treatment
- Raw Observations
- Obs2Elem.
- Centroiding
Sept 13, 2004
The GAIA
Database
Attitude Updating
Astrometric updating
Calibration
Global
-
Cross-Matching
JENAM 2004
12
GAIA Data Analysis: Modelling and Data
Reduction
All the sources and observations in a given time
Attitude updating
All the sources and observations in a cal. unit
GAIA
Database
All the observations of a given source
All the sources in the mission time
Sept 13, 2004
JENAM 2004
Calibration updating
Source updating
Global updating
13
GAIA Data Analysis: Modelling and Data
Reduction
Model
Optics: LSF, no chromaticity
Astrometry: α, δ, µα, µδ, π
Calibration: CCD units
Geometric: 2 variables
Photometric: average
Global Parameter: γ, Sun
Orbit: L2
Attitude: Nom. Scan + Nom. Rot + noise pointing
No improvement of observables.
Sept 13, 2004
JENAM 2004
14
GAIA Data Analysis: Modelling and Data
Reduction
GDAAS Phase I Conclusions
•The approach chosen has proved quite succesfully
•O-O + UML tools demonstrated its advantatges in the implementation of
this complex system
•Java has demonstrated to be ideal for the problems posed by the system
•The choice of the DBMS has shown to be a key element
•To get good concurrent performance on ingestion and CM was an expensive task
•GIS complexity increased by the use of wrappers
Sept 13, 2004
JENAM 2004
15
GAIA Data Analysis: Modelling and Data
Reduction
Test Results
• A 4 year mission (up to 13th magnitude) would generate a
DB of about 1.2TB. Assuming a scaling factor of 380 from
13th magnitude to 20th magnitude (ratio of total number of
sources), the final database size would be around 460TB
(not including Spectro data).
• The average ingestion & cross-matching time consumption
for a single processor is about 1.5 hours of processing per
day of observation. Can be easily reduced using distributed
processing.
Sept 13, 2004
JENAM 2004
16
GAIA Data Analysis: Modelling and Data
Reduction
GDAAS Phase II
Sept 13, 2004
JENAM 2004
17
GAIA Data Analysis: Modelling and Data
Reduction
Objectives
• The objective of the Phase II study is to provide complete
confidence in the overall GAIA data processing approach,
identifying interfaces with all foreseen data reduction steps,
implementing and testing an agreed package of algorithms
provided by the wider GAIA community, and demonstrating
scalability to a final data processing system.
Sept 13, 2004
JENAM 2004
18
GAIA Data Analysis: Modelling and Data
Reduction
Ingestion and Initial Data
Treatment
Raw (Telemetry) Data
Decode, Cross-match,
Timing, Ingestion
First-look Tasks
Asses. Payload health
Science Alerts
Core Tasks (GIS)
Calib., attitude,
Astrometry, global par.
GAIA Database
Shell Task n
Shell Task 1
Sept 13, 2004
Off-line tasks:
Pec. objects
Shell Task 2
JENAM 2004
19
GAIA Data Analysis: Modelling and Data
Reduction
Model
Optics: LSF(t,x,y,l), PSF, Chromaticity
Astrometry: α, δ, µα, µδ, π
Calibration: CCD units, Pixel columns
Geometric & Photometric at Large
and short scale: millions of parameters
LSF, PSF, Chrom. determined
Global Parameter: γ, Sun and Planets
Orbit: Lissajous L2
Attitude: Nom. Scan + Nom. Rot + noise pointing
Improvement of observables. Raw data stored.
Sept 13, 2004
JENAM 2004
20
GAIA Data Analysis: Modelling and Data
Reduction
Ingestion and Initial data Treatment:
•Telemetry decoding, streams separation,
•Initial centroiding and flux estimation
•Cross-matching and source creation in the DB
Core Tasks:
•Provisional classification of objects
•Global Iterative Solution: Attitude, astrometric solution,
global parameters and Astro geometric calibration
•Photometric raw data treatment and calibration (Astro &
Spectro)
•Radial velocity raw data treatment and calibration (Spectro)
Sept 13, 2004
JENAM 2004
21
GAIA Data Analysis: Modelling and Data
Reduction
Shell
•Double star analysis (visual, astrometric..)
•Variability analysis
•Exoplanets detection
•Minor planets treatment (identification and orbit det.)
•Derivation of Astrophys. Parameters for stars and QSO
•Radial velocity analysis
•Other....
First-look
•Astrometric first-look analysis (great circle reduction)
•Science alerts (supernova, microlensing, )
Sept 13, 2004
JENAM 2004
22
GAIA Data Analysis: Modelling and Data
Reduction
Recent GIS Test Results
Sept 13, 2004
JENAM 2004
23
GAIA Data Analysis: Modelling and Data
Reduction
Second campaign of of GIS Testing
Test-2
(June-September 2004+)
Data: 18 months of mission data
Model for telemetry data:
measurement errors + nominal values for attitude, calibration, global and
source parameters
Processes: IDT, Source (α, δ, π & µ ), Attitude, Calibration & Global
Parameters for GIS initialisation:
•Raw attitude data: Gaussian random scatter (σ = 1 mas)
•Geometric Calibration: Gaussian random scatter (σ = 1 mas)
•Global: γ = 1.1 (10 % error, translates into about ∼1 mas)
•10 % of primary sources (initial π = 0, absolute parallax):
•Gaussian scatter (α,δ) = 10 mas ,(µα,µδ,µr) = 1 mas/yr
Sept 13, 2004
JENAM 2004
24
GAIA Data Analysis: Modelling and Data
Reduction
Test-2: Preliminary Results
Value of the global
parameter γ after each GIS
iteration.
Sept 13, 2004
JENAM 2004
25
GAIA Data Analysis: Modelling and Data
Reduction
Test-2: Preliminary Results
Mean difference between the
updated and the theoretical
value for the parallax after each
GIS iteration.
Blue symbols: primary sources.
Red symbols: cross-matching
sources.
Sept 13, 2004
JENAM 2004
26
GAIA Data Analysis: Modelling and Data
Reduction
Test-2: First Results
Conclusions:
GIS is converging
Sept 13, 2004
JENAM 2004
27
GAIA Data Analysis: Modelling and Data
Reduction
GDAAS3
Sept 13, 2004
JENAM 2004
28
GAIA Data Analysis: Modelling and Data
Reduction
Objectives for GDAAS3
1. To prove the GIS approach at a level enough to extrapolate
(and validate the approach) to the full mission
2. To implement the shell algorithm GIS at a level enough to
extrapolate (and validate the approach) to the full mission
3. To provide a design of the GAIA operational system
Sept 13, 2004
JENAM 2004
29
GAIA Data Analysis: Modelling and Data
General structure
Reduction
Verification
MBP photometry
Radial velocities
Data Base and GDAAS system
Core Algorithms
Simulator
Sept 13, 2004
Shell 4
Shell 1
GRID
Shell 5
SW-HW asses.
JENAM 2004
Shell 2
Shell 3
Shell n
30
GAIA Data Analysis: Modelling and Data
TASKS
Reduction
Verification. TBD
MBP photometry
TBD
Radial velocities.
MEUDON
Data Base and GDAAS system
Shell 1
GRID
UB, GMV Core Algorithms
UB, GMV, CESCA Shell 5
CESCA
Simulator
UB and others
Sept 13, 2004
Shell 4
SW-HW asses.
ESAC
JENAM 2004
Shell 2
Shell 3
CNES
Shell n
31