v5-01-Release

v5-02-Release
Peter Hristov
27/02/2012
v5-02-Rev-01
•
•
•
•
Corresponds to the trunk rev. 54887
Works with Root v5-32-01 and Geant3 v1-12
All tests are OK
No significant difference in the memory
consumption with respect to v5-01-Release
• Known issues
– Creation of AliMDC RPM with shared libraries.
Ongoing
– Merging of raw tags. Under investigation
v5-02-Release: Requests
• #91977 Tracking cosmic muons in the TPC and
momentum calculation
• #92135 Port TPC event filtering task to AliRoot
release v5-02
• #92162 Porting request - AliTPCtrackerMI.cxx
v5-01-Release: One more tag needed
for the LHC11h MC production
•
•
•
•
•
•
•
•
•
#87875 Big memory leak in AOD
#91510 Reconstruction able to deal with different triggers
#91718 port AliTOFTenderSupply to release
#91883 Request to port to release (5.01 and 5.02) the
AliQAThresholds related classes
#91889 porting CDB snapshot commits to v5-01-Release
#91931 port to Release AliT0QAChecker
#91932 commit to trunk amd port to Release AliAODTZERO
#91935 Request to port changes in the
AliAnalysisTaskPHOSTriggerQA.* to the Release
#91954 VZERO: Changes related to DQM/QA checker
v5-01-Release: Requests
• #91962 Porting into the release of
TOF/AliTOFTrigger update
• #91976 Wrong attribute in the
AliAnalysisTaskESDfilter
• #91977 Tracking cosmic muons in the TPC and
momentum calculation
• #92135 Port TPC event filtering task to AliRoot
release v5-02
• #92162 Porting request - AliTPCtrackerMI.cxx
Old slides
v5-02-Release
• Modifications from rev. 54391 to 54845 tested
– main tests are OK, no need for manual selection
– this makes the new release suitable for AOD filtering with
tenders
– all porting requests will be solved “in one go”
• Issues
– Creation of AliMDC RPM with shared libraries. In progress
– Creation of PAR files.
– Merging of raw tags. Under investigation, not yet
reproduced
• Next tag: tomorrow, 28/02
– Moving to the usual Savannah porting requests after it
Changes: v5-01-Rev-26
• Adding protection against not initialized
corrected ZDCTDC values from reconstruction.
From rev. 54116
• #91648 To be ported to the release: Fix needed
for simulation of strange baryons in
HIJING+Signals. From rev. 54646
• #91695 Request to port AliQAThresholds.h to the
Release. From rev. 54254
• #91322 Request to port bugfix in the
AliPHOSReconstructor.cxx to the v5-01-Release.
From rev. 54496
Requests: v5-02-Release
• #91482 EMCAL: Port fix for new 2 SM 1/3 to release
5.02
• #91465 Moved calibration object for ESD initialization
• #91421 Request to port PHOS trigger QA task to the
Release
• #91322 Request to port bugfix in the
AliPHOSReconstructor.cxx to the v5-02-Release
• #91253 Request to port bugfix in the
AliAnalysisTaskPHOSPbPbQA.cxx to the v5-02-Release
• #91071 Running on AODs very slow on AAF
Other reports
• #91699 Request on more detailed ITS cluster
information in the ESDs
• #91685 ITS QA porting request
• #91592 ZDC: new geometry to be implemented
• #91512 Trigger aliases for reconstruction
• #91510 Reconstruction able to deal with different
triggers
• #91500 stored TPC only information in AOD049 LHC10h
v5-02-Release
•
•
•
•
•
Branch v5-02-Release is created on 03/02
All tests OK (Root v5-30-00-patches)
Tests with Root v5-32-00-patches: ongoing
Several modifications to be ported
Issues
– Creation of AliMDC RPM with shared libraries
– Creation of PAR files
– Merging of raw tags
Changes: v5-01-Rev-24
• #24646 Re-produce AODs for cascades in pass2
PbPb 2010 (data & MC). Change in the
configuration of vertexingHF from rev.
54533,54561
• #90456: Request for modification of
AliSimulation::ConvertRaw2Sdigits to select
events for embedding. From rev. 54508
• Technical fixes for Root 5-32. From rev. 51946
• Technical fixes for Coverity 11390, 11389, 11388.
From rev. 54588
v5-01-Rev-25
• Technical fix from rev. 50674: Patch for recent
TRefArray->TObjArray usage. Needed for the
MC AOD filtering
v5-01-Rev-23
• #91061: AOD production very slow. From rev.
54460
Changes: v5-01-Rev-22
• #88827: Request for porting updates to TOF QA task into
release. From rev. 54242
• #90320: Request to port additional consistency checks in
HLT TPC cluster decoding to v5-01-Release. From rev.
54055,54056
• #90546: Filtering crashes when processing 2010 MC data.
From rev. 54115
• #90738 Request to port a fix to the release in
AliZDCDigitizer. From rev. 54035
• #90749 ESD Porting Request: GetTPCClusterInfo with
additional switch. From rev. 54081
• #90812 Porting overlap fixes to the release. From rev.
54092
Changes: v5-01-Rev-22
• #90817 Please commit PHOS trigger part in the
AliAnalysisTaskESDfilter.cxx. From rev. 54439
• #90870 Request to port ZDC code to the release. From rev.
54133
• #90916 Request: porting to v5-01-Release of the new ESD>AOD filter. From rev. 52540,53853,54188,54210
• #91005 Fix in AliCTPRawStream.cxx. From rev. 54234
• #91126: Request to commit a patch for AMPT. From rev.
54424
• #91159: Vertex generation in AliGenCorrHF. From rev.
54417
• #91030: 2.76 TeV LHC11a pass3 dca problem. From rev.
54304
Other reports
• #90615 Problems in the material budget,
eta<0.9 and 0.9<eta<1.4
• #90944 adding alignment objects for the
additional supermodules in the geometry
setup 2012
• #90939 Request to include (anti)hypertriton in
ALICE GEANT3
Requests/Additional fixes
• #90625 Memory problem in AliTPCtrackerMI
• #90622 Logic flaw in AliTPCseed. From rev.
53997
• #90616 Worrying message from TPC
reconstruction. From rev. 54237
• Changes in RAW (TClonesArray usage)
v5-02-Release
•
•
•
•
Coverity: 129 defects to be fixed
AliRoot tests: mostly OK
Root v5-32-00-patches: needs tests
PWGs transition: PWG0 and PWG2 still
ongoing
• One library per subdirectory: not yet ready
• Savannah bug reports: many old bugs are still
open
GDB on Grid
• Some potential problems detected and fixed
(ITS, TPC, HLT)
• Some jobs fail in the beginning (event 0-10),
~4%
– Not reproducible locally, even if we run many
reconstruction jobs in parallel
– Always caused by std_badalloc in different places
• Other jobs are killed by the system (memory)
~20%
Changes: v5-01-Rev-21
• #90324: Exception in
AliITStrackerMI::FollowProlongationTree. From
rev. 53978
• #90549: Request to port r53948 to the release
(MUON small leak fix)
• #90658: For v5-01: Option to isolate heavy flavor
part of a Pythia event. From rev. 53959
• #84578: Request to extend AliGenBox for using
Yrange. From rev. 53996
• Optional RB/PX 24 shielding and scoring. From
rev. 53955,53956
Changes: v5-01-Rev-21
• #90461: Request to port a new feature for ZDC
to the release. From rev. 53705
• #90504: EVE muon_init.C update r53875
• #25142: Commit and porting to Release of the
new ESD->AOD filter. From rev. 54021
• #90540: Port 53910,53911 and 53912 to the
Release (Full MC Header in the AOD)
Changes: OCDB
• #90756 Request to port object in RAW OCDB
(for realistic MUON simulations)
• #90736 Calibration of the TRD cosmics of
May,Jun and August
Reconstruction of RAW (LHC11h)
• Back trace problem solved
• Clean-up of the PATH and LD_LIBRARY_PATH
on the GRID
• Clean-up of the AliEn libraries
• Deterministic splitting of the failed jobs (in
preparation)
• New tests in parallel with the Grid production
Changes: v5-01-Rev-20
• #90319: Segmentation violation in
AliPHOSRawFitterv1::~AliPHOSRawFitterv1. From rev. 53869
• #90053: Request: Port bug fix TRD calibration code to release. From
rev. 53734
• #90292: Add line ConvertZDC() in
AliAnalysisTaskESDfilter::ConvertESDtoAOD(). From rev. 53895
• #90307: ZDC QA update. From rev. 52738,53081,53271
• #90309: ZDC request to port code to the release. From rev. 52616
• #90024: port changes in PYTHIA6 for pyquen production (pyquen1.5.F,CMakelib6.4.21.pkg updated), rev.53645
• #90359: Request: fix cached values in ESD. From rev. 53900
• #90013: Vertexing task crashing in trunk. From rev. 53793
• Additional protection. From rev. 53904
LHC11h Pass2 – reconstruction details
•
•
•
•
•
•
•
•
•
•
Use v5-01-Rev-19 in the production
Start in inverse time order (last runs first, “LIFO”): OK
Use MB trigger for CPass0: OK
Exercise the full production setup on runs from “grey
area”: special “gdb” production, run 170593: OK
Run with TPC pools: OK
Work on a local raw file: OK
Use OCDB snapshot: OK
Keep only the rec. points for the current event: OK
Switch off QA: OK
Switch off MUON, if the memory consumption is still
too high
26
Results
• CPass0: 185 jobs, 523,509 out of 539,890 raw
files successfully reconstructed => 97% efficiency
•All runs with mag.field configuration (+ +) ready
(170593-169628)
•Details on losses follow
• Pass2 current status: 131 jobs, 225,568 out of
362,790 files successfully reconstructed => 62.2%
efficiency
27
30.00%
169835
169838
169855
169859
169919
169922
169924
169961
169969
169981
170036
170040
170083
170085
170089
170152
170159
170193
170203
170205
170208
170230
170268
170270
170308
170311
170313
170387
170389
170546
170593
Losses – Pass2
• G_exception – average 6.5%
G_exception (%)
35.00%
Strong run dependency
25.00%
20.00%
15.00%
10.00%
5.00%
0.00%
28
Losses – Pass2 (2)
• Memory overrun – average 16.8%
Memory overrun (%)
40.00%
35.00%
30.00%
Strong run dependency
Function of number of events/chunk and data taking
configuration
25.00%
20.00%
15.00%
10.00%
5.00%
169835
169838
169855
169859
169919
169922
169924
169961
169969
169981
170036
170040
170083
170085
170089
170152
170159
170193
170203
170205
170208
170230
170268
170270
170308
170311
170313
170387
170389
170546
170593
0.00%
29
Losses
• G_exception
• Debugging hard as there is no traceback
• Seems to be random (from syswatch.log)
• Irreproducible in local tests
• No related issues shown by Valgrind
• Appears in the first events of the chunks
• Working with ROOT experts, at least to get the
exception in the logs => special “gdb” run
• Memory overrun
• Additional profiling ongoing
• All external sources are out – gain only possible
through changes in reconstruction
30
Special “gdb” run
• “catch throw” mode
• Several problems discovered, to be submitted
to Savannah. Most probably uninitialized
memory is used as index in an array
– TClonesArray new with placement, where the
index come from GetEntriesFast
– corrupted (?) raw data
– deletion of arrays
Plans
• Continue the investigation of G__exception on
the GRID
• Understand the difference between CPass0
and Pass2 (MB trigger, V0s, cascades?)
• Try to reproduce completely the GRID
execution flow on a local machine
• Resubmit the failed jobs in “split” mode
v5-02-Release
• Complete the transition of the analysis code to
the new modules
• Move every library to a sub-directory and get rid
of *.pkg (native CMake)
• Fix the Coverity defects and compilation warnings
• Solve as much as possible Savannah issues
• Create the branch at the end of January
• First stable tag in February