The ATLAS Computing Model: Status, Plans and Future Possibilities Shawn McKee University of Michigan CCP 2006, Gyeongju, Korea August 29th, 2006 Overview The ATLAS collaboration has only a year before it must manage large amounts of “real” ATLAS data for its globally distributed collaboration. ATLAS physicists need the software and physical infrastructure required to: Calibrate and align detector subsystems to produce well understood data Realistically simulate the ATLAS detector and its underlying physics Provide access to ATLAS data globally Define, manage, search and analyze data-sets of interest I will cover current status, plans and some of the relevant research in this area and indicate how it might benefit ATLAS in augmenting and extending its infrastructure. Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 2 The ATLAS Computing Model Computing Model is fairly well evolved, documented in C-TDR http://doc.cern.ch//archive/electronic/cern/preprints/lhcc/public/lhcc-2005022.pdf There are many areas with significant questions/issues to be resolved: Calibration and alignment strategy is still evolving Physics data access patterns MAY be exercised (SC04: since June) Unlikely to know the real patterns until 2007/2008! Still uncertainties on the event sizes , reconstruction time How best to integrate ongoing “infrastructure” improvements from research efforts into our operating model? Lesson from the previous round of experiments at CERN (LEP, 1989-2000) Reviews in 1988 underestimated the computing requirements by an order of magnitude! Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 3 ATLAS Computing Model Overview We have a hierarchical model (EF-T0-T1-T2) with specific roles and responsibilities Data will be processed in stages: RAW->ESD->AOD-TAG Data “production” is well-defined and scheduled Roles and responsibilities are assigned within the hierarchy. Users will send jobs to the data and extract relevant data typically NTuples or similar Goal is a production and analysis system with seamless access to all ATLAS grid resources All resources need to be managed effectively to insure ATLAS goals are met and resource providers policy’s are enforced. Grid middleware must provide this Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 4 ATLAS Facilities and Roles Event Filter Farm at CERN Assembles data (at CERN) into a stream to the Tier 0 Center Tier 0 Center at CERN Data archiving: Raw data to mass storage at CERN and to Tier 1 centers Production: Fast production of Event Summary Data (ESD) and Analysis Object Data (AOD) Distribution: ESD, AOD to Tier 1 centers and mass storage at CERN Tier 1 Centers distributed worldwide (10 centers) Data steward: Re-reconstruction of raw data they archive, producing new ESD, AOD Coordinated access to full ESD and AOD (all AOD, 20-100% of ESD depending upon site) Tier 2 Centers distributed worldwide (approximately 30 centers) Monte Carlo Simulation, producing ESD, AOD, ESD, AOD sent to Tier 1 centers On demand user physics analysis of shared datasets Tier 3 Centers distributed worldwide Physics analysis A CERN Analysis Facility Analysis Enhanced access to ESD and RAW/calibration data on demand Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 5 Computing Model: event data flow from EF Events written in “ByteStream” format by the Event Filter farm in 2 GB files ~1000 events/file (nominal size is 1.6 MB/event) 200 Hz trigger rate (independent of luminosity) Currently 4+ streams are foreseen: Express stream with “most interesting” events Calibration events (including some physics streams, such as inclusive leptons) “Trouble maker” events (for debugging) Full (undivided) event stream One 2-GB file every 5 seconds will be available from the Event Filter Data will be transferred to the Tier-0 input buffer at 320 MB/s (average) The Tier-0 input buffer will have to hold raw data waiting for processing And also cope with possible backlogs ~125 TB will be sufficient to hold 5 days of raw data on disk Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 6 ATLAS Data Processing Tier-0: Prompt first pass processing on express/calibration & physics streams 24-48 hours, process full physics streams with reasonable calibrations Implies large data movement from T0 →T1s, some T0 ↔ T2 (Calibration) Tier-1: Reprocess 1-2 months after arrival with better calibrations Reprocess all local RAW at year end with improved calibration and software Implies large data movement from T1↔T1 and T1 → T2 Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 7 ATLAS partial &“average” T1 Data Flow (2008) Slide from D.Barberis RAW Tape RAW ESD (2x) ESD2 RAW AODm (10x) AODm2 1.6 GB/file 0.02 Hz 1.7K f/day 32 MB/s 2.7 TB/day There are a185KHzsignificant number of flows f/day 720 MB/s to be managed and optimized 0.044 Hz 3.74K f/day 44 MB/s 3.66 TB/day Tier-0 disk buffer ESD1 AODm1 RAW AOD2 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day 1.6 GB/file 0.02 Hz 1.7K f/day 32 MB/s 2.7 TB/day 10 MB/file 0.2 Hz 17K f/day 2 MB/s 0.16 TB/day ESD2 AODm2 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 500 MB/file 0.036 Hz 3.1K f/day 18 MB/s 1.44 TB/day Other T1 Tier-1s T1 Shawn McKee CPU farm ESD2 AODm2 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 500 MB/file 0.004 Hz 0.34K f/day 2 MB/s 0.16 TB/day disk storage ESD2 AOD2 AODm2 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 10 MB/file 0.2 Hz 17K f/day 2 MB/s 0.16 TB/day 500 MB/file 0.004 Hz 0.34K f/day 2 MB/s 0.16 TB/day AODm1 AODm2 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day Plus simulation and analysis data flow ESD2 AODm2 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 500 MB/file 0.036 Hz 3.1K f/day 18 MB/s 1.44 TB/day Each T1 Tier-2 T1 Other T1 Tier-1s T1 The ATLAS Computing Model: Status, Plans and Future Possibilities 8 ATLAS Event Data Model RAW: “ByteStream” format, ~1.6 MB/event ESD (Event Summary Data): Full output of reconstruction in object (POOL/ROOT) format: Tracks (+ their hits), Calo Clusters, Calo Cells, combined reconstruction objects etc. Nominal size 500 kB/event currently 2.5 times larger: contents and technology under revision AOD (Analysis Object Data): Summary of event reconstruction with “physics” (POOL/ROOT) objects: electrons, muons, jets, etc. Nominal size 100 kB/event currently 70% of that: contents and technology under revision TAG: Database used to quickly select events in AOD and/or ESD files Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 9 ATLAS Data Streaming ATLAS Computing TDR had 4 streams from event filter primary physics, calibration, express, problem events Calibration stream has split at least once since! Discussions are focused upon optimisation of data access At AOD, envisage ~10 streams TAGs useful for event selection and data set definition We are now planning ESD and RAW streaming Straw man streaming schemes (trigger based) being agreed Will explore the access improvements in large-scale exercises Are also looking at overlaps, bookkeeping etc Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 10 HEP Data Analysis Raw data hits, pulse heights Reconstructed data (ESD) tracks, clusters… Analysis Objects (AOD) Physics Objects Summarized Organized by physics topic Ntuples, histograms, statistical data Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 11 Production Data Processing Physics Models Trigger System Run Conditions Monte Carlo Truth Data Data Acquisition Detector Simulation Level 3 trigger Calibration Data Raw data Trigger Tags Reconstruction Reconstruction Event Summary Data ESD MC Raw Data Event Tags MC Event Summary Data MC Event Tags coordination required at the collaboration and group levels Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 12 Physics Analysis ESD ESD ESD ESD ESD Event Tags Event Selection Tier 0,1 Collaboration wide Calibration Data Analysis Objects Analysis Raw Data Tier 2 Processing Analysis Groups PhysicsObjects PhysicsObjects PhysicsObjects StatObjects StatObjects StatObjects Tier 3, 4 Physicists Physics Analysis Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 13 ATLAS Resource Requirements in for 2008 Computing TDR Recent (July 2006) updates have reduced the expected contributions CPU (MSI2k) Tape (PB) Disk (PB) Tier-0 3.7 2.1 0.2 CERN AF 2.1 0.3 1.0 Sum of Tier-1s 16.7 6.0 7.6 Sum of Tier-2s 18.9 0.0 6.1 Total 41.4 8.4 14.9 Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 14 ATLAS Grid Infrastructure Plan “A” is “the Grid”…there is no plan “B” ATLAS plans to use grid technology To meet its resource needs To manage those resources Three grids LCG Nordugrid OSG Significant resources, but different middleware Teams working on solutions are typically associated to a grid and its middleware In principle all ATLAS resources are available to all ATLAS users Works out to O(1) cpu per user Interest by ATLAS users to use their local systems with priority Not only a central system, flexibility concerning middleware Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 15 ATLAS Virtual Organization Until recently the Grid has been a “free for all” no CPU or storage accounting (new in a prototyping/testing phase) no or limited priorities (roles mapped to small number of accounts: atlas01-04) no storage space reservation Last year ATLAS saw a competition for resources between “official” Rome productions and “unofficial”, but organized, productions B-physics, flavour tagging... The latest release of the VOMS (Virtual Organisation Management Service) middleware package allows the definition of user groups and roles within the ATLAS Virtual Organisation and is used by all ATLAS grid flavors! Relative priorities are easy to enforce IF all jobs go through the same system For a distributed submission system, it is up to the resource providers to: agree to the policies of each site with ATLAS publish and enforce the agreed policies Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 16 Calibrating and Aligning ATLAS Calibrating and aligning detector subsystems is a critical process Without well understood detectors we will have no meaningful physics data The default option for offline prompt calibrations is processing at Tier-0 or at the Cern Analysis Facility, however the TDR states that: “Tier-2 centres will provide analysis facilities, and some will provide the capacity to produce calibrations based on processing raw data”. “Tier-2 facilities may take a range of significant roles in ATLAS such as providing calibration constants, simulation and analysis”. “Some Tier-2s may take significant role in calibration following the local detector interests and involvements”. ATLAS will have some subsystems utilizing Tier-2 centers as Calibration and Alignment sites. Must insure we can support the data flow without disrupting other planned flows Real-time aspect is critical – the system must account for “deadlines” Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 17 Proposed ATLAS Muon Calibration System (quoted bandwidths are for 10 KHz muon rate) Thread Thread Thread L2PU L2PU Thread Thread TCP/IP, UDP, etc. Control Network 2 ~ 500 kB/s Local Server 4 Local Server x ~20 3 1 5 Server Memory =Thread queue Thread x 25 ~ 10 MB/s 6 Gatherer Calibration Server Dequeue Calibration farm ~ 500 kB/s Local Server disk Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 18 ATLAS Simulations Within ATLAS the Tier-2 centers will be responsible for the bulk of the simulation effort. Current planning assumes ATLAS will simulate approximately 20% of the real data volume This number is dictated by resources; ATLAS may need to find a way to increase this fraction Event generator frame work interfaces multiple packages including the Genser distribution provided by LCG-AA Simulation with Geant4 since early 2004 automatic geometry build from GeoModel >25M events fully simulated up to now since mid-2004 only a handful of crashes! Digitization tested and tuned with Test Beam Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 19 ATLAS Analysis Computing Model ATLAS Analysis model broken into two components Scheduled central production of augmented AOD, tuples & TAG collections from ESD Derived files moved to other T1s and to T2s Chaotic user analysis of augmented AOD streams, tuples, new selections etc and individual user simulation and CPUbound tasks matching the official MC production Modest to large(?) job traffic between T2s (and T1s, T3s) Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 20 Distributed Analysis At this point emphasis is on a batch model to implement the ATLAS Computing model Interactive solutions are difficult to realize on top of the current middleware layer We expect ATLAS users to send large batches of short jobs to optimize their turnaround Scalability Data Access Analysis in parallel to production Job Priorities Distributed analysis effectiveness depends strongly upon the hardware and software infrastructure. Analysis is divided into “group” and “on demand” types Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 21 ATLAS Group Analysis Group analysis is characterised by access to full ESD and perhaps RAW data This is resource intensive Must be a scheduled activity Can back-navigate from AOD to ESD at same site Can harvest small samples of ESD (and some RAW) to be sent to Tier 2s Must be agreed by physics and detector groups Group analysis will produce Deep copies of subsets Dataset definitions TAG selections Big Trains Most efficient access if analyses are blocked into a ‘big train’ Idea around for a while, already used in e.g. heavy ions Each wagon (group) has a wagon master=production manager Must ensure will not derail the train Train must run often enough (every ~2 weeks?) Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 22 ATLAS On-demand Analysis Restricted Tier 2s and CAF Could specialize some Tier 2s for some groups ALL Tier 2s are for ATLAS-wide usage Role and group based quotas are essential Quotas to be determined per group not per user Data Selection Over small samples with Tier-2 file-based TAG and AMI dataset selector TAG queries over larger samples by batch job to database TAG at Tier-1s/large Tier 2s What data? Group-derived EventViews Root Trees Subsets of ESD and RAW Pre-selected or selected via a Big Train run by working group Each user needs 14.5 kSI2k (about 12 current boxes) 2.1TB ‘associated’ with each user on average Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 23 ATLAS Data Management Based on Datasets PoolFileCatalog API is used to hide grid differences On LCG, LFC acts as local replica catalog Aims to provide uniform access to data on all grids FTS is used to transfer data between the sites To date FTS has tried to manage data flow by restricting allowed endpoints (“channel” definition) Interesting possibilities exist to incorporate network related research advances to improve performance, efficiency and reliability Data management is a central aspect of Distributed Analysis PANDA is closely integrated with DDM and operational LCG instance was closely coupled with SC3 Right now we run a smaller instance for test purposes Final production version will be based on new middleware for SC4 (FPS) Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 24 Distributed Data Management Accessing distributed data on the Grid is not a simple task (see below!) Several DBs are needed centrally to hold dataset information “Local” catalogues hold information on local data storage The new DDM system (right) is under test this summer It will be used for all ATLAS data from October on (LCG Service Challenge 3) Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 25 ATLAS plans for using FTS Tier-0 FTS server: FTS Server T0 Channel from Tier-0 to all Tier-1s: used to move "Tier-0" (raw and 1st pass reconstruction data) Channel from Tier-1s to Tier-0/CAF: to move e.g. AOD (CAF also acts as "Tier-2" for analysis) T0 VO box LFC Tier-1 FTS server: Channel from all other Tier-1s to this Tier-1 (pulling data): used for DQ2 dataset subscriptions (e.g. reprocessing, or massive "organized" movement when doing Distributed Production) Channel to and from this Tier-1 to all its associated Tier-2s Association defined by ATLAS management (along with LCG) “Star”-channel for all remaining traffic [new: low-traffic] T1 VO box LFC T2 T2 T1 …. FTS Server T1 All SEs SRM LFC: local within ‘cloud’ Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 26 ATLAS and Related Research Up to now I have focused on the ATLAS computing model Implicit in this model and central to its success are: High-performance, ubiquitous and robust networks Grid middleware to securely find, prioritize and manage resources Without either of these capabilities the model risks melting down or failing to deliver the required capabilities. Efforts to date have (necessarily) focused on building the most basic capabilities and demonstrating they can work. To be truly effective will require updating and extending this model to include the best results of ongoing networking and resource management research projects. A quick overview of some selected (US) projects follows… Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 27 The UltraLight Project UltraLight is A four year $2M NSF ITR funded by MPS (2005-8) Application driven Network R&D. A collaboration of BNL, Buffalo, Caltech, CERN, Florida, FIU, FNAL, Internet2, Michigan, MIT, SLAC, Vanderbilt. Significant international participation: Brazil, Japan, Korea amongst many others. Goal: Enable the network as a managed resource. Meta-Goal: Enable physics analysis and discoveries which could not otherwise be achieved. Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 28 ATLAS and UltraLight Disk-to-Disk Research Muon calibration work has presented an opportunity to couple research efforts into production ATLAS MDT subsystems need very fast calibration turn-around time (< 24 hours) Initial estimates plan for as much as 0.5 TB/day of high-Pt muon data for calibration. UltraLight could enable us to quickly transport (~1/4 hour) the needed events to Tier-2 sites for calibration Michigan is an ATLAS Muon Alignment and Calibration Center, a Tier-2 and an UltraLight Site Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 29 Networking at KNU (Korea) Uses 10Gbps GLORIAD link from Korea to US, which is called BIGGLORIAD, also part of UltraLight Try to saturate this BIGGLORIAD link with servers and cluster storages Korea BIG-GLORIAD U.S. connected with 10Gbps Korea is planning to be a Tier-1 site for LHC experiments Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 30 VINCI: Virtual Intelligent Networks for Computing Infrastructures A network Global Scheduler Shawn McKee implemented as a set of collaborating agents running on distributed MonALISA services Each agent uses policy-based priority queues; and negotiates for an end to end connection using a set of cost functions A lease mechanism is implemented for each offer an agent makes to its peers Periodic lease renewal is used for all agents; this results in a flexible response to task completion, as well as to application failure or network errors If network errors are detected, supervising agents cause all segments to be released along a path. An alternative path may then be set up rapidly enough to avoid a TCP timeout, allowing the transfer to continue uninterrupted. The ATLAS Computing Model: Status, Plans and Future Possibilities 31 Lambda Station A network path forwarding service to interface production facilities with advanced research networks: Goal is selective forwarding on a per flow basis Alternate network paths for high impact data movement Dynamic path modification, with graceful cutover & fallback Current implementation is based on policy-based routing & DSCP marking Lambda Station interacts with: Host applications & systems LAN infrastructure Site border infrastructure Advanced technology WANs Remote Lambda Stations Shawn McKee D. Petravick, P. DeMar The ATLAS Computing Model: Status, Plans and Future Possibilities 32 TeraPaths (LAN QoS Integration) The TeraPaths project investigates the integration and use of LAN QoS and MPLS/GMPLS-based differentiated network services in the ATLAS data intensive distributed computing environment in order to manage the network as a critical resource Web page QoS requests web services web services APIs Cmd line scheduler scheduler user manager user manager TeraPaths Includes: BNL Michigan site monitor … site monitor WAN web services router manager router manager hardware drivers … WAN monitoring hardware drivers ESNet (OSCARS) FNAL(LambdaStation) SLAC(DWMI) WAN Site A Shawn McKee Site B The ATLAS Computing Model: Status, Plans and Future Possibilities 33 Integrating Research into Production As you can see there are many efforts, even just within the US, to help integrate a managed network into our infrastructure There are also many similar efforts in computing, storage, grid-middleware and applications (EGEE, OSG, LCG,…). The challenge will be to harvest these efforts and integrate them into a robust system for LHC physicists. I will close with an “example” vision of what could result from such integration… Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 34 An Example: UltraLight/ATLAS Application (2008) Node1> fts –vvv –in mercury.ultralight.org:/data01/big/zmumu05687.root –out venus.ultralight.org:/mstore/events/data –prio 3 –deadline +2:50 –xsum FTS: Initiating file transfer setup… FTS: Remote host responds ready FTS: Contacting path discovery service PDS: Path discovery in progress… PDS: Path RTT 128.4 ms, best effort path bottleneck is 10 GE PDS: Path options found: PDS: Lightpath option exists end-to-end PDS: Virtual pipe option exists (partial) PDS: High-performance protocol capable end-systems exist FTS: Requested transfer 1.2 TB file transfer within 2 hours 50 minutes, priority 3 FTS: Remote host confirms available space for [email protected] FTS: End-host agent contacted…parameters transferred EHA: Priority 3 request allowed for [email protected] EHA: request scheduling details EHA: Lightpath prior scheduling (higher/same priority) precludes use EHA: Virtual pipe sizeable to 3 Gbps available for 1 hour starting in 52.4 minutes EHA: request monitoring prediction along path EHA: FAST-UL transfer expected to deliver 1.2 Gbps (+0.8/-0.4) averaged over next 2 hours 50 minutes Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 35 ATLAS FTS 2008 Example (cont.) EHA: Virtual pipe (partial) expected to deliver 3 Gbps(+0/-0.3) during reservation; variance from unprotected section < 0.3 Gbps 95%CL EHA: Recommendation: begin transfer using FAST-UL using network identifier #5A-3C1. Connection will migrate to MPLS/QoS tunnel in 52.3 minutes. Estimated completion in 1 hour 22.78 minutes. FTS: Initiating transfer between mercury.ultralight.org and venus.ultralight.org using #5A-3C1 EHA: Transfer initiated…tracking at URL: fts://localhost/FTS/AE13FF132-FAFE39A-44-5A-3C1 EHA: Reservation placed for MPLS/QoS connection along partial path: 3Gbps beginning in 52.2 minutes: duration 60 minutes EHA: Reservation confirmed, rescode #9FA-39AF2E, note: unprotected network section included. <…lots of status messages…> FTS: Transfer proceeding, average 1.1 Gbps, 431.3 GB transferred EHA: Connecting to reservation: tunnel complete, traffic marking initiated EHA: Virtual pipe active: current rate 2.98 Gbps, estimated completion in 34.35 minutes FTS: Transfer complete, signaling EHA on #5A-3C1 EHA: Transfer complete received…hold for xsum confirmation FTS: Remote checksum processing initiated… FTS: Checksum verified—closing connection EHA: Connection #5A-3C1 completed…closing virtual pipe with 12.3 minutes remaining on reservation EHA: Resources freed. Transfer details uploading to monitoring node EHA: Request successfully completed, transferred 1.2 TB in 1 hour 41.3 minutes (transfer 1 hour 34.4 minutes) Shawn McKee The ATLAS Computing Model: Status, Plans and Future Possibilities 36 Conclusions ATLAS is quickly approaching “real” data and our computing model has been successfully validated (as far as we have been able to take it). Some major uncertainties exist, especially around “user analysis” and what resource implications these may have. There are lots of R&D programs active in many areas of special importance to ATLAS (and LHC) which could significantly strengthen the core model The challenge will be to select, integrate, prototype and test the R&D developments in time to have a meaningful impact upon the ATLAS (or LHC) program Shawn McKee Questions? The ATLAS Computing Model: Status, Plans and Future Possibilities 37
© Copyright 2026 Paperzz