Cloudsatothersites T2-typecomputing RandallSobie UniversityofVictoria RandallSobieIPP/Victoria 1 Overview • CloudsareusedinavarietyofwaysforTier-2typecomputing – MCsimulation,productionandanalysis – Commercial/private,in-house/distributed • Motivationforusingclouds – Easeofuse,reducedmanpowercosts,resourcesharing – Separationofapplicationandsystemadministration – Leveragesoftwaredevelopmentbycommercialworld • Howarecloudsbeingused? – VMprovisioning,jobmanagement,benchmarks,storage,networking,monitoring RandallSobieIPP/Victoria 2 CloudcomputinginHEP Dedicated Virtual cluster CloudcomputinginHEPistypicallyproviding 5-20%oftheprocessingofcurrentprojects “Dedicated”clouds (OwnedbyHEP) “Opportunistic”clouds Opportunistic (privateandcommercial) RandallSobieIPP/Victoria 3 Clouddeployments Traditional bare-metal Specificpurposecloud (e.g..LTDABaBar,HLTclouds) Standalone/private cloud (e.g.PNNL,NorduGrid) Distributedclouds (e.g.UK,Canada, Australia,INFNClouds) Bare-metalorin-housecloudwithexternalcloud (e.g..CERN,BNL) RandallSobieIPP/Victoria 4 Examplesofclouddeployments (meanttoillustrateouruseofclouds) RandallSobieIPP/Victoria 5 Australian Belle II Grid Site SingleCREAMCEservices ATLASTier-2(Torque) and BelleIIsite(DynamicTorque) Australia-ATLAs Tier 2 TORQUE + Maui 14,000 HEPSpec ~ (1400 cores) Dynamic Torque CREAM CE distribute jobs via SSH TORQUE + Maui (Belle II) LCG.Melbourne.au Dynamic Torque control VMs Research Cloud (Currently 700 cores) RandallSobieIPP/Victoria 6 Why private cloud? Chosen for flexibility, efficient use of compute resources for services Provides easy load-balancing and availability features Provides templating features Easy re-use of templates to test and instantiate new server instances Non-systems staff can provision their own instances of services Software Defined Networking is more malleable than physical networking, encourages better networking practices, including security Lessons learned VM’s and/or containers provide needed flexibility to support multiple collaborations and different user needs Ceph storage is very robust and flexible VM’s impose a 15%-20% performance penalty on HEP compute workload without careful tuning Move to containers on bare metal planned OpenStack features do not help us make sure a certain number of instances are up and healthy and consistent Kubernetes looks appealing in this respect RandallSobieIPP/Victoria 7 GridPP(P.Love/A.McNab) UniversityOpenstackinstances • CloudsatHEPinstitutions(Oxford/Imperial). • ECDFcloudinEdinburghhasrecentlymadeavailabletotheHEP UKVacuumdeployments • Keytoourlight-weightTier-2strategywhereweoperatewithminimal manpoweratthesite(<1000cores). DatacentredcommercialOpenstack • ScaleofaTier-2facility. • Freeaccesstothetheirsystem(ATLAS)whilsttheywerecommissioningthings; paidforaccesswhenfundsavailable. • NetworkconnectivitytotheUKacademicnetworkisonly1Gbitbuttheyhave planstoupgrade RandallSobieIPP/Victoria 8 Italy(INFN;MassimoSgaravattoetal) PrivateOpenStackCloud(Padova-Legnaro)calledCLOUDAREAPADOVANA Usedby~25usergroups/projectthatfinanciallycontributedfortheresources Batchprocessing • Relyingontheelastiqframework,HTCondorbatchclustersareinstantiated. • Thesebatchclustersare'dynamic':newworkernodesareautomatically addedorareremoveddependingonload. • CMSCloudprojectisintegratedwiththelocalTier-2. • E.g.CMSVMscanaccesstheT2storage(dcache)usingthesamelocal protocol(dCAP)usedbytheT2WNs. • PlanstodeploytheSynergyservice,whichallowstomanagetheresource allocationusingafair-shareapproach,withoutastaticpartitioningofsuch resourcesamongtherelevantusercommunities. RandallSobieIPP/Victoria 9 NorduGrid RandallSobieIPP/Victoria 10 BernSwitzerland SWITCHengines–SwissNRENcommercialcloud(OpenStack) (freeduringdevelopmentphase) RandallSobieIPP/Victoria 11 Canada DistributedcloudsystemforATLASandBelleII • IntegratedintoPanda/DIRAC • Inproductionfor3-4years • AlsousedbyCanadianastronomy • • • • • • uCernVM,CVMFS,Squid-discovery(Shoal) DistributedVMimagerepository Datawrittentolocalstorageandtransferred BenchmarksrunatVMboot VMtimemeasurementsforaccounting Reasonablemonitoring • UpdatingsystemforOpenNebula • Studyingdatafederations(e.g.Dynafed) • Context-awareness Job Submit user script HTCondor Starts job CloudScheduler CloudScheduler Start VMs Compute Cloud VM VM Image Repository 10-15cloudsmanagedbyHTCondor/ CloudScheduler(4000-5000cores) 800-1000cores(each)EC2/Azure (Egressfeeswaived) • Challengesincludemanagingresources acrossmanyadministrativedomains RandallSobieIPP/Victoria 12 CanadianWLCG“cloud”–includesAustralianT2 FridayOctober6 Cloudresources 10clouds 4300cores RandallSobieIPP/Victoria 13 Jobscheduling/VMprovisioning • VarietyofmethodsforrunningHEPworkloadsonclouds – VM-DIRAC(LHCbandBelleII) – VAC/Vcycle(UK) – HTCondor/CloudScheduler(Canada) – HTC/GlideinWMS(FNAL),HTC/VM(PNNL),HTC/APR(BNL) – Dynamic-Torque(Australia) – CloudAreaPadovana(INFN) – ARC(NorduGrid) • Eachmethodhasitsownmeritsandoftenwasdesignedtointegrated cloudsintoanexistinginfrastructure(e.g.local,WLCGandexperiment) RandallSobieIPP/Victoria 14 Commercialandprivateclouds • Commercialclouduse – PrimarilyAmazonEC2andMicrosoftAzure(withgrants) – ATLASdiscussinguseofGCE – OthercommercialOpenStackclouds • DataCentred(UK),SWITCHengines(Switzerland) – CERNcommercialcloudprocurement • Privateclouds – OpenStackandOpenNebularesearch-fundedcloudsbutnotinvolvedinHEP RandallSobieIPP/Victoria 15 Networkconnectivity • AmazonandMicrosoftcloudsareconnectedtotheresearchnetworksin NorthAmerica(probablyGCEaswell) – Egresschargescanbewaiveduponrequest • Trans-borderortrans-oceantrafficcanbeanissue – BecomeanimportantdiscussiontopicintheLHCONEmeetings • Privateopportunisticclouds – trafficflowsoverresearchnetworkbutnotLHCONEnetwork RandallSobieIPP/Victoria 16 CPUBenchmarks Newsuiteof“fast”benchmarks – HEPiXBenchmarkWorkingGroup – Suiteavailableincludes“fastHS”(LHCb)andWhetstonebenchmarks • WritetoElasticSearchDB – RunbenchmarksinthepilotjoborduringthebootoftheVM Datastorage – DatawrittentolocalstorageonnodeandthentransferredtoselectedSE – UKgrouphasdonesomeworkintegratingtheirobjectstorewithATLAS – BNLusingS3storageonEC2forT2-SE RandallSobieIPP/Victoria 17 Monitoring Cloudorsitemonitor CloudSystemmonitor Sensu,Munin,RabbitMQ,Mongo-DB,Ganglia Application Benchmarksandaccounting Applicationmonitor ElasticSearchDB Pandamonitoring RandallSobieIPP/Victoria 18 Summary • CloudsatHEPsites – Typicallyintegratedintoanexistinginfrastructure – Seenasawaytobettermanagemulti-userresources – CloudR&Dfundingopportunities • Opportunisticresearchclouds – Easywaytoutilizecloudsatnon-HEPresearchcomputingfacilities – Norequirementforon-siteapplicationspecialistsorcomplexsoftware • Commercialclouds – EC2/Azure/GCEdominatebutotherOpenStackclouds – Grantandsomecontractedresources – Trans-bordernetworkconnectivitybeingaddressed RandallSobieIPP/Victoria 19
© Copyright 2026 Paperzz