Yu Feng: First Galaxies and Quasars in the BlueTides Simulation

BlueWatersSymposium2016,Sunriver,OR
BlueTidesSimulation
ModelingtheFirstQuasarsandGalaxiesonBlueWaters
WiththesimulationsoftwareMP-Gadget
YuFeng
BerkeleyCenterforCosmologicalPhysics
BerkeleyInstituteforDataScience
ScienceTeam:
NicholasBattaglia,PrincetonUniversity
SimeonBird,JohnHopkinsUniversity
RupertACroft,CarnegieMellonUniversity
TizianaDiMatteo,CarnegieMellonUniversity
AnanthTenneti,CarnegieMellonUniversity
DacenWaters,CarnegieMellonUniversity
StevenWilkins,UniversityofSussex
IntroductionofBlueTidesonthePublicWebsite:
http://bluetides-project.org/
Thestageoffirstgalaxiesandquasars:EpochofReionization
Interplayofmultiplephysics:
darkenergyanddarkmatter,neutrinos,radiation,
hydrodynamics,atomicphysics,stellarformation,black-holes.
Arapidbuildupofobservationaldatainthenextdecade.
JWST
2018
First
quasars
400Myearto800Myear
AgeofUniverse
Currentobservations
-Ahandfulofgalaxies
irregularandclumpy
-AsingleQuasar
today
A1/200,000thtipoftheiceberg
z=8
z=12
HUDF:A23Mpc/hcube
BOSSDR12:2.2cubicGpc/h
Newmajorspacebasedprogramsbrings1000timesmoredata,coveringsignificantly
largevolumethancurrentdata.
Challengestotheoreticalmodelling:
NeedLargevolumeandhighresolutioncomputersimulations
BlueTidesistheonlyfullcosmologicalsimulation
capableofmatchingthepower
ofnextgenerationinstruments.
ScientificAchievements
Auniqueposition:largevolumeandhighresolution.
-11paperssubmitted/accepted/published
MadepossiblebysupercomputersasbigandasaccessibleasBlueWaters:
Lotsofmemory,lotsofcpusandlargeIObandwidth.
Example1:
Theexistenceofmassivediskyhighredshiftgalaxies;
(Fengetal2016)
Example2:
Theearliestgalaxyknown,
discoveredinMarch2016(Oeschetal.)
comparingtoBlueTidespredictions.
Watersetal,2016
Example3:
BlueTidespredictsameasurementof
galaxyclusteringfromWFIRSTgalaxies;(Watersetal,2016)
FirstdarkenergymeasurementfromfirsthalfofUniverse!
(BaryonAcousticOscillation)
WFIRSTGRISM
CurrentDesign
Example4:
Co-evolutionofquasarsandtheirhostinggalaxies;
(DiMatteoetal2016,inprep)
tidaltensor
weaktidalfield:
Thinfilaments
radialmotions,
coldaccretion
t1=0.7
M BH=4x108 m sun
i
M BH=1x107 M sun
Largetidalfield:
Largefilaments,
Accretion perp.tot1
disc
j
t1=0.2
spheroid
SoftwareProduct:ContinuedDevelopmentsonMP-Gadget
MP-Gadget:
-softwarebehindBlueTides;
-ahydrodynamicsandgravitysolverforcosmologicalsimulations;
-multiplephyiscsmodelsforbaryonphysics;
-scalesfromasinglelaptoptoBlueWatersfullcapability.
Pushingtheenvolopeincosmologyandastrophysics
thatinvolvesfullmachinecapabilityscalecomputingoffacilities.
AccesstoBlueWatersresources(computingandsupport)iscrucial.
MP-GADGETArchitecture
GadgetDomainDecomposition
Hilbert-PeanoCurveand
GlobalIndexTree
PetaPM:PMSolver
PFFT:ParallelFFT
FFTW
PetaIO:IOinterface
bigfilelibrary
POSIXIO
Short-rangesolvers
Improvedmulti-thread
GadgetTreecode
V.Springel
Nearly20yearsofGADGET
N.Khandai
Circa2000
Thereisnowa
collectionof
differentversions
ofGADGET
S-Gadget1.0
Y.Feng
P-Gadget1.0
publicversions
8,000lines
S-Gadget1.1
THEFAMILYTREEOF
GADGET
P-Gadget1.1
12,000lines
major
rewrite
P-Gadget2
MP-Gadget/master
23,000lines
Millenniumrun
L-Gadget2
P-Gadget2
(Feng,Bird,etal.2016)
publicversion
Gadget2
51,000lines
18,000lines
P-Gadget2
20,000lines
Gadget2-asynchronous
Cray/P-Gadget3
120,000lines
Year2011
10,000lines
Gadget2-multidomain
BG/P-Gadget3
65,000lines
P-Gadget3
42,000lines
19,000lines
PreparingtheMP-Gadgetcodebaseforasustainablefuture.
Withamuchleanercodebase
Cray/P-Gadget3:120,000lines
MP-Gadget/BlueTides-I:72,000lines
Improvingexistingcapabilities.
-ImprovedPMscaling
-ImprovedThreadingefficiency
CurrentMP-Gadget:22,000lines
Abetterfoundationfornewfeatures.
Specifically,sinceMP-Gadget/BlueTides-I
Removed18,000linesofunused,deprecated,orduplicatedcodefromthecodebase;
-rewritingandcleaningupunlicensedcode;
-eliminatingnested#ifdefblocksfromthesourcecode;
-Zel'dovichinitialconditionsolvernowusesthemainPMsolver.
-Blackhole,starformationandHalofinderusestheupgradedTreeEvaluator.
Improvedbuildsystem
-off-treecompilation;
-AflatlistofMAKEFILEflags;
Aself-describingparameterfileparser;
-Demandthemodulestobeself-documenting.
Forlongertermsustainability:
Encouragingaflatter,lowredundancy,self-containedcodingstyle
AdditionalfeaturessinceMP-Gadget/BlueTides-I
onthenewcodebase
Implemented:
-Modelingradiation,andneutrinocomponentsinHubbleexpansion;
-Aquickgalaxyformationmodelforinter-galacticmediumphysics;
In-progress:
-Alowresolutionfeedbackmodelforinter-clustermediumphysics;
Planned:
-Anewdomaindecompositionschemethatuseslessmemorytoimprove
over-decompositionandreduceloadimbalanceinshortrangesolvers.
PAID:ImprovingtheIOperformanceofMP-Gadget
IMETeam:ParallelIO(DarrenAdams,WilliamGropp,LuuHuong,EdwardKarrels)
ScienceTeam:BlueTidesteam+MarkusScheucher(CMUStudent)
PeakIOthroughputofBlueTidessimulation:
554.8GB/s.
Canwedobetter?
Thebigfilebenchmarktool
http://github.com/bluetides-project/IO_simulator
-SimulatingtheIOpatternofBlueTideswithoutrunningthesimulation;
-Generalpurposebenchmarking;
IOofMP-Gadget:bigfile
bigfileistheIOlibraryofMP-Gadget.
Exposeddataavoidcontaineroverhead;
SplicingavoidsLustremagics;
Plaintextmetadata-schemaandattributes;
Highavailability:
-Reusablebylinkingorincludingsourcecode.
5200lines820lines
http://github.com/rainwoodman/bigfile
Afactorof10fasterthanCompressedHDF5
usedinpreviousbenchmarksofBlueTides
io.c
petaio.c
Initialbenchmarkrunsidentifiedthreehotspotsinpre-IOstates:
-Sequentialtruncationoffiles;
-Updatingthechecksums;
-Castingdatatypes;
Needtoinvestigatefurther.
ReusablesoftwarecomponentsinMP-Gadget
PFFT,bigfile,mpsortarereusablecomponentsofMP-Gadget;
UpdatedwithfullPythonbindingsandC++compatibleheaders.
FastPM::FastparticlemeshN-bodygravitysolver/library:
-PFFT,bigfile
(Fengetal2016,MNRAS)
nbodykit:massivelyparalleln-bodydataanalysistoolkitwithPython
-PFFT,bigfile,mpsort
(Handetal,2016,inprep)
SoftwaretestedanddevelopedfromBlueWatersdirectlyimproves
thedownstreamapplications.
Summary
WidespectrumofimpactinscienceandscientificcomputingthatareenabledbyBlueWaters.
InterestingscienceaboutfirstgalaxiesandquasarsfromBlueTidessimulation.
-PredictionsonWFIRSTandJWSTsatelliteprograms
-Physicsoffirstquasarsandgalaxy
SupportfromBlueWatersallowstocontinuethedevelopmentofMP-Gadget:
-leanercodebaseforsustainabledevelopment;
-newmodelsanddirections;
-understandingandimproveingI/OperformancefromthePAIDprogram;
-componentreuseenablessharedimprovements;
Plansahead:HydrodynamicsandDomaindecomposition
Shortrangeinteraction:
-Shortertimesteps
-Particle-particleinteraction
Highlyclustereddistributionofcomputationinsimulationvolume.
Over-decompositionimprovesefficiencyofshortrangeinteraction,
bysplitinglocalhot-spotsintochunksthatcanbescheduledtootherranks,
improvingbalanceandCPUutilization.
-Shortertimesteps
-Particle-particleinteraction
Highlyclustereddistributionofcomputationinsimulationvolume.
Over-decompositionimprovesefficiencyofshortrangeinteraction,
bysplitinglocalhot-spotsintochunksthatcanbescheduledtootherranks,
improvingbalanceandCPUutilization.
cpu0
cpu1
cpu2
cpu3
ButadomaininGadgetisexpensive,costingabout1000bytesperdomainduetothe
Hilbert-PeanodecompositionandBH-treecoverage.
cpu0
cpu1
cpu2
cpu3
ButadomaininGadgetisexpensive,costingabout1000bytesperdomainduetothe
Hilbert-PeanodecompositionandBH-treecoverage.
OnBlueWatersscale,thisbecomesprohibitivelyexpensive.BlueTidesuses
81000MPIranks,and4sub-domainsperran,totallingabout300MB/eachrankfordomain
storage,competingmemorystorageforthestatevector,tree,andparticlemesh.
Wearelookingintotwopossiblesolutions:
1.Compressthedomaindatastructure,reducingmemoryusageperdomain.
2.Switchtoasimplerdomaindecompositionscheme.