Laboratoire d'Astrophysique Ecole Polytechnique Fédérale de Lausanne Switzerland http://lastro.epfl.ch Measuring Galaxy Shapes for Cosmology Impact of Big Data & Machine Learning Marc Gentile, Sept. 29, 2016 Standard cosmology: successes and challenges standard model is very successful in reproducing the • The dynamics and observed structures of the Universe • But it is confronted to the evidence for a “dark” Universe Planck Collaboration Credit: NASA, ESA and R. Massey 2 Weak Gravitational Lensing and the Cosmic Shear from distant sources is slightly deflected by the • Light gravitational field of intervening matter along the line of sight main applications • TwoStudy dark matter on various scales - Constrain cosmology: using “Cosmic Shear” Shear”: weak lensing of background galaxies by • “Cosmic foreground large-scale structures NASA/ESA ...measure the slight modification of observed galaxy shapes for ~2 billions of galaxies Unknown Original galaxy Magnification Shear 3 Measuring the Cosmic Shear: difficult! Measuring gravitational shear fr From H. Hoekstra Cartesian components of galaxy ellipticity Observed galaxies: small, blurred, noisy, undersampled! • 'noise bias' in ellipticity estimates → need to calibrate on simulations 4 Measuring the Cosmic Shear: challenges shear measurement is difficult • Accurate Billions of galaxies / large sky areas to reduce statistical errors - Very weak signal, easily corrupted by many systematic effect => requires extremely tight control on systematic errors lowand S/N ratio - Observed galaxies are faint, under-sampled, 4.1. Principles challenges - An inverse problem: no unique solution Intrinsic Galaxy (shape unknown) Gravitational Lensing causes a shear g Image is convolved with the PSF Image produced by detector is pixelated Various sources of noise degrade the image Bridle et al., 2008 5 Shear measurement methods: main approaches • Estimation from second-brightness moments - KSB/KSB+ • Analytical decomposition of galaxy images/shapes - Shapelets, Reglens • Fit parametric model to galaxy profile - im3shape, lensfit, StackFit, gFit community: STEP, GREAT initiatives to help • Lensing improve algorithms through blind simulations 6 Typical shear measurement issues Mandelbaum, Rowe, et al. • Shear measurement biases • Applying the method on real data , Heymans et al. 2012; Hamana et al. 2013). In cont, the PSF due to et theal.optics varies relatively slowly ndelbaum, Rowe, time. The optical PSF is commonly described as mbination of di↵raction plus aberrations (possibly n conoslowly quite high order). Both the atmospheric and opPSF bed ashave some spatial coherence, qualitatively like ng shear, though the scaling with separation is not ossibly tical. nd ophe ly e↵ect like of the PSF on the galaxy shapes that we to measure isMandelbaum twofold: first, et al., applying a roughly ciris not r blurring kernel tends to dilute the galaxy shapes, ing them hat we appear rounder by an amount that depends he ratio Fig. 3.— Real galaxies from the HST as observed by the Adhly cir- of galaxy and PSF sizes. Correction for this tion can easily be a factor of 2 for typical galaxies, Camera for Surveys (ACS) in the COSMOS survey (KoekeKacprzak vanced et al, 2013 hapes, moer et al. 2007; Scoville et al. 2007b,a). The top left shows a which we wish to measure shears to 1%. Second, epends Figure 1. A demonstration of the concept of noise and modelis biaswell-fit interaction. by Left-hand side postage stamps show the images which are going to be fit with galaxy that a simple parametric model from Lackner small PSF anisotropiesFigure can1. Aleak intoofRight-hand demonstration the conceptside of noise andstamps modelare biastheinteraction. postage stamps show the imagesthe which areof going to be fit with a parametric galaxy model. postage images ofLeft-hand best fittingside models. In this work wegalaxy compare results two simulations. & Gunn (2012). The bottom left shows a that is reasonor thisbut coherent Fig. 3.— Real galaxies from the HST as observed by the AdaSimulation parametric1 galaxy model. Right-hand postage stamps are the bias, images of best models. In this jointly work we the results of two simulations. uses the real galaxy imageside directly to measure noise model biasfitting and their interaction in compare a single step (blue dot-dashed ellipse). galaxy shapes if not removed, mimicking a lensing ably well-fit but with additional substructure evident. The right Simulation uses the real image directly (Koeketo measure bias, model and theirgalaxy interaction jointly a single step (blue dot-dashed ellipse). laxies, vanced Camera for Surveys (ACS) in 12the COSMOS survey Simulation first finds the galaxy best fitting parametric model to a noise real galaxy image bias (black solid cartoon), and in create it’s model image (dashed magenta side aaareal true “irregular” galaxy that iscreate not by al. Simulation 2 first findsintroduces the best fitting parametric model image (blackbias solid galaxy cartoon), it’swell-fit model image (dashed magenta ellipse). This process the model bias.shows The next to step is togalaxy measure the noise using this best fittingand image as the true image. The fitsimple to the noisy moer et al. 2007; Scoville et al. 2007b,a). The top left shows econd, ellipse). process by introduces theellipse. model bias. next step is toand measure the noise bias thisare best fittingThe image as the true image. The to the noisy image sources isThis represented red dotted In thisThe simulation noise model bias interaction terms absent. difference of the results of fit Simulations 1 parametric models with ⇠ 10 using parameters. ars in the images areise↵ectively begalaxy that well-fit by apoint simple parametric model from Lackner image is represented by red dotted In thisbias simulation noise and model bias interaction terms are absent. The difference of the results of Simulations 1 and 2 measures the strength of noiseellipse. and model interaction terms. k into Gunn (2012). The bottom left shows a the galaxy that and 2 measures theof strength of noise and modelisbiasreasoninteraction terms. blurring by &the PSF, and hence are measures ensing ably well-fit but with additional substructure evident. The right . However, the PSF must be estimatedIt from them shear couples moments to interaction the higher-order sions for the noise bias, including terms with the model is worth that some shear measurement methodsthe ap- second side shows a true “irregular” galaxy thatto note is not well-fit by simple sions for the noise bias, including interaction terms with the model is worth toFor note that methods apthen interpolated to the positions of galaxies. a some bias. ply aItfully Bayesian formalism andshear use ameasurement full posterior(Massey shear probamoments et al. 2007a; Bernstein 2010; Zhang & parametric models with ⇠ 10 parameters. es be-of some common methods of PSF bias. ply a fully Bayesian formalism 2013) and use full posterior shear probability (Bernstein & Armstrong in asubsequent analyses. These mary estimation 2011). bility (Bernstein & Armstrong 2013) in subsequent analyses. TheseThis issue is particularly pressing given methods are considered to be free ofKomatsu noise and model related biases of the interpolation, see Kitching et al. (2013). methods are aconsidered be free ofthat noise and model related biases and present promisingtoalternative approach to problems studied several state-of-the-art shape measurement methods them present approach to problems studied 2.1 Shear and ellipticity inmoments this paper.a promising shear couples the secondand to alternative the higher-order (see Appendix A) are fitting relatively simple 2.1 based Shear andon ellipticity in thisThis paper. 2.5. Summary of e↵ects paper is organised as follows. Section 2 contains the prinFor a Cosmic shear as cosmological observable can be related to the moments (Massey et al. 2007a; Bernstein Zhang &thethe paper isshear organised as2010; follows. Section 2models contains pringalaxy or are based on a decomposition into ciplesThis of cosmic analysis. In Section 3 we present analytic Cosmic shearpotential as cosmological observable can be relatedbato the gravitational between the distant source galaxy and an mation ciples ofparticularly cosmic In interaction, Section 3 weasgiven present the analytic g. 1 summarizes the main e↵ects goisinto a shear weak Komatsu 2011). Thisthat issue pressing formulae for noise andanalysis. model bias a generalisation of gravitational potential between the distant source galaxy and an observernecessarily (see Bernstein & Jarvis 2002, for reviews). Ellipticity is sis functions that cannot describe galaxy proformulae for and model bias as a generalisation of the noise biasnoise equations derived ininteraction, R12. We present a toy model for observeras(see Bernstein & Jarvis 2002, for reviews). Ellipticity is ng observation. galaxy image is distorted as it is defined a complex number that The several state-of-the-art shape measurement methods files in detail (Voigt & Bridle 2010; Melchior et al. 2010). the noise bias equations derived in R12. We present a toy model for problem in section 4. In Section 5 we use the COSMOS sample defined as a complex number ected by mass along the line-of-sight from thethe galaxy the problem infitting section 4. Inmodel Section 5 weand use simple the COSMOS sample (see Appendix A) are based on relatively to evaluate noise and biases, show the significance a − b functions 2iφ More complex decompositions intoe =basis often(1) to the noise andWe model biases, show the significance a+ − bb e 2iφ, of evaluate theon interaction terms. conclude in and Section 6.baa s. This is the desired signal. It isbased then further disgalaxy models or are a decomposition into e = but eat, the expense(1) can describe more complex galaxies, of the interaction terms. We conclude in Section 6. a+b aedweak by the atmosphere (forthat a ground-based telescope), where a and b are the semi-major and semi-minor axes, respec- 7 sis functions cannot necessarily describe of galaxy prointroducing many tens or > 100 parameters, making where and a and areangle the semi-major and semi-minorbetween axes, respectively, φ isb the (measured anticlockwise) the xas it isoptics,files cope andin pixelation on the these efdetail (Voigt & detector; Bridle 2010; Melchior et al. 2010). • - Images are never perfectly clean, sensitivity to artefacts Account for color, eliminate galaxy blends, etc. Processing time per galaxy - About 1sec / galaxy, too large even on a parallel cluster Next-generation Weak Lensing Surveys • • In a few years time massive WL surveys will begin From Space: - ESA Euclid, 1.2m telescope (~2020) NASA WFIRST, 2.4m telescope (~2020) - LSST 8.4m Telescope (~2023) SKA (Square Kilometre Array) Radio Telescope (~2020) • From the ground: • • Goal: measure cosmological parameters within 1% error Will shear measurement methods be ready? 8 Bias Requirements for Euclid-like surveys true • and that the error on the shear, i i , and the true obeys a linear relationship of the form Additive and Multiplicative Biases i • • true i = mi true i + ci (i = 1, 2) ESA Euclid: Bias requirements to this estimate cosmological Although approximate, metric is generally parameters ~1% literaturewithin to quantify the accuracy of shear measure ods. Table 1 gives approximate upper limits of al for current, upcoming and planned weak lensing sur on the calculations done in e.g. Amara & Réfrég Kitching et al. (2009); Cropper et al. (2013); Ma (Euclid) (2013). Examples of surveys in the “current” ca A tenfold improvement in bias required till the launch of (CFHTLenS) (Heymans et al. 2012), the Sloan D Euclid telescope in ~2020… Survey (SDSS) (Lin et al. 2012; Mandelbaum et9 al. Big Data in Weak Lensing • • • • Volume - Euclid imaging survey: 15’000 deg2 ~4 times in mult. bands ~10 Petabytes raw space data Velocity - Large Synoptic Telecope (LSST) 20’000 deg2 in 6 bands, ~15 Terabytes / day raw data rate SKA radio survey: ~3 Terabytes / second Variety - Heterogeneous datasets, ground+space, variable depth, multiple bands, different cameras, etc. Veracity - Low S/N, missing/incomplete/abnormal data, etc. 10 Big Data challenges for shear measurement • • • Algorithms currently too slow / high volumes - Typical method processes ~1 galaxy per second Either speed-up algorithms or apply massively distributed computing techniques (e.g. map-reduce) Algorithms must cope with data variety - Handle missing/incomplete/abnormal information Exploit value in data from different source/nature But, Big Data also provides opportunities - Accessing more data can potentially yield better accuracy More high S/N deep data, better calibration => improved bias correction Applying new technologies/paradigms - Lessons from industry 11 Machine Learning (ML) can help • • • All methods rely on calibrations for correcting biases - Requires expert knowledge in astrophysics Requires expert knowledge of the method itself Potential for improvements poorly explored so far Possibly the use of Machine Learning can help - Entirely new approaches for shape measurement Help improve existing methods: speed, accuracy Machine Learning for calibration shown to work Mapping Dark 12 Machine Learning (ML) can help • Example: “Pure” ML approach based on simulations Machine Learning as Direct Measurement 13 Machine Learning (ML) can help • Example: “Hybrid” ML approach: automated bias calibration Machine Learning as Calibration Scheme 14 Summary potential of weak lensing and shear measurement for • High cosmology • But accurate shear measurement is extremely difficult algorithms are still insufficiently accurate to help • Current discriminate cosmological model within 1% precision are too slow, especially with upcoming massive • Algorithms surveys (Big Data) • One can learn lessons from the industry or other fields Learning techniques can help improving algorithms • Machine accuracy and speed 15
© Copyright 2025 Paperzz