Lecture 5: source detection. • Test the null hypothesis (NH). Cutoff at 5% level. – The NH says: let’s suppose there is no source there – ie, model is just background. – Calculate χ2 for this null model. – Calculate the probability of χ2 exceeding the value obtained. (Sometimes called a P-value.) NASSP Masters 5003F - Computational Astronomy - 2009 Source detection. – If this probability (the P-value) is smaller than a previously chosen cutoff, call this a positive detection. • BUT! Note that there is no certainty. – Sometimes the null model will by chance give a large χ2 => ‘false positives.’ For given data, background and cutoff, there will be a fixed number of false positives expected in the source list. • => ‘reliability’. More on this later. – Sometimes a real source will give a small nullhypothesis χ2 => ‘false negatives’, real sources which are missed. • => ‘completeness’. More on this later. NASSP Masters 5003F - Computational Astronomy - 2009 Problems with the NH approach: • We don’t have exact knowledge of the background. – Have to estimate it either from • separate data – in which case we need separate data! • or from the same data… but this may be dominated by the source... – Or our background model may be wrong. • Same issues as other model fitting. In particular: – χ2 has to be used with care when the noise is Poisson. NASSP Masters 5003F - Computational Astronomy - 2009 But where are the sources? • A low probability for the null hypothesis tells us, at best, that there is a source somewhere. • Finding the source(s) consists rather of looking for peaks in a random signal. • The simplest example is when the noise is uncorrelated and the source peaks have width=0. NASSP Masters 5003F - Computational Astronomy - 2009 A generic source-detection algorithm • We shall assume that: – The data is ‘binned’ (eg CCD data). – We have a good independent estimate of the background. – The sources are sparsely distributed – such that we can deal with them one at a time. – The shape of the source profile is known. – The source position is unknown. – The source amplitude is unknown (but >0). NASSP Masters 5003F - Computational Astronomy - 2009 Generic source-detection algorithm: The algorithm has 3 steps: 1: Calculate a sliding-window map. 2: Find the peaks in this map. 3: Rejects For each peak, calculate the probability that it could arise by chance from the background (the null hypothesis P-value). No P < Pcutoff? Yes Choose a Pcutoff Sources NASSP Masters 5003F - Computational Astronomy - 2009 1: The sliding window. y U y U y U NASSP Masters 5003F - Computational Astronomy - 2009 1: The sliding window. Same thing. • For each position of the sliding window, a single number U is calculated from the values falling within the window. • The output is a map of the U values. • The intent is to: – Raise the signal-to-noise – Improve sensitivity – Amplify the sources at the expense of the noise. • Sliding-window processing only has value when the source has a width > 1 pixel. • Edges need special treatment. NASSP Masters 5003F - Computational Astronomy - 2009 1: Window functions • A weighted sum (= a convolution). – Simplest with all weights = 1: “sliding box”. – Optimum weights – a “matched filter”: • For uniform Gaussian noise, wopt = s. • Trickier to optimize for Poisson noise. • Per-window null-hypothesis χ2. – With either an independent value of bkg (in which case degrees of freedom = number of pixels Nw in the window), or… – …one fitted from the data (deg free = Nw-1). • Likelihood (same bkg provisions as χ2). NASSP Masters 5003F - Computational Astronomy - 2009 1: Window functions Parent function Data NASSP Masters 5003F - Computational Astronomy - 2009 1: Window functions Parent function Chi squared, size=100 Matched filter, size=10 Log-likelihood, size=100 NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding Gaussian noise, convolved with a gaussian filter. …don’t get the gaussians mixed up! NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding • No single neat prescription. • Naive prescription: – Pixel i is a peak pixel if yi > any other y within a patch of pixels from i-j to i+j. • But what value to choose for j? • Things to avoid are: – j too small – results in more than 1 peak per source; – j too large – misses a close adjacent source. NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding Box too small: Box too large: NASSP Masters 5003F - Computational Astronomy - 2009 3: Decision time – is it a source or not? • To calculate a P-value we need the probability distribution of peaks in the postwindow map of U values (given the null hypothesis). • This is not the same as the probability distribution of the original data values… • …nor is it even the same as the probability distribution of U values. • In fact, little work seems to have been done on ppeaks. (Though there is quite a lot on the distribution of extrema – not quite the same thing.) NASSP Masters 5003F - Computational Astronomy - 2009 3: The decision ‘Map’ vs ‘peak’ distributions for Gaussian noise. Black: all pixels Red: peaks NASSP Masters 5003F - Computational Astronomy - 2009 3: Cash to the rescue • First of all, remember that our model m has p parameters θ = [θ1, θ2,… θp]. • Cash theory – form a ratio between 2 likelihoods: – The numerator is calculated with all p parameters fixed at their ‘null hypothesis’ values. – For the denominator, a subset, q in number, of the parameters are adjusted to give the highest likelihood value. • -2log(this ratio) behaves like χ2 with q degrees of freedom. NASSP Masters 5003F - Computational Astronomy - 2009 3: Cash to the rescue • A practical recipe for applying Cash to source detection goes as follows: – Choose a window area surrounding each peak. – Within this window, calculate Lnull with model mi = bi (the background map values). – Calculate Lbest by fitting a model mi = bi + θ1 s(ri – θr) • Degrees of freedom ν = 1 (the amplitude) + d (the dimensions of the spatial fit). – The Cash statistic 2(Lbest-Lnull) behaves like χ2 with 1+d deg. free. NASSP Masters 5003F - Computational Astronomy - 2009 3: Cash to the rescue • The only difficult point (which is a problem for every method) is to calculate the fraction of pixels which are peaks. – Monte Carlo – Possibly a Fourier technique? • Also, don’t want to use the fit for final parameter values. A Mighell fit is better. NASSP Masters 5003F - Computational Astronomy - 2009 • • • • • • • Useful references: W Press et al, “Numerical Recipes in Fortran” P Bevington, “Data reduction and error analysis for the physical sciences” W Cash, Ap J 228, 939 (1979) K J Mighell, Ap J 518, 380 (1999) I M Stewart, A&A 454, 997 (2006) I M Stewart, A&A, in print (2009) Wikipedia NASSP Masters 5003F - Computational Astronomy - 2009
© Copyright 2026 Paperzz