Lecture 7 – chi squared and all that • Testing for goodness-of

Lecture 7 – chi squared and all that
•
•
•
•
Testing for goodness-of-fit continued.
Uncertainties in the fitted parameters.
Confidence intervals.
The Null Hypothesis.
NASSP Masters 5003F - Computational Astronomy - 2010
Hypothesis testing continued.
Survival function
• Procedure:
1. “Suppose the model is a
perfect fit.”
2. Calculate survival
function for χ2 of pure
noise of N-M degrees of
freedom..
3. Draw vertical at point of
measured χ2.
4. Y value where this
vertical intercepts the SF
is the probability that a
perfect model would
have this χ2 value by
random fluctuation.
NASSP Masters 5003F - Computational Astronomy - 2009
Questions answered so far:
• In fitting a model, we want:
1. The best fit values of the parameters;
2. Then we want to know if these values are
good enough! Ie if the model is a good fit to
the data.
3. If the model passes, we want uncertainties
in the best-fit parameters.
• Number 1 is accomplished. √
• Number 2 is accomplished. √
NASSP Masters 5003F - Computational Astronomy - 2009
Uncertainties in the best-fit parameters
• Usually what one gets is a covariance matrix
(mentioned in lecture 4):
 σ 12 σ 122 L
 2

2
E = σ 21 σ 2

 M

O
• This is a symmetric matrix: σij2=σji2 for all i,j.
• For U=χ2, E=2(Hbestfit)-1, where Hbestfit is the
Hessian, evaluated at the best-fit values of the θi.
• For U=-L, E=F-1, where F is the “Fisher Information
2
Matrix”:
(
∂
− Lbestfit )
ˆ
F
=
,
i
j
• These definitions are equivalent!
∂θ i ∂θ j
– For Gaussian data, identical.
NASSP Masters 5003F - Computational Astronomy - 2009
The Hessian or curvature matrix
Contours of U:
• The contours are
ellipses in the limit as the
minimum is approached.
– Ellipsoidal hypercontours
in the general case that
M>2.
• Semiaxes aligned with
the eigenvectors of H.
• Small semiaxis:
large curvature;
small uncertainty in that
direction.
Arrows show the eigenvectors.
NASSP Masters 5003F - Computational Astronomy - 2009
1-parameter example
1) Gaussian data, U=χ2.
N
U =∑
i =1
( yi − θ )
2
σ
2
i
– For this simple model, we can find the best fit
θ without numerical minimization:
N
(
∂U
yi − θ )
= −2∑
2
∂θ
σ
i =1
i
– Setting this to zero gives: θˆ =
N
yi
∑σ
i =1
N
1
∑σ
i =1
2
i
.
2
i
NASSP Masters 5003F - Computational Astronomy - 2009
Sidebar – optimum weighted average
• A weighted average is:
wy
∑
µ̂ =
∑w
i
i
i
i
i
• Since the yi are random variables, so is µ^.
• Therefore it will have a PDF and an
uncertainty σµ.
• The smallest uncertainty is given for
wi = 1 σ
2
i
– Exactly what we have from the χ2 fit.
NASSP Masters 5003F - Computational Astronomy - 2009
Back to the1-parameter example.
– Again, because this model is so simple, we
can calculate σθ by direct propagation of
uncertainties.
•
θ^ is a function of N uncorrelated random
variables yi, so
2
 ∂θ  2
σˆθ = ∑   σ i
i =1  ∂yi 
N
2
•
It is fairly easy to show that:
σˆθ2 =
1
N
1
∑σ
i =1
2
i
NASSP Masters 5003F - Computational Astronomy - 2009
What does the standard approach give?
• Hessian is a 1-element matrix:
∂ 2U
H1,1 =
∂θ 2
= 2∑
bestfit
1
σ i2
• Hence
2
=
H1,1
1
1
∑σ
2
i
• QED.
NASSP Masters 5003F - Computational Astronomy - 2009
1-parameter example continued
2) Poisson data, U=-L. (No point in using -L for gaussian
data, it’s then mathematically the same as chi squared.)
N
U = −∑ [ yi ln θ − θ − ln ( yi !)]
i =1
– Again it is simple to calculate the position of
the minimum directly:
yi 
∂U 
∑
= N −

θ 
∂θ 
1
ˆ
– Setting this to zero gives θ =
N
Ie, the average of the ys.
N
∑y
i
i =1
NASSP Masters 5003F - Computational Astronomy - 2009
Uncertainties in the Poisson/L case.
– With our present simple model it is very easy
by propagation of uncertainties to show that
σˆθ =
2
θˆ
N
– Following the formal procedure for
comparison:
2
L
∂
ˆ
F1,1 = − 2
∂θ
y
∑
=
bestfit
θˆ 2
i
N
=
θˆ
– Inverting this gives the same result.
NASSP Masters 5003F - Computational Astronomy - 2009
1-parameter example continued
3) Poisson data, U=“chi squared”.
– There are two flavours of “chi squared” for
Poisson data!
N
U Pearson = ∑
( yi − θ )2
θ
i =1
N
U Mighell = ∑
( yi + min( yi ,1) − θ )
2
yi + 1
i =1
– Note that the following is simply incorrect:
N
U =∑
i =1
( yi − θ )
2
yi
NASSP Masters 5003F - Computational Astronomy - 2009
Don’t use Pearson’s for fitting.
• It is not hard to prove it is biased.
– Eg, keeping our simple model,
2

yi 
∂U
∑
= N − 2 
∂θ 
θ 
– Setting this to zero gives
θˆPearson =
1
N
N
1
2
yi ≠
∑
N
i =1
N
∑y
i
i =1
– In his paper, Mighell calculates the limiting
value of θ^Pearson as N->∞ and shows it is not θ.
NASSP Masters 5003F - Computational Astronomy - 2009
The Mighell formula is unbiased.
– For this statistic,
N
N

yi + min ( yi ,1)
∂U
1
= 2 θ × ∑
−∑

∂θ
yi + 1
i =1 yi + 1
i =1


– Setting this to zero gives
N
θˆMighell =
∑
i =1
yi + min ( yi ,1)
yi + 1
N
1
∑
i =1 yi + 1
– Some not-too-hairy algebra shows that the
limiting value of θ^Mighell as N->∞ is equal to θ.
NASSP Masters 5003F - Computational Astronomy - 2009
Goodness-of-fit:
1. The Gaussian/χ2 case has been covered already.
2. The Poisson/L case is a problem, because no
general PDF for L is known for this noise
distribution.
– If we insist on using this, have to estimate SF via a
Monte Carlo. Messy, time-consuming.
3. For the Poisson/”chi squared” case, where we
have 2 competing formulae, we should do:
– Use Mighell to fit;
– Use Mighell for uncertainties;
– But use Pearson (with the best-fit values of θi) for
goodness-of-fit hypothesis testing.
•
Because it has the same PDF (thus also SF) as χ2.
NASSP Masters 5003F - Computational Astronomy - 2009
Confidence intervals
• There is a hidden assumption behind
frequentist model fitting: namely that it is
meaningful to talk about p(θi^).
()
p θˆi
θˆi
NASSP Masters 5003F - Computational Astronomy - 2009
Confidence intervals
• We already have some hints about its
shape… and a Monte Carlo seems to offer
a way to map it as accurately as we want.
θˆi ,bestfit
()
p θˆi
2σ̂
θˆi
NASSP Masters 5003F - Computational Astronomy - 2009
Bayesians think this is nonsense.
• Such a MC is like pretending that θ^ is the
‘true’ value, and then generating lots of
hypothetical experimental data.
• But all we really know is the single set of data
which we measure in the real experiment.
– Plus possibly some ‘prior knowledge’.
• We don’t want p(θ^), we want p(θ).
• But we’ll continue with the frequentist way for
the time being.
NASSP Masters 5003F - Computational Astronomy - 2009
Confidence intervals
• We also assume that
p(θi) is approximately
Gaussian (which may be
entirely unwarranted!!)
1
σ 2π
σ
 x2 
∫−σ dx exp − 2σ 2  ≈ 0.68
– We interpret this to mean
that there is a 68%
chance that the interval
[θˆ − σˆ ,θˆ + σˆ ]
contains the truth value θ.
NASSP Masters 5003F - Computational Astronomy - 2009
Confidence intervals
• Note that this is not the only
interval which contains 68%
of the probability. We can
move the interval up and
down the θ axis as we
please. The –σ to +σ
version is just a convention.
• FYI
1
σ 2π
 x2  1  a 
∫0 dx exp − 2σ 2  = 2 erf  2 
a
erf() is called the error function.
NASSP Masters 5003F - Computational Astronomy - 2009
Confidence intervals
• For more than 1 parameter
the q% confidence interval
is the (hyper)contour within
which the probability of the
truth value occuring =q.
• Again, by convention,
symmetrical contours are
used.
NASSP Masters 5003F - Computational Astronomy - 2009
When m=s+b (which is not always appropriate)
• It is of interest to ask (probably before we
attempt to fit the parameters of s!):
– Is there any signal present at all?
• In frequentist statistics this is again done
via hypothesis testing. The hypothesis
now is called the null hypothesis (‘null’
from Latin for ‘nothing’):
– “Suppose there is no signal at all.”
– and test what follows from this.
NASSP Masters 5003F - Computational Astronomy - 2009
Testing the Null Hypothesis - details
1) Gaussian data, U=χ2:
– Construct the survival function (SF).
• Degrees of freedom?
–
–
–
–
Depends whether we fit the background or not.
Suppose we have Mb and Ms.
If background fitted, υ=N-Mb.
If not (in this case need to know the background from other
information), υ=N.
– From the set of measurements yi, calculate
N
U Meas = ∑
i =1
( yi -bi )
2
Note ONLY include background!
σ i2
– From the SF read off that value of probability which
corresponds to Umeas.
• That is the probability that background alone would generate
>=Umeas.
NASSP Masters 5003F - Computational Astronomy - 2009
Testing the Null Hypothesis – details cont.
2) Poisson data, U=“χ2”:
– The PDF, therefore the SF, are not known for the
Mighell statistic.
– However the PDF and SF for the Pearson statistic
are identical to χ2.
– Use Pearson statistic for Poisson hypothesis
testing.
3) Poisson data, U=-L:
– PDF and SF not known.
– But one can compare two models via the Cash
statistic. (Cash W, Ap J 228, 939 (1979).
NASSP Masters 5003F - Computational Astronomy - 2009
The Cash statistic
C = 2(Lbestfit − Lnull )
• This is only valid providing the null model
can be obtained by some combination of
signal parameters.
– This implies that one of the signal parameters
will be an amplitude (ie, a scalar multiplying
the whole signal function).
– It also ensures that
Lbestfit ≥ Lnull
hence
C≥0
NASSP Masters 5003F - Computational Astronomy - 2009
The Cash statistic
• Cash showed that the PDF of C was the
same shape as that of χ2, but with υ=Mfitted.
• Note that this is rather different from the
usual p(χ2), for which υ is approx. equal to
the number of data values N.
NASSP Masters 5003F - Computational Astronomy - 2009
Incomplete gamma functions - advice
• Recall the survival function for χ2 is
Γ(ν 2 , U 2)
P(> U ,ν ) = 1 −
Γ(ν 2)
– The incomplete gamma function can be calculated via
scipy.special.gammainc.
• It is very small values of P that we are interested in
however – ie where Г(υ/2,U/2)/ Г(υ /2) becomes
close to 1.
• In this regime it is better to use the complementary
(means, 1 minus) incomplete gamma function:
– scipy.special.gammaincc <– note 2 cs.
– But NOTE the definition carefully.
NASSP Masters 5003F - Computational Astronomy - 2009
General problems with fitting:
• When some of the θs are ‘near
degenerate’.
– Solution: avoid this.
• When several different models fit equally
well (or poorly).
– Solution: F-test (sometimes). Supposedly
restricted to the case in which 2 models differ
by an additive component.
NASSP Masters 5003F - Computational Astronomy - 2009
Degenerate θs
Data
Model: two close
gaussians – 2 parameters: the amp
of each gaussian.
Valley in U is long and narrow. Many
combinations of θ1 and θ2 give about as
good fit; parameters strongly correlated.
NASSP Masters 5003F - Computational Astronomy - 2009
Overview of the grand plan
Frequentist
Bayesian
Gaussian
(χ2 -L)
Poisson χ2
Poisson -L
χ2
υ=N
UPearson
υ=N
Cash
υ=M
Minimize χ2
Minimize UMighell
Minimize -L
Uncert
E=2H-1
E=2H-1
GOF
χ2
υ=N-M
UPearson
υ=N-M
Null H
Fit
T B D…
E=-F-1
No formula
(MC)
NASSP Masters 5003F - Computational Astronomy - 2009
Flowchart to disentangle the uses of χ2:
Is there any
signal at all?
Is the model an
accurate description?
Minimize χ2 to get
best-fit θ.
Test the Null Hypothesis:
Test the hypothesis that it is.
Decide on a cutoff probability Pcut.
Decide on a cutoff probability Pcut.
Calculate χ2 for θ= bkg values.
Calculate χ2 for the best fit θ.
Compare to theoretical χ2 survival function (num deg free = N).
Compare to theoretical χ2 survival function (num deg free = N-M).
P<Pcut?
P<Pcut?
No – no
signal.
Yes –
there is a
signal
No –
model is
good.
Yes –
model is
bad.
NASSP Masters 5003F - Computational Astronomy - 2009