Direct Maximum Likelihood Methods

Direct Maximum Likelihood Methods
David Szwer
Hannes Busche; Dan Maxwell; David Paredes;
Charles Adams; Matt Jones
Group meeting 12/03/2012
Talk Overview
• Uncertainty in Arcsine
• Introduction to Maximum Likelihood
• Maximum Likelihood for Arcsine
Arcsine
• Measure r ± αr .
• Calculate θ = asin(r); what is αθ?
Arcsine
• θ = asin(r).
Arcsine
• Measure r ± αr .
• Calculate θ = asin(r); what is αθ?
• Calculus method:

 
r
r

r
1 r
2
Arcsine, Derivative Method
• θ = asin(r).
• αθ → ∞ as r → ±1.
Arcsine, Functional Method
• θ = asin(r).
• θ+αθ = asin(r+αr)
θ–αθ = asin(r–αr).
Arcsine, Functional Method
• When r ≈±1, only one of the errors is well
defined.
Arcsine, Functional Method
• When |r |>1, θ can’t be calculated. Must we
just throw away that data?
Introduction to Maximum Likelihood
• “Likelihood” is probability of Data given
Hypothesis and background Information
L = P(D|H,I)
• Used to derive χ2 least-squares fitting.
– Least-squares fitting adjusts parameters of H to
maximise L.
Notation from Toussaint, “Bayesian Inference in
Physics”, Rev. Mod. Phys. 83 943-999 (2011)
Introduction to Maximum Likelihood
• “Likelihood” is probability of Data given
Hypothesis and background Information
L = P(D|H,I)
• Used to derive χ2 least-squares fitting.
– Least-squares fitting adjusts parameters of H to
maximise L.
Notation from Toussaint, “Bayesian Inference in
Physics”, Rev. Mod. Phys. 83 943-999 (2011)
Arcsine, Maximum Likelihood
L = P(D|H,I)
• Data D is just “r”.
Arcsine, Maximum Likelihood
L = P(D|H,I)
• Data D is just “r”.
• Hypothesis H is “θtrue=θ0”.
Arcsine, Maximum Likelihood
L = P(D|H,I)
• Data D is just “r”.
• Hypothesis H is “θtrue=θ0”.
• Assumptions I include “Uncertainty in r is
Gaussian” and “θ = asin(r)”.
Arcsine, Maximum Likelihood
L = P(D|H,I)
• Data D is just “r”.
• Hypothesis H is “θtrue=θ0”.
• Assumptions I include “Uncertainty in r is
Gaussian” and “θ = asin(r)”.
1
L( r |  0 ) 
e
2  r
 asin  r  0
 12 
r




2
Arcsine, Maximum Likelihood
• L = P(D|H,I) = P(r|θ0)
Arcsine, Maximum Likelihood
• Bayes’ Theorem
PH | I  PD | H , I 
PH | D, I  
PD | I 
Arcsine, Maximum Likelihood
• Bayes’ Theorem
PH | I  PD | H , I 
PH | D, I  
PD | I 
• Posterior: P(H|D,I) = P(θ0|r)
Arcsine, Maximum Likelihood
• Bayes’ Theorem
PH | I  PD | H , I 
PH | D, I  
PD | I 
• Posterior: P(H|D,I) = P(θ0|r)
• Likelihood: L = P(D|H,I) = P(r|θ0)
Arcsine, Maximum Likelihood
• Bayes’ Theorem
PH | I  PD | H , I 
PH | D, I  
PD | I 
• Posterior: P(H|D,I) = P(θ0|r)
• Likelihood: L = P(D|H,I) = P(r|θ0)
• Prior: P(H|I) = P(θ0)
– Assume uniform prior –π/2 ≤ θ0 < π/2.
Arcsine, Maximum Likelihood
• Bayes’ Theorem
PH | I  PD | H , I 
PH | D, I  
PD | I 
• Posterior: P(H|D,I) = P(θ0|r)
• Likelihood: L = P(D|H,I) = P(r|θ0)
• Prior: P(H|I) = P(θ0)
– Assume uniform prior –π/2 ≤ θ0 < π/2.
• Evidence: P(D|I) = P(r)
– Normalisation constant
• P(θ0|r)QP(r|θ0) for fixed r.
Arcsine, Maximum Likelihood
• L = P(D|H,I) = P(r|θ0)
Arcsine, Maximum Likelihood
• Renormalize for r.
• P(H|D,I) = P(θ0|r)
Arcsine, Maximum Likelihood
• Small |r|: approximately Gaussian.
Arcsine, Maximum Likelihood
• Distorted for larger |r|.
Arcsine, Maximum Likelihood
• r=±1 handled easily...
Arcsine, Maximum Likelihood
• Renormalize for r.
• P(H|D,I) = P(θ0|r)
Arcsine, Maximum Likelihood
• ...as is |r|>1: gives θ0=±π/2 and sensible
uncertainty.
Arcsine, Summary
Arcsine, Summary
Arcsine, Summary
• Max. likelihood error estimated as

2
   P 0 | r  0  asin r  d  0
2
2

2
Cons
• Direct maximum likelihood:
• Does not give single
uncertainty figure.
– Does not “plug-in” to
other frequentist
methods.
• Computation-intensive.
Cons and Pros
• Direct maximum likelihood:
• Does not give single
uncertainty figure.
– Does not “plug-in” to
other frequentist
methods.
• Computation-intensive.
• Works for badlybehaved functions.
• Gives full Posterior
distribution.
• Extends easily to nonuniform priors via
Bayes’ theorem.
References
• Hughes and Hase, “Measurements and their
Uncertainties”, OUP (2010)
• Toussaint, “Bayesian Inference in Physics”,
Rev. Mod. Phys. 83 943-999 (2011)
• MacKay, “Information Theory, Inference and
Learning Algorithms”, CUP (2003). Available
online:
http://www.inference.phy.cam.ac.uk/mackay
/itila/
– See especially sections 3.1, 22.1 and 24.1.