Zech - Oxford Physics

Reduction of Variables in Parameter Inference
Günter Zech, Universität Siegen
Motivation: Parameter fitting from multidimensional
histograms often suffers from statistical difficulties due
to low numbers of events per bin. (Relevant if data have
to be compared to a Monte Carlo simulation and
therefore a simple likelihood fit is not possible.)
Goal: Reduce the dimensionality without loss of
information
Phystat2005, Oxford
G. Zech, Universitaet Siegen
1
Historical example
Determination of V/A coupling in t-decay at PETRA
reaction:
e  e   t t        ( t  t   )
distribution:
f ( p   , p   |  )  (1   ) f1 ( p   , p   )  f 2 ( p   , p   )
1 parameter, 6 variables, about 30 events
with 3 bins per variable we get about 2 events / bin
(A simple likelihood fit was not applicable due to acceptance
corrections by Monte Carlo simulation.)


Some groups fitted the p   p  distribution.
Phystat2005, Oxford
G. Zech, Universitaet Siegen
2
Simple case: 2 random variables, 1 linear parameter
f ( x, y |  )  f1 ( x, y)  f 2 ( x, y)
Define new variables:
v( x, y )  f1
u ( x, y )  f 2 / f1
We get
 ( x, y )
g (u, v |  )  v(1  u )
(u, v)
ln L( )   ln( 1  ui )  const.
i
The only relevant variable is u
(The analytic expression of g(u,v|) is not required!)
The generalization to more than 2 variables is trivial
Phystat2005, Oxford
G. Zech, Universitaet Siegen
3
Example:
f ( x, y , z |  ) 
1

( x 2  y 2  z 2 )1/ 2   ( x  y 3 ),
( x 2  y 2  z 2 )1/ 2  1,
( x  y 3 )
u 2
( x  y 2  z 2 )1/ 2
Experimental data xi,yi,ziui
MC: generate x,y,z  u
Perform a likelihood fit
to a superposition of the two
MC distributions of u
Phystat2005, Oxford
G. Zech, Universitaet Siegen
4
Nonlinear parameter dependence
Linearize, approximate
by Taylor expansion at
first estimate 0 of , fit
D



d
f ( x , )  f ( x , 0 ) 
f ( x ,  0 )D  ...
d



d
u( x) 
f ( x , 0 ) / f ( x , 0 )
d
Several parameters
 
 
 

f ( x , )  f ( x , 0 ) 
f ( x ,  0 )D1
1
 


f ( x ,  0 )D 2 ...
 2

 
 

ui ( x ) 
f ( x , 0 ) / f ( x, 0 )
 i
We need one variable per
parameter
(makes only sense if
initially the number of
variables is larger than the
number of parameters)
Phystat2005, Oxford
G. Zech, Universitaet Siegen
5
Can we do any better?
Approximate a sufficient statistic
number of events
Example: distorted lifetime distribution
(exponential)
t
Mean value of experimental
data is still approximatively
sufficient.
Compute relation between
observed and true value by
Monte Carlo simulation.
[Full detector simulation for t0  t0‘
Reweight MC events  t(t‘)]
100
mean values
true
measured
75
50
25
0
0
Phystat2005, Oxford
G. Zech, Universitaet Siegen
1
2
3
4
lifetime
6
5
tobserved
1.4
1.2
1.0
0.8
0.8
1.0
ttrue
1.2
Monte Carlo  curve
Data  tobserved + error  estimated t + error
Phystat2005, Oxford
G. Zech, Universitaet Siegen
7
Approximate likelihood estimate
pdf: f ( x |  )
(x,  could be multidimensional)
• ignore acceptance and resolution effects and determine
parameters + errors from a likelihood fit exp  exp
to the the observed data
• generate Monte Carlo events for
• loop
, re-weight events by
and perform likelihood fit 
• correct experimental value
Phystat2005, Oxford
G. Zech, Universitaet Siegen
0  0 ,observed
f ( x |  ) / f ( x | 0 )
observed ( ),  (observed )

   (exp )
8
Remarks:
• The fit of the experimental data to the uncorrected pdf
provides an approximate estimate for the parameters.
• Other sufficient statistics may be used, which do not
require a likelihood fit.
• In some cases where the resolution is bad the pdf may
be undefined for some experimental values of x.
Shifting or scaling of data helps.
• For more than 2 parameters it is tedious to determine the
relation between true and observed parameter values.
• In case acceptance and resolution effects are very large,
we may have to take them into account. How?
Phystat2005, Oxford
G. Zech, Universitaet Siegen
9
Acceptance effects
Acceptance effects do not necessarily spoil the method.
Example: The mean value of lifetimes remains a
sufficient statistic when the exponential is truncated at
large times.
f (t |  ) 
e
1 e
 t
t max


t max
ln L  N  ln   ln 1  e

N

1 N
t   ti
N i 1
(
Phystat2005, Oxford
G. Zech, Universitaet Siegen
)

ti 

i 1

N
10
General case (only losses, no resolution effects):
a ( x) f ( x |  )
h( x |  ) 
 a ( x) f ( x |  )dx
a(x) = acceptance
Likelihood:
ln L   ln f ( xi |  )  N ln A( )  N ln a ( xi )
A( )   a ( x) f ( x |  )dx
The last term is a constant and can be discarded. The integrated
acceptance A() has to be estimated by a Monte Carlo simulation.
(Table or approximated by an analytic expression)
The acceptance estimate may be crude. Approximations reduce the
precision but do not bias the result. The simulation (obseved) takes
care of everything.
Phystat2005, Oxford
G. Zech, Universitaet Siegen
11
Resolution effects
Can normally be neglected (remember: approximation do not
bias the result)
When non-negligible:
• Perform binning-free unfolding (see my SLAC contribution)
• Do a likelihood fit with the unfolded data
• simulate complete procedure with MC (may require some
CPU power.)
Phystat2005, Oxford
G. Zech, Universitaet Siegen
12
Approximate estimators for linear and quadratic pdfs
(in case acceptance and resolution effects are small)
1

f ( x | a, b)  a  bx  3  a  x 2
2

1  x  1
p.d.f.:
Asume a=a0+a, b=b0+b, f f0(x)=f(x |a0,b0)
a, b small
Neglect quadratic terms in a, b
x / f (x )

ˆ
b
 x / f (x )
i
2
i
0
i
0
i
(1  3x ) / f (x )

aˆ 
 (1  3x ) / f (x )
2
i
2
i
0
i
0
i
(very fast, could be used online)
Phystat2005, Oxford
G. Zech, Universitaet Siegen
13
Summary
Method 1: Reduction of variables
•
•
•
The Number of variables can be reduced to the number of
parameters. This simplifies a likelihood inference of the
parameters if the number of parameters is less than the
number of variables.
Goodnes-of-fit can be applied to the new variable(s)
(simplifies g.o.f.)
Acceptance and resolution effects can be taken into account
in a similar way as in the second method. (has not been
demonstrated)
Phystat2005, Oxford
G. Zech, Universitaet Siegen
14
Method 2: Use of an approximatly sufficient statistic or
likelihood estimate
a) No large resolution and acceptance effects:
Perform fit with uncorrected data and undistorted likelihood
function.
b) Acceptance losses but small distortions:
Compute global acceptance by MC and include in the
likelihood function.
c) Stong resolution effects:
Perform crude unfolding.
All approximations are corrected by the Monte Carlo
simulation.
The loss in precision introduced by the approximations is
usually completely negligible.
Phystat2005, Oxford
G. Zech, Universitaet Siegen
15