Line Fit with Errors on x,y

Line Fit with Errors on x,y
We now consider the line fit in the case where we have an
uncertainty in both x,y. An example of such an application is
the attempt to search for right-handed weak currents at
HERA. Here, cross sections are measured as a function of
the polarization of the lepton beam. Weak interactions are
thought to be purely left-handed. I.e., the charged current
cross section for right-handed electrons (and for left-handed
positrons) should vanish. This is to be tested experimentally.
CC (P) = (1+ P) CC (P = 0)
Where P=+1 for right-handed positrons and left-handed
electrons and P=-1 for left-handed positrons and righthanded electrons. Both and P are measured with some
uncertainty.
HERA II 2003-2007
Spin
Rotators
successfully
implemented
New physics possibilities due to big increase in data
sets, polarization of the beam
Proton Structure studies:
Charged Current
Neutral Current
e- W- , W- scatters on u,c,d, s
e+ W+ , W+ scatters on d,s,u,c
The Weak interaction is purely left-handed. Polarization of lepton
beam should show dramatic effects.
CC-interaction should disappear for right-handed
electrons, left-handed positrons.
Measure CC cross sections as a function of the lepton
charge and polarization.
Error bars on
polarization are
not shown typically 2 %
Classic test of EW interaction.
Pearsons’ Data
Here is a standard data set
used to test fitting programs.
It is rather pathological, with
errors on the coordinates
changing rapidly from point to
point.
The theory is that the data
comes from a straight line. We
try our different approaches
and see what happens.
Formulation - 1
We consider the general line fit where we have uncertainty on
both x,y. We assume the x,y measurements are independent.
Formulation 1: consider that all points along line can be source
of data, and form the joint probability as follows (for single point):
f (x i , y i | m,c) = f (x i | m,c) f (y i | m,c) ds
2
(x x) 2 1
(y f (x)) 2 df
1
1+ dx
exp i 2 exp i
=
2
dx 2 x
2 y
2 x 2 y
(x x) 2 (y mx c) 2 1
2
=
1+
m
exp i 2 exp i
dx
2
2 x y
2 y
2 x i
i
i
i
i
i
L(m,c) = f (x i , y i | m,c)
i
i
i
Likelihood
(x x) 2 (y mx c) 2 1+ m 2
dx
exp i 2 exp i
L(m,c) 2
2 y
2 x i 2 x y
i
i
i
i
Line Fit – cont.
If we assume the integral runs to +- infinity (in practice +-4 is fine), then we can use our trick of ‘completing the square’
to solve the integral. We factor out the parts that don’t have
any mx dependence. After some messy algebra, we find:
(y mx c) 2 i
i
L(m,c) exp
2
2 2
2
2 2
2( y + m x )
i
2 y + m x
1+ m 2
i
i
i
i
Likelihood
Likelihood contours in (m,c) plane. Note large correlation
between the parameters.
m = 0.47411± 0.05756
c = 5.4499 ± 0.29268
1.000 0.963
ij = 0.963 1.000 Likelihood - cont.
c = 5.4499+0.2995
0.2851
m = 0.47411+0.05522
0.05975
Log(likelihood) looks reasonably Gaussian, so error
prescription and 2 fits are expected to work.
Likelihood - cont.
The ±1 contours
Line Fit - cont.
(y mx c) 2 1
i
L(m,c) exp
i
2 2
2 i
2(1+ m ) i i
2
2
2
If we take x = y = i
i
i
di
hi
hi = di cos =
(y i mx i c)
1+ m 2
so
hi2 1
L(m,c) exp
2 2 i
2 i i
What matters is the perpendicular distance to the line.
2 Approach
y = mx + c
Single point
di
hi
m = tan hi = di cos =
di
1+ m 2
Assume for now equal errors on x,y. Define the 2 using the
perpendicular distance to the line:
hi2
di2
(y i mx i c) 2
= 2 =
=
2 2
2
2
(1+
m
)
(1+
m
)
i
i
i
i
i
i
2
2 Approach-cont.
Suppose now that
x,i y,i
Change variables:
xi
xi =
x,i
yi
yi =
y,i
In this coordinate system, the errors are equal (and 1) on
each variable. So
2
(y
mx
c)
i
2 = i
2
1+
m
i
Where:
y
x
=m +c
y
x
y = mx + c
or
x,i
m= m
y,i
1
c=c
y,i
2 Approach-cont.
2
(y
mx
c)
i
2 = i2
2 2
i y,i + m x,i
Note: this looks like the maximum likelihood formula, except
that we are missing the prefactor. This is consistent with the
2 definition that we take the Gaussian part of the likelihood,
but it is clearly an approximation. Note also that this formula
violates one of the rules of 2 fits - the error depends on the
fit parameters (m) and is not fixed.
2 Approach-cont.
m = 0.47951± 0.05707
c = 5.4763 ± 0.28971
1.000 0.962
ij = 0.962 1.000 True points
In the previous, we assumed that every point along the line
was equally likely as the source of the measured point, and
generated a likelihood density. We can also try a Bayes
approach where we assume there are true points which are
the source of the measured points. Then
r
r r
r r
r r r
r
P( xT ,m,c | x m , y m )P( x m , y m ) = P( x m , y m | xT ,m,c)P( xT ,m,c)
Note that we don’t need yT since we have the relation
yT = mxT + c
The measured points are related to the true via:
r r r
P( x m , y m | xT ,m,c) = i
(x i xT,i ) 2 (y i mxT,i c) 2 exp
exp
2
2
2 x,i y,i
2 x,i 2 y,i
1
Bayes - cont.
For
r r
P( x m , y m ) we use the law of total probability
r
r r
r
r r r
P( x m , y m ) = dxT P( xT ) dmP(m) dcP(c)P(x m , y m | xT ,m,c)
Let’s take flat distributions in all the true variables, then
r r r
P( x m , y m | xT ,m,c)
r
r r
P( xT ,m,c | x m , y m ) = r
r r r
dxT dm dc P(x m , y m | xT ,m,c)
To get the probability in (m,c) space, we marginalize:
r r
r
r r
r
P(m,c | x m , y m ) = P( xT ,m,c | x m , y m ) dxT
r r r
r
P( x m , y m | xT ,m,c) dxT
= r
r r r
dxT dm dc P(x m , y m | xT ,m,c)
Bayes - cont.
For a single point, we find
(y mx c) 2 i
i
exp
2
2 2 2
2 2
2 ( y,i + m x,i )
2( y,i + m x,i )
P(m,c | x i , y i ) =
(y mx c) 2 1
i
i
exp
dmdc
2
2 2 2
2 2
2 ( y,i + m x,i )
2( y,i + m x,i )
1
And for our set of points:
(y mx c) 2 i
i
exp
2
2 2 2
2 2
i
2 ( y,i + m x,i )
2( y,i + m x,i )
r r
P(m,c | x m , y m ) =
(y mx c) 2 1
i
i
exp
dmdc
2
2 2 2
2 2
i
2 ( y,i + m x,i )
2( y,i + m x,i )
1
Note that this gives a different result than the likelihood
approach, which has an extra (1+m2) in the numerator, and
the 2 approach, which has no prefactor.
Pearsons’ Data – cont.
(y mx c) 2 i
i
exp
2
2 2 2
2 2
i
2 ( y,i + m x,i )
2( y,i + m x,i )
r r
P(m,c | x m , y m ) =
(y mx c) 2 1
i
i
exp
dmdc
2
2 2 2
2 2
i
2 ( y,i + m x,i )
2( y,i + m x,i )
1
Bayes:
Pearsons’ Data – cont.
r r
r r
P(m | x m , y m ) = P(m,c | x m , y m ) dc
r r
r r
P(c | x m , y m ) = P(m,c | x m , y m ) dm
The marginalized Bayes probabilities
Pearsons’ Data - cont.
68% CL contours for y(x)
from Bayes approach
r r
P(y | x) = P(m,c | x m , y m ) (c = y mx) dm dc
y min
y max
P(y | x)dy = 0.16 = P(y | x)dy
Comparison
Only very small differences !