Slide - CERN Indico

VII RTN Workshop
INPN, Charles University, Prague
8th-10th February 2006
Study of tt production
in tau  jets channel at CDFII
using neural networks
Silvia Amerio, Ambra Gresele, Ignazio Lazzizzera
University of Trento and INFN
ttbar  tau + jets channel
pp  tt  W  bW b b-jet



2 jets b-jet
jet  
W
    
      1 0



jet

      
        1 0
  e e 
    
BR  36%
1-Prong (BR  50%)
3-Prong (BR  14%)
1
Tau reconstruction at CDF
     ,
      1 0
       ,
        1 0
Hadronic cluster
Em cluster
Charged pions  isolated tracks
pointing at narrow clusters in
tau from W decay is
highly boosted
calorimeters
Neutral pions  2 photons showers in
the Central EM ShowerMax chamber
(0)
Tau ID procedure:
• finding a narrow cluster in the calorimeter
• finding a seed track (track pointing to the
cluster; if several such tracks are found, the one
with the highest Pt is chosen)
• Other tracks within a 3-D angle iso with respect
to the seed track are associated to the tau
candidate
• Tracks with iso <  < sig are used to veto tau
candidates
• In the same way 0s are associate to tau
candidates
2
ttbar  tau + jets selection
ttbar  tau + jets events are
searched for in 311 pb-1 of data
collected with
TOP_MULTIJET TRIGGER:
Huge background from QCD events
Level 1  at least one calorimetric tower
with Et ≥ 10 GeV
Level 2  at least 4 calorimetric clusters
with Etclu ≥ 15 GeV and total transverse
energy ΣEt ≥ 125 GeV
Level 3  at least 4 jets with |η| ≤ 2.0 and
transverse energy Etjet ≥ 10 GeV
after some preliminary cuts (prerequisites)
S/N = 1/32000
What can we do to increase S/N?
a tau+jets event has 2 jets from b quarks  we require that at least one jet is b-tagged
a tau+jets event has one jet from a tau  we require that at least one jet tau-like is present
S/N = 1/6000
3
ttbar  tau + jets selection
2 neural networks applied in cascade
Study of discriminating variables
GLOBAL





 number of tracks in the signal cone
number of jets
 isolation
aplanarity
TAU
 narrowness
sphericity
SPECIFIC  mass
centrality
 number of muons hits in
missing Et significance
correspondence of tau candidate
NN1  trained with mostly global variables
NN2
 trained with tau specific variables
NN analysis:
MC Herwig,mixture of different masses data
• training on a sample with 50% signal and 50% bgnd (3000+3000 evts)
• validation on a sample with realistic (S/N = 1/6000) proportions of signal
and background
4
ttbar  tau + jets selection
Eff(cut) 
NN1
Training results
signal events with NNout  cut
signal events with NNout  cut
total signal events
Pur(cut) 
total events with NNout  cut
NNout distr
Efficiency
Purity
Eff vs Purity
NN2
5
ttbar  tau + jets selection
Validation on a sample different from training one
and with S/N = 1/6000
NN1 validation results
To increase S/N
NN2 is applied only to
those events having
NN1 output greater
than a certain value
BestCut
6
ttbar  tau + jets selection
Best Cut
BestCut is NN1 output maximizing signal significance S/(S+B)
Signal significance as a function of cut on NN1 output for different top masses;
maximum for BestCut = 960 independently from top mass
For each mass used
in training, samples
with realistic
proportions of signal
and bgnd
(S/N = 1/6000).
Signal events picked
randomly from
Herwig MC.
Bgnd events are
data events with
MetSigf ≤ 3 GeV1/2
100 measures
varying signal events
Signal significance increases with decreasing top mass (cross section increases)
7
ttbar  tau + jets selection
Validation on those
events having NN1
output greater than
BestCut (S/N = 1/13)
Higher purity due to higher
S/N in the sample submitted
to NN2
8
NN2 validation results
ttbar production cross section
The method
• Made different NN1 and NN2 trainings, selected the best (on validation)
• Apply NN1
• Select events whose NN1 output > BestCut
• Apply NN2
• Binned likelihood fit on NN2 output distribution to extract ns and nb,
number of signal and background events in the sample
LnS , nB

ns  nb 

N
e
N!
  ns  nb  #bins
 ns f s (i )  nb f b (i ) 



ns  nb
i 1 

ki
ns  signal events
nb  background events
N  total data events
f S (i )  fraction of signal events in ith bin
f B (i )  fraction of background events in ith bin
nS

 preCut   NN 1  L
ki  data events in the ith bin
distributions obtained from NN2 training
9
ttbar production cross section
Validation
• Validation is made on samples with realistic proportions of signal and
background (initial S/N = 1/6000, Signal from MC, bgnd from data)
• measurement of cross section as a function of cut on NN1 output
 a plateau in correspondence of the correct value is expected
Mtop= 167.55GeV
Mtop= 170 GeV
th = 8.5pb
th=7.9pb
Mtop= 172.5 GeV
th=7.3pb
Each point is the mean over 100
measurements on samples differing
for signal events
Mtop= 175 GeV
th=6.7pb
Mtop= 177.55 GeV
th=6.2pb
Mtop= 180GeV
th=5.7pb
10
ttbar production cross section
Validation
To check the method, measurement of cross section as a function of top mass for
BestCut = 960
Measurement made on samples with S/N = 1/6000 (signal picked randomly from MC
ttbar, bgnd from data with Met significance ≤ 3 GeV1/2)
Cacciari et al.,JHEP
0404:068 (2004)
standard deviation
each red point is the mean over 100
measurements on samples differing
for signal events
theoretical error
11
Conclusion and plans for the future
tau + jets selection  two NNs in cascade give
S/N  1/1 with
36% efficiency on signal
cross section measurement method  results on
validation samples in good agreement with theory
we are working to reduce statistical error
soon a mesurement on data!
12
Backup slides
The accelerator complex
Cockroft-Walton accelerator
H- up to 750 keV
Linac, linear accelerator
H- up to 400 MeV
Booster, synchrotron
400 MeV p up to 8 GeV
Main Injector, synchrotron
p/pbar up to 150 GeV
Tevatron, synchrotron
p/pbar up to 980 GeV
CDF detector
Silicon Tracking
Detectors
azimuthal angle 
Central Drift
Chambers (COT)
y
x
Solenoid Coil
EM
Calorimeter
 = -log(tan(/2))
Hadronic
Calorimeter
polar angle 
z
Muon Drift
Chambers
Muon Scintillator
Counters
Steel Shielding
z = distance along beamline; +z = p direction (east), -z = pbar
direction (west); z = 0 interaction point
ttbar  tau + jets selection
Input variables:
• most discriminating between signal and background
• no redundancy
• divided in global variables and tau specific
global
tau+jets event has...
number
of jets
5 jets
jets isotropically distributed, i.e.
greater values of aplanarity and sphericity
aplanarity
sphericity
A = 3/2Q1
Q1 ≤ Q2 ≤ Q3 eigenvalues of normalized momentum tensor
S = 3/2(Q1 + Q2)
ttbar  tau + jets selection
Input variables  global
jets are emitted
preferably in the
transverse plane
centrality
E
T
C
jets
s
in a ttbar event
up to 6 jets can
be produced by
hard processes,
while in QCD
events the least
energetic ones
are produced by
gluon
bremsstrahlung
missing Et
significance
sumet3
two neutrinos in the final
state
METsigf 
MET
ET
17
ttbar  tau + jets selection
greater boost
Input variables  tau specific
tau boost
(P/m)
a jet coming from a real tau...
has one/three
charged tracks
ntrks10deg
is isolated
nisopi0s
(neutral pions
in the isolation
cone)
trackiso
(Sum pt of the
tracks in isolation
cone)
caloetiso
(cal energy in the
isolation cone)
18
ttbar  tau + jets selection
Input variables  tau specific
is narrow
nmuhits
has no muon
hits in
correspondence
phiphi
(φi-<φ>)2
mass (calculated with
tracks and neutral pions)
nearer to tau mass
(1.777 GeV)
etaeta
(ηi - <η>)2
angleseedtocluster
tau visible mass
has small angle
between seed
track and cluster
barycentre
ttbar  tau + jets selection
Background (All data events)
Prerequisites
 Good run list version 7
 |priVertZ| < 60 cm && |jetZV priVertZ| < 5 cm
&& Nv12 ≥ 1
 nTightLepton = 0
b-tagged jet request:
• 92% background rejected
• 35% signal lost
tau jet request:
• 30% background rejected
• 40% signal lost
S/N  1/6000
Signal (MC Herwig events – Mtop = 175 GeV)
ttbar  tau + jets selection
Training
sample
Herwig Mass samples
used to build training
samples
Current top mass estimate (combination of CDF and D0 results)  172.7 ± 2.9 GeV/c2
Effects of prerequisites and preliminary cuts on MC samples used to build training samples
ttbar  tau + jets selection
Training
sample
Tau candidate – Hepg tau matching
A tau candidate is a true tau if
r      0.2
2
2
   HepgTau TauCand
where
   HepgTau  TauCand
ttbar  tau + jets selection
More examples we submit
to the net, better results
BUT
greater the number of
events, slower the net
Number of
training
events
Signal distribution
compromise between a
statistically significant
number of events and
net speed
For a fixed net
architecture, trainings on
samples with different
cardinalities
Efficiency on signal
as a function of cut
on NNout
Background distribution
ttbar production cross section
b-tagging
Efficiencies
on signal
events
prerequisites
tau finder
NN1 output cut
ttbar  tau + jets selection
Why don’t we use a Single NN technique trained with all variables?
18 signal
events
expected
S/N
NN output
Single NN
1/3
995
Double NN
1/1.3
450
For a fixed efficiency on signal, two NN in cascade give
better S/N
and
allow further increasing of S/N (relaxing request on signal efficiency)
9
Reactive Tabu Search
Make a chain of the words (up to 8 bits -> 0 ÷255) representing the weights
w1
wn
w2
Research space => S = {0,1}Ntot
B * n = Ntot
RTS basic idea:
• starting configuration (position in the hypersurface) = all weights are equal;
• make d explorative moves in the neighbourhood and choose the best;
1 move = 1 bit flip
To increase the probability that with a bit flip we
move in the neighborhood, we can use
• GRAY CODE instead of BINARY CODE;
• picking a toggle bit according to a suitable
distribution (linear, decreasing with the bit
significance)
The one that produces the
minimum value of f(w)
Reactive Tabu Search
TABOO MOVE
As an elementary move is applied, its inverse its temporarily prohibited
for a period of T steps (prohibition period).
T must be:
• large, to avoid cycles (larger than (R/2 -1) to make cycles of length
R impossible);
• small, to avoid over-constraining (anyway smaller than 2Ntot-2)
REACTIVE TABOO MOVE
Initially T is set to 1; then increases when there is evidence that more
diversification is needed, while it decreases when such evidence
disappears.
Reactive Tabu Search
1. Choose a starting configuration (position in the error surface)
2. Make admissible (i.e. not taboo) only the moves that have not been
executed in the most recent part of the search (prohibition period)
3. Make “some” explorative moves (using only admissible moves) in the
neighbourhood of the current configuration
4. Choose the configuration that produces the minimal value of the
error function;
5. Check if the chosen configuration has been already visited in the
past and, in this case, react adjusting the prohibition period or
escaping
6. Goto 1
• If “too often” already visited, escape;
• If “not too often” visited, but among “too many” already visited
different configurations, escape as well;
• Otherwise
 If no repetions since long time, decrease prohibition period;
 Otherwise increase it.
• Store history until needed.
ttbar production cross section
Systematics
Systematic error introduced by neural networks due to
 finite number of training events
 training procedure:
each move on the error suface to localize an optimal
minimum consists of a random bit flip in the strings of
weights.
2000 evts
(x4)
6000 evts
100 applications
100 applications
training
training
training
100 applications
training
(x4)
....
8000 evts
....
to account these effects
we made multiple trainings on
samples
with different cardinalities
training
training
(x4)
12000 evts
application: NN1  BestCut  NN2  fit
(x4)
32
ttbar production cross section
Systematics
For each cardinality:
Mean over all pseudoexperiments and all trainings:
Mean over the pseudoexoperiments
Systematic error: standard deviation of the mean over
pseudoexperiments with respect to overall mean
33
ttbar production cross section
Systematics
Other sources of systematic errors
 MC generator  Herwig VS Pythia
 Incomplete knowledge of Initial and
Final State Radiation
 Jet energy scale
 Parton Density Functions
 Luminosity estimate
compare efficiencies on samples
used for training with those
obtained on samples with
different MC, ISR/FSR, jet
corrections.
reweighting method
• 4.2% from Cerenkov Luminosity Counters (MC
simulation beam position, calibrations, beam losses)
• 4% from inelastic cross section normalization
34
ttbar production cross section
1.
application of prerequisites
2.
selection of events with at least one b-tagged jet
3.
selection of events with at least one tau candidate
4.
NN1 application
5.
BestCut
6.
NN2 application
NN1 is trained with
variables strongly
related to jet
corrections
Systematic
error due to
jet energy
scale
b-tagged jets are
searched among those
with corrected
transverse energy
greater than 15 GeV
1  1   1
  

 2.6%
 
 def
   JES 2