Parton Distribution Functions
Goals: describe in detail how global PDF analyses are carried out, the strengths and weaknesses
of LO/LO* (modified LO)/NLO and NNLO fits, how PDF uncertainties are calculated and how PDF
re-weighting can be easily carried out through LHAPDF and other PDF tools, and the use of correlations
to constrain PDFs and cross sections
1. Introduction
As mentioned in Chapter ??, the calculation of the production cross sections at hadron colliders for both
interesting physics processes and their backgrounds relies upon a knowledge of the distribution of the
momentum fraction x of the partons (quarks and gluons) in a proton in the relevant kinematic range.
These parton distribution functions (pdfs) can not be calculated perturbatively; ultimately, it may be
possible to calculate these parton distribution functions non-perturbatively, using lattice gauge theory
(reference for LGT). For the foreseeable future, though, pdf’s will be determined by global fits to
data from deep inelastic scattering (DIS), Drell-Yan (DY), and jet production at current energy ranges.
There are a number of global pdf fitting groups that are currently active [], and which provide
semi-regular updates to the parton distributions when new data and/or theoretical developments become
available. The resulting pdf’s are available at leading order (LO), next-to-leading order (NLO) and
next-to-next-to-leading order (NNLO) in the strong coupling constant (αs ), depending on the order(s) at
which the global pdf fits have been carried out. Some pdf’s have also been produced at what has been
termed modified leading order, in an attempt to reduce some of the problems that result from the use
of LO pdf’s in parton shower Monte Carlo programs. PDFs of each these orders will be discussed in this
chapter.
There are two different classes of technique for pdf determination, those based on the Hessian
approach, and those using a Monte Carlo approach. Both classes will be discussed in this chapter.
In addition, the PDF4LHC working group [] has carried out benchmark comparisons [] of the NLO
predictions at the LHC (7 TeV) for six PDF groups. These comparisons will be discussed in Chapter X.
2. Processes involved in global analysis fits
Measurements of deep-inelastic scattering (DIS) structure functions (F2 , F3 ), or of the related cross sections, in lepton-hadron scattering and of lepton pair production cross sections in hadron-hadron collisions
provide the main source of information on quark distributions fq/p (x, Q2 ) inside hadrons. At leading
order, the gluon distribution function fg/p (x, Q2 ) enters directly in hadron-hadron scattering processes
with jet final states. Modern global parton distribution fits are carried out to NLO and NNLO, which
allows αS (Q2 ), fq/p (x, Q2 ) and fg/p (x, Q2 ) to all mix and contribute in the theoretical formulae for
all processes. Nevertheless, the broad picture described above still holds to some degree in global pdf
analyses.
A NLO (NNLO) global pdf fit requires thousands of iterations and thus thousands of estimates of
NLO (NNLO) matrix elements. The NLO (NNLO) matrix elements require too much time for evaluation
to be used directly in global fits. Either a K-factor (NLO/LO or NNLO/LO) can be calculated for each
data point used in the global fit, and the LO matrix element (which can be calculated very quickly) can be
changed in the global fit (multiplied by the K-factor), or a routine such as fastNLO [] or Applgrid [] can
be used for fast evaluation of the NLO matrix element with the new iterated pdf. Practically speaking,
both provide the same order of accuracy footnoteAn argument has been made that the K-factor approach
does not work, for example for inclusive jet production, since different subprocesses contribute to that
production and the NLO corrections may be different for each subprocess. However, the K-factors
change very slowly in the process of a global fit, and occasional updating is sufficient to preserve the
needed accuracy.. Even when fastNLO or Applgrid is used at NLO, a K-factor approach is still needed
at NNLO.
The data from DIS, DY and jet processes utilized in pdf fits cover a wide range in x and Q2 .
HERA data (H1 [?]+ZEUS [?]) are predominantly at low x, while the fixed target DIS [?, ?, ?, ?, ?]
and DY [?, ?] data are at higher x. Collider jet data at both the Tevatron and LHC [?, ?, ?, ?] cover a
broad range in x and Q2 by themselves and are particularly important in the determination of the high x
gluon distribution. To date, no jet data from the LHC has been used in global pdf fits, although that will
change as high statistics data, and their detailed systematic error information, are published. In addition,
jet production data from HERA have been used in the HERAPDF global pdf fits [].
show plot of all points in x and Q2 in CT10 fit
There is a tradeoff between the size and the consistency of a data set, in that a wider data set
contains more information, but information coming from different experiments may be partially inconsistent. Most of the fixed target data have been taken on nuclear targets and suffer from uncertainties in
the nuclear corrections that must be made []. This is unfortunate as it is the neutrino fixed target data that
provide most of the quark flavor differentiation, for example between up, down and strange quarks. As
LHC collider data become more copious, it may be possible to reduce the reliance on fixed target nuclear
data. For example, the rapidity distributions for W + , W − and Z production the LHC (as well as the
Tevatron) are proving to be very useful in constraining u and d valence and sea quarks, as described in
Chapter X.
There is considerable overlap, however, for the kinematic coverage among the datasets with the
degree of overlap increasing with time as the full statistics of the HERA experiments are being published.
Parton distributions determined at a given x and Q2 ‘feed-down’ or evolve to lower x values at higher
Q2 values. reference to discussion in Chapter 2? DGLAP-based NLO and NNLO pQCD should
provide an accurate description of the data (and of the evolution of the parton distributions) over the
entire kinematic range present in current global fits. At very low x and Q2 , DGLAP evolution is believed
to be no longer applicable and a BFKL [?, ?, ?, ?] description must be used. No clear evidence of
BFKL physics is seen in the current range of data (reference for this?); thus all global analyses use
conventional DGLAP evolution of pdfs.
There is a remarkable consistency between the data in the pdf fits and the perturbative QCD theory
fit to them. Both the CTEQ and MSTW groups use over 3000 data points (check exact numbers) in
their global pdf analyses and the χ2 /DOF for the fit of theory to data is on the order of unity, for both the
NLO and NNLO analyses. (The NNLO χ2 values tend to be slightly larger than those for NLO.) For most
of the data points, the statistical errors are smaller than the systematic errors, so a proper treatment of
the systematic errors and their bin-to-bin correlations is important. All modern day experiments provide
the needed correlated systematic error information. The H1 and ZEUS experiments have combined the
data from the two experiments in such a way as to reduce both the systematic and statistical errors,
providing errors of both type of the order of a percent or less over much of the HERA kinematics. In the
combination, 1402 data points are combined to form 742 cross-section measurements (including both
neutral current and charged current cross sections). The combined data set, with its small statistical and
systematic errors, forms a very strong constraint for all modern global pdf fits. The manner of using the
systematic errors in a global fit will be discussed later in this chapter.
The accuracy of the extrapolation to higher Q2 depends on the accuracy of the original measurement, any uncertainty on αS (Q2 ) and the accuracy of the evolution code. Most global pdf analyses
are carried out at NLO and NNLO. The NLO and NNLO evolution codes have now been benchmarked
against each other and found to be consistent (need ref to les houches). Many processes have been
calculated to NLO and there is the possibility of including data from these processes in global fits. Fewer
processes have been calculated at NNLO [?]. These processes include DIS and DY, but not for example inclusive jet production. (Progress towards the calculation of inclusive jet production at NNLO is
described in Chapter ref:sec.) Typically, jet production is included in global pdf fits using NLO matrix
elements, perhaps supplemented by threshold corrections to make an approximate NNLO prediction.
do we talk about threshold corrections? Thus, any current NNLO global pdf analyses are still
LHC parton kinematics
9
10
8
10
x1,2 = (M/14 TeV) exp( y)
Q=M
M = 10 TeV
7
10
6
M = 1 TeV
5
10
4
M = 100 GeV
10
2
2
Q (GeV )
10
3
10
y=
6
4
2
0
2
4
6
2
10
M = 10 GeV
1
fixed
target
HERA
10
0
10
-7
10
-6
10
-5
10
-4
-3
10
10
-2
10
-1
10
0
10
x
Fig. 1: A plot showing the x and Q2 values needed for the colliding partons to produce a final state with mass M and rapidity
y at the LHC (14 TeV).
approximate for this reason, but in practice the approximation should work reasonably well. Full NNLO
precision awaits the completion of the NNLO inclusive jet cross section, though. Current evolution programs in use should be able to carry out the evolution using NLO DGLAP to an accuracy of a few percent
over the hadron collider kinematic range, except perhaps at very large x and very small x.
The kinematics appropriate for the production of a state of mass M and rapidity y at the LHC was
shown in Figure 1 in Section ??.
For example, to produce a state of mass 100 GeV and rapidity 2 requires partons of x values 0.05
and 0.001 at a Q2 value of 1 × 104 GeV2 . Compare this figure to the scatterplot of the x and Q2 range
included in the recent CT10 fit and it is clear that an extrapolation to higher Q2 (M 2 ) is required for
predictions for many of the LHC processes of interest.
3. Parameterizations and schemes
A global pdf analysis carried out at NLO or NNLO needs to be performed in a specific renormalization and factorization scheme. The evolution kernels are calculated in a specific scheme and to maintain consistency, any hard scattering cross section calculations used for the input processes or utilizing the resulting pdfs need to have been implemented in that same renormalization scheme. As we
saw earlier in Chapter ??, one needs to specify a scheme or convention in subtracting the divergent
terms from the pdfs; basically the scheme specifies how much of the finite corrections to subtract along
with the divergent pieces. Almost universally, the M S scheme is used; using dimensional regularization, in this scheme the pole terms and accompanying log4π and Euler constant terms are subtracted.
Fig. 2: CTEQ6.5 up and down quark distributions normalized to those of CTEQ6.1, showing the impact of the heavy quark
mass corrections.
should we explicitly show the subtraction terms? PDFs are also available in the DIS scheme
(where the full order αs corrections for F2 are absorbed into the quark pdfs), a fixed flavour scheme
(see, for example, GRV [?]) and several schemes that differ in their specific treatment of the charm quark
mass.
Basically all modern pdfs now incorporate a treatment of heavy quark effects in their fits, either via
the ACOT general-mass (GM) variable flavor number scheme [] (supplemented by by a unified treatment
of both kinematical and dynamical effects using the S-ACOT [] and ACOT-χ [] concepts), used by CTEQ,
or by the Thorne-Roberts scheme [], used by both MSTW and HERAPDF, and the FONLL scheme, used
by NNPDF [].
What about GJR and ABKM?
Incorporation of the full heavy-quark mass effects in the general-mass formalism suppresses the
heavy flavor contributions to the DIS structure functions, especially at low x and Q2 . In order for the
theoretical calculations in the global fits to agree with the data in these kinematic regions, then the
contributions of the light quark and anti-quark pdfs must increase accordingly. This has a noticeable
impact, especially on predictions for W and Z cross sections at the LHC.
Figure 2 shows the impact of the heavy quark mass corrections on the up and down quark distributions for CTEQ6.5, at a Q value of 2 GeV. The CTEQ6.5 up and down quark distributions are normalized
to the corresponding ones from CTEQ6.1 (which does not have the heavy quark mass corrections). The
shaded areas indicate the CTEQ6.1 pdf uncertainty. The dashed curves represent slightly different parameterizations for the CTEQ6.5 pdfs. The heavy quark mass corrections have a strong effect (larger
than the pdf uncertainty for CTEQ6.1) at low x, in a region sensitive to W and Z production at the LHC.
The impact of general-mass variable flavour number schemes (GM-VFNS) lies mostly in the low
x and Q2 regions. Aside from modifications to the fits to the HERA data, and the commensurate change
in the fitted pdfs, there is basically no modification for predictions at high Q2 at the LHC. Thus, it is fully
consistent to use GM-VFNS pdfs with matrix elements calculated with the M¯S scheme.
should this be discussed in more detail/more elegantly
It is also possible to use only leading-order matrix element calculations in the global fits which
results in leading-order parton distribution functions, which have been made available by both the CTEQ
and MRST groups. For many hard matrix elements for processes used in the global analysis, there exist
K factors significantly different from unity. Thus, one expects there to be noticeable differences between
the LO and NLO parton distributions (and indeed this is often the case).
All global analyses use a generic form for the parameterization of both the quark and gluon distributions at some reference value Q0 :
F (x, Q0 ) = A0 xA1 (1 − x)A2 P (x; A3 , A4 ...)
(1)
The reference value Q0 is usually chosen in the range of 1–2 GeV. The parameter A1 is associated with
small-x Regge behaviour while A2 is associated with large-x valence counting rules. We expect A1 to
be approximately -1 for gluons and anti-quarks, and of the order of 1/2 for valence quarks, from the
Regge arguments mentioned in Chapter X. Counting rule arguments tell us that the A2 parameter should
be related to 2ns − 1, where ns is the minimum number of spectator quarks. So, for valence quarks in
a proton, there are two spectator quarks, and we expect A2 = 3. For a gluon, there are three spectator
quarks, and A2 = 5; for anti-quarks in a proton, there are four spectator quarks, and thus A2 = 7. Such
arguments are useful, for example in telling us that the gluon distribution should fall more rapidly with
x than quark distributions, but it is not clear exactly at what value of Q that the arguments made above
are valid.
The first two factors, in general, are not sufficient to completely describe either quark or gluon
distributions. The term P (x; A3 , ...) is a suitably chosen smooth function, depending on one or more
parameters, that adds more flexibility to the pdf parameterization. P (x; A3 , ...) is chosen so as to tend
towards a constant for x approaching either 0 or 1.
In general, both the number of free parameters and the functional form can have an influence on
the global fit. A too-limited parameterization not only can lead to a worse description of the data, but
also to pdfs in different kinematic regions being tied together not by the physics, but by the limitations
of the parameterization. Note that the parameterization forms shown here imply that pdfs are positivedefinite. As they are not physical objects by themselves, it is possible for them to be negative, especially
at low Q2 . Some pdf groups (such as CTEQ) use a positive-definite form for the parameterization; others
do not. For example, the MSTW2008 gluon distribution is negative for x <?, Q2 =?GeV 2 . Evolution
quickly brings the gluon into positive territory.
The NNPDF approach attempts to minimize the parameterization bias by exploring global fits
using a large number of free parameters in a Monte Carlo approach. The general form for NNPDF
can be written as fi (x, Qo ) = ci (x)N Ni (x), where N Ni (x) is a neural network, and ci (x) is a “preprocessing function”. The preprocessing function is not fitted, but rather chosen randomly in a space
of function of the general form of Equation X. more detail here? The CT10 NLO fit uses 26 free
parameters (many of the pdf parameters are either fixed at reasonable values, or are constrained by sum
rules), while MSTW08 uses 20 free parameters, and NNPDF effectively has 259 free parameters.
The pdfs made available to the world from the global analysis groups can either be in a form where
the x and Q2 dependence is parameterized, or the pdfs for a given x and Q2 range can be interpolated
from a grid that is provided, or the grid can be generated given the starting parameters for the pdfs (see
the discussion on LHAPDF in Section 8.). All techniques should provide an accuracy on the output pdf
distributions on the order of a few percent.
The parton distributions from the CT10 NLO pdfs release are plotted in Figure 3 at a Q value
of 10 GeV. The gluon distribution is dominant at x values of less than 0.01 with the valence quark
distributions dominant at higher x. One of the major influences of the HERA data has been to steepen
the gluon distribution at low x. The CT10 up quark, up-bar quark, b quark and gluon distributions
are shown as a function of Q2 for x values of 0.001, 0.01, 0.1 and 0.3 in Figures ??. At low x, the
Fig. 3: The CT10 parton distribution functions evaluated at a Q of 10 GeV. have to scale gluon by 0.1
pdfs increase with Q2 , while at higher x, the pdfs decrease with Q2 . Both effects are due to DGLAP
evolution.
4. Uncertainties on pdfs
In addition to having the best estimates for the values of the pdfs in a given kinematic range, it is also important to understand the allowed range of variation of the pdfs, i.e. their uncertainties. A conventional
method of estimating parton distribution uncertainties has been to compare different published parton
distributions. This is unreliable since most published sets of parton distributions adopt similar assumptions and the differences between the sets do not fully explore the uncertainties that actually exist.
The sum of the quark distributions (Σfq/p (x, Q2 ) + fg/p (x, Q2 )) is, in general, well-determined
over a wide range of x and Q2 . As stated above, the quark distributions are predominantly determined
by the DIS and DY data sets which have large statistics, and systematic errors in the few percent range
(±3% for 10−4 < x < 0.75). Thus the sum of the quark distributions is basically known to a similar
accuracy. The individual quark flavours, though, may have a greater uncertainty than the sum. This can
be important, for example, in predicting distributions that depend on specific quark flavours, like the W
asymmetry distribution [?] and the W and Z rapidity distributions.
The largest uncertainty of any parton distribution, however, is that on the gluon distribution. The
gluon distribution can be determined indirectly at low x by measuring the scaling violations in the quark
Fig. 4: The CT10 up quark, up-bar quark, b quark and gluon parton distribution functions evaluated as a function of Q2 at x
values of 0.001 (left) and 0.01 (right).
Fig. 5: The CT10 up quark, up-bar quark, b quark and gluon parton distribution functions evaluated as a function of Q2 at an x
value of 0.1 (left) and 0.3 (right).
distributions, but a direct measurement is necessary at moderate to high x. About 43% of the momentum
of the proton is carried by gluons, and most of that momentum is at relatively small x (16% of the
momentum of the proton, for example, is carried by gluons in the x range from 0.01 to 0.1.) The best
direct information on the gluon distribution at moderate to high x comes from jet production at the
Tevatron (and the LHC).
There has been a great deal of activity on the subject of pdf uncertainties. Two techniques in
particular, the Lagrange Multiplier and Hessian techniques, have been used by CTEQ and MSTW to
estimate pdf uncertainties [?, ?, ?]. The Lagrange Multiplier technique is useful for probing the pdf
uncertainty of a given process, such as the W cross section, while the Hessian technique provides a more
general framework for estimating the pdf uncertainty for any cross section. In addition, the Hessian
technique results in tools more accessible to the general user.
4.1 The Hessian method
The Hessian method for pdf determination involves minimizing a suitable log-likelihood function. The
χ2 function may contain the full set of correlated errors, or only a partial set. The correlated systematic
errors may be accounted for using a covariance matrix, or as a shift to the data, adopting a χ2 penalty
proportional to the size of the shift divided by the systematic error.
have to rewrite; taken from CT10 paper
As an example, consider the CT10 fits to the combined HERA Run 1 neutral current (e+ p) cross
sections shown in Figures ??. When comparing each experimental value Dk with the respective theory
value Tk ({a}) (dependent on PDF parameters {a}), we account for possible systematic shifts in the
data, estimated by the correlation matrix βkα . There are Nλ =114 independent sources of experimental
systematic uncertainties, quantified by parameters λα that should obey the standard normal distribution.
The contribution of the combined HERA-1 set to the log-likelihood function χ2 is given by
χ2 ({a}, {λ}) =
N
X
1
s2
k=1 k
Dk − Tk ({a}) −
q
Nλ
X
α=1
2
λα βkα +
Nλ
X
λ2α ,
(2)
α=1
where N is the total number of points, sk = s2k,stat + s2k,uncor sys is the total uncorrelated error on
the measurement Dk , obtained by summing the statistical and uncorrelated systematic errors on Dk in
quadrature. Minimization of χ2 with respect to the systematic parameters λα is realized algebraically,
by the procedure explained in Refs. [?, ?].
The plot on the left shows a comparison of the unshifted data and the shifted data. The plot on the
right shows the CT10 NLO predictions compared to the shifted data. The CT10 predictions show good
agreement with the combined H1/ZEUS set of reduced DIS cross sections. A χ2 ≈ 680 is obtained for
the N = 579 data points of the combined HERA-1 sample that pass the standard CTEQ kinematical cuts
for the included DIS data, Q > 2 GeV and W > 3.5 GeV. Apart from some excessive scatter of the NC
e± p data around theory predictions, which results in a slightly higher-than-ideal value of χ2 /N = 1.18,
NLO theory describes well the overall data, without systematic discrepancies.
The data points shown in the figure include systematic shifts bringing theoretical and experimental
values closer to one another, by allowing the systematic parameters λα to vary within the bounds allowed
by the experimental correlation matrix βkα . As expected, the best-fit values of λα are distributed consisP
tently with the standard normal distribution. Their contribution α λ2α ≈ 65 to χ2 in Eq. 2 is better than
the value of 114 expected on statistical grounds.
The histogram of λα values obtained in the CT10 fit is shown in Fig. 7. The histogram is clearly
compatible with its stated Gaussian behavior. In each fit, one observes 1-2 values at (±)2-3σ, but those
tend to have a large PDF uncertainty (up to 3σ) and are not persistent in all fits.
10
7
10
6
10
5
10
4
10
7
10
6
10
5
10
4
-5
x=6.18⋅ 10 , i=20
-5
x=9.5⋅ 10 , i=19
-4
x=2⋅ 10 , i=18
-5
x=6.18⋅ 10 , i=20
-5
x=9.5⋅ 10 , i=19
-4
x=2⋅ 10 , i=18
-4
-3
x=2⋅ 10 , i=13
-3
x=3.2⋅ 10 , i=12
-3
2
i +
+
σNC
r (x,Q ) ⋅ 2 e P → e X
2
i +
+
σNC
r (x,Q ) ⋅ 2 e P → e X
-4
x=3.2⋅ 10 , i=17
-4
x=5⋅ 10 , i=16
-4
x=8⋅ 10 , i=15
-3
x=1.3⋅ 10 , i=14
x=5⋅ 10 , i=11
10
-3
3
x=8⋅ 10 , i=10
-2
x=1.3⋅ 10 , i=9
-2
x=2⋅ 10 , i=8
10
2
-2
x=3.2⋅ 10 , i=7
-2
x=5⋅ 10 , i=6
-2
x=8⋅ 10 , i=5
10
x=0.13, i=4
x=3.2⋅ 10 , i=17
-4
x=5⋅ 10 , i=16
-4
x=8⋅ 10 , i=15
-3
x=1.3⋅ 10 , i=14
-3
x=2⋅ 10 , i=13
-3
x=3.2⋅ 10 , i=12
-3
x=5⋅ 10 , i=11
10
-3
3
x=8⋅ 10 , i=10
-2
x=1.3⋅ 10 , i=9
-2
x=2⋅ 10 , i=8
10
2
-2
x=3.2⋅ 10 , i=7
-2
x=5⋅ 10 , i=6
-2
x=8⋅ 10 , i=5
10
x=0.13, i=4
x=0.18, i=3
1
x=0.25, i=2
10
-1
10
-2
10
x=0.18, i=3
-3
x=0.4, i=1
1
10
-1
10
-2
x=0.65, i=0
Circles: HERA-1 data
Boxes: HERA-1 data with systematic shifts (CT10)
10
10
2
10
Q
2
3
2
[GeV ]
10
4
10
5
x=0.25, i=2
10
-3
x=0.4, i=1
x=0.65, i=0
Shifted HERA-1 data (circles)
CT10W NLO theory (lines)
10
10
2
10
Q
2
3
10
4
10
5
2
[GeV ]
Fig. 6: A comparison of the unshifted and shifted HERA1 combined neutral current data (left) and the comparison of the NLO
CT10 predictions to the shifted data (right).
The overall agreement with the combined HERA-1 data is slightly worse than with the separate
data sets, as a consequence of some increase in χ2 /N for the NC data at x < 0.001 and x > 0.1.
In this case, the absolute size of the systematic error shifts needed is small. This not need be the
case, for example, for cross sections with larger systematic errors, such as inclusive jet production.
The Hessian method results in the production of a central (best fit) pdf, and a set of error pdfs. In
this method a large matrix (26 × 26 for CTEQ, 20 × 20 for MSTW), with dimension equal to the number
of free parameters in the fit, has to be diagonalized. The result is 26 (20) orthonormal eigenvector
directions for CTEQ (MRST) which provide the basis for the determination of the pdf error for any
cross section. Thus, there are 52 error pdfs for the CT10 error set and 40 for the MSTW08 error set.
This process is shown schematically in Figure 8. The eigenvectors are now admixtures of the 26 pdf
parameters left free in the global fit. There is a broad range for the eigenvalues, over a factor of one
million. The eigenvalues are distributed roughly linearly as log ǫi , where ǫi is the eigenvalue for the
i-th direction. The larger eigenvalues correspond to directions which are well-determined; for example,
eigenvectors 1 and 2 are sensitive primarily to the valence quark distributions at moderate x, a region
where they are well-constrained. The theoretical uncertainty on the determination of the W mass at
both the Tevatron and the LHC depends primarily on these 2 eigenvector directions, as W production at
the Tevatron proceeds primarily through collisions of valence quarks. The most significant eigenvector
directions for determination of the W mass at the LHC correspond to larger eigenvector numbers, which
are primarily determined by sea quark distributions. In most cases, the eigenvector can not be directly
tied to the behaviour of a particular pdf in a specific kinematic region. There are exceptions, such as
eigenvector 15 in the CTEQ6.1 fit, discussed below.
35
CT10
30
Count
25
20
15
10
5
0
-3
-2
0
-1
1
2
ΛΑ
(a)
(b)
Fig. 7: Distribution of systematic parameters λα of the combined HERA-1 data set [?] in the CT10 best fit (CT10.00).
2-dim (i,j) rendition of d-dim (~16) PDF parameter space
aj
ul
p(i)
contours of constant c2global
ul: eigenvector in the l-direction
p(i): point of largest ai with tolerance T
s0: global minimum
p(i)
ul
diagonalization and
T
rescaling by
the iterative method
s0
zl
s0
zk
ai
Hessian eigenvector basis sets
(a)
Original parameter basis
(b)
Orthonormal eigenvector basis
Fig. 8: A schematic representation of the transformation from the pdf parameter basis to the orthonormal eigenvector basis.
Perhaps the most controversial aspect of pdf uncertainties is the determination of the ∆χ2 excursion from the central fit that is representative of a reasonable error. Nominally, a ∆χ2 = T 2 (tolerance)
would correspond to a 1 − σ(68%CL) error. PDF fits performed with a limited number of experiments
may be able to maintain that criterion. For example, HERAPDF uses a χ2 excursion of 1 for a 1σ error.
For general global fits, such as from CTEQ and MSTW, however, a χ2 excursion of 1 (for a 1σ error) is
too low of a value in a global pdf fit. These global fits use data sets arising from a number of different
processes and different experiments; there is a non-negligible tension between some of the different data
sets. A larger variation in ∆χ2 is required for a 68% CL. For example, CT10 uses a tolerance T=10 for
a 90% CL error, corresponding to T=6.1 for a 68% CL error, while MSTW uses a dynamical tolerance
(varying from 1 to 6.5) for each eigenvector.
The uncertainties for all predictions should be linearly dependent on the tolerance parameter used;
thus, it should be reasonable to scale the uncertainty for an observable from the 90% CL limit provided
by the CTEQ/MSTW error pdfs to a one-sigma error by dividing by a factor of 1.6 (MSTW also provides separate 68%CL error pdfs). Such a scaling will be a better approximation for observables more
dependent on the low number eigenvectors, where the χ2 function is closer to a quadratic form.
Even though, the data sets and definitions of tolerance are different among the different pdf groups,
we will see in Chapter X that the pdf uncertainties at the LHC are fairly similar. Note that relying on the
errors determined from a single pdf group may be an underestimate of the true pdf uncertainty, as the
central results among the pdf groups often differ by an amount similar to this one-sigma error. (See the
discussion in Chapter ?? regarding the PDF4LHC comparisons of predictions and uncertainties for the
LHC.)
Each error pdf results from an excursion along the “+” and “−” directions for each eigenvector.
Consider a variable X; its value using the central pdf for an error set (say CT10) is given by X0 . Xi+ is
the value of that variable using the pdf corresponding to the “+” direction for eigenvector i and Xi− the
value for the variable using the pdf corresponding to the “−” direction. The excursions are symmetric
for the larger eigenvalues, but may be asymmetric for the more poorly determined directions. In order to
calculate the pdf error for an observable, a Master Equation should be used:
+
∆Xmax
=
v
uN
uX
t [max(X + − X , X − − X , 0)]2
0
0
i
i
i=1
−
∆Xmax
=
v
uN
uX
t [max(X − X + , X − X − , 0)]2
0
0
i
i
(3)
i=1
∆X + adds in quadrature the pdf error contributions that lead to an increase in the observable X
and ∆X − the pdf error contributions that lead to a decrease. The addition in quadrature is justified by
the eigenvectors forming an orthonormal basis. The sum is over all N eigenvector directions, or 20 in
the case of CTEQ6.1. Ordinarily, Xi+ − X0 will be positive and Xi− − X0 will be negative, and thus it
is trivial as to which term is to be included in each quadratic sum. For the higher number eigenvectors,
however, the “+” and “−” contributions may be in the same direction (see for example eigenvector 17
in Figure 9). In this case, only the most positive term will be included in the calculation of ∆X + and
the most negative in the calculation of ∆X − . Thus, there may be less than N terms for either the “+”
or “−” directions. There are other versions of the Master Equation in current use but the version listed
above is the “official” recommendation of the authors.
There are two things that can happen when new pdfs (eigenvector directions) are added: a new
direction in parameter space can be opened to which some cross sections will be sensitive to (an example
of this is eigenvector 15 in the CTEQ6.1 error pdf set, which is sensitive to the high x gluon behaviour
and thus influences the high pT jet cross section at the Tevatron and LHC). This particular eigenvector
direction happens to be dominated by a parameter which affects mostly the large x behavior of the gluon
distribution.
In this case, a smaller parameter space is an underestimate of the true pdf error since it did not
sample a direction important for some physics. In the second case, adding new eigenvectors does not
appreciably open new parameter space and the new parameters should not contribute much pdf error
to most physics processes (although the error may be redistributed somewhat among the new and old
eigenvectors). The high x gluon uncertainty did not decrease significantly in the CTEQ pdfs produced
after the CTEQ6.1 set (CTEQ6.6, CT09, CT10), but in these latter fits, there is no single eigenvector
similar to eigenvector 15 in the CTEQ6.1 pdf set that encompasses most of the high x gluon uncertainty.
Instead this uncertainty is spread among several different eigenvectors.
In Figure 9, the pdf errors are shown in the “+” and “−” directions for the 20 CTEQ eigenvector directions for predictions for inclusive jet production at the Tevatron from the CTEQ6.1 pdfs.
(I′ ll try to get the mathematica plots for CT10. The excursions are symmetric for the first 10
eigenvectors but can be asymmetric for the last 10, as they correspond to poorly determined directions.
Either X0 and Xi± can be calculated separately in a matrix element/Monte Carlo program (requiring the program to be run 2N + 1 times) or X0 can be calculated with the program and at the same time
the ratio of the pdf luminosities (the product of the two pdfs at the x values used in the generation of
the event) for eigenvector i (±) to that of the central fit can be calculated and stored. This results in an
effective sample with 2N + 1 weights, but identical kinematics, requiring a substantially reduced amount
Fig. 9: The pdf errors for the CDF inclusive jet cross section in Run 1 for the 20 different eigenvector directions contained in
the CTEQ6.1 pdf error set. The vertical axes show the fractional deviation from the central prediction and the horizontal axes
the jet transverse momentum in GeV.
of time to generate.
As an example of pdf uncertainties using the Hessian method, the CT10 and MSTW2008 uncertainties for the up quark and gluon distributions are shown in Figures ?? and ??. While the CT10 and
MSTW2008 pdf distributions and uncertainties are reasonably close to each other, some differences are
evident, especially at low and high x.
discuss re − diagonlization here?
4.2 The NNPDF approach
4.3 Pdf uncertainties and Sudakov form factors
As discussed in the above section, it is often useful to use the error pdf sets with parton shower Monte
Carlos. The caveat still remains that a true test of the acceptances would use a NLO MC. Similar to
their use with matrix element calculations, events can be generated once using the central pdf and the
pdf weights stored for the error pdfs. These pdf weights then can be used to construct the pdf uncertainty
for any observable. Some sample code for PYTHIA is given on the benchmark website. One additional
complication with respect to their use in matrix element programs is that the parton distributions are
used to construct the initial state parton showers through the backward evolution process. The space-like
evolution of the initial state partons is guided by the ratio of parton distribution functions at different x
and Q2 values, c.f. ??. Thus the Sudakov form factors in parton shower Monte Carlos will be constructed
using only the central pdf and not with any of the individual error pdfs and this may lead to some
errors for the calculation of the pdf uncertainties of some observables. However, it was demonstrated in
Reference [?] that the pdf uncertainty for Sudakov form factors in the kinematic region relevant for the
LHC is minimal, and the weighting technique can be used just as well with parton shower Monte Carlos
as with matrix element programs.
Fig. 10: A comparison of the CT10 and MSTW2008 up quark (left) and gluon (right) pdf uncertainty bands at a Q2 value of
100 GeV 2 .
5. Choice of αs (mZ ) and related uncertainties
Global pdf fits are sensitive to the value of the strong coupling constant αs , explicitly through the QCD
cross sections used in the fits, and implicitly through the scaling violations observed in DIS. In fact, a
global fit can be used to determine the value of αs (mZ ), albeit less accurately than provided by the world
average. Some pdf groups use the world average value of αs (mZ ) as a fixed constant in the global fits,
while other groups allow αs (mZ ) to be a free parameter in the fit. It is also possible to explore the effects
of the variation of αs (mZ ) by producing pdfs at different αs (mZ ) values. The values of αs (mZ ) at NLO
and their uncertainties are shown in Figure X for the six pdf groups.
to be developed further show gluon distribution for different alphas values and discuss compensati
It is expected that the LO value of αs (mZ ) is considerably larger than the NLO value. There is
some tendency for the NNLO value of αs to be slightly smaller than the value at NLO due to the (mostly)
positive NNLO corrections for most processes.
talk about re − diagonalization and orthogonalitity of pdf and alphas uncert
6. NLO and LO pdfs
Global pdf fitting groups have traditionally produced sets of pdfs in which leading order rather than
next-to-leading order matrix elements, along with the 1-loop αS rather than the 2-loop αS , have been
used to fit the input datasets. The resultant leading order pdfs have most often been used in conjunction
with leading order matrix element programs or parton shower Monte Carlos. However, the leading order
pdfs of a given set will tend to differ from the central pdfs in the NLO fit, and in fact will most often
lie outside the pdf error band. Such is the case for the up quark distribution shown in Figure ?? and
the gluon distribution shown in Figure ??, where the LO pdfs are plotted along with the NLO pdf error
Fig. 11: The CTEQ6L1 up quark and gluon pdfs compared to the CT10 NLO pdf error bands for the same.
bands. The LO up quark distribution is considerably larger than its NLO counterpart at both small x and
large x. This is due to (1) the larger gluon distribution at small x for the LO pdf and (2) the influence of
missing log(1 − x) terms in the LO DIS matrix element. The gluon distribution is outside of the NLO
error band basically for all x. It is higher than the NLO gluon distribution at small x due to missing
log(1/x) terms in the LO DIS matrix element. It is smaller than the NLO gluon distribution at large x
basically due to the momentum sum rule and the lack of constraints at high x.
The global pdf fits are dominated by the high statistics, low systematic error deep inelastic scattering data and the differences between the LO and NLO pdfs are determined most often by the differences
between the LO and NLO matrix elements for deep inelastic scattering. This is especially true at low x
and at high x, due to missing terms that first arise in the hard matrix elements for DIS at NLO. As the
NLO corrections for most processes of interest at the LHC are reasonably small, the use of NLO pdfs in
conjunction with LO matrix elements will most often give a closer approximation of the full NLO result
(although the result remains formally LO). In many cases in which a relatively large K-factor results
from a calculation of collider processes, the primary cause is the difference between LO and NLO pdfs,
rather than the differences between LO and NLO matrix elements.
In most cases, LO pdfs will be used not in fixed order calculations, but in programs where the
LO matrix elements have been embedded in a parton shower framework. In the initial state radiation
algorithms in these frameworks, shower patrons are emitted at non-zero angles with finite transverse
momentum, and not with a zero kT implicit in the collinear approximation. It might be argued that
the resulting kinematic suppression due to parton showering should be taken into account when driving
pdfs for explicit use in Monte Carlo programs. Indeed, there is substantial kinematic suppression for
production of a low-mass (10 GeV) object at forward rapidities due to this effect, but the suppression
becomes minimal once the mass rises to the order of 100 GeV [].
W- rapidity distribution
7
7
6
6
5
5
4
NLO CTEQ6.6
3
LO CTEQ6.6
2
-4
-3
-2
-1
y
0
4
3
2
LO CTEQ6L1
1
0
-5
σ(nb)
σ(nb)
W+ rapidity distribution
1
1
2
3
4
0
-5
5
-2
-1
0
y
1
2
3
4
5
0
y
1
2
3
4
5
H rapidity distribution
4
4
3.5
3.5
3
2.5
σ(pb)
3
2.5
σ(nb)
-3
W-
Z rapidity distribution
2
1.5
2
1.5
1
1
0.5
0.5
0
-5
-4
W+
-4
-3
-2
-1
0
y
Z
1
2
3
4
5
0
-5
-4
-3
-2
-1
H
Fig. 12: A comparison of NLO predictions for SM boson rapidity distributions to LO predictions for the same, using CTEQ6.6
and CTEQ6L1 pdfs, respectively.
6.01 Modified LO PDFs
Due to the inherent differences between LO and NLO pdfs, and the relatively small differences between
LO and NLO matrix elements for processes of interest at the LHC, LO calculations at the LHC using LO
pdfs often lead to erroneous predictions. This is true not only of the normalization of the cross sections,
but also for the kinematic shapes. This can be seen for example in the predictions for the W +/− /Z
and Higgs rapidity distributions seen in Figure 12, where the wrong shapes for the vector boson rapidity
distributions result from the deficiencies of the LO DIS matrix elements used in the fit. This can have an
impact, for example, if the LO predictions are used to calculate final-state acceptances.
In an attempt to reduce the size of the errors obtained using LO pdfs with LO predictions, modified
LO pdfs have been produced. The techniques used to produce these modified pdfs include (1) relaxing
the momentum sum rule in the global fit and (2) using NLO pseudo-data in order to try to steer the
fit towards the desired NLO behavior. Both the CTEQ [] and MRST [] modified LO pdfs use the first
technique, while the CTEQ pdfs use the second technique as well.
A comparison of the full NLO and modified LO pdf predictions for W/Z and Higgs production
at the LHC is shown in Figure 14for three different LHC center-of-mass energies. It can be seen that the
modified LO pdfs lead to substantially better agreement with the NLO predictions than observed in the
figure above which used LO pdfs.
Of course, the desired behavior can also be obtained (in most cases) by the use of NLO pdfs in the
LO calculation. Here, care must be taken that only positive-definite NLO pdfs be used.
Increasingly, most processes of interest have been included in NLO parton shower Monte Carlos.
Here, the issue of LO pdfs becomes moot, as NLO pdfs must be used in such programs for consistency
with the matrix elements.
W+ rapidity distribution
SM Higgs boson rapidity distribution
4
6
NLO CTEQ66 14 TeV
MH = 120 GeV
LO CT09MC2 14 TeV
3.5
5
MRST2007lomod 14 TeV
NLO CTEQ6.6 10 TeV
3
LO CT09MC2 10 TeV
MRST2007lomod 10 TeV
2.5
σ(pb)
σ(nb)
4
3
NLO CTEQ66 14 TeV
1.5
LO CT09MC2 14 TeV
2
2
LO MRST2007lomod 14 TeV
NLO CTEQ6.6 10 TeV
1
LO CT09MC2 10 TeV
MRST2007lomod 10 TeV
1
0.5
NLO CTEQ6.6 7 TeV
LO CT09MC2 7 TeV
LO MRST2007lomod
0-5
-4
-3
-2
-1
0
y
1
2
3
4
5
0-5
-4
-3
W+
-2
-1
y
0
1
2
3
4
5
H0
Fig. 13: A comparison of LO, NLO and modified LO predictions for the W + and Higgs rapidity distributions at the LHC.
6.1 NLO and NNLO pdfs
The transition from NLO to NNLO results in much smaller changes to the pdfs as can be observed in
Figure ??.
have to update these figures to CT10 NNLO. also, discuss new features at NNLO?
7. PDF correlations
The uncertainty analysis may be extended to define a correlation between the uncertainties of two variables, say X(~a) and Y (~a). As for the case of PDFs, the physical concept of PDF correlations can be
determined both from PDF determinations based on the Hessian approach and on the Monte Carlo approach.
7.1 PDF correlations in the Hessian approach
Consider the projection of the tolerance hypersphere onto a circle of radius 1 in the plane of the gradients
~ and ∇Y
~ in the parton parameter space [?, ?]. The circle maps onto an ellipse in the XY plane. This
∇X
“tolerance ellipse” is described by Lissajous-style parametric equations,
X = X0 + ∆X cos θ,
(4)
Y
(5)
= Y0 + ∆Y cos(θ + ϕ),
where the parameter θ varies between 0 and 2π, X0 ≡ X(~a0 ), and Y0 ≡ Y (~a0 ). ∆X and ∆Y are the
maximal variations δX ≡ X − X0 and δY ≡ Y − Y0 evaluated according to the M aster Equation, and
~ and ∇Y
~ in the {ai } space, with
ϕ is the angle between ∇X
cos ϕ =
N ~ · ∇Y
~
X
∇X
1
(+)
(−)
(+)
(−)
.
Yi − Yi
Xi − Xi
=
∆X∆Y
4∆X ∆Y i=1
(6)
The quantity cos ϕ characterizes whether the PDF degrees of freedom of X and Y are correlated
(cos ϕ ≈ 1), anti-correlated (cos ϕ ≈ −1), or uncorrelated (cos ϕ ≈ 0). If units for X and Y are rescaled
so that ∆X = ∆Y (e.g., ∆X = ∆Y = 1), the semimajor axis of the tolerance ellipse is directed at
Fig. 14: A comparison of LO, NLO and NNLO up quark and gluon pdfs.
an angle π/4 (or 3π/4) with respect to the ∆X axis for cos ϕ > 0 (or cos ϕ < 0). In these units, the
ellipse reduces to a line for cos ϕ = ±1 and becomes a circle for cos ϕ = 0, as illustrated by Fig. 15.
These properties can be found by diagonalizing the equation for the correlation ellipse. Its semiminor
and semimajor axes (normalized to ∆X = ∆Y ) are
{aminor , amajor } =
The eccentricity ǫ ≡
1.
q
√
sin ϕ
.
1 ± cos ϕ
(7)
1 − (aminor /amajor )2 is therefore approximately equal to
δX
∆X
2
+
δY
∆Y
2
−2
δX
∆X
δY
∆Y
p
|cos ϕ| as |cos ϕ| →
cos ϕ = sin2 ϕ.
(8)
A magnitude of | cos ϕ| close to unity suggests that a precise measurement of X (constraining δX
to be along the dashed line in Fig. 15) is likely to constrain tangibly the uncertainty δY in Y , as the value
of Y shall lie within the needle-shaped error ellipse. Conversely, cos ϕ ≈ 0 implies that the measurement
of X is not likely to constrain δY strongly.1
The values of ∆X, ∆Y, and cos ϕ are also sufficient to estimate the PDF uncertainty of any
function f (X, Y ) of X and Y by relating the gradient of f (X, Y ) to ∂X f ≡ ∂f /∂X and ∂Y f ≡ ∂f /∂Y
via the chain rule:
q
~ ∆f = ∇f = (∆X ∂X f )2 + 2∆X ∆Y cos ϕ ∂X f ∂Y f + (∆Y ∂Y f )2 .
(−)
1
√ The allowed range of δY /∆Y for a given δ ≡ δX/∆X is rY
1 − δ 2 sin ϕ.
(+)
(±)
≤ δY /∆Y ≤ rY , where rY
(9)
≡ δ cos ϕ ±
cos ϕ ≈ 1
δY
cos ϕ ≈ 0
δY
δX
cos ϕ ≈ −1
δY
δX
δX
Fig. 15: Correlations ellipses for a strong correlation (left), no correlation (center) and a strong anti-correlation(right) [?].
Of particular interest is the case of a rational function f (X, Y ) = X m /Y n , pertinent to computations
of various cross section ratios, cross section asymmetries, and statistical significance for finding signal
events over background processes [?]. For rational functions Eq. (9) takes the form
∆f
=
f0
s
∆X
m
X0
2
∆X ∆Y
∆Y
− 2mn
cos ϕ + n
X0 Y0
Y0
2
.
(10)
For example, consider a simple ratio, f = X/Y . Then ∆f /f0 is suppressed (∆f /f0 ≈ |∆X/X0 − ∆Y /Y0 |)
if X and Y are strongly correlated, and it is enhanced (∆f /f0 ≈ ∆X/X0 + ∆Y /Y0 ) if X and Y are
strongly anticorrelated.
As would be true for any estimate provided by the Hessian method, the correlation angle is inherently approximate. Eq. (6) is derived under a number of simplifying assumptions, notably in the
quadratic approximation for the χ2 function within the tolerance hypersphere, and by using a symmetric finite-difference formula for {∂i X} that may fail if X is not monotonic. With these limitations in
mind, we find the correlation angle to be a convenient measure of interdependence between quantities of
diverse nature, such as physical cross sections and parton distributions themselves.
We can calculate the correlations between two pdfs, fa1 (x1 , µ1 ) and fa2 (x2 , µ2 ) at a scale µ1 =
µ2 =85 GeV. In the figure below, we show self-correlations for the up quark (left) and the gluon (right).
Light (dark) shades of gray correspond to cosφ close to 1 (-1). Each self-correlation includes a trivial
correlation (cosφ = 1) when x1 and x2 are approximately the same (along the x1 = x2 diagonals). For
the up quark, this trivial correlation is the only pattern present. The gluon distribution, however, also
shows a strong anti-correlation when one of the x values is large and the other small. This arises as a
consequence of the momentum sum rule. A fairly complete set of correlation patterns, connected with
the momentum sum rule, perturbative evolution, and constraints from experimental data, can be found in
Ref. [].
In Chapter , the correlations for certain benchmark cross sections are given with respect to that
for Z production. As expected, the W + and W − cross sections are very correlated with that for the
Z, while the Higgs cross sections are uncorrelated (mHiggs =120 GeV) or anti-correlated (mHiggs =240
GeV). Thus, the PDF uncertainty for the ratio of the cross section for a 240 GeV Higgs boson to that of
the cross section for Z boson production is larger than the PDF uncertainty for Higgs boson production
by itself. Correlations among various physics processes, especially those important for Higgs production,
are discussed in Chapter ??.
A simple C code (corr.C) is available from the PDF4LHC website that calculates the correlation
cosine between any two observables given two text files that present the cross sections for each observable
as a function of the error PDFs.
Correlations between CTEQ6.6 PDF’s
0.7
0.7
0.5
0.5
0.2
0.2
0.1
0.1
x in u
at Q=85. GeV
x in g
at Q=85. GeV
Correlations between CTEQ6.6 PDF’s
0.05
0.02
0.05
0.02
0.01
0.01
10-3
10-3
10-4
10-4
10
-5
10-510-4 10-3
0.010.02 0.05 0.1
0.2
x in g at Q=85. GeV
0.5 0.7
10-5
10-510-4 10-3
0.010.02 0.05 0.1
0.2
x in u at Q=85. GeV
0.5 0.7
Fig. 16: Contour plots of the correlation cosine between two pdfs, for the up quark (left) and the gluon (right).
7.11 Correlations within the Monte Carlo approach
General correlations between PDFs and physical observables can be computed within the Monte Carlo
approach used by NNPDF using standard textbook methods. To illustrate this point, let us compute the
the correlation coefficient ρ[A, B] for two observables A and B which depend on PDFs (or are PDFs
themselves). This correlation coefficient in the Monte Carlo approach is given by
ρ[A, B] =
Nrep
hABirep − hAirep hBirep
(Nrep − 1)
σA σB
(11)
where the averages are taken over ensemble of the Nrep values of the observables computed with the
different replicas in the NNPDF2.0 set, and σA,B are the standard deviations of the ensembles. The
quantity ρ characterizes whether two observables (or PDFs) are correlated (ρ ≈ 1), anti-correlated (ρ ≈
−1) or uncorrelated (ρ ≈ 0).
This correlation can be generalized to other cases, for example to compute the correlation between
PDFs and the value of the strong coupling αs (mZ ), as studied in Ref. [?, ?], for any given values of
x and Q2 . For example, the correlation between the strong coupling and the gluon at x and Q2 (or in
general any other PDF) is defined as the usual correlation between two probability distributions, namely
(equation to be supplied later)
where averages over replicas include PDF sets with varying αs in the sense of Eq. (??). Note that
the computation of this correlation takes into account not only the central gluons of the fits with different
αs but also the corresponding uncertainties in each case.
8. LHAPDF and other tools
8.02 LHAPDF and Durham PDF plotter
Libraries such as PDFLIB [?] have been established that maintain a large collection of available pdfs.
However, PDFLIB is no longer supported, making it more difficult for easy access to the most up-to-date
pdfs. In addition, the determination of the pdf uncertainty of any cross section typically involves the use
of a large number of pdfs (on the order of 30-100) and PDFLIB is not set up for easy accessibility for a
large number of pdfs.
Fig. 17: Screen capture of the Durham pdf plotter website.
At Les Houches in 2001, representatives from a number of pdf groups were present and an interface (Les Houches Accord 2, or LHAPDF) [?] that allows the compact storage of the information needed
to define a pdf was defined. Each pdf can be determined either from a grid in x and Q2 or by a few lines
of information (basically the starting values of the parameters at Q = Qo ) and the interface carries out
the evolution to any x and Q value, at either LO or NLO as appropriate for each pdf.
The interface is as easy to use as PDFLIB and consists essentially of 3 subroutine calls:
• call Initpdfset(name): called once at the beginning of the code; name is the file name of the external
pdf file that defines the pdf set (for example, CTEQ, GKK [?] or MRST)
• call Initpdf(mem): mem specifies the individual member of the pdf set
• call evolvepdf(x,Q,f): returns the pdf momentum densities for flavour f at a momentum fraction x
and scale Q
Responsibility for LHAPDF has been taken over by the Durham HEPDATA project [?] and regular
updates/improvements have been produced. Interfaces with LHAPDF are now included in most matrix
element programs. Recent modifications make it possible to include all error pdfs in memory at the same
time. Such a possibility reduces the amount of time needed for pdf error calculations on any observable.
The matrix element result can be calculated once using the central pdf and the relative (pdf)×(pdf)
parton-parton luminosity can be calculated for each of the error pdfs (or the values of x1 ,x2 , the flavour
of partons 1 and 2 and the value of Q2 can be stored). Such a pdf re-weighting has been shown to work
both for exact matrix element calculations as well as for matrix element+parton shower calculations.
A new routine LHAGLUE [?] provides an interface from PDFLIB to LHAPDF making it possible
to use the PDFLIB subroutine calls that may be present in older programs.
Also, extremely useful is the Durham pdf plotter (http://hepdata.cedar.ac.uk/pdf/pdf3.html) which
allows the fast plotting/comparisons of pdfs, including their error bands. All of the pdf plots in this book
were made with the Durham plotter.
8.1 PDF re-weighting, Applgrid and fastNLO
NLO and NNLO programs are notoriously slow. Thus, it can be very time-consuming to generate a
higher order cross section with one pdf, and then have to re-run the program as well for the 2n (where
n is the number of pdf eigenvectors) error pdfs. Such a step is in fact unnecessary, and most programs
have the ability to use pdf re-weighting to substitute a new pdf for the pdf used in the original generation,
using the re-weighting function show in the equation below.
putinpdfre − weightingequation
The pdf error weights can either be stored at the time of generation, as discussed above in Section (Sudakovformfactors), or can be generated on the fly by the program.
Since the pdf-dependent information in a QCD calculation can be factorized from the rest of the
hard scattering terms, it is possible to calculate the non-pdf terms one time and then to store the pdf
information on a grid (in terms of the pdf x values and their µr and µf dependence). This allows
for the fast calculation of any hard scattering cross section and the a posteriori inclusion of pdf’s and
the strong coupling constant αs in higher order QCD calculations. The technique also allows the a
posterio variation of the renormalization and factorization scales. This is the working principle of the
two programs fastNLO [] and Applgrid []. These programs are increasingly used for calculation of the
NLO matrix elements used in pdf fits.
more elaboration? relate to what is done with the B + S tuples?
9. PDF luminosities
needs to be updated using CT10 pdfs, and lhc energies of 7, 8 TeV, as well as 13.5 TeV.
It is useful to introduce the idea of differential parton-parton luminosities. Such luminosities,
when multiplied by the dimensionless cross section ŝσ̂ for a given process, provide a useful estimate of
the size of an event cross section at the LHC. Below we define the differential parton-parton luminosity
dLij /dŝ dy and its integral dLij /dŝ:
dLij
1
1
=
[fi (x1 , µ)fj (x2 , µ) + (1 ↔ 2)] .
dŝ dy
s 1 + δij
(12)
The prefactor with the Kronecker delta avoids double-counting in case the partons are identical. The
generic parton-model formula
σ=
XZ
i,j
can then be written as
σ=
0
1
dx1 dx2 fi (x1 , µ) fj (x2 , µ) σ̂ij
X Z dŝ
i,j
ŝ
dy
dLij
dŝ dy
(13)
(ŝ σ̂ij ) .
(14)
(Note that this result is easily derived by defining τ = x1 x2 = ŝ/s and observing that the Jacobian
∂(τ, y)/∂(x1 , x2 ) = 1.)
R
Figure 18 shows a plot of the luminosity function integrated over rapidity, dLij /dŝ = (dLij /dŝ dy) dy,
√
at the LHC s = 14 TeV for various parton flavour combinations, calculated using the CTEQ6.1 parton distribution functions
[?]. The widths of the curves indicate an estimate for the
√
√ pdf uncertainties.
We assume µ = ŝ for the scale. As expected, the gg luminosity is large at low ŝ but falls rapidly
with respect to the other parton luminosities. The gq luminosity is large over the entire kinematic region
plotted.
One can further specify the parton-parton luminosity for a specific rapidity y and ŝ, dLij /dŝ dy.
If one is interested in a specific partonic initial state, then the resulting differential luminosity can be
displayed in families of curves as shown in Figure 19, where the differential parton-parton
luminosity
√
at the LHC is shown as a function of the subprocess centre-of-mass energy ŝ at various values of
rapidity for the produced system for several different combinations of initial state partons. One can read
from the curves the parton-parton luminosity for a specific value of mass fraction and rapidity. (It is
also easy to use the Durham pdf plotter to generate the pdf curve for any desired flavour and kinematic
configuration 2 .)
2
http://durpdg.dur.ac.uk/hepdata/pdf3.html
Fig. 18: The parton-parton luminosity
P
Red=
i
h
dLij
dτ
i
P
in picobarns, integrated over y. Green=gg, Blue=
(qi q̄i + q̄i qi ), where the sum runs over the five quark flavours d, u, s, c, b.
i
(gqi + g q̄i + qi g + q̄i g),
It is also of great interest to understand the uncertainty in the parton-parton luminosity for specific
kinematic configurations. Some representative parton-parton luminosity uncertainties, integrated over
rapidity, are shown in Figures 20, 21 and 22. The pdf uncertainties were generated from the CTEQ6.1
Hessian error analysis using the standard ∆χ2 = 100 criterion. Except for kinematic regions where
one or both partons is a gluon at high x, the pdf uncertainties are of the order of 5–10%. Even tighter
constraints will be possible once the LHC Standard Model data is included in the global pdf fits. Again,
the uncertainties for individual pdfs can also be calculated online using the Durham pdf plotter. Often
it is not the pdf uncertainty for a cross section that is required, but rather the pdf uncertainty for an
acceptance for a given final state. The acceptance for a particular process may depend on the input
pdfs due to the rapidity cuts placed on the jets, leptons, photons, etc. and the impacts of the varying
longitudinal boosts of the final state caused by the different pdf pairs. An approximate “rule-of-thumb”
is that the pdf uncertainty for the acceptance is a factor of 5–10 times smaller than the uncertainty for the
cross section itself.
In Figure 23, the pdf luminosity curves shown in Figure 18 are overlaid with equivalent luminosity
curves from the Tevatron. In Figure 24, the ratios of the pdf luminosities at the LHC to those at the
Tevatron are plotted. The most dramatic increase in pdf luminosity at the LHC comes from gg initial
states, followed by gq initial states and then q q̄ initial states. The latter ratio is smallest because of
the availability of valence antiquarks
at the Tevatron at moderate to large x. As an example, consider
√
chargino pair production with ŝ = 0.4 TeV. This process proceeds through q q̄ annihilation; thus, there
is only a factor of 10 enhancement at the LHC compared to the Tevatron.
Backgrounds to interesting physics at the LHC proceed mostly through gg and gq initial states.
Thus, there will be a commensurate increase in the rate for background processes at the LHC.
P
Fig. 19: d(Luminosity)/dy at rapidities (right to left) y = 0, 2, 4, 6. Green=gg, Blue=
P
Red=
(q q̄ + q̄i qi ), where the sum runs over the five quark flavours d, u, s, c, b.
i i i
Fig. 20: Fractional uncertainty of the gg luminosity integrated over y.
i
(gqi + g q̄i + qi g + q̄i g),
Fig. 21: Fractional uncertainty for the parton-parton luminosity integrated over y for
the five quark flavours d, u, s, c, b.
Fig. 22: Fractional uncertainty for the luminosity integrated over y for
flavours d, u, s, c, b.
P
i
P
i
(qi q̄i + q̄i qi ), where the sum runs over
(qi q̄i + q̄i qi ), where the sum runs over the five quark
Fig. 23: The parton-parton luminosity
P
Red=
i
h
1 dLij
ŝ dτ
i
P
in pb integrated over y. Green=gg, Blue=
i
(gqi + g q̄i + qi g + q̄i g),
(qi q̄i + q̄i qi ), where the sum runs over the five quark flavours d, u, s, c, b. The top family of curves are for the
LHC and the bottom for the Tevatron.
Fig. 24: The ratio of parton-parton luminosity
P
Blue=
i
h
1 dLij
ŝ dτ
P
(gqi + g q̄i + qi g + q̄i g) (middle), Red=
d, u, s, c, b.
i
i
in pb integrated over y at the LHC and Tevatron. Green=gg (top),
(qi q̄i + q̄i qi ) (bottom), where the sum runs over the five quark flavours
© Copyright 2026 Paperzz