Variation, jumps, market frictions and high

Variation, jumps, market frictions and
high frequency data in financial econometrics∗
Ole E. Barndorff-Nielsen
The T.N. Thiele Centre for Mathematics in Natural Science,
Department of Mathematical Sciences,
University of Aarhus, Ny Munkegade, DK-8000 Aarhus C, Denmark
[email protected]
Neil Shephard
Nuffield College, Oxford OX1 1NF, UK
[email protected]
July 14, 2005
1
Introduction
We will review the econometrics of non-parametric estimation of the components of the variation
of asset prices. This very active literature has been stimulated by the recent advent of complete
records of transaction prices, quote data and order books. In our view the interaction of the
new data sources with new econometric methodology is leading to a paradigm shift in one of
the most important areas in econometrics: volatility measurement, modelling and forecasting.
We will describe this new paradigm which draws together econometrics with arbitrage free
financial economics theory. Perhaps the two most influential papers in this area have been
Andersen, Bollerslev, Diebold, and Labys (2001) and Barndorff-Nielsen and Shephard (2002),
but many other papers have made important contributions. This work is likely to have deep
impacts on the econometrics of asset allocation and risk management. One of our observations
will be that inferences based on these methods, computed from observed market prices and so
under the physical measure, are also valid as inferences under all equivalent measures. This puts
this subject also at the heart of the econometrics of derivative pricing.
∗
Prepared for the invited symposium on Financial Econometrics, 9th World Congress of the Econometric
Society, London, 20th August 2005. We are grateful to Tim Bollerslev, Eric Ghysels, Peter Hansen, Jean Jacod,
Dmitry Kulikov, Jeremy Large, Asger Lunde, Andrew Patton, Mark Podolskij, Kevin Sheppard and Jun Yu for
comments on an earlier draft. Talks based on this paper were also given in 2005 as the Hiemstra Lecture at the
13th Annual conference of the Society of Non-linear Dynamics and Econometrics in London, the keynote address
at the 3rd Nordic Econometric Meeting in Helsinki and as a Special Invited Lecture at the 25th European Meeting
of Statisticians in Oslo. Ole Barndorff-Nielsen’s work is supported by CAF (www.caf.dk), which is funded by the
Danish Social Science Research Council. Neil Shephard’s research is supported by the UK’s ESRC through the
grant “High frequency financial econometrics based upon power variation.”
1
One of the most challenging problems in this context is dealing with various forms of market
frictions, which obscure the efficient price from the econometrician. Here we will characterise four
types of statistical models of frictions and discuss how econometricians have been attempting
to overcome them.
In section 2 we will set out the basis of the econometrics of arbitrage-free price processes,
focusing on the centrality of quadratic variation. In section 3 we will discuss central limit
theorems for estimators of the QV process, while in section 4 the role of jumps in QV will be
highlighted, with bipower and multipower variation being used to identify them and to test
the hypothesis that there are no jumps in the price process. In section 5 we write about the
econometrics of market frictions, while in section 6 we conclude.
2
Arbitrage-free, frictionless price processes
2.1
Semimartingales and quadratic variation
Given a complete record of transaction or quote prices it is natural to model prices in continuous time (e.g. Engle (2000)). This matches with the vast continuous time financial economic
arbitrage-free theory based on a frictionless market. In this section and the next, we will discuss how to make inferences on the degree of variation in such frictionless worlds. Section 5
will extend this by characterising the types of frictions seen in practice and discuss strategies
econometricians have been using to overcome these difficulties.
In its most general case the fundamental theory of asset prices says that a vector of log-prices
at time t,
0
Yt = Yt1 , ..., Ytp ,
must obey a semimartingale process (written Y ∈ SM) on some filtered probability space
Ω, F, (Ft )t≥0 , P in a frictionless market. The semimartingale is defined as being a process
which can be written as
Y = A + M,
(1)
where A is a local finite variation process (A ∈ FV loc ) and M is a local martingale (M ∈ Mloc ).
Compact introductions to the economics and mathematics of semimartingales are given in Back
(1991) and Protter (2004), respectively.
The Y process can exhibit jumps. It is tempting to decompose Y = Y ct + Y d , where Y ct
and Y d are the purely continuous and discontinuous sample path components of Y . However,
technically this definition is not clear as the jumps of the Y process can be so active that they
2
cannot be summed up. Thus we will define
Y ct = Ac + M c ,
where M c is the continuous part of the local martingale component of Y and A c is A minus the
sum of its jumps1 . Likewise, the continuous sample path subsets of classes of processes such as
SM and M, will be denoted by SMc and Mc .
Crucial to semimartingales, and to the economics of financial risk, is the quadratic variation
(QV) process of (Y 0 , X 0 )0 ∈ SM. This can be defined as
tj ≤t
[Y, X]t = p− lim
n→∞
X
j=1
Ytj − Ytj−1
0
Xtj − Xtj−1 ,
(2)
(e.g. Protter (2004, p. 66–77)) for any deterministic sequence 2 of partitions 0 = t0 < t1 < ... <
tn = T with supj {tj+1 − tj } → 0 for n → ∞. The convergence is also locally uniform in time.
It can be shown that this probability limit exists for all semimartingales.
Throughout we employ the notation that
[Y ]t = [Y, Y ]t ,
while we will sometimes refer to
well known that3
p
[Y l ]t as the quadratic volatility (QVol) process for Y l . It is
[Y ] = [Y ct ] + [Y d ],
[Y d ]t =
where
X
∆Yu ∆Yu0
(3)
0≤u≤t
with ∆Yt = Yt − Yt− are the jumps in Y and noting that [Act ] = 0. In the probability literature
QV is usually defined in a different, but equivalent, manner (see, for example, Protter (2004, p.
66))
[Y ]t =
2.2
Yt Yt0
−2
Z
t
0
Yu− dYu0 .
(4)
Brownian semimartingales
In economics the most familiar semimartingale is the Brownian semimartingale (Y ∈ BSM)
Z t
Z t
Yt =
au du +
σ u dWu ,
(5)
0
0
1
It is tempting to use the notation Y c for Y ct , but in the probability literature if Y ∈ SM then Y c = M c , so
Y ignores Ac .
2
The assumption that the times are deterministic can be relaxed to allow them to be any Riemann sequence
of adapted subdivisions. This is discussed in, for example, Jacod and Shiryaev (2003, p. 51). Economically this
is important for it means that we can also think of the limiting argument as the result of a joint process of Y and
a counting process N whose arrival times are the tj . So long as Y and N are adapted to at least their bivariate
natural filtration the limiting argument holds as the intensity of N increases off to infinity with n.
3
Although the sum of jumps of Y does not exist in general when Y ∈ SM, the sum of outer products of the
jumps always does exist. Hence [Y d ] can be properly defined.
c
3
where a is a vector of predictable drifts, σ is a matrix volatility process whose elements are
càdlàg and W is a vector Brownian motion. The stochastic integral σ •W t , where f •gt is generic
Rt
notation for the process 0 fu dgu , is said to be a stochastic volatility process (σ • W ∈ SV) —
e.g. the reviews in Ghysels, Harvey, and Renault (1996) and Shephard (2005). This vector
process has elements which are Mcloc . Doob (1953) showed that all continuous local martingales
with absolutely continuous quadratic variation can be written in the form of a SV process (see
Rt
Karatzas and Shreve (1991, p. 170–172)) 4 . The drift 0 au du has elements which are absolutely
continuous — an assumption which looks ad hoc, however arbitrage freeness plus the SV model
implies this property must hold (Karatzas and Shreve (1998, p. 3) and Andersen, Bollerslev,
Diebold, and Labys (2003, p. 583)). Hence Y ∈ BSM is a rather canonical model in the finance
theory of continuous sample path processes. Its use is bolstered by the facts that Ito calculus
for continuous sample path processes is relatively simple.
If Y ∈ BSM then
[Y ]t =
Z
t
Σu du
0
the integrated covariance process, while
dYt |Ft ∼ N (at dt, Σt dt) ,
Σt = σ t σ 0t ,
where
(6)
where Ft is the natural filtration – that is the information from the entire sample path of Y up
to time t. Thus at dt and Σt dt have clear interpretations as the infinitesimal predictive mean
Rt
and covariance of asset returns. This implies that A t = 0 E (dYu |Fu ) du while, centrally to our
interests,
d[Y ]t = Cov (dYt |Ft )
and [Y ]t =
Z
0
t
Cov (dYu |Fu ) du.
Thus A and [Y ] are the integrated infinitesimal predictive mean and covariance of the asset
prices, respectively.
2.3
Jump processes
There is no plausible economic theory which says that prices must follow continuous sample path
processes. Indeed we will see later that statistically it is rather easy to reject this hypothesis even
for price processes drawn from very thickly traded markets. In this paper we will add a finite
activity jump process (this means there are a finite number of jumps in a fixed time interval)
P t
Jt = N
j=1 Cj , adapted to the filtration generated by Y , to the Brownian semimartingale model.
4
An example of a continuous local martingale which has no SV representation is a time-change Brownian
motion where the time-change takes the form of the so-called “devil’s staircase,” which is continuous and nondecreasing but not absolutely continuous (see, for example, Munroe (1953, Section 27)). This relates to the work
of, for example, Calvet and Fisher (2002) on multifractals.
4
This yields
Yt =
Z
t
au du +
0
Z
t
σ u dWu +
0
Nt
X
Cj .
(7)
j=1
Here N is a simple counting process and the C are the associated non-zero jumps (which we
assume have a covariance) which happen at times 0 = τ 0 < τ 1 < τ 2 < ... . It is helpful to
decompose J into J = J A + J M , where, assuming J has an absolutely continuous intensity,
Rt
JtA = 0 cu du, and ct = E (dJt |Ft ). Then J M is the compensated jump process, so J M ∈ M,
Rt
while J A ∈ FV ct
loc . Thus Y has the decomposition as in (1), with A t = 0 (au + cu ) du and
Mt =
Z
t
σ u dWu +
0
It is easy to see that [Y d ]t =
Nt
X
j=1
P Nt
0
j=1 Cj Cj
[Y ]t =
Z
Cj −
Z
t
cu du.
0
and so
t
Σu du +
0
Nt
X
Cj Cj0 .
j=1
Again we note that E (dYt |Ft ) = (at + ct ) dt, but now,
Cov (σ t dWt , dJt |Ft ) = 0,
(8)
so
Cov (dYt |Ft ) = Σt dt + Cov (dJt |Ft ) 6= d[Y ]t .
This means that the QV process aggregates the components of the variation of prices and so is
not sufficient to learn the integrated covariance process.
To identify the components of the QV process we can use the bipower variation (BPV)
process introduced by Barndorff-Nielsen and Shephard (2006). So long as it exists, the p × p
matrix BPV process {Y } has l, k-th element
n
o 1 n
o n
o
Y l, Y k =
Yl +Yk − Yl−Yk ,
4
l, k, = 1, 2, ..., p,
(9)
where, so long as the limit exists and the convergence is locally uniform in t, 5
n
Yl
o
t
= p− lim
δ↓0
bt/δc X
l
l
l
l
− Yδ(j−2)
Yδ(j−1)
Yδj − Yδ(j−1)
.
(10)
j=1
5
In order to simplify some of the later results we consistently ignore end effects in variation statistics. This can
be justified in two ways, either by (a) setting Yt = 0 for t < 0, (b) letting Y start being a semimartingale at zero
at time C < 0 not at time 0. The latter seems realistic when dealing with markets open 24 hours a day, borrowing
returns from small periods of the previous day. It means that there is a modest degree of wash over from one
days variation statistics into the next day. There seems little econometric reasons why this should be a worry.
Assumption (b) can also be used in equity markets when combined with some form of stochastic imputation,
adding in artifical simulated returns for the missing period — see the related comments in Barndorff-Nielsen and
Shephard (2002).
5
Here bxc is the floor function, which is the largest integer less than or equal to x. Combining
the results in Barndorff-Nielsen and Shephard (2006) and Barndorff-Nielsen, Graversen, Jacod,
Podolskij, and Shephard (2005) if Y is the form of (7) then, without any additional assumptions,
Z t
−2
µ1 {Y }t =
Σu du,
0
where µr = E |U |r , U ∼ N (0, 1) and r > 0, which means that
[Y ]t − µ−2
1 {Y }t =
Nt
X
Cj Cj0 .
j=1
At first sight the robustness of BPV looks rather magical, but it is a consequence of the fact
that only a finite number of terms in the sum (10) are affected by jumps, while each return
which does not have a jump goes to zero in probability. Therefore, since the probability of
jumps in contiguous time intervals goes to zero as δ ↓ 0, those terms which do include jumps do
not impact the probability limit. The extension of this result to the case where J is an infinite
activity jump process is discussed in Section 4.4.
2.4
Forecasting
Suppose Y obeys (7) and introduce the generic notation
yt+s,t = Yt+s − Yt
= at+s,t + mt+s,t ,
t, s > 0.
So long as the covariance exists,
Cov (yt+s,t |Ft ) = Cov (at+s,t |Ft ) + Cov (mt+s,t |Ft )
+Cov (at+s,t , mt+s,t |Ft ) + Cov (mt+s,t , at+s,t , |Ft ) .
Notice how complicated this expression is compared to the covariance in (6), which is due to the
R t+dt
fact that s is not necessarily dt and so a t+s,t is no longer known given Ft — while t
au du was.
However, in all likelihood for small s, a makes a rather modest contribution to the predictive
covariance of Y .
This suggests using the approximation that
Cov (yt+s,t |Ft ) ' Cov (mt+s,t |Ft ) .
Now using (8) so
Cov (mt+s,t |Ft ) = E ([Y ]t+s − [Y ]t |Ft ) − E
6
(Z
t
t+s
cu du
Z
t+s
cu du
t
0
|Ft
)
.
Hence if c or s is small then we might approximate
Cov (Yt+s − Yt |Ft ) ' E ([Y ]t+s − [Y ]t |Ft )
= E ([σ • W ]t+s − [σ • W ]t |Ft ) + E ([J]t+s − [J]t |Ft ) .
Thus an interesting forecasting strategy for covariances is to forecast the increments of the QV
process or its components. As the QV process and its components are themselves estimable,
though with substantial possible error, this is feasible. This approach to forecasting has been
advocated in a series of influential papers by Andersen, Bollerslev, Diebold, and Labys (2001),
Andersen, Bollerslev, Diebold, and Ebens (2001) and Andersen, Bollerslev, Diebold, and Labys
(2003), while the important earlier paper by Andersen and Bollerslev (1998a) was stimulating
in the context of measuring the forecast performance of GARCH models. The use of forecasting using estimates of the increments of the components of QV was introduced by Andersen,
Bollerslev, and Diebold (2003). We will return to it in section 3.9 when we have developed an
asymptotic theory for estimating the QV process and its components.
2.5
Realised QV & BPV
The QV process can be estimated in many different ways. The most immediate is the realised
QV estimator
bt/δc
[Yδ ]t =
X
j=1
Yjδ − Y(j−1)δ
0
Yjδ − Y(j−1)δ ,
where δ > 0. This is the outer product of returns computed over a fixed interval of time of
p
length δ. By construction, as δ ↓ 0, [Y δ ]t → [Y ]t . Likewise
bt/δc n o
X
l
l
l
l
l
Yδ =
Yδ(j−1) − Yδ(j−2) Yδj − Yδ(j−1) ,
t
Yδl , Yδk =
1
4
l = 1, 2, ..., p,
(11)
j=1
p
Yδl + Yδk − Yδl − Yδk and {Yδ } → {Y }.
In practice, the presence of market frictions can potentially mean that this limiting argument
is not really available as an accurate guide to the behaviour of these statistics for small δ. Such
difficulties with limiting arguments, which are present in almost all areas of econometrics and
statistics, do not invalidate the use of asymptotics, for it is used to provide predictions about
finite sample behaviour. Probability limits are, of course, coarse and we will respond to this by
refining our understanding by developing central limit theorems and hope they will make good
predictions when δ is moderately small. For very small δ these asymptotic predictions become
poor guides as frictions bite hard and this will be discussed in section 5.
7
In financial econometrics the focus is often on the increments of the QV and realised QV
over set time intervals, like one day. Let us define the daily QV
Vi = [Y ]hi − [Y ]h(i−1) ,
i = 1, 2, ...
while it is estimated by the realised daily QV
Vbi = [Yδ ]hi − [Yδ ]h(i−1) ,
i = 1, 2, ....
p
Clearly Vbi → Vi as δ ↓ 0. The l-th diagonal element of Vbi , written Vbil,l is called the realised
q
variance6 of asset l, while its square root is its realised volatility. The latter estimates the Vil,l ,
the daily QVol process of asset l. The l, k-th element of Vbi , Vbil,k , is called the realised covariance
between assets l and k. Off these objects we can define standard dependence measures, like
realised regression
l,k
Vbil,k p l,k
Vil,k
b
β i = k,k → β i = k,k ,
Vbi
Vi
which estimates the QV regression and the realised correlation
b
ρl,k
i = q
Vbil,k
Vbil,l Vbik,k
Vil,k
p
q
→ ρl,k
=
,
i
l,l k,k
V i Vi
which estimates the QV correlation. Similar daily objects can be calculated off the realised BPV
process
which estimates
n
o
bi = µ−2 {Yδ } − {Yδ }
B
1
hi
h(i−1) ,
Bi = Y ct hi − Y ct h(i−1) =
Z
hi
h(i−1)
i = 1, 2, ...
σ 2u du,
i = 1, 2, ...
Realised volatility has a very long history in financial economics. It appears in, for example, Rosenberg (1972), Officer (1973), Merton (1980), French, Schwert, and Stambaugh (1987),
Schwert (1989) and Schwert (1998), with Merton (1980) making the implicit connection with the
case where δ ↓ 0 in the pure scaled Brownian motion plus drift case. Of course, in probability
theory QV was discussed as early as Wiener (1924) and Lévy (1937) and appears as a crucial
object in the development of the stochastic analysis of semimartingales which occurred in the
second half of the last century. For more general financial processes a closer connection between
realised QV and QV, and its use for econometric purposes, was made in a series of independent
and concurrent papers by Comte and Renault (1998), Barndorff-Nielsen and Shephard (2001)
and Andersen, Bollerslev, Diebold, and Labys (2001). The realised regressions and correlations
Some authors call Vbil,l the realised volatility, but throughout this paper we follow the tradition in finance of
using volatility to mean standard deviation type objects.
6
8
were defined and studied in detail by Andersen, Bollerslev, Diebold, and Labys (2003) and
Barndorff-Nielsen and Shephard (2004).
A major motivation for Barndorff-Nielsen and Shephard (2002) and Andersen, Bollerslev,
Diebold, and Labys (2001) was the fact that volatility in financial markets is highly and unstably
diurnal within a day, responding to regularly timed macroeconomic news announcements, social
norms such as lunch times and sleeping or the opening of other markets. This makes estimating
lim [Y ]t+ε − [Y ]t /ε
ε↓0
extremely difficult. The very stimulating work of Genon-Catalot, Larédo, and Picard (1992),
Foster and Nelson (1996), Mykland and Zhang (2002) and Mykland and Zhang (2005) tries to
tackle this problem using a double asymptotics, as δ ↓ 0 and ε ↓ 0. However, in the last five years
many econometrics researchers have mostly focused on naturally diurnally robust quantities like
the daily or weekly QV.
2.6
Changes in probability law, QV and BPV
The observed price process Y ∈ SM, governed by its data generating process or measure P ,
is not uniquely interesting. In financial economics the stochastic behaviour of Y under risk
neutral versions P ∗ (i.e. so called equivalent martingale measures) are also important for they
determine the price of contingent assets based on Y . An interesting question is whether [Y ]
computed under P tells us anything about the behaviour of [Y ] under P ∗ . To discuss this recall
that the notation P ∗ << P means that the probability law P ∗ is dominated by P . When
P ∗ << P and P << P ∗ then P and P ∗ are said to be equivalent measures7 , which is more
general than P ∗ being an equivalent martingale measure for P .
p
Clearly under P , [Yδ ] → [Y ], so if P and P ∗ are equivalent measures then, as the region
where [Yδ ] − [Y ] has got substantial probability away from zero narrows as δ ↓ 0, so it must
under P ∗ by equivalence. Consequently, in the limit we have that, almost surely,
[YP ] = [YP ∗ ] ,
(12)
where [YP ] is the QV of Y under P and [YP ∗ ] be the QV of Y under P ∗ . Hence, potentially,
[Yδ ] tells us a lot about [Y ] under P ∗ .
This result appears in, for example, book length treatments of semimartingales such as Jacod
and Shiryaev (2003, p. 169) and Protter (2004, p. 95). However, its importance in econometrics
seems to have gone largely unnoticed. We believe it is powerful. It means that inference based
7
For those who are unfamiliar with these terms, imagine P and P ∗ have discrete support. If this support
exactly coincides, then the measures are equivalent. This is crucial for it means we can rescale the measures of P
to produce P ∗ and vice versa. Similar ideas hold in the continuous case, see for example Billingsley (1995).
9
on [Yδ ] can be regarded as valid inference on [Y ] under each and every equivalent martingale
measure P ∗ (i.e. it holds for incomplete markets). Thus we have a way of making inferences
on derivative prices. This is the only result we know of which allows inference under P to be
transferred to make conclusions about P ∗ . This, in our view, puts the concepts of realised QV
and realised BPV at the centre of financial econometrics. Of course, the dual properties of [Y P ∗ ]
and that Y must be a martingale under P ∗ is not enough to identify P ∗ (e.g. it gives no hint at
the degree of leverage nor drift), but it does tell us a great deal about reasonable risk neutral
processes. To put this observation in context, Garcia, Ghysels, and Renault (2005) and Bates
(2003) reviews the econometric literature on pricing derivatives, while Bollerslev, Gibson, and
Zhou (2005) use realised volatility as inputs into estimating parameters of SV models used to
price options.
This result extends further for (e.g. Jacod and Shiryaev (2003, p. 169))
[YPct ] = [YPct∗ ],
which implies that
[YPd ] = [YPd∗ ],
(13)
which means that BPV can be used to make inference on the continuous and discontinuous
components of Y under P ∗ . Further, if there are jumps under P then there must be jumps, at
the same time and of the same variation, under P ∗ .
2.7
Derivatives based on realised QV and QVol
In the last ten years an over the counter market in realised QV and QVol has been rapidly
developing. This has been stimulated by interests in hedging volatility risk — see Neuberger
(1990), Carr and Madan (1998), Demeterfi, Derman, Kamal, and Zou (1999) and Carr and
Lewis (2004). Examples of such options are where the payoffs are
q
max ([Yδ ]t − K1 , 0) , max
[Yδ ]t − K2 , 0 .
Interesting δ is typically taken as a day. Such options approximate, potentially poorly,
q
max ([Y ]t − K1 , 0) , max
[Y ]t − K2 , 0 .
(14)
(15)
The fair value of options of the type (15) has been studied by a number of authors, for
various volatility models. For example, Brockhaus and Long (1999) employs the Heston (1993)
SV model, Javaheri, Wilmott, and Haug (2002) GARCH diffusion, while Howison, Rafailidis,
and Rasmussen (2004) study log-Gaussian OU processes. Carr, Geman, Madan, and Yor (2005)
10
look at the same problem based upon pure jump processes. Carr and Lee (2003a) have studied
how one might value such options based on replication without being specific about the volatility
model. See also the overview of Branger and Schlag (2005).
The common feature of these papers is that the calculations are based on replacing (14) by
(15). These authors do not take into account, to our knowledge, the potentially large difference
between using [Yδ ]t and [Y ]t .
2.8
Empirical illustrations: measurement
To illustrate some of the empirical features of realised daily QV, and particularly their precision
as estimators of daily QV, we have used a series which records the log of the number of German
Deutsche Mark a single US Dollar buys (written Y 1 ) and the log of the Japanese Yen/Dollar rate
(written Y 2 ). It covers 1st December 1986 until 30th November 1996 and was kindly supplied to
us by Olsen and Associates in Zurich (see Dacorogna, Gencay, Müller, Olsen, and Pictet (2001)),
although we have made slightly different adjustments to deal with some missing data (described
in detail in Barndorff-Nielsen and Shephard (2002)). Capturing time stamped indicative bid
and ask quotes from a Reuters screen, they computed prices at each 5-minute period by linear
interpolation by averaging the log bid and log ask for the two closest ticks.
Figure 1 provides some descriptive statistics for the exchange rates starting on 4th February,
1991. Figure 1(a) shows the first four active days of the dataset, displaying the bivariate 10
q
minute returns8 . Figure 1(b) details the daily realised volatilities for the DM Vbi1 , together
with 95% confidence intervals. These confidence intervals are based on the log-version of the
limit theory for the realised variance we will develop in the next subsection. When the volatility
is high, the confidence intervals tend to be very large as well. In Figure 1(c) we have drawn
the realised covariance Vbi1,2 against i, together with the associated confidence intervals. These
terms move rather violently through this period. The corresponding realised correlations b
ρ 1,2
i are
given in Figure 1(d). These are quite stable through time with only a single realised correlation
standing out from the others in the sample. The correlations are not particularly precisely
estimated, with the confidence intervals typically being around 0.2 wide.
Table 1 provides some additional daily summary statistics for 100 times the daily data (the
scaling is introduced to make the Tables easier to read). It shows the means of the squared
2
1
daily returns Yi1 − Yi−1
and the estimated daily QVs Vbi1 are in line, but that the realised
b 1 is below them. The RV and BPV quantities are highly correlated, but the BPV has
BPV B
i
a smaller standard deviation. A GARCH(1,1) model is also fitted to the daily return data and
its conditional, one-step ahead predicted variances h i , computed. These have similar means
8
This time resolution was selected so that the results are not very sensitive to market frictions.
11
(a) High requency returns for 4 days
(b): 95% CI for daily realised volatility for DM
0.03
0.002
0.02
0.000
0.01
−0.002
DM
Yen
0
1
2
3
(c): 95% CI for realised volatility for Yen
4
1991.20 1991.25 1991.30 1991.35
(d) realised correlation for DM & Yen
1991.40
0.020
0.75
0.015
0.50
0.010
0.25
0.005
1991.20
1991.25
1991.30
1991.35
0.00
1991.40 1991.20
1991.25
1991.30
1991.35
1991.40
Figure 1: DM and Yen against the Dollar. Data is 4th February 1991 onwards for 50 active
trading days. (a) 10 minute returns on the two exchange rates for the first 4 days of the dataset.
(b) Realised volatility for the DM series. This is marked with a cross, while the bars denote 95%
confidence intervals. (c) Realised covariance. (d) Realised correlation.
and lower standard deviations, but h i is less strongly correlated with squared returns than the
realised measures.
2.9
Empirical illustration: time series behaviour
Figure 2 shows summaries of the time series behaviour of daily raw and realised DM quantities.
They are computed using the whole run of 10 years of 10 minute return data. Figure 2(a) shows
the raw daily returns and 2(b) gives the corresponding correlogram of daily squared and absolute
returns. As usual absolute returns are moderately more autocorrelated than squared returns,
with the degree of autocorrelation in these plots being modest, while the memory lasts a large
number of lags.
Figure 2(c) shows a time series plot of the daily realised volatilities
q
Vbi1 for the DM series,
indicating bursts of high volatility and periods of rather tranquil activity. The correlogram for
12
Daily
QV: Vbi1
bi1
BPV: B
GARCH: hi
2
1
Yi1 − Yi−1
Mean
0.509
0.441
0.512
0.504
Standard Dev/Cor
.50
.95
.40
.55 .57
.22
.54 .48 .39 1.05
Table 1: Daily statistics for 100 times DM/Dollar return series: estimated QV, BPV, conditional
variance for GARCH and squared daily returns. Reported is the mean, standard deviation and
correlations.
this series is given in Figure 2(d). This shows lagged one correlations of around one half and
is around 0.25 at 10 lags. The correlogram then declines irregularly at larger lags. Figure 2(e)
q
b 1 using the lagged two bipower variation measure. This series does not display the
shows B
i
peaks and troughs of the realised QVol statistics and its correlogram in Figure 2(d) is modestly
higher with its r
first lag being around 0.56 compared to 0.47. The corresponding estimated jump
b 1 is displayed in Figure 2(f), while its correlogram is given in
QVol measure max 0, Vbi1 − B
i
Figure 2(d), which shows a very small degree of autocorrelation.
2.10
2.10.1
Empirical illustration: a more subtle example
Interpolation, last price, quotes and trades
So far we have not focused on the details of how we compute the prices used in these calculations.
This is important if we wish to try to exploit information buried in returns recorded for very
small values of δ, such as a handful of seconds. Our discussion will be based on data taken from
the London Stock Exchange’s electronic order book, called SETS, in January 2004. The market
is open from 8am to 4.30pm, but we remove the first 15 minutes of each day following Engle and
Russell (1998). Times are accurate up to one second. We will use three pieces of the database:
transactions, best bid and best ask. Note the bid and ask are firm quotes, not indicative like
the exchange rate data previous studied. We average the bid and ask to produce a mid-quote,
which is taken to proxy the efficient price. We also give some results based on transaction prices.
We will focus on four high value stocks: Vodafone (telecoms), BP (hydrocarbons), AstraZeneca
(pharmaceuticals) and HSBC (banking).
The top row of Figure 3 shows the log of the mid-quotes, recorded every six seconds on the
2nd working day in January. The graphs indicate the striking discreteness of the price processes,
which is particularly important for the Vodafone series. Table 2 gives the tick size, the number
of mid-point updates and transactions for each asset. It shows the usual result that as the tick
size, as a percentage of the price increases, then the number of mid-quote price updates will
tend to fall as larger tick sizes mean that there is a larger cost to impatience, that is jumping
the queue in the order book by offering a better price than the best current and so updating the
13
0.050
(a) Daily returns
0.10
0.025
(b) ACF: Squared daily returns
Squared daily returns
Absolute daily returns
0.05
0.000
0.00
−0.025
1988
1990
1992
(c) Estimated daily QVol
1994
1996
0
50
100
150
(d) ACF of daily realised masures
0.50
0.02
200
Daily realised QV
Daily realised BPV
Daily realised jump QV
0.25
0.01
0.00
1988
1990
1992
1994
(e) Estimated daily continuous QVol
1996
0.020
0.02
0
50
100
(f) Estimated daily jump QVol
150
200
0.015
0.010
0.01
0.005
1988
1990
1992
1994
1996
1988
1990
1992
1994
1996
Figure 2: All graphs refer to the Olsen group’s five
q minute changes data DM/Dollar. Top left:
daily returns. Middle left: estimated daily QVol
Vbi1 , bottom left: estimated daily continuous
r
q
1
b . Bottom right: estimated continuous QVol
b 1 . Top right: ACF
QVol B
max 0, Vb 1 − B
i
i
i
of squared and absolute returns. X-axis is marked off in days. Middle right: ACF of various
realised estimators.
best quotes.
The middle row of Figure 3 shows the corresponding daily realised QVol, computed using
0.015, 0.1, 1, 5 and 20 minute intervals based on mid-quotes. These are related to the signature
plots of Andersen, Bollerslev, Diebold, and Labys (2000). As the times of the mid-quotes fall
irregularly in time, there is the question of how to approximate the price at these time points.
The Olsen method uses linear interpolation between the prices at the nearest observations before
and after the correct time point. Another method is to use the last datapoint before the relevant
time — the last tick or raw method (e.g. Wasserfallen and Zimmermann (1985)). Typically, the
former leads to falls in realised QVol as δ falls, indeed in theory it converges to zero as δ ↓ 0
as its interpolated price process is of continuous bounded variation (Hansen and Lunde (2006)),
14
Vodafone
6.120
4.97
BP
AstraZeneca
7.90
HSBC
6.7900
6.115
6.7875
4.96
7.89
6.110
6.7850
6.105
4.95
8 10 12 14 16
8 10 12 14 16
Mid−quotes
Raw
Interpolation
0.015
Raw
Interpolation
0.0150
0.0125
8 10 12 14 16 Hours 8 10 12 14 16
0.0150
Raw
Interpolation
0.0100
Raw
Interpolation
0.0125
0.0075
0.0100
0.010
0.1
12
Transactions
0.03
10
0.0150
0.0100
0.1
Raw
Interpolation
12
Raw
Interpolation
10
0.1
12
Raw
Interpolation
0.0125
0.0125
12
10
Raw
Interpolation
0.0100
0.0100
0.02
0.0050
10 Minutes 0.1
0.0125
0.0100
0.0075
0.0075
0.0075
0.1
12
10
0.1
12
10
0.1
12
10 Minutes 0.1
12
10
Figure 3: LSE’s electronic order book on the 2nd working day in January 2004. Top graphs:
mid-quote log-price every 6 seconds, from 8.15am to 4.30pm. X-axis is in hours. Middle graphs:
realised daily QVol computed using 0.015, 0.1, 1, 5 and 20 minute midpoint returns. X-axis is in
minutes. Lower graphs: realised daily QVol computed using 0.1, 1, 5 and 20 minute transaction
returns. Middle and lower graphs are computed using interpolation and the last tick method.
while the latter increases modestly. The sensitivity to δ tends to be larger in cases where the
tick size is large as a percentage of price and this is the case here. Overall we have the conclusion
that the realised QVol does not change much when δ is 5 minutes or above and that it is more
stable for interpolation than for last price. When we use smaller time intervals there are large
dangers lurking. We will formally discuss the effect of market frictions in section 5.
The bottom row in Figure 3 shows the corresponding results for realised QVols computed
using the transactions database. This ignores some very large over the counter trades. Realised
QVol increases more strongly as δ falls when we use the last tick rather than mid-quote data.
This is particularly the case for Vodafone, where bid/ask bounce has a large impact. Even the
interpolation method has difficulties with transaction data. Overall, one gets the impression
15
Vodafone
open-close
open-open
Correlation
Tick size
# of Mid-quotes per day
# of Transactions per day
0.00968
0.0159
0.861
0.25
333
3,018
BP
Daily
0.00941
0.0140
0.851
0.25
1,434
2,995
AstraZeneca
volatility
0.0143
0.0140
0.912
1.0
1,666
2,233
HSBC
0.00730
0.00720
0.731
0.5
598
2,264
Table 2: Top part of table: Average daily volatility. Open is the mid-price at 8.15am, close is
the mid-price at 4.30pm. Open-open looks at daily returns. Reported are the sample standard
deviations of the returns over 20 days and sample correlation between the open-close and openopen daily returns. Bottom part of table: descriptive statistics about the size of the dataset.
from this study that basing the analysis on mid-quote data is sound for the LSE data 9 .
A fundamental difficulty with equity data is that the equity markets are only open for a
fraction of the whole day and so it is quite possible that a large degree of their variation is at
times when there is little data. This is certainly true for the U.K. equity markets which are
closed during a high percentage of the time when U.S. markets are open. Table 2 gives daily
volatility for open to close and open to open returns, as well as the correlation between the two
return measures. It shows the open to close measures account for a high degree of the volatility
in the prices, with high correlations between the two returns. The weakest relationship is for the
Vodafone series, with the strongest for AstraZeneca. Hansen and Lunde (2005c) have studied
how one can use high-frequency information to estimate the QV throughout the day, taking into
account closed periods.
2.10.2
Epps effects
Market frictions affect the estimation of realised QVol, but if the asset is highly active, the tick
size is small as a percentage of the price, δ is well above a minute and the mid-quote/interpolation
method is used, then the effects are modest. The situation is much less rosy when we look at
estimating quadratic covariations due to the so called Epps (1979) effect. This has been documented in very great detail by Sheppard (2005), who provides various theoretical explanations.
We will come back to them in sections 3.8.3 and 5. For the moment it suffices to look at Figure 4
which shows the average daily realised correlation computed in January 2004 for the four stocks
looked at above. Throughout prices are computed using mid-quotes and interpolation. The
graph shows how this average varies with respect to δ. It trends downwards to zero as δ ↓ 0,
with extremely low dependence measures for low values of δ. This is probably caused by the
9
A good alternative would be to carry out the entire analysis on either all the best bids or all the best asks.
This approach is used by Hansen and Lunde (2005c) and Large (2005).
16
0.4
0.3
Vodafone−BP
Vodafone−AstraZeneca
Vodafone−HSBC
BP−AstraZeneca
BP−HSBC
AstraZeneca−HSBC
0.2
0.1
Minutes
0.1
0.2 0.3 0.4
1
2
3 4 5 67
10
20
30 40 50
Figure 4: LSE data during January 2004. Realised correlation computed daily, averaged over
the month. Realised quantities are computed using data at the frequency on the x-axis.
fact that asset prices tend not to simultaneously move due to non-synchronous trading and the
differential rate at which information of different types is absorbed into individual stock prices.
Measurement error when Y ∈ BSM
3
3.1
Infeasible asymptotics
Market frictions mean that it is not wise to use realised variation objects based on very small δ.
This suggests refining our convergence in probability arguments to give a central limit theorem
which may provide reasonable predictions about the behaviour of RV statistics for moderate
values of δ, such as 5 or 10 minutes, where frictions are less likely to bite hard. Such CLTs will
be the focus of attention in this section. At the end of the section, in addition, we will briefly
discuss various alternative measures of variation, such as realised range, subsampling and kernel,
which have recently been introduced to the literature. Finally we will also discuss how realised
objects can contribute to the practical forecasting of volatility.
We will derive the central limit theorem for [Y δ ]t which can then be discretised to produce
the CLT for Vbi . Univariate results will be presented, since this has less notational clutter. The
generalised results were developed in a series of papers by Jacod (1994), Jacod and Protter
(1998), Barndorff-Nielsen and Shephard (2002) and Barndorff-Nielsen and Shephard (2004).
Theorem 1 Suppose that Y ∈ BSM is one-dimensional and that (for all t < ∞)
then as δ ↓ 0 so
δ −1/2 ([Yδ ]t − [Y ]t ) →
√ Z t 2
2
σ u dBu ,
Rt
0
a2u du < ∞,
(16)
0
where B is a Brownian motion which is independent from Y and the convergence is in law stable
17
as a process.
Proof. By Ito’s lemma for continuous semimartingales
Y 2 = [Y ] + 2Y • Y,
then
Yjδ − Y(j−1)δ
This implies that
2
= [Y ]δj − [Y ]δ(j−1) + 2
δ −1/2 ([Yδ ]t − [Y ]t ) = 2δ −1/2
= 2δ −1/2
Z
δj
δ(j−1)
bt/δc Z δj
X
δ(j−1)
j=1
Z
(Yu − Y(j−1)δ )dYu .
(Yu − Y(j−1)δ )dYu
δbt/δc
(Yu − Yδbu/δc )dYu .
0
Jacod and Protter (1998, Theorem 5.5) show that for Y satisfying the conditions in Theorem 1
then10
δ −1/2
Z
0
t
1
(Yu − Yδbs/δc )dYu → √
2
Z
0
t
σ 2u dBu ,
where B ⊥⊥ Y and the convergence is in law stable as a process. This implies
δ −1/2 ([Yδ ] − [Y ]) →
√
2 σ2 • B .
The most important point of this Theorem is that B ⊥⊥ Y . The appearance of the additional
Brownian motion B is striking. This means that Theorem 1 implies, for a single t,
Z t
L
−1/2
4
δ
([Yδ ]t − [Y ]t ) → M N 0, 2
σ u du ,
(17)
0
where M N denotes a mixed Gaussian distribution. This result implies in particular that, for
i 6= j,
δ
−1/2
Vbi − Vi
Vbj − Vj
!
L
→ MN
0
0
R hi
h(i−1)
,2
σ 4u du
0
R hj
0
4
h(j−1) σ u du
!!
,
so Vbi − Vi are asymptotically uncorrelated, so long as Var Vbi − Vi < ∞, through time.
10
At an intuitive level, if we ignore the drift then
Z δj
Z
2
(Yu − Y(j−1)δ )dYu ' σδ(j−1)
δ(j−1)
δj
(Wu − W(j−1)δ )dWu ,
δ(j−1)
which is a martingale difference sequence in j with zero mean and conditional variance of 21 σ 2δ(j−1) . Applying a
triangular martingale CLT one would expect this result, although formalising it requires a considerable number
of additional steps.
18
Barndorff-Nielsen and Shephard (2002) showed that Theorem 1 can be used in practice as
Rt
[4]
the integrated quarticity 0 σ 4u du can be consistently estimated using (1/3) {Y δ }t where
bt/δc
[4]
{Yδ }t
=δ
−1
X
j=1
Yjδ − Y(j−1)δ
4
.
(18)
In particular then
δ −1/2 ([Yδ ]t − [Y ]t ) L
r
→ N (0, 1).
2
[4]
{Yδ }t
3
This is a nonparametric result as it does not require us to specify the form of a or σ.
(19)
The multivariate version of (16) has that as δ ↓ 0 so
q
q
1 XX
δ −1/2 [Yδ ](kl) − [Y ](kl) → √
σ (kb) σ (cl) + σ (lb) σ (ck) • B(bc) ,
2 b=1 c=1
k, l = 1, ..., q, (20)
where B is a q × q matrix of independent Brownian motions, independent of Y and the convergence is in law stable as a process. In the mixed normal version of this result, the asymptotic
covariance is a q × q × q × q array with elements
Z t
{Σ(kk0 )u Σ(ll0 )u + Σ(kl0 )u Σ(lk0 )u + Σ(kl)u Σ(k0 l0 )u }du
0
.
(21)
k,k 0 ,l,l0 =1,...,q
Barndorff-Nielsen and Shephard (2004) showed how to use high frequency data to estimate this
array of processes. We refer the reader to that paper, and also Mykland and Zhang (2005), for
details.
3.2
Finite sample performance & the bootstrap
Our analysis of [Yδ ]t − [Y ]t has been asymptotic as δ ↓ 0. Of course it is crucial to know if this
analysis is informative for the kind of moderate values of δ we see in practice. A number of
authors have studied the finite sample behaviour of the feasible limit theory given in (19) and a
log-version, derived using the delta-rule
δ −1/2 (log[Yδ ]t − log[Y ]t ) L
s
→ N (0, 1).
[4]
{Y
}
δ t
2
3
([Yδ ]t )2
(22)
We refer readers to Barndorff-Nielsen and Shephard (2005a), Meddahi (2002), Goncalves and
Meddahi (2004), and Nielsen and Frederiksen (2005). The overall conclusion is that (19) is quite
poorly sized, but that (22) performs pretty well. The asymptotic theory is challenged in cases
where there are components in volatility which are very quickly mean reverting. In the multivariate case, Barndorff-Nielsen and Shephard (2004) studied the finite sample behaviour of realised
regression and correlation statistics. They suggest various transformations which improve the
19
finite sample behaviour of these statistics, including the use of the Fisher transformation for the
realised correlation.
Goncalves and Meddahi (2004) have studied how one might try to bootstrap the realised
daily QV estimator. Their overall conclusions are that the usual Edgeworth expansions, which
justify the order improvement associated with the bootstrap, are not reliable guides to the finite
sample behaviour of the statistics. However, it is possible to design bootstraps which provide
very significant improvements over the limiting theory in (19). This seems an interesting avenue
to follow up, particularly in the multivariate case.
3.3
Irregularly spaced data
Mykland and Zhang (2005) have recently generalised (16) to cover the case where prices are
recorded at irregular time intervals. See also the related work of Barndorff-Nielsen and Shephard
(2005c). Mykland and Zhang (2005) define a random sequence of times, independent of Y , 11
over the interval t ∈ [0, T ],
Gn = {0 = t0 < t1 < ... < tn = T } ,
then continue to have δ = T /n, and define the estimated QV process
tj ≤t
[YGn ]t =
X
j=1
Ytj − Ytj−1
2
p
→ [Y ]t .
They show that as n → ∞ so12
δ −1/2 ([YGn ]t − [Y ]t ) =
2δ −1/2
L
→ MN
tj ≤t Z t
X j
tj−1
j=1 Z t 0, 2
0
where
(Yu − Ytj−1 )dYu
∂HuG
∂u
σ 4u du
,
tj ≤t
HtG
=
lim H Gn ,
n→∞ t
where
HtGn
=δ
−1
X
j=0
(tj − tj−1 )2 ,
and we have assumed that σ follows a diffusion and H G , which is a bit like a QV process but
is scaled by δ −1 , is differentiable with respect to time. The H G function is non-decreasing and
11
It is tempting to think of the tj as the time of the j-th trade or quote. However, it is well know that the
process generating the times of trades and price movements in tick time are not statistically independent (e.g.
Engle and Russell (2005) and Rydberg and Shephard (2000)). This would seem to rule out the direct application
of the methods we use here in tick time, suggesting care is needed in that case.
12
At an intuitive level, if we ignore the drift then
Z tj
Z tj
2
(Yu − Ytj−1 )dYu ' σtj−1
(Wu − Wtj−1 )dWu ,
tj−1
tj−1
which is a martingale difference sequence in j with zero mean and conditional variance of 21 σ2tj−1 (tj − tj−1 ).
Although this suggests the stated result, formalising it requires a considerable number of additional steps.
20
runs quickly when the sampling is slower than normal. For regularly space data, t j = δj and so
HtG = t, which reproduces (17).
It is clear that
tj ≤t
[4]
[YGn ]t
=δ
−1
X
Ytj − Ytj−1
j=1
4
Z t
p
→3
0
∂HuG
∂u
σ 4u du,
which implies the feasible distributional result in (19) and (22) also holds for irregularly spaced
data, which was one of the results in Barndorff-Nielsen and Shephard (2005c).
3.4
Multiple grids
Zhang (2004) extended the above analysis to the simultaneous use of multiple grids. In our
exposition we will work with Gn (i) = 0 = ti0 < ti1 < ... < tin = T for i = 0, 1, 2, ..., I and δ i =
2
Ptij ≤t T /ni . Then define the i-th estimated QV process [Y Gn (i) ]t = j=1
Y ti − Y ti
. Additionally
j
j−1
we need a new cross term for the covariation between the time scales. The appropriate term is
ti,k
j ≤t
G(i)∪G(k)
Ht
G (i)∪Gn (k)
= lim Ht n
n→∞
G (i)∪Gn (k)
where Ht n
,
X = (δ i δ k )−1/2
j=1
i,k
ti,k
j − tj−1
2
,
where ti,k
j comes from
n
o
i,k
i,k
Gn (i) ∪ Gn (k) = 0 = ti,k
<
t
<
...
<
t
=
T
,
0
1
2n
i, k = 0, 1, 2, ..., I.
Clearly, for all i,
−1/2
δi
−1/2
so the scaled (by δ i
[YGn (k) ]t is
[YGn (i) ]t − [Y ]t =
−1/2
and δ k



2
 Z

Z
∂Hu
∂u
0
!
G(i)∪G(k)
t
tij ≤t Z
X
j=1
tij
tij−1
(Yu − Yti
j−1
)dYu
, respectively) asymptotic covariance matrix of [Y Gn (i) ]t and
G(i)
t
−1/2
2δ i
∂Hu
σ 4u du
∂u
0
σ 4u du
!
Z
•
t
0
G(k)
∂Hu
∂u
!

σ 4u du


.


Example 1 Let t0j = δ (j + ε), t1j = δ (j + η) where |ε − η| ∈ [0, 1] are temporal offsets, then
G(0)
Ht
G(1)
= Ht
= t,
G(0)∪G(1)
Ht
Thus
δ
−1/2
[YGn (0) ]t − [Y ]t
[YGn (1) ]t − [Y ]t
L
= t (η − ε)2 + (1 − |η − ε|)2 .
→ MN
0, 2
1
•
(η − ε)2 + (1 − |η − ε|)2 1
Z
0
t
σ 4u du
The correlation between the two measures is minimised at 1/2 by setting |η − ε| = 1/2.
21
Example 1 extends naturally to when t kj = δ j +
k
K+1
, k = 0, 1, 2, ..., K, which allows
many equally spaced realised QV like estimators to be defined based on returns measured over
δ periods. The scaled asymptotic covariance of [Y Gn (i) ]t and [YGn (k) ]t is
2
(
k−i
K +1
2
) Z t
k−i 2
+ 1 − σ 4u du.
K + 1
0
If K = 1 or K = 2 then the correlation between the estimates is 1/2 and 5/9, respectively. As
the sampling points become more dense the correlation quickly escalates which means that each
new realised QV estimator brings out less and less additional information.
3.5
Subsampling
The multiple grid allows us to create a pooled grid estimator of QV — which is a special case of
subsampling a statistic based on a random field, see for example the review of Politis, Romano,
and Wolf (1999, Ch. 5). A simple example of this is
K
1 X
[YGn+ (K) ]t =
[YGn (i) ]t ,
K +1
(23)
i=0
which was mentioned in this context by Müller (1993) and Zhou (1996, p.
48).
Clearly
p
[YGn+ (K) ]t → [Y ]t as δ ↓ 0, while the properties of this estimator were first studied when
Y ∈ BSM by Zhang, Mykland, and Aı̈t-Sahalia (2005). Zhang (2004) also studies the properties
of unequally weighted pooled estimators, while additional insights are provided by Aı̈t-Sahalia,
Mykland, and Zhang (2005b).
Example 2 Let tkj = δ j +
δ
−1/2
L
→ MN
k
K+1
, k = 0, 1, 2, ..., K. Then, for fixed K as δ ↓ 0 so
[YGn (0) ]t − [Y ]t
[YGn (1) ]t − [Y ]t
(
!
) Z t
K
K X
X
k−i 2
2
k−i 2
0,
+ 1 − σ 4u du
K +1
K + 1
(K + 1)2 i=0 k=0
0
This subsampler is based on a sample size K +1 times the usual one but returns are still recorded
over intervals of length δ. When K = 1 then the constant in front of integrated quarticity is
1.5 while when K = 2 it drops to 1.4074. The next terms in the sequence are 1.3750, 1.3600,
1.3519 and 1.3469 while it asymptotes to 1.333, a result due to Zhang, Mykland, and Aı̈t-Sahalia
(2005). Hence the gain from using the entire sample path of Y via multiple grids is modest and
almost all the available gains occur by the time K reaches 2. However, we will see later that this
subsampler has virtues when there are market frictions.
22
3.6
Serial covariances
Suppose we define the notation Gδ (ε, η) = {δ(ε + η), δ(2ε + η), ...}, then the above theory implies
that





Rt
δ −1/2 [YGn (2,0) ]t − 0 σ 2u du
Rt
δ −1/2 [YGn (2,−1) ]t − 0 σ 2u du
Rt
δ −1/2 [YGn (1,0) ]t − 0 σ 2u du
Define the realised serial covariance as

  


0
4 2 2 Z t

 L
σ 4u du .
 → M N  0  ,  2 4 2 

0
0
2 2 2
bt/δc
γ
bs (Yδ , Xδ ) =
X
j=1
Yδj − Yδ(j−1)
Xδ(j−s) − Xδ(j−s−1) ,
s = 0, 1, 2, ..., S,
and say γ
b−s (Y, X) = γ
bs (Y, X) while γ
bs (Yδ ) = γ
bs (Yδ , Yδ ). Derivatives on such objects have
recently been studied by Carr and Lee (2003b). We have that
2b
γ 1 (Yδ ) = [YGn (2,0) ]t + [YGn (2,−1) ]t − 2[YGn (1,0) ]t + op (δ 1/2 ).
Note that γ
b0 (Yδ ) = [YGn (1,0) ]t . Then, clearly
δ
−1/2





Rt
γ
b0 (Yδ ) − 0 σ 2u du
γ
b1 (Yδ )
..
.
γ
bS (Yδ )





 L
 → M N 


0
0
..
.
0
 
 
 
,
 
2 0 ···
0 1 ···
.. .. . .
.
. .
0 0 ···
0
0
..
.
1


Z t



σ 4u du ,

 0

(24)
see Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, Theorem 2). Consequently




Rt 4
γ 0 (Yδ )
γ
b1 (Yδ )/b
σ u du 
 L


..
δ −1/2 
 → M N 0, I R 0
2  ,
.
t 2
σ
du
γ
b (Yδ )/b
γ (Yδ )
0 u
S
0
which differs from the result of Bartlett (1946), inflating the usual standard errors as well as
making inference multivariate mixed Gaussian. There is some shared characteristics with the
familiar Eicker (1967) robust standard errors but the details are, of course, rather different.
3.7
Kernels
Following Bartlett (1950) and Eicker (1967), long run estimates of variances are often computed
using kernels. We will see this idea may be helpful when there are market frictions and so
we take some time discussing this here. It was introduced in this context by Zhou (1996) and
Hansen and Lunde (2006), while a thorough discussion was given by Barndorff-Nielsen, Hansen,
Lunde, and Shephard (2004, Theorem 2). A kernel takes on the form of
RVw (Y ) = w0 [Yδ ] + 2
q
X
i=1
23
wi γ
bi (Yδ ),
(25)
where the weights wi are non-stochastic. It is clear from (24) that if the estimator is based on
δ/K returns, so that it is compatible with (23), then
(
δ
K
w02
+2
q
X
wi2
i=1
!)−1/2 RVw (Y δ ) − w0
K
Z
t
0
σ 2u du
L
→ MN
0, 2
Z
t
0
σ 4u du
.
(26)
In order for this method to be consistent for integrated variance as q → ∞ we need that
P
w0 = 1 + o(1) and qi=1 wi2 /K = O(1) as a function of q.
Example 3 The Bartlett kernel puts w 0 = 1 and wi = (q + 1 − i) / (q + 1). When q = 1 then
w1 = 1/2 and the constant in front of integrated quarticity is 3, while when q = 2 then w 1 = 2/3,
w2 = 1/3 and the constant becomes 4 + 2/9. For moderately large q this is well approximated
by 4 (q + 1) /3. This means that we need q/K → 0 for this method to be consistent. This result
appears in Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, Theorem 2).
3.8
3.8.1
Other measures
Realised range
Suppose Y = σW , a scaled Brownian motion, then
r
2
2
E sup Ys = ϕ2 σ t, where ϕr = E sup |Ws | ,
0≤s≤t
0≤s≤1
noting that ϕ2 = 4 log 2 and ϕ4 = 9ζ(3), where ζ is the Riemann function. This observation
led Parkinson (1980) to provide a simple estimator of σ 2 based on the highs and lows of asset
prices. See also the work of Rogers and Satchell (1991), Alizadeh, Brandt, and Diebold (2002),
Ghysels, Santa-Clara, and Valkanov (2004) and Brandt and Diebold (2004). One reason for the
interest in ranges is the belief that they are quite informative and somewhat robust to market
frictions. The problem with this analysis is that it does not extend readily when Y ∈ BSM.
In independent work, Christensen and Podolskij (2005) and Martens and van Dijk (2005)
have studied the realised range process. Christensen and Podolskij (2005) define the process as
bt/δc
\Y \t = p− lim
δ↓0
X
sup
j=1 s∈[(j−1)δ,jδ]
(Ys − Y(j−1)δ )2 ,
(27)
which is estimated by the obvious realised version, written \Y δ \t . Christensen and Podolskij
Rt 2
(2005) have proved that if Y ∈ BSM, then ϕ −1
2 \Y \t = 0 σ u du. Christensen and Podolskij
(2005) also shows that under rather weak conditions
Z
L
ϕ4 − ϕ22 t 4
−1/2
−1
δ
ϕ2 \Yδ \t − [Y ]t → M N 0,
σ u du ,
ϕ22
0
24
where ϕ0 = ϕ4 − ϕ22 /ϕ22 ' 0.4. This shows that it is around five time as efficient as the
usual realised QV estimator. Christensen and Podolskij (2005) suggest estimating integrated
quarticity using
bt/δc
δ −1 ϕ−1
4
X
sup
(Ys − Y(j−1)δ )4 ,
j=1 s∈[(j−1)δ,jδ]
which means this limit theorem is feasible. Martens and van Dijk (2005) have also studied the
properties of \Yδ \t using simulation and empirical work.
As far as we know no results are known about estimating [Y ] using ranges when there are
jumps in Y , although it is relatively easy to see that a bipower type estimator could be defined
using contiguous ranges which would robustly estimate [Y ct ].
3.8.2
Discrete sine transformation
Curci and Corsi (2003) have argued that before computing realised QV we should prefilter the
data using a discrete sine transformation to the returns in order to reduce the impact of market
frictions. This is efficient when the data X is a Gaussian random walk Y plus independent
Gaussian noise ε model, where we think of the noise as market frictions. The Curci and Corsi
(2003) method is equivalent to calculating the realised QV process on the smoother E (Y |X; θ),
where θ are the estimated parameters indexing the Gaussian model. This type of approach was
also advocated in Zhou (1996, p. 112).
3.8.3
Fourier approach
Motivated by the problem of irregularly spaced data, where the spacing is independent of Y ,
Malliavin and Mancino (2002) showed that if Y ∈ BSM then


J h
i
h
i
X
1
p
YJl , YJk
= π2 
alj akj + blj bkj  → Y l , Y k
,
J
2π
2π
j=1
as J → ∞, where the Fourier coefficients of Y are
Z
Z
1 2π
1 2π
l
l
l
aj =
cos(ju)dYu , bj =
sin(ju)dYul .
π 0
π 0
The Fourier coefficients are computed by, for example, integration by parts
j Z 2π
1 l
l
l
sin(ju)Yul du
aj =
Y − Y0 +
π 2π
π 0
1 n−1
X
1 l
'
Y2π − Y0l +
{cos (jti ) − cos (jti+1 )} Ytli ,
π
π
i=0
blj
'
1
π
n−1
X
i=0
{sin (jti ) − sin (jti+1 )} Ytli .
25
(28)
This means that, in principle, one can use all the available data for all the series, even though
prices for different assets appear at different points in time. Indeed each series has its Fourier
coefficients computed separately, only performing the multivariate aspect of the analysis at step
(28). A similar type of analysis could be based on wavelets, see Hog and Lunde (2003).
The performance of this Fourier estimator of QV is discussed by, for example, Barucci and
Reno (2002b), Barucci and Reno (2002a), Kanatani (2004b), Precup and Iori (2005), Nielsen
and Frederiksen (2005) and Kanatani (2004a) who carry out some extensive simulation and
empirical studies of the procedure. Reno (2003) has used a multivariate version of this method
to study the Epps effects, while Mancino and Reno (2005) use it to look at dynamic principle
components. Kanatani (2004a, p. 22) has shown that in the univariate case the finite J Fourier
estimator can be written as a kernel estimator (25). For regularly spaced data he derived the
weight function, noting that as J increases, so each of these weights declined and so for fixed
δ so [YJ ]2π → [Yδ ]2π . An important missing component in this analysis is any CLT for this
estimator.
3.8.4
Generalised bipower variation
The realised bipower variation process suggests studying generic statistics of the form introduced
by Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005) and Barndorff-Nielsen,
Graversen, Jacod, and Shephard (2005)
bt/δc
Yδ (g, h)t = δ
X g δ −1/2 Yδ(j−1) − Yδ(j−2) h δ −1/2 Yδj − Yδ(j−1) ,
(29)
j=1
where the multivariate Y ∈ BSM and g, h are conformable matrices with elements which are
continuous with at most polynomial growth in their arguments. Both QV and multivariate BPV
can be cast in this form by the appropriate choice of g, h. Some of the choices of g, h will deliver
statistics which will be robust to jumps.
Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2005) have shown that as
δ ↓ 0 the probability limit of this process is always the generalised BPV process
Z t
ρσ u (g)ρσ u (h)du,
0
where the convergence is locally uniform, ρ σ (g) = Eg(X) and X ∼ N (0, σσ 0 ). They also provide
a central limit theorem for the generalised power variation estimator.
An example of the above framework which we have not covered yet is achieved by selecting
r
h(y) = y l for r > 0 and g(y) = 1, then (29) becomes
δ 1−r/2
bntc r
X
l
l
Yδ(j−1) − Yδ(j−2)
,
j=1
26
(30)
which is called the realised r-th order power variation. When r is an integer it has been studied
from a probabilistic viewpoint by Jacod (1994) while Barndorff-Nielsen and Shephard (2003)
look at the econometrics of the case where r > 0. The increments of these types of high frequency volatility measures have been informally used in the financial econometrics literature for
some time when r = 1, but until recently without a strong understanding of their properties.
Examples of their use include Schwert (1990), Andersen and Bollerslev (1998b) and Andersen and Bollerslev (1997), while they have also been abstractly discussed by Shiryaev (1999, pp.
349–350) and Maheswaran and Sims (1993). Following the work by Barndorff-Nielsen and Shephard (2003), Ghysels, Santa-Clara, and Valkanov (2004) and Forsberg and Ghysels (2004) have
successfully used realised power variation as an input into volatility forecasting competitions.
It is unclear how the greater flexibility over the choice of g, h will help econometricians in the
future to learn about new features of volatility and jumps, perhaps robustly to market frictions.
It would also be attractive if one could generalise (29) to allow g and h to be functions of the
path of the prices, not just returns.
3.9
3.9.1
Non-parametric forecasting
Background
We saw in section 2.4 that if s is small then
Cov (Yt+s − Yt |Ft ) ' E ([Y ]t+s − [Y ]t |Ft ) .
This suggests:
1. estimating components of the increments of QV;
2. projecting these terms forward using a time series model.
This separates out the task of historical measurement of past volatility (step 1) from the
problem of forecasting (step 2).
Suppose we wish to make a sequence of one-step or multi-step ahead predictions of V i =
[Y ]hi − [Y ]h(i−1) using their proxies Vbi = [Yδ ]hi − [Yδ ]h(i−1) , raw returns yi = Yhi − Yh(i−1) (to try
bi = {Yδ } − {Yδ }
to deal with leverage effects) and components B
hi
h(i−1) , where i = 1, 2, ..., T . For
simplicity of exposition we set h = 1. This setup exploits the high frequency information set,
but is somewhat robust to the presence of complicated intraday effects. Clearly if Y ∈ BSM
then the CLT for realised QV implies that as δ ↓ 0, so long as the moments exist,
E (Vi |Fi−1 ) ' E Vbi |Fi−1 + o(δ 1/2 ).
27
It is compelling to choose to use the coarser information set, so
bi−1 , B
bi−2 , ..., B
b1 , yi−1 , ..., y1
Cov Yi − Yi−1 |Vbi−1 , Vbi−2 , ..., Vb1 , B
bi−1 , B
bi−2 , ..., B
b1 , yi−1 , ..., y1
' E Vi |Vbi−1 , Vbi−2 , ..., Vb1 , B
bi−1 , B
bi−2 , ..., B
b1 , yi−1 , ..., y1 .
' E Vbi |Vbi−1 , Vbi−2 , ..., Vb1 , B
Forecasting can be carried out using structural or reduced form models. The simplest reduced
form approach is to forecast Vbi using the past history Vbi−1 , Vbi−2 , ..., Vb1 , yi−1 , yi−2 , ..., y1 and
bi−1 , B
bi−2 , ..., B
b1 based on standard forecasting methods such as autoregressions. The earliest
B
modelling of this type that we know of was carried out by Rosenberg (1972) who regressed
Vbi on Vbi−1 to show, for the first time in the academic literature, that volatility was partially
forecastable.
This approach to forecasting is convenient but potentially inefficient for it fails to use all the
available high frequency data. In particular, for example, if Y ∈ SV then accurately modelled
high frequency data may allow us to accurately estimate the spot covariance Σ (i−1)h , which
would be a more informative indicator than Vbi−1 . However, the results in Andersen, Bollerslev,
and Meddahi (2004) are reassuring on that front. They indicate that if Y ∈ SV there is only
a small loss in efficiency by forgoing Σ (i−1)h and using Vbi−1 instead. Further, Ghysels, SantaClara, and Valkanov (2004) and Forsberg and Ghysels (2004) have forcefully argued that by
additionally conditioning on low power variation statistics (30) very significant forecast gains
can be achieved.
3.9.2
Univariate illustration
In this subsection we will briefly illustrate some of these suggestions in the univariate case.
Much more sophisticated studies are given in, for example, Andersen, Bollerslev, Diebold, and
Labys (2001), Andersen, Bollerslev, Diebold, and Ebens (2001), Andersen, Bollerslev, Diebold,
and Labys (2003), Bollerslev, Kretschmer, Pigorsch, and Tauchen (2005) and Andersen, Bollerslev, and Meddahi (2004), who look at various functional forms, differing asset types and more
involved dynamics. Ghysels, Santa-Clara, and Valkanov (2004) suggest an alternative method,
using high frequency data but exploiting more sophisticated dynamics through so-called MIDAS
regressions.
Table 3 gives a simple example of this approach for 100 times the returns on the DM/Dollar
bi .
series. It shows the result of regressing Vbi on a constant, and simple lagged versions of Vbi and B
We dropped a priori the use of yi as regressors for this exchange rate, where leverage effects are
usually not thought to be important. The unusual spacing, using 1, 5, 20 and 40 lags, mimics
the approach used by Corsi (2003) and Andersen, Bollerslev, and Diebold (2003). The results
28
Const
0.503
(0.010)
0.170
(0.016)
0.139
(0.017)
0.139
(0.017)
Vbi−1
0.413
(0.018)
-0.137
(0.059)
Realised QV terms
Vbi−5
Vbi−20
0.153
(0.018)
-0.076
(0.059)
0.061
(0.018)
-0.017
(0.058)
Vbi−40
0.030
(0.017)
0.116
(0.058)
bi−1
B
0.713
(0.075)
0.551
(0.023)
Realised BPV terms
bi−5
bi−20
bi−40
B
B
B
0.270
(0.074)
0.180
(0.023)
0.091
(0.074)
0.071
(0.022)
-0.110
(0.073)
0.027
(0.021)
Summary measures
log L
P ort49
-1751.42
4660
-1393.41
199
-1336.81
108
-1342.03
122
Table 3: Prediction for 100 times returns on the DM/Dollar series. Dynamic regression, predicting future daily RV Vbi using lagged values and lagged values of estimated realised BPV
bi . Software used was PcGive. Subscripts denote the lag length in this table. Everyterms B
thing is computed using 10 minute returns. Figures in brackets are asymptotic standard errors.
Port 49 denotes the Box-Ljung portmantau statistic computed with 49 lags, while log-L denotes
the Gaussian likelihood.
are quite striking. None of the models have satisfactory Box-Ljung portmanteau tests (this can
be fixed by including a moving average error term in the model), but the inclusion of lagged
information is massively significant. The lagged realised volatilities seem to do a reasonable job
at soaking up the dependence in the data, but the effect of bipower variation is more important.
This is in line with the results in Andersen, Bollerslev, and Diebold (2003) who first noted this
effect. See also the work of Forsberg and Ghysels (2004) on the effect of inclusion of other power
variation statistics in forecasting.
Table 4 shows some rather more sophisticated results. Here we model returns directly using
a GARCH type model, but also include lagged explanatory variables in the conditional variance.
This is in the spirit of the work of Engle and Gallo (2005). The results above the line show the
homoskedastic fit and the improvement resulting from the standard GARCH(1,1) model. Below
the line we include a variety of realised variables as explanatory variables; including longer lags
of realised variables does not improve the fit. The best combination has a large coefficient on
realised BPV and a negative coefficient on realised QV. This means when there is evidence for
a jump then the impact of realised volatility is tempered, while when there is no sign of jump
the realised variables are seen with full force. What is interesting from these results is that
the realised effects are very much more important than the lagged daily returns. In effect the
realised quantities have basically tested out the traditional GARCH model.
Overall this tiny empirical study confirms the results in the literature about the predictability
of realised volatility, but that it is quite easy to outperform a simple autoregressive model for
RV. We can see how useful bipower variation is and that taken together the realised quantities
do provide a coherent way of empirically forecasting future volatility.
29
Const
0.504
(0.021)
0.008
(0.003)
0.017
(0.009)
0.011
(0.008)
0.014
(0.009)
0.019
(0.010)
Realised terms
bi−1
Vbi−1
B
-0.115
(0.039)
0.085
(0.042)
-0.104
(0.074)
0.253
(0.076)
0.120
(0.058)
0.282
(0.116)
Standard GARCH terms
(Yi−1 − Yi−2 )2
hi−1
0.053
(0.010)
0.019
(0.019)
0.015
(0.017)
0.013
(0.019)
0.930
(0.013)
0.842
(0.052)
0.876
(0.049)
0.853
(0.055)
0.822
(0.062)
log L
-2636.59
-2552.10
-2533.89
-2537.49
-2535.10
-2534.89
Table 4: Prediction for 100 times returns Y i −Yi−1 on the DM/Dollar series. GARCH type model
of the conditional variance hi of daily returns, using lagged squared returns (Y i−1 − Yi−2 )2 , rebi−1 and lagged conditional variance hi−1 . Throughout a Gausalised QV Vbi−1 , realised BPV B
sian quasi-likelihood is used. Robust standard errors are reported. Carried out using PcGive.
3.9.3
Multivariate illustration
TO BE ADDED
3.10
Parametric inference and forecasting
Throughout we have emphasised the non-parametric nature of the analysis. This is helpful due
to the strong and complicated diurnal patterns we see in volatility. These effects tend also
to be unstable through time and so are difficult to model parametrically. A literature which
mostly avoids this problem is that on estimating parametric SV models from low frequency data.
Much of this is reviewed in Shephard (2005, Ch. 1). Examples include the use of Markov chain
Monte Carlo methods (e.g. Kim, Shephard, and Chib (1998)) and efficient method of moments
(e.g. Chernov, Gallant, Ghysels, and Tauchen (2003)). Both approaches are computationally
intensive and intricate to code. Simpler method of moment procedures (e.g. Andersen and
Sørensen (1996)) have the difficulty that they are sensitive to the choice of moments and can be
rather inefficient.
Recently various researchers have used the time series of realised daily QV to estimate parametric SV models. These models ignore the intraday effects and so are theoretically misspecified.
Typically the researchers use various simple types of method of moments estimators, relying on
the great increase in information available from realised statistics to overcome the inefficiency
caused by the use of relatively crude statistical methods. The first papers to do this were
Barndorff-Nielsen and Shephard (2002) and Bollerslev and Zhou (2002), who studied the first
two dynamic moments of the time series Vb1 , Vb2 , ..., VbT implied by various common volatility
models and used these to estimate the parameters embedded within the SV models. More so30
phisticated approaches have been developed by Corradi and Distaso (2004) and Phillips and Yu
(2005). Barndorff-Nielsen and Shephard (2002) also studied the use of these second order properties of the realised quantities to estimate V 1 , V2 , ..., VT from the time series of Vb1 , Vb2 , ..., VbT
using the Kalman filter. This exploited the asymptotic theory for the measurement error (17).
See also the work of Meddahi (2002), Andersen, Bollerslev, and Meddahi (2004) and Andersen,
Bollerslev, and Meddahi (2005).
3.11
Forecast evaluation
One of the main early uses of realised volatility was to provide a instrument for measuring the
success for various volatility forecasting methods. Andersen and Bollerslev (1998a) studied the
correlation between Vi or Vbi and hi , the conditional variance from a GARCH model based on
daily returns from time 1 up to time i − 1. They used these results to argue that GARCH
models were more successful than had been previously understood in the empirical finance
literature. Hansen and Lunde (2005b) study a similar type of problem, but look at a wider class
of forecasting models and carry out formal testing of the superiority of one modelling approach
over another.
Hansen and Lunde (2005a) and Patton (2005) have focused on the delicate implications of
the use of different loss functions to discriminate between competing forecasting models, where
the object of the forecasting is Cov (Y i − Yi−1 |Fi−1 ). They use Vbi to proxy this unobserved
covariance. See also the related work of Koopman, Jungbacker, and Hol (2005).
4
Jumps
4.1
Bipower variation
In this short section we will review some material which non-parametrically identifies the contribution of jumps to the variation of asset prices. A focus will be on using this method for testing
for jumps from discrete data. We will also discuss some work by Cecilia Mancini which provides
an alternative to BPV for splitting up QV into its continuous and discontinuous components.
Rt
Recall µ−2
1 {Y }t = 0 Σu du when Y is a BSM plus jump process given in (7). The BPV
process is consistently estimated by the p × p matrix realised BPV process {Y δ }, defined in
(10). This means that we can, in theory, consistently estimate [Y ct ] and [Y d ] by µ−2
1 {Yδ } and
[Yδ ] − µ−2
1 {Yδ }, respectively.
One potential use of {Yδ } is to test for the hypothesis that a set of data is consistent with
a null hypothesis of continuous sample paths. We can do this by asking if [Y δ ]t − µ−2
1 {Yδ }t is
statistically significantly bigger than zero — an approach introduced by Barndorff-Nielsen and
31
Shephard (2006). This demands a distribution theory for realised BPV objects, calculated under
the null that Y ∈ BSM with σ > 0.
Building on the earlier CLT of Barndorff-Nielsen and Shephard (2006), Barndorff-Nielsen,
Graversen, Jacod, Podolskij, and Shephard (2005) have established a CLT which covers this
situation when Y ∈ BSM. We will only present the univariate result, which has that as δ ↓ 0
so
δ −1/2 ({Yδ }t − {Y }t ) → µ21
Z
p
(2 + ϑ)
t
0
σ 2u dBu ,
(31)
where B ⊥⊥ Y , the convergence is in law stable as a process and
ϑ = π 2 /4 + π − 5 ' 0.6090.
This result, unlike Theorem 1, has some quite technical conditions associated with it in order to
control the degree to which the volatility process can jump; however we will not discuss those
issues here. Extending the result to cover the joint distribution of the estimators of the QV and
the BPV processes, they showed that
−2
Z t
0
(2 + ϑ) 2
µ1 {Yδ }t − µ−2
{Y }t
L
−1/2
4
1
→ MN
,
σ u du ,
δ
0
2
2
[Yδ ]t − [Y ]t
0
a Hausman (1978) type result as the estimator of the QV process is, of course, fully asymptotically efficient when Y ∈ BSM. Consequently
δ −1/2 [Yδ ]t − µ−2
L
1 {Yδ }t
s Z
→ N (0, 1) ,
t
ϑ
0
(32)
σ 4u du
which can be used as the basis of a test of the null of no jumps.
4.2
Multipower variation
The “standard” estimator of integrated quarticity, given in (18), is not robust to jumps. One
way of overcoming this problem is to use a multipower variation (MPV) measure — introduced
by Barndorff-Nielsen and Shephard (2006). This is defined as
( I
)
bt/δc
X
Y
[r]
Yδ(j−i) − Yδ(j−1−i) ri ,
{Y }t = p− lim δ (1−r+ /2)
δ↓0
j=1
where ri > 0, r = (r1 , r2 , ..., rI )0 for all i and r+ =
[1,1]
case {Y }t = {Y }t
.
i=1
PI
i=1 ri .
The usual BPV process is the special
If Y obeys (7) and ri < 2 then
{Y
[r]
}t
=
I
Y
µ ri
i=1
32
!Z
t
0
σ ru+ du,
This process is approximated by the estimated MPV process
( I
)
bt/δc
X Y
ri
[r]
(1−r+ /2)
Yδ(j−i) − Yδ(j−1−i) {Yδ }t = δ
.
j=1
i=1
In particular the scaled realised tri and quadpower variation,
[1,1,1,1]
µ−4
1 {Yδ }t
respectively, both estimate
Rt
0
[4/3,4/3,4/3]
and µ−3
4/3 {Yδ }t
,
σ 4u du consistently in the presence of jumps. Hence either of these
objects can be used to replace the integrated quarticity in (32), so producing a non-parametric
test for the presence of jumps in the interval [0, t]. The test is conditionally consistent, meaning
if there is a jump, it will detected and has asymptotically the correct size. Extensive small
sample studies are reported in Huang and Tauchen (2005), who favour ratio versions of the
statistic like
δ
−1/2
µ−2
1 {Yδ }t
−1
[Yδ ]t
v
u
u {Y }[1,1,1,1]
tϑ δ t
({Yδ }t )2
L
→ N (0, 1) ,
which has pretty reasonable finite sample properties. They also show that this test tends to
under reject the null of no jumps in the presence of some forms of market frictions.
It is clearly possible to carry out jump testing on separate days or weeks. Such tests are
asymptotically independent over these non-overlapping periods under the null hypothesis.
To illustrate this methodology we will apply the jump test to the DM/Dollar rate, asking
if the hypothesis of a continuous sample path is consistent with the data we have. Our focus
will mostly be on Friday January 15th 1988, although we will also give results for neighbouring
days to provide some context. In Figure 5 we plot 100 times the change during the week of the
discretised Yδ , so a one unit uptick represents a 1% change, for a variety of values of n = 1/δ,
bi /Vbi with their corresponding 99% critical values.
as well as giving the ratio jump statistics B
In Figure 5 there is a large uptick in the D-mark against the Dollar, with a movement of
nearly two percent in a five minute period. This occurred on the Friday and was a response to
the news of a large fall in the U.S. balance of payment deficit, which led to a large strengthening
bi . Hence the
of the Dollar. The data for January 15th had a large Vbi but a much smaller B
statistics are attributing a large component of Vbi to the jump, with the adjusted ratio statistic
being larger than the corresponding 99% critical value. When δ is large the statistic is on the
borderline of being significant, while the situation becomes much clearer as δ becomes small.
This illustration is typical of results presented in Barndorff-Nielsen and Shephard (2006) which
showed that many of the large jumps in this exchange rate correspond to macroeconomic news
33
4
Change in Yδ during week
using δ=20 minutes
Ratio jump statistic and 99% critical values
1.00
3
2
0.75
1
4
(µ 1)−2B^i /V^i
99% critical values
0.50
0
Tues Wed Thurs
Mon
Change in Yδ during week
using δ=5 minutes
Mon
Tues Wed
Thurs Fri
Ratio jump statistic and 99% critical values
Fri
1.00
3
2
0.75
(µ 1)−2B^i /V^i
99% critical values
1
0.50
0
0.25
Mon
Tues
Wed
Thurs
Mon
Fri
Tues
Wed
Thurs
Fri
Figure 5: Left hand side: change in Y δ during a week, centred at 0 on Monday 11th January
and running until Friday of that week. Drawn every 20 and 5 minutes. An up tick of 1 indicates
bi /Vbi , which should
strengthening of the Dollar by 1%. Right hand side shows an index plot of B
be around 1 if there are no jumps. Test is one sided, with criticial values also drawn as a line.
announcements. This is consistent with the recent economics literature documenting significant
intraday announcement effects, e.g. Andersen, Bollerslev, Diebold, and Vega (2003).
4.3
Grids
It is clear that the martingale based CLT for irregularly spaced data for the estimator of the
QV process can be extended to cover the BPV case. We define
tj ≤t
{YGn }t =
X
Yt
j=1
j−1
p
− Ytj−2 Ytj − Ytj−1 → {Y }t .
Using the same notation as before, we would expect the following result to hold, due to the fact
that H G is assumed to be continuous,
−2
Z t
∂HuG
0
(2 + ϑ) 2
µ1 {YGn }t − µ−2
L
1 {Y }t
δ −1/2
→ MN
σ 4u du .
0
2
2
[Yδ ]t − [Y ]t
∂u
0
34
[1,1,1,1]
The integrated moderated quarticity can be estimated using µ −4
1 {Yδ }t
, or a grid version,
which again implies that the usual feasible CLT continues to hold for irregularly spaced data.
This is the expected result from the analysis of power variation provided by Barndorff-Nielsen
and Shephard (2005c).
Potentially there are modest efficiency gains to be had by computing the estimators of BPV
on multiple grids and then averaging them. The extension along these lines is straightforward
and will not be detailed here.
4.4
Infinite activity jumps
The probability limit of realised BPV is robust to finite activity jumps. A natural question to
ask is: (i) is the CLT also robust to jumps, (ii) is the probability limit also unaffected by infinite
activity jumps, that is jump processes with an infinite number of jumps in any finite period of
time. Both issues are studied by Barndorff-Nielsen, Shephard, and Winkel (2004) in the case
where the jumps are of Lévy type, while Woerner (2004) looks at the probability limit for more
general jump processes.
Barndorff-Nielsen, Shephard, and Winkel (2004) find that the CLT for BPV is affected by
finite activity jumps, but this is not true of tripower and high order measures of variation. The
reason for the robustness of tripower results is quite technical and we will not discuss it here.
However, it potentially means that inference under the assumption of jumps can be carried out
using tripower variation, which seems an exciting possibility. Both Barndorff-Nielsen, Shephard,
and Winkel (2004) and Woerner (2004) give results which prove that the probability limit of
realised BPV is unaffected by some types of infinite activity jump processes. More work is
needed on this topic to make these result definitive. It is somewhat related to the parametric
study of Aı̈t-Sahalia (2004). He shows that maximum likelihood estimation can disentangle a
homoskedastic diffusive component from a purely discontinuous infinite activity Lévy component
of prices. Outside the likelihood framework, the paper also studies the optimal combinations
of moment functions for the generalized method of moment estimation of homoskedastic jumpdiffusions. Further insights can be found by looking at likelihood inference for Lévy processes,
which is studied by Aı̈t-Sahalia and Jacod (2005a) and Aı̈t-Sahalia and Jacod (2005b).
4.5
Testing the null of no continuous component
In some stimulating recent papers, Carr, Geman, Madan, and Yor (2003) and Carr and Wu
(2004), have argued that it is attractive to build SV models out of pure jump processes, with
no Brownian aspect. This is somewhat related to the material we discuss in section 5.6. It is
clearly important to be able to test this hypothesis, seeing if pure discreteness is consistent with
35
observed prices.
Barndorff-Nielsen, Shephard, and Winkel (2004) showed that
[2/3,2/3,2/3]
δ −1/2 {Yδ }t
− [Y ct ]t
has a mixed Gaussian limit and is robust to jumps. But this result is only valid if σ > 0, which
rules out its use for testing for pure discreteness. However, we can artificially add a scaled
Brownian motion, U = σB, to the observed price process and then test if
[2/3,2/3,2/3]
δ −1/2 {Yδ + Uδ }t
− σ2 t
is statistically significantly greater than zero. In principle this would be a consistent non-
parametric test of the maintained hypothesis of Peter Carr and his coauthors.
4.6
Alternative methods for identifying jumps
Mancini (2001), Mancini (2004) and Mancini (2003) has developed robust estimators of Y ct
in the presence of finite activity jumps. Her approach is to use truncation
bt/δc
X
j=1
Yjδ − Y(j−1)δ
2
I(Yjδ − Y(j−1)δ < rδ ),
(33)
where I (.) is an indicator function. The crucial function r δ has to have the property that
p
δ log δ −1 rδ−1 ↓ 0. It is motivated by the modulus of continuity of Brownian motion paths that
almost surely
lim sup
δ↓0 0≤s,t≤T
|t−s|<δ
|W − Wt |
p s
= 1.
2δ log δ −1
This is an elegant theory, which works when Y ∈ BSM. It is not prescriptive about the tuning
function rδ , which is an advantage and a drawback. Given the threshold in (33) is universal,
this method will throw out more returns as jumps during a high volatility period than during a
low volatility period.
Aı̈t-Sahalia and Jacod (2005b, Section 7 onwards) provides additional insights into these
types of truncation estimators in the case where Y is scaled Brownian motion plus a homogeneous
pure jump process. They develop a two-step procedure, which automatically selects the level of
truncation. Their analysis is broader still, providing additional insights into a range of power
variation type objects.
5
5.1
Mitigating market frictions
Background
The semimartingale model of the frictionless, arbitrage free market is a fiction. When we use
high frequency data to perform inference on either transaction or quote data then various market
36
frictions can become important. O’Hara (1995), Engle (2000), Hasbrouck (2003) and Engle and
Russell (2005) review the detailed modelling of these effects. Inevitably such modelling is quite
complicated.
With the exception of subsection 2.10, we have so far mostly ignored frictions by thinking of
δ as being only moderately small. This is ad hoc and it is wise to try to more formally identify
the impact of frictions. In this context the first econometric work was carried out by Fang(96)
(1996) and Andersen, Bollerslev, Diebold, and Labys (2000) who used so-called signature plots
to assess the degree of bias caused by frictions using a variety of values of δ. The signature
plots we draw show the square root of the time series average of estimators of V i computed
over many days, plotting this against δ. If the log-price process was a pure martingale then we
would expect the plot to have roughly horizontal lines. Figure 6 shows the signature plot for the
Vodafone series, discussed in section 2.10, calculated over 20 days in January 2004. Included in
this plot is the standard deviation of daily returns Y i − Yi−1 , which is around 0.01.
The signature plot for realised volatility for Vodafone reinforces the results from section
2.10. The daily RV is most reliable in the case of interpolated mid-quote data, where the bias
is rather moderate even with δ well below 5 minutes 13 . Much larger biases appear when we
base the statistics on transaction data, with raw transaction data being particularly vulnerable.
The Figure also reports corresponding results for estimators we will discuss in this section: the
kernel, two scale and alternation estimators. Potentially these statistics may be less influenced
by frictions and have the potential to exhibit less bias.
In order to deepen our understanding we will characterise different types of frictions: (i)
liquidity effects, (ii) bid/ask bounce, (iii) discreteness, (iv) Epps effects. We will not discuss
other important frictions such as market closures. We will review some of the literature which
has tried to mitigate the effects of market frictions on the estimation of the QV of the efficient
semimartingale price. This is a very active area of research and some of the answers are less
clear cut than in previous sections as there is less agreement on the way to model the frictions.
5.2
Four statistical models of frictions
The economics of market frictions involve a large number of traders competing against one
another to maximise their expected utility. The econometrics literature on the effect of frictions
on realised quantities almost entirely ignores the details of this economics and employs reduced
form statistical models of the frictions. This is rather unsatisfactory but is typical of how
13
The reasonably flat signature plot for interpolated mid-quote data masks at least two offsetting effects: (i) an
upward bias of RV for small δ that we will discuss in a moment, (ii) a downward bias caused by the interpolation
which induces positive correlation amongst returns. In other datasets this offsetting may well not work. A more
through empirical study is provided by Hansen and Lunde (2006).
37
0.04
0.03
Mid−quote returns: interpolated
0.04
RVol
Kernel
2 scale
Alternation
Daily
0.03
0.02
0.02
0.01
0.01
0.04
0.03
0.1 0.2
1 2 34
Transaction returns: interpolated
10
20
0.04
RVol
Kernel
2 scale
Daily
0.03
0.02
0.02
0.01
0.01
0.1 0.2
1
2 34
10
20
Mid−quote returns: raw
RVol
Kernel
2 scale
Alternation
Daily
0.1 0.2
Transaction returns: raw
1
2 34
10
20
1
2 34
10
20
RVol
Kernel
2 scale
Daily
0.1 0.2
Figure 6: Signature plot for various estimators for the Vodafone stock during January 2004.
Shows the square root of the long term average of estimators of V i based on data calculated over
intervals of length δ recorded in minutes. Estimators are squared daily returns (which does not
depend on δ), realised variance and the two scale, kernel and L (again does not depend upon δ)
estimators. Note that the 2 scale estimator could only be computed for δ ≥ 0.1, the results for
δ < 0.1 repeats the value recorded for δ = 0.1. code: lse RV.ox.
econometricians deal with measurement error in other areas of economics, reflecting the fact
that frictions are nuisances to us as our scientific focus is on estimating [Y ].
Suppose Y ∈ BSM is some latent efficient price obscured by frictions. Our desire is to
estimate [Y ] from the distorted sample path X. Four basic statistical assumptions about the
frictions seem worthy of study. Two use the notation U = X − Y , with ρ δ,j = Cor(Uδi , Uδ(i+j) )
assuming this correlation does not depend upon i.
Assumption F1: stationary in observation time. For all δ, U δi has a zero mean, variance
of κ 2 and is a covariance stationary process with Cor(U δi , Uδ(i+j) ) = ρδ,j = ρj . For simplicity
of exposition we will also assume U is Gaussian.
Here the autocovariance function does not depend upon δ or i 14 . This is not plausible
14
This assumptions has some empirical attractions if it is applied to ρi−j = Cor(Uti , Utj ), the autocorrelation
38
empirically, as we can see from Figure 3. In particular it is strong to assume that the temporal
behaviour of Uδ , U2δ , U3δ ,... does not vary with δ. However, it is a simple structure where we
can give a relatively complete analysis.
The leading case of F1 is where Uδi ∼ N ID(0, κ 2 ) over i — the white noise assumption. We
will write this as F1W. In this context it has been discussed by, for example, Zhou (1996), Bandi
and Russell (2003), Hansen and Lunde (2006), Zhang, Mykland, and Aı̈t-Sahalia (2005) and
Zhang (2004). Bandi and Russell (2003), Hansen and Lunde (2006) and Aı̈t-Sahalia, Mykland,
and Zhang (2005b) have studied the more general case which allows autocorrelation in the
errors. Note Gloter and Jacod (2001a) and Gloter and Jacod (2001b) also study what happens
if Var(Uδi ) varies with δ.
F1 implies, for example, that [U ]t = ∞ which is inconvenient mathematically. Further, it
means that we can very precisely estimate Y δi by simply repeatedly recording Xδi (or versions
of it very close in time to δi) and averaging it, for as δ ↓ 0 the dependence pattern in the data
peters out increasingly quickly in calendar time. This is most easily seen in the F1W case, but
it holds more generally. In this setup we would expect to eventually be able to estimate [Y ]
given enough data. This is indeed the case.
Assumption F2: diffusion and stationary in calendar time. U ∈ BSM, E(U t ) =
0. Throughout this discussion, for simplicity of exposition, we will assume U is a Gaussian
process.
F2 is very different from F1 for [U ]t < ∞ and Y cannot be perfectly recovered by local
averaging of Z. Aı̈t-Sahalia, Mykland, and Zhang (2005a) have studied this case when Y is
Brownian motion and U is a Gaussian OU process which solves
√
dUt = −λUt dt + κ 2λdBt ,
(34)
where B ⊥⊥ Y and is Brownian motion. Even if the full sample path of U from time 0 to time
t is available it is not possible to consistently estimate λ as the log-likelihood for the path as
a function of λ, given by the Radon-Nikodym derivative, is not degenerate. Of course, on the
other hand, the value of κ 2 λ can be deduced from [U ]. Thus Var(U t ) cannot be estimated just
from the sample path of U unless one a priori knows either κ or λ. This means it is impossible
to consistently estimate [Y ] just from the sample path of X if one allows F2. Hansen and Lunde
(2006) have studied the properties of realised QV under more general conditions, which are
closer to the full spirit of F2.
function in tick time. But we have already noted that this would require limit theorems for estimators of QV to
hold for arbitrary stochastic tj . Existing results for irregularly spaced data, such as Mykland and Zhang (2005),
are proved under the assumption that the tj are independent of Y . The independence assumption is well known
not be innocuous.
39
Assumption F3: purely discontinuous prices. X is a pure point process and X − Y is
strictly stationary in business or transaction time.
Oomens (2004) studies the properties of realised QV and Large (2005) provides a new type
of estimator of [Y ] based on X.
Assumption F4: scrambled. (i) non-synchronously trading or quote updating, (ii) delays
caused by reaction times, as information is absorbed into markets differentially quickly.
F4(i) was the motivation for the work of Malliavin and Mancino (2002) discussed in Section
3.8.3. It also appears in the work of Hayashi and Yoshida (2005), which we will discuss at the
end of this section.
Typically F1-F2 are strengthened by
Assumption IND. U ⊥⊥ Y .
IND is a strong assumption, which is studied in some length by Hansen and Lunde (2006)
who argue it is not empirically reasonable for a number of empirical examples.
F1-F3 are all capable of impacting on univariate realised analysis, Assumption F4 can only
possibly help in the multivariate case for it hardly impacts univariate quadratic variations over
moderately long stretches of time such as a day. Assumptions F1-F2 are purely statistical
abstractions for measurement error, somewhat convenient for carrying out calculations. They
are attempts to deal with some of the effects of liquidity and bid/ask bounce. F2 has the problem
that it implies X ∈ BSM and so needs additional assumptions in order to identify [Y ], such as
Y ∈ Mloc . F3 is motivated by the discreteness seen in most asset markets — see for example
Figure 3. F4 tries to capture a feature of Epps (1979) effects, which was discussed in section
2.10.2.
5.3
Properties of [Uδ ] in univariate case under F1W and F2
Trivially under F1 and F2
[Xδ ] = [Yδ ] + [Uδ ] + 2[Uδ , Yδ ].
It is useful to think of the conditional moments E U |Y ([Xδ ] − [Yδ ]) and VarU |Y ([Xδ ] − [Yδ ]), for
they give an impression of the effect of frictions.
Given space constraints we will specialise the results for F1 to F1W — there is little technical
loss in doing this as the results under the more general conditions of F1 are very similar. Write
n = bt/δc, then under IND we have that
EU [Uδ ] = n2κ 2 ,
EU |Y [Uδ , Yδ ] = 0,
CovU |Y ([Yδ ], [Uδ , Yδ ]) = 0,
VarU |Y [Uδ , Yδ ] = 2κ 2 [Yδ ],
CovU |Y ([Uδ ], [Uδ , Yδ ]) = 0,
40
[Yδ ] ⊥⊥ [Uδ ].
(35)
To calculate Var[Uδ ], write ua,b = Ua − Ub and use the normality of U ,
Cov(u2a,b , u2c,d ) = 2Cov 2 (ua,b , uc,d ) .
Taken together with the fact that [U δ ] =
EU |Y ([Xδ ] − [Yδ ]) = 2nκ 2
Pn
2
i=1 uδi,δ(i−1)
then under F1W Var[Uδ ] ' 12κ 4 n, so
and VarU |Y ([Xδ ] − [Yδ ]) ' 12κ 4 n + 8κ 2 [Yδ ].
(36)
This was reported by Fang(96) (1996), Bandi and Russell (2003), Hansen and Lunde (2006) and
Zhang, Mykland, and Aı̈t-Sahalia (2005). Notice that as δ ↓ 0 both the bias and variance goes
to infinity. This formalises the well known result that [X δ ] is an inaccurate estimator of [Y ]
under F1 when δ is very small. The generalisation of these results to F1 appears also in the
above papers and Aı̈t-Sahalia, Mykland, and Zhang (2005b) — the key result being that the
orders in n do not change as we move to F1.
For mid-quotes data κ 2 tends to be quite small and so the impact of terms which contain
κ 4 tend to be tiny unless n is massive. For transaction data this is not typically the case for
then we would expect κ 2 to be much bigger due to bid/ask bounce.
Hansen and Lunde (2006) have studied the empirically important case of where [U δ , Yδ ] is
negative, which can happen when the IND assumption is dropped.
Under F2, if U is a Gaussian OU process (34) then E(U t ) = 0, Var(Ut ) = κ 2 . Clearly for
small δ under IND
EU |Y ([Xδ ] − [Yδ ]) = n2κ 2 (1 − exp(−λδ)) ' 2tλκ 2 = [U ]t ,
while from (17) the corresponding variance is approximately 2δ
Rt
0
σ 2u + κ 2 2λ
(37)
2
du. Notice that
for large δ, [U ]t provides a poor approximation to the bias. As δ decreases, at first the bias
increases but the variance falls, but eventually as δ gets very small the variance becomes small
and the bias remains constant at [U ] t . Hence the results are materially different from the F1
case.
The bias can be very large as λ is likely to be very large in practice, in order for the half life
of U to be very brief. This type of result appears in the parametric case where Y is Brownian
motion plus drift in Aı̈t-Sahalia, Mykland, and Zhang (2005b, Section 9.2). Related work
includes Gloter and Jacod (2001a) and Gloter and Jacod (2001b). The bias (37) continues to
hold for non-Gaussian OU processes of the type discussed by Barndorff-Nielsen and Shephard
(2001).
Under F1, Bandi and Russell (2003) and Hansen and Lunde (2006) have studied the problem
of minimising the E ([Xδ ] − [Y ])2 by choosing δ. A key is the problem of estimating κ 2 , but
4
P
under F1 this is consistently estimated by
Xδi − Xδ(i−1) /2n. For thickly traded assets
41
they typically select δ to be between 1 and 15 minutes, but the results are sensitive to the choice
of mid-quotes or transaction data and the use of interpolation or raw transforms. In recent work
Bandi and Russell (2005) have extended their analysis to the multivariate case. See also the
work of Martens (2003).
5.4
Subsampling
5.4.1
Raw subsampling
Recalling the definition of (23), we have that when t kj = δ j +
k
K+1
K
[XGn+ (K) ] =
1 X
[XGn (i) ]t ,
K +1
then
where [XGn (i) ] = [YGn (i) ] + [UGn (i) ] + 2[UGn (i) , YGn (i) ],
i=0
and so we can use (35) to calculate EU |Y [XGn+ (K) ]. In particular under F1W
EU |Y [XGn+ (K) ] = [YGn+ (K) ] + n2κ 2 ,
(38)
so subsampling does not reduce the bias of the estimator. However, and crucially, under F1W
we trivially have
VarU |Y [UGn+ (K) ] =
1
Var[UGn (i) ],
K +1
VarU |Y [UGn+ (K) , YGn+ (K) ] = 2κ 2
1
[Y + ].
K + 1 Gn (K)
Which means that, for fixed δ as K → ∞ so
p
[XGn+ (K) ] → p lim [YGn+ (K) ] + n2κ 2 .
K→∞
This is a marked improvement over the realised QV, for in that case both the mean and variance
explode as δ ↓ 0. This important point was first made in the F1W context by Zhang, Mykland,
and Aı̈t-Sahalia (2005). Aı̈t-Sahalia, Mykland, and Zhang (2005b) extend this argument to the
case of dependence in observation time.
Under F2, if U is a Gaussian OU process (34) then for small δ, as before,
EU |Y [XGn+ (K) ] − [YGn+ (K) ] = n2κ 2 (1 − exp(−λδ)) ' [U ]t ,
while from (17) the corresponding variance is approximately (see Example 2),
(
) Z t
K X
K
X
k−i 2
2
2
k−i 2
σ 2u + κ 2 2λ du.
+ 1 − 2
K +1
K +1
(K + 1)
0
(39)
i=0 k=0
Hence under F2 subsampling has broadly the same properties as the raw realised QV estimator
studied in the previous subsection, but with a slightly smaller variance when K is large.
42
5.4.2
Bias correction
Subsampling has the great virtue that under F1 or F2 its variance gets smaller as δ ↓ 0. This
means that if we can bias correct these estimators then they will be consistent for [Y ].
Under F1W Zhang, Mykland, and Aı̈t-Sahalia (2005) argued one should estimate [Y ] by
/Xδ,K / = [XGn+ (K) ] −
1
[X δ ],
K + 1 K+1
which they termed a “two scaled estimator”, one based on returns computed over intervals of
length δ, the other on intervals of length δ/(K + 1). Clearly under F1W
1
K +1
[Y δ ] + n2κ 2 −
n2κ 2
K + 1 K+1
K +1
1
[Y δ ],
' [YGn+ (K) ] −
K + 1 K+1
EU (/Xδ,K /) = [YGn+ (K) ] −
with the second term becoming negligible as K increases. Hence the two scale estimator is consistent under F1W and IND. Zhang, Mykland, and Aı̈t-Sahalia (2005) calculate the asymptotic
distribution of /Xδ,K / − [Y ] and show it is mixed Gaussian, but with a slow rate of convergence.
Zhang (2004) provides a multiscale estimator which has a faster rate of convergence, but exploits
similar ideas. It is clear this argument also holds under F1 with time dependent errors which
are stationary in observation time — for in calendar time the dependence in the data weakens
as δ ↓ 0. A clear analysis of that setup is provided by Aı̈t-Sahalia, Mykland, and Zhang (2005b).
Unfortunately this estimator falls over when we move to the F2 assumption, concording with
the above comments that this is a much more difficult problem. In particular for finite δ, K
1
EU (/Xδ,K /) = [YGn+ (K) ] −
[Y δ ]
K+1
K
+
1
δ
K +1
2
+n2κ (1 − exp(−λδ)) −
1 − exp −λ
K +1
K +1
1
' [YGn+ (K) ] −
[Y δ ] + 2κ 2 tn.
K + 1 K+1
Thus, in this case, the two scale estimator does not really help. The problem here is that the
bias correction is of the wrong form.
Figure 7, repeats Figure 6 but now plots the results for the two scale estimators for Vodafone
on 2nd January 2004. Throughout the subsampling was based on a time series of 2 second returns
and setting K = δ/0.02. It is important to understand that the choice of K is important and
the empirical properties of the two scale estimator may change quite significantly with this
tuning parameter15 . We have little experience of how to select K well. The results in the top
15
When δ is small we would like to use a lot of subsampling as the theory of Zhang, Mykland, and Aı̈t-Sahalia
(2005) suggests, but we cannot take K to be very large as we run out of data. Our dataset is recorded only up
to one second. At the moment we cannot see how to overcome this problem. More empirical studies into the
performance of the two scale estimator are given in Hansen and Lunde (2006).
43
row, which are made on mid-quotes, show the 2 scale estimator does well for interpolated data
and moderates the bias of the RV estimator for small δ. The 2 scale estimator is much more
challenged (as all the estimators are!) by the transaction data, but it does have a positive impact
in the case of the raw data. The signature plot for the two scale estimator for the Vodafone
series is given in Figure 6 and produces qualitatively very similar results.
0.04
0.03
Mid−quote returns: interpolated
0.04
RVol
Kernel
2 scale
Alternation
0.03
0.02
0.02
0.01
0.01
0.04
0.03
0.1 0.2
1 2 34
Transaction returns: interpolated
10
20
0.04
RVol
Kernel
2 scale
Alternation
RVol
Kernel
2 scale
Alternation
0.1 0.2
Transaction returns: raw
1
0.02
0.01
0.01
1
2 34
10
20
2 34
10
20
10
20
RVol
Kernel
2 scale
Alternation
0.03
0.02
0.1 0.2
Mid−quote returns: raw
0.1 0.2
1
2 34
Figure 7: LSE’s electronic order book on the 2nd working day in January 2004. Top 2 rows
graphs: subsampled realised daily QVol computed using 0.1, 1, 5 and 20 minute midpoint returns. X-axis is in minutes. First row corresponds to mid-points, the second row corresponds to
transactions. Bottom 2 rows of graphs: 2 scale estimator of the QVol computed using 0.1, 1, 5
and 20 minute transaction returns.
5.5
Kernel
An alternative way of trying to remove frictions is by using kernels, discussed in this context
by, for example, French, Schwert, and Stambaugh (1987) and Zhou (1996), and formalised by
Hansen and Lunde (2006) and Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004). We will
44
study
RVw (Xδ ) = w0 [Xδ ] + 2
q
X
i=1
wi γ
bi (Xδ )
= RVw (Yδ ) + RVw (Uδ ) + RVw (Yδ , Uδ ) + RVw (Uδ , Yδ ).
We already know from section 3.7 the properties of RV w (Yδ ) when δ is small, while under the
IND assumption
RVw (Yδ ) ⊥ RVw (Yδ , Uδ ), RVw (Uδ , Yδ ),
RVw (Uδ ) ⊥ RVw (Yδ , Uδ ),
RVw (Yδ ) ⊥⊥ RVw (Uδ ).
the first two terms have zero mean and so, writing ρ ◦s = Cor(Ut , Ut−s ),
EU (RVw (Xδ )) = w0 [Yδ ] + 2nκ 2
w0 (1 − ρ◦δ ) +
q
X
i=1
wi 2ρ◦iδ − ρ◦(i−1)δ − ρ◦(i+1)δ
Under F1W Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004,

 




[Uδ ]
2κ 2 n
[Uδ ]
12
2
 2b






γ 1 (Uδ ) 
 γ 1 (Uδ )   −2κ n 
 2b
 −16
 2b
 2b
 


4 4
(U
)
(U
)
γ
0
γ
δ
δ
2
2
EU 
=
 , CovU 
 = nκ 

 



 0
..
..
..







.
.
.
..
2b
γ q (Uδ )
2b
γ q (Uδ )
0
.
!
.
Theorem 2) note that

28
−16 24
4
−16 24
..
..
..
.
.
.
while EU |Y [Uδ , Yδ ] = EU |Y γ
b1 (Uδ , Yδ ) = 0 and
CovU |Y

..
.



,




[Uδ , Yδ ]
 γ
b1 (Uδ , Yδ ) 
γ
b1 (Yδ , Uδ )

2[Yδ ] − 2b
γ 1 (Yδ )

γ 1 (Yδ ) − γ
b2 (Yδ ) − [Yδ ]
2[Yδ ] − 2b
γ 1 (Yδ )
= nκ 4  2b
2b
γ 1 (Yδ ) − γ
b2 (Yδ ) − [Yδ ] 2b
γ 2 (Yδ ) − γ
b1 (Yδ ) − γ
b3 (Yδ ) 2[Yδ ] − 2b
γ 1 (Yδ )
which implies
EU {RVw (Xδ )} = w0 [Yδ ] + 2nκ 2 (w0 − w1 ) ,
which is unbiased iff w0 = w1 = 1. The simplest unbiased estimator is thus
RV1,1 (Xδ ) = [Xδ ] + 2b
γ 1 (Xδ ).
(40)
This was proposed by Zhou (1996) and studied in detail by Hansen and Lunde (2006). Table
5 summarises the properties of various estimators under F1 and IND using these results. As
lags are added, the middle term in the variance considerably falls, while the first term increases.
This means that when δ is small then RV 1,1,0.5 (Xδ ) could be more precise than [Xδ ] in the
presence of market friction. Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004, Theorem
45
Estimator
[Xδ ]
RV1,1 (Xδ ) = [Xδ ] + 2b
γ 1 (Xδ )
RV1,1,0.5 (Xδ ) = [Xδ ] + 2b
γ 1 (Xδ ) + γ
b2 (Xδ )
Bias
2nκ 2
0
0
Variance
Rt 4
Rt
2δ 0 σ u du + 12nκ 4 + 8κ 2 0 σ 2u du
Rt 4
R
t
6δ 0 σ u du + 8nκ 4 + 8κ 2 0 σ 2u du
R
Rt 4
t
7δ 0 σ u du + 2nκ 4 + 4κ 2 0 σ 2u du
Table 5: Bias and variance of various kernel estimators under the F1 and IND assumptions
2) show that by adding more lags and choosing the weights appropriately they can produce a
consistent estimator, when small adjustments are made to deal with the intricate end effects we
have ignored in this survey. An alternative would be to subsample (40), which will transform
the unbiased statistic into a consistent one under the F1 assumption — although, as we have
seen in the previous subsection, such an argument fails under F2. Under F2, the kernel approach
will continue to produce an unbiased estimator of [Y ], but typically it will be inconsistent.
Consider the simple kernel weights w i = (q + 1 − i) /q, where q is set to cover 5 minutes,
whatever the value of δ. Then the bias is 0 under F1 and IND. When q = 1 then w 1 = 1, if
q = 2 then w1 = 1, w2 = 1/2, while q = 3 implies w1 = 1, w2 = 2/3, w3 = 1/2. When q = 4, then
w1 = 1, w2 = 3/4, w3 = 2/4, w4 = 1/4. Figure 7, repeats Figure 3 but now plots the results for
the kernel estimator. These seem pretty strong results, with the estimator being reasonably in
line with the alternation estimator we will discuss in the moment. The results are pretty stable
with respect to δ, although the problem of dealing with frictions is clearly harder in the case
where we use transaction as opposed to mid-quote data. These results are confirmed by looking
at the signature plots in Figure 6, which shows the kernel still has bias in the transactions
case, but the biases are rather modest. Barndorff-Nielsen, Hansen, Lunde, and Shephard (2004,
Theorem 2) discuss more sophisticated choices of kernels.
5.6
F3 — pure point process
This is attractive as there is a great deal of literature which moves econometricians towards
modelling prices as point processes (e.g. Engle and Russell (1998), Engle (2000) and Bowsher
(2003)). It appears in a number of guises. There is a literature on continuous sample path
processes which are rounded to yield a point process. Work on this includes Gottlieb and Kalay
(1985), Jacod (1996) and Delattre and Jacod (1997). Another strand has started out with a
model for a point process for X, which has been introduced in this context by Oomens (2004)
and Large (2005).
Oomens (2004) has studied the sampling properties of realised QV estimators in the case
where the data generating process is purely made up of jumps. Hasbrouck (1999) studies the
problem where they have an underlying time series which is rounded so that the prices live on
46
discrete lattice points, whose width is a tick. His analysis is fully parametric.
Closer to the content of this paper, Barndorff-Nielsen and Shephard (2005b) study the first
two moments of realised QV when Yt = Zτ t ,where τ is the integral of a non-negative covariance
stationary process, that is a process with non-decreasing sample paths, and Z is a Lévy process.
Throughout they assume that Z ⊥⊥ τ . Such models have received some attention recently in
mathematical finance due to the papers by, for example, Carr, Geman, Madan, and Yor (2003)
and Carr and Wu (2004). Of course Lévy processes, outside the Brownian motion case, are pure
jump processes and can be made to obey lattice structures if desired. ?) show that the realised
QV estimator is an inconsistent estimator of τ , but are able to characterise the bias and variance
and so the realised QV estimator can be used to provide inference on underlying parameters if
the time-change model is parametric.
A radical departure is provided by Large (2005) who introduced the alternation estimator.
He looks at markets where prices move almost always by one tick. An example of this is Vodafone
in Figure 3. He uses solely one side of the quotes, say the best bid, and assumes that the pure
jump process Xt always jumps towards Yt when it moves. He then estimates the [Y ] t as
b t = [X]t Nt − At ,
L
At
where Nt are the number of price movements in the single side of the market up to time t. We
call this the alternation estimator. Here A t are the number of alternations or immediate price
p
bt →
reversals. Under various assumptions he then shows that L
[Y ]t as Nt → ∞ and establishes
a CLT for the estimator. This approach is an elegant combination of a market microstructure
model which is cointegrated with a BSM efficient price.
For the Vodafone share price, we computed the alternation estimator separately on the
bid and ask sides of the markets and averaged the values to compute the estimator of [Y ].
Figure 6 shows its value averaged over the 20 days in January 2004. It is in line with the
estimated unconditional standard deviation of the open to close daily returns. Figure 7 shows
the corresponding result for the 2nd of January, 2004. Again it suggests the alternation estimator
produces plausible values in practice. Of course the Vodafone example is a nice case for the
alternation estimator for it has a large tick size as a percentage of price and almost all of its
moves are by one tick. It is potentially challenged by stocks with more frequently updated
quotes.
47
5.7
F4 — scrambled multivariate process
There is very little work on the effect of market frictions in the multivariate case. At a trivial
level, all the univariate results apply to the multivariate case through the use of polarisation
[Y l , Y m ]t =
1 l
[Y + Y m ]t − [Y l − Y m ]t ,
4
and so frictionally robust estimators can be applied to the two components. Although this
approach has significant merits, and is used in the paper by Bandi and Russell (2005) to select
δ for daily realised QV calculations, it misses out on adequately dealing with Epps type effects.
In the multivariate case there are potentially two new problems: (i) non-synchronous trading
or quote updating, (ii) delays caused by reaction times, as information is absorbed into markets
differentially quickly. The first of these has been studied for quite a long time in empirical finance
by, for example, Scholes and Williams (1977) and Lo and MacKinlay (1990). Martens (2003)
provides a review of some of this work and more modern papers as well as making contributions
of his own.
If Y ∈ BSM and the times of observations, t j , are independent from Y then this is a
precisely specified statistical problem and a number of solutions are available. We discussed the
Fourier method in section 3.8.3, but there is also the work of Hayashi and Yoshida (2005) which
characterises the bias caused by random sampling and suggests methods for trying to overcome
it. Hayashi and Yoshida (2005, Definition 3.1) introduced the estimator
o
Y l, Y m
n
t
=
tX
j ≤t i ≤t tX
i=1 j=1
Ytli − Ytli−1
m
Ytm
−
Y
tj−1 I {(ti−1 , ti ) ∩ (tj−1 , tj ) 6= } .
j
(41)
This multiplies returns together whenever time intervals of the returns have any component
which are overlapping. This artificially includes terms with components which are approximately
uncorrelated (inflating the variance of the estimator), but it does not exclude any terms and
so does not miss any of the contributions to quadratic covariation. They show under various
assumptions that as the times of observations become denser over the interval from time 0 to
time t, this estimator converges to the desired quadratic covariation quantity.
Table 6 illustrates the effect of using estimator (41), estimating the average realised correlation during January 2004 for the London stock exchange data discussed earlier. These results
are comparable with Figure 4 and shows that it goes a modest way towards tackling this issue.
This suggests that other types of frictions are additionally important.
48
Vodafone
BP
AstraZeneca
HSBC
Vodafone
.0681
.0456
.0776
BP
AstraZeneca
HSBC
.0430
.0602
.0495
-
Table 6: The average of the daily Hayashi-Yoshida estimator of the correlation amoungst the
returns in these asset prices. These statistics are based on mid-quotes.
6
Conclusions
This paper has reviewed the literature on the measurement and forecasting of uncertainty
through quadratic variation type objects. The econometrics of this has focused on realised
objects, estimating QV and its components. Such an approach has been shown to provide a
leap forward in our understanding of time varying volatility and jumps, which are crucial in asset
allocation, derivative pricing and risk assessment. A drawback with these types of methods is
the potential for market frictions to complicate the analysis. Recent research has been trying to
address this issue and has introduced various innovative methods. There is still much work to
be carried through in that area.
7
Software
The calculations made for this paper were carried out using PcGive of Doornik and Hendry
(2005) and software written by the authors using the Ox language of Doornik (2001).
References
Aı̈t-Sahalia, Y. (2004). Disentangling diffusion from jumps. Journal of Financial Economics 74, 487–
528.
Aı̈t-Sahalia, Y. and J. Jacod (2005a). Fisher’s information for discretely sampled Lévy processes.
Unpublished paper: Department of Economics, Princeton University.
Aı̈t-Sahalia, Y. and J. Jacod (2005b). Volatility estimators for discretely sampled Lévy processes.
Unpublished paper: Department of Economics, Princeton University.
Aı̈t-Sahalia, Y., P. Mykland, and L. Zhang (2005a). How often to sample a continuous-time process in
the presence of market microstructure noise. Review of Financial Studies 18, 351–416.
Aı̈t-Sahalia, Y., P. Mykland, and L. Zhang (2005b). Ultra high frequency volatility estimation with
dependent microstructure noise. Unpublished paper: Department of Economics, Princeton University.
Alizadeh, S., M. Brandt, and F. Diebold (2002). Range-based estimation of stochastic volatility models.
Journal of Finance 57, 1047–1091.
Andersen, T. G. and T. Bollerslev (1997). Intraday periodicity and volatility persistence in financial
markets. Journal of Empirical Finance 4, 115–158.
Andersen, T. G. and T. Bollerslev (1998a). Answering the skeptics: yes, standard volatility models do
provide accurate forecasts. International Economic Review 39, 885–905.
Andersen, T. G. and T. Bollerslev (1998b). Deutsche mark-dollar volatility: intraday activity patterns,
macroeconomic announcements, and longer run dependencies. Journal of Finance 53, 219–265.
49
Andersen, T. G., T. Bollerslev, and F. X. Diebold (2003). Some like it smooth, and some like it rough:
untangling continuous and jump components in measuring, modeling and forecasting asset return
volatility. Unpublished paper: Economics Dept, Duke University.
Andersen, T. G., T. Bollerslev, F. X. Diebold, and H. Ebens (2001). The distribution of realized stock
return volatility. Journal of Financial Economics 61, 43–76.
Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys (2000). Great realizations. Risk 13,
105–108.
Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys (2001). The distribution of exchange rate
volatility. Journal of the American Statistical Association 96, 42–55. Correction published in 2003,
volume 98, page 501.
Andersen, T. G., T. Bollerslev, F. X. Diebold, and P. Labys (2003). Modeling and forecasting realized
volatility. Econometrica 71, 579–625.
Andersen, T. G., T. Bollerslev, F. X. Diebold, and C. Vega (2003). Micro effects of macro announcements: Real-time price discovery in foreign exchange. American Economic Review 93, 38–62.
Andersen, T. G., T. Bollerslev, and N. Meddahi (2004). Analytic evaluation of volatility forecasts.
International Economic Review 45, 1079–1110.
Andersen, T. G., T. Bollerslev, and N. Meddahi (2005). Correcting the errors: A note on volatility
forecast evaluation based on high-frequency data and realized volatilities. Econometrica 73, 279–
296.
Andersen, T. G. and B. Sørensen (1996). GMM estimation of a stochastic volatility model: a Monte
Carlo study. Journal of Business and Economic Statistics 14, 328–352.
Back, K. (1991). Asset pricing for general processes. Journal of Mathematical Economics 20, 371–395.
Bandi, F. M. and J. R. Russell (2003). Microstructure noise, realized volatility, and optimal sampling.
Unpublished paper, Graduate School of Business, University of Chicago.
Bandi, F. M. and J. R. Russell (2005). Realized covariation, realized beta and microstructure noise.
Unpublished paper, Graduate School of Business, University of Chicago.
Barndorff-Nielsen, O. E., S. E. Graversen, J. Jacod, M. Podolskij, and N. Shephard (2005). A central
limit theorem for realised power and bipower variations of continuous semimartingales. In Y. Kabanov and R. Lipster (Eds.), From Stochastic Analysis to Mathematical Finance, Festschrift for
Albert Shiryaev. Springer. Forthcoming.
Barndorff-Nielsen, O. E., S. E. Graversen, J. Jacod, and N. Shephard (2005). Limit theorems for
realised bipower variation in econometrics. Unpublished paper: Nuffield College, Oxford.
Barndorff-Nielsen, O. E., P. R. Hansen, A. Lunde, and N. Shephard (2004). Regular and modified
kernel-based estimators of integrated variance: the case with independent noise. Unpublished paper:
Nuffield College, Oxford.
Barndorff-Nielsen, O. E. and N. Shephard (2001). Non-Gaussian Ornstein–Uhlenbeck-based models
and some of their uses in financial economics (with discussion). Journal of the Royal Statistical
Society, Series B 63, 167–241.
Barndorff-Nielsen, O. E. and N. Shephard (2002). Econometric analysis of realised volatility and its
use in estimating stochastic volatility models. Journal of the Royal Statistical Society, Series B 64,
253–280.
Barndorff-Nielsen, O. E. and N. Shephard (2003). Realised power variation and stochastic volatility.
Bernoulli 9, 243–265. Correction published in pages 1109–1111.
Barndorff-Nielsen, O. E. and N. Shephard (2004). Econometric analysis of realised covariation: high
frequency covariance, regression and correlation in financial economics. Econometrica 72, 885–925.
Barndorff-Nielsen, O. E. and N. Shephard (2005a). How accurate is the asymptotic approximation to
the distribution of realised volatility? In D. W. K. Andrews and J. H. Stock (Eds.), Identification
and Inference for Econometric Models. A Festschrift in Honour of T.J. Rothenberg, pp. 306–331.
Cambridge: Cambridge University Press.
Barndorff-Nielsen, O. E. and N. Shephard (2005b). Impact of jumps on returns and realised variances:
econometric analysis of time-deformed Lévy processes. Journal of Econometrics. Forthcoming.
50
Barndorff-Nielsen, O. E. and N. Shephard (2005c). Power variation and time change. Theory of Probability and Its Applications. Forthcoming.
Barndorff-Nielsen, O. E. and N. Shephard (2006). Econometrics of testing for jumps in financial economics using bipower variation. Journal of Financial Econometrics 4. Forthcoming.
Barndorff-Nielsen, O. E., N. Shephard, and M. Winkel (2004). Limit theorems for multipower variation
in the presence of jumps in financial econometrics. Unpublished paper: Nuffield College, Oxford.
Bartlett, M. S. (1946). On the theoretical specification of sampling properties of autocorrelated time
series. Journal of the Royal Statistical Society, Supplement 8, 27–41.
Bartlett, M. S. (1950). Periodogram analysis and continuous spectra. Biometrika 37, 1–16.
Barucci, E. and R. Reno (2002a). On measuring volatility and the GARCH forecasting performance.
Journal of International Financial Markets, Institutions and Money 12, 182–200.
Barucci, E. and R. Reno (2002b). On measuring volatility of diffusion processes with high frequency
data. Economic Letters 74, 371–378.
Bates, D. S. (2003). Empirical option pricing: a retrospective. Journal of Econometrics 116, 387–404.
Billingsley, P. (1995). Probability and Measure (3 ed.). New York: Wiley.
Bollerslev, T., M. Gibson, and H. Zhou (2005). Dynamic estimation of volatility risk premia and
investor risk aversion from option-implied and realized volatilities. Unpublished paper, Economics
Department, Duke University.
Bollerslev, T., U. Kretschmer, C. Pigorsch, and G. Tauchen (2005). The dynamics of bipower variation,
realized volatility and returns. Unpublished paper: Department of Economics, Duke University.
Bollerslev, T. and H. Zhou (2002). Estimating stochastic volatility diffusion using conditional moments
of integrated volatility. Journal of Econometrics 109, 33–65.
Bowsher, C. G. (2003). Modelling security market events in continuous time: intensity based, multivariate point process models. Unpublished paper: Nuffield College, Oxford.
Brandt, M. W. and F. X. Diebold (2004). A no-arbitrage approach to range-based estimation of return
covariances and correlations. Journal of Business. Forthcoming.
Branger, N. and C. Schlag (2005). An economic motivation for variance contracts. Unpublished paper:
Faculty of Economics and Business Administration, Goethe University.
Brockhaus, O. and D. Long (1999). Volatility swaps made simple. Risk 2, 92–95.
Calvet, L. and A. Fisher (2002). Multifractality in asset returns: theory and evidence. Review of
Economics and Statistics 84, 381–406.
Carr, P., H. Geman, D. B. Madan, and M. Yor (2003). Stochastic volatility for Lévy processes. Mathematical Finance 13, 345–382.
Carr, P., H. Geman, D. B. Madan, and M. Yor (2005). Pricing options on realized variance. Finance
and Stochastics. Forthcoming.
Carr, P. and R. Lee (2003a). Robust replication of volatility derivatives. Unpublished paper: Courant
Institute, NYU.
Carr, P. and R. Lee (2003b). Trading autocorrelation. Unpublished paper: Courant Institute, NYU.
Carr, P. and K. Lewis (2004). Corridor variance swaps. Risk , 67–72.
Carr, P. and D. B. Madan (1998). Towards a theory of volatility trading. In R. Jarrow (Ed.), Volatility,
pp. 417–427. Risk Publications.
Carr, P. and L. Wu (2004). Time-changed Lévy processes and option pricing. Journal of Financial
Economics, 113–141.
Chernov, M., A. R. Gallant, E. Ghysels, and G. Tauchen (2003). Alternative models of stock price
dynamics. Journal of Econometrics 116, 225–257.
Christensen, K. and M. Podolskij (2005). Asymptotic theory for range-based estimation of integrated
volatility of a continuous semi-martingale. Unpublished paper: Aarhus School of Business.
Comte, F. and E. Renault (1998). Long memory in continuous-time stochastic volatility models. Mathematical Finance 8, 291–323.
Corradi, V. and W. Distaso (2004). Specification tests for daily integrated volatility, in the presence
of possible jumps. Unpublished paper: Queen Mary College, London.
51
Corsi, F. (2003). A simple long memory model of realized volatility. Unpublished paper: University of
Southern Switzerland.
Curci, G. and F. Corsi (2003). A discrete sine transformation approach to realized volatility measurement. Unpublished paper.
Dacorogna, M. M., R. Gencay, U. A. Müller, R. B. Olsen, and O. V. Pictet (2001). An Introduction to
High-Frequency Finance. San Diego: Academic Press.
Delattre, S. and J. Jacod (1997). A central limit theorem for normalized functions of the increments
of a diffusion process in the presence of round off errors. Bernoulli 3, 1–28.
Demeterfi, K., E. Derman, M. Kamal, and J. Zou (1999). A guide to volatility and variance swaps.
Journal of Derivatives 6, 9–32.
Doob, J. L. (1953). Stochastic Processes. New York: John Wiley and Sons.
Doornik, J. A. (2001). Ox: Object Oriented Matrix Programming, 3.0. London: Timberlake Consultants
Press.
Doornik, J. A. and D. F. Hendry (2005). PC Give, Version 10.4. London: Timberlake Consultants
Press.
Eicker, F. (1967). Limit theorems for regressions with unequal and dependent errors. In Proceedings
of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1, pp. 59–82.
Berkeley: University of California.
Engle, R. F. (2000). The econometrics of ultra-high frequency data. Econometrica 68, 1–22.
Engle, R. F. and J. P. Gallo (2005). A multiple indicator model for volatility using intra daily data.
Journal of Econometrics. Forthcoming.
Engle, R. F. and J. R. Russell (1998). Forecasting transaction rates: the autoregressive conditional
duration model. Econometrica 66, 1127–1162.
Engle, R. F. and J. R. Russell (2005). Analysis of high frequency data. In Y. Ait-Sahalia and L. P.
Hansen (Eds.), Handbook of Financial Econometrics. Amsterdam: North Holland. Forthcoming.
Epps, T. W. (1979). Comovements in stock prices in the very short run. Journal of the American
Statistical Association 74, 291–296.
Fang(96) (1996). Volatility modeling and estimation of high-frequency data with Gaussian noise. Unpublished Ph.D. thesis, Sloan School of Management, MIT.
Forsberg, L. and E. Ghysels (2004). Why do absolute returns predict volatility so well. Unpublished
paper: Economics Department, UNC, Chapel Hill.
Foster, D. P. and D. B. Nelson (1996). Continuous record asymptotics for rolling sample variance
estimators. Econometrica 64, 139–174.
French, K. R., G. W. Schwert, and R. F. Stambaugh (1987). Expected stock returns and volatility.
Journal of Financial Economics 19, 3–29.
Garcia, R., E. Ghysels, and E. Renault (2005). The econometrics of option pricing. In Y. Ait-Sahalia
and L. P. Hansen (Eds.), Handbook of Financial Econometrics. North Holland. Forthcoming.
Genon-Catalot, V., C. Larédo, and D. Picard (1992). Non-parametric estimation of the diffusion coefficient by wavelet methods. Scandinavian Journal of Statistics 19, 317–335.
Ghysels, E., A. C. Harvey, and E. Renault (1996). Stochastic volatility. In C. R. Rao and G. S. Maddala
(Eds.), Statistical Methods in Finance, pp. 119–191. Amsterdam: North-Holland.
Ghysels, E., P. Santa-Clara, and R. Valkanov (2004). Predicting volatility: getting the most out of
return data sampled at different frequencies. Journal of Econometrics. Forthcoming.
Gloter, A. and J. Jacod (2001a). Diffusions with measurement errors. I — local asymptotic normality.
ESAIM: Probability and Statistics 5, 225–242.
Gloter, A. and J. Jacod (2001b). Diffusions with measurement errors. II — measurement errors.
ESAIM: Probability and Statistics 5, 243–260.
Goncalves, S. and N. Meddahi (2004). Bootstrapping realized volatility. Unpublished paper, CIRANO,
Montreal.
Gottlieb, G. and A. Kalay (1985). Implications of the discretness of observed stock prices. Journal of
Finance 40, 135–153.
52
Hansen, P. R. and A. Lunde (2005a). Consistent ranking of volatility models. Journal of Econometrics.
Forthcoming.
Hansen, P. R. and A. Lunde (2005b). A forecast comparison of volatility models: Does anything beat
a GARCH(1,1)? Journal of Applied Econometrics. Forthcoming.
Hansen, P. R. and A. Lunde (2005c). A realized variance for the whole day based on intermittent
high-frequency data.
Hansen, P. R. and A. Lunde (2006). Realized variance and market microstructure noise (with discussion). Journal of Business and Economic Statistics 24. Forthcoming.
Hasbrouck, J. (1999). The dynamics of discrete bid and ask quotes. Journal of Finance 54, 2109–2142.
Hasbrouck, J. (2003). Empirical Market Microstructure: Economic and Statistical Perspectives on the
Dynamics of Trade in Security Markets. Unpublished manuscript, Department of Finance, Stern
Business School, NYU.
Hausman, J. A. (1978). Specification tests in econometrics. Econometrica 46, 1251–1271.
Hayashi, T. and N. Yoshida (2005). On covariance estimation of non-synchronously observed diffusion
processes. Bernoulli 11, 359–379.
Heston, S. L. (1993). A closed-form solution for options with stochastic volatility, with applications to
bond and currency options. Review of Financial Studies 6, 327–343.
Hog, E. and A. Lunde (2003). Wavelet estimation of integrated volatility. Unpublished paper: Aarhus
School of Business.
Howison, S. D., A. Rafailidis, and H. O. Rasmussen (2004). On the pricing and hedging of volatility
derivatives. Applied Mathematical Finance 11, 317–346.
Huang, X. and G. Tauchen (2005). The relative contribution of jumps to total price variation. Journal
of Financial Econometrics 3. Forthcoming.
Jacod, J. (1994). Limit of random measures associated with the increments of a Brownian semimartingale. Preprint number 120, Laboratoire de Probabilitiés, Université Pierre et Marie Curie, Paris.
Jacod, J. (1996). La variation quadratique du Brownian en presence d’erreurs d’arrondi. Asterisque 236,
155–162.
Jacod, J. and P. Protter (1998). Asymptotic error distributions for the Euler method for stochastic
differential equations. Annals of Probability 26, 267–307.
Jacod, J. and A. N. Shiryaev (2003). Limit Theorems for Stochastic Processes (2 ed.). Springer-Verlag:
Berlin.
Javaheri, A., P. Wilmott, and E. Haug (2002). GARCH and volatility swaps. Unpublished paper:
available at www.wilmott.com.
Kanatani, T. (2004a). High frequency data and realized volatility. Ph.D. thesis, Graduate School of
Economics, Kyoto University.
Kanatani, T. (2004b). Integrated volatility measuring from unevenly sampled observations. Economics
Bulletin 3, 1–8.
Karatzas, I. and S. E. Shreve (1991). Brownian Motion and Stochastic Calculus (2 ed.), Volume 113
of Graduate Texts in Mathematics. Berlin: Springer–Verlag.
Karatzas, I. and S. E. Shreve (1998). Methods of Mathematical Finance. New York: Springer–Verlag.
Kim, S., N. Shephard, and S. Chib (1998). Stochastic volatility: likelihood inference and comparison
with ARCH models. Review of Economic Studies 65, 361–393.
Koopman, S. J., B. Jungbacker, and E. Hol (2005). Forecasting daily variability of the S&P 100
stock index using historical, realised and implied volatility measurements. Journal of Empirical
Finance 12, 445–475.
Large, J. (2005). Estimating quadratic variation when quoted prices jump by a constant increment.
Unpublished paper: Nuffield College, Oxford.
Lévy, P. (1937). Théories de L’Addition Aléatories. Paris: Gauthier-Villars.
Lo, A. and C. MacKinlay (1990). The econometric analysis of nonsynchronous trading. Journal of
Econometrics 45, 181–211.
53
Maheswaran, S. and C. A. Sims (1993). Empirical implications of arbitrage-free asset markets. In
P. C. B. Phillips (Ed.), Models, Methods and Applications of Econometrics, pp. 301–316. Basil
Blackwell.
Malliavin, P. and M. E. Mancino (2002). Fourier series method for measurement of multivariate volatilities. Finance and Stochastics 6, 49–61.
Mancini, C. (2001). Does our favourite index jump or not. Dipartimento di Matematica per le Decisioni,
Universita di Firenze.
Mancini, C. (2003). Statistics of a Poisson-Gaussian process. Dipartimento di Matematica per le Decisioni, Universita di Firenze.
Mancini, C. (2004). Estimation of the characteristics of jump of a general Poisson-diffusion process.
Scandinavian Actuarial Journal 1, 42–52.
Mancino, M. and R. Reno (2005). Dynamic principal component analysis of multivariate volatilites via
fourier analysis. Applied Mathematical Finance. Forthcoming.
Martens, M. (2003). Estimating unbiased and precise realized covariances. Unpublished paper.
Martens, M. and D. van Dijk (2005). Measuring volatility with the realized range. Unpublished paper:
Econometric Institute, Erasmus University, Rotterdam.
Meddahi, N. (2002). A theoretical comparison between integrated and realized volatilities. Journal of
Applied Econometrics 17, 479–508.
Merton, R. C. (1980). On estimating the expected return on the market: An exploratory investigation.
Journal of Financial Economics 8, 323–361.
Müller, U. A. (1993). Statistics of variables observed over overlapping intervals. Unpublished paper,
Olsen and Associates, Zurich.
Munroe, M. E. (1953). Introduction to Measure and Integration. Cambridge, MA: Addison-Wesley
Publishing Company, Inc.
Mykland, P. and L. Zhang (2005). ANOVA for diffusions. Annals of Statistics 33. Forthcoming.
Mykland, P. A. and L. Zhang (2002). Inference for volatility-type objects and implications for hedging.
Unpublished paper: Department of Statistics, University of Chicago.
Neuberger, A. (1990). Volatility trading. Unpublished paper: London Business School.
Nielsen, M. O. and P. H. Frederiksen (2005). Finite sample accuracy of integrated volatility estimators.
Unpublished paper, Department of Economics, Cornell University.
Officer, R. R. (1973). The variability of the market factor of the New York stock exchange. Journal of
Business 46, 434–453.
O’Hara, M. (1995). Market Microstructure Theory. Oxford: Blackwell Publishers.
Oomens, R. A. A. (2004). Properties of bias corrected realized variance in calender time and business
time. Unpublished paper, Warwick Business School.
Parkinson, M. (1980). The extreme value method for estimating the variance of the rate of return.
Journal of Business 53, 61–66.
Patton, A. (2005). Volatility forecast evaluation and comparison using imperfect volatility proxies.
Unpublished paper: Department of Accounting and Finance, LSE.
Phillips, P. C. B. and J. Yu (2005). A two-stage realized volatility approach to the estimation for
diffusion processes from discrete observations. Unpublished paper: Cowles Foundation for Research
in Economics, Yale University.
Politis, D., J. P. Romano, and M. Wolf (1999). Subsampling. New York: Springer.
Precup, O. V. and G. Iori (2005). Cross-correlation measures in high-frequency domain. Unpublished
paper: Department of Economics, City University.
Protter, P. (2004). Stochastic Integration and Differential Equations. New York: Springer-Verlag.
Reno, R. (2003). A closer look at the epps effect. International Journal of Theoretical and Applied
Finance 6, 87–102.
Rogers, L. C. G. and S. E. Satchell (1991). Estimating variance from high, low, open and close prices.
Annals of Applied Probability 1, 504–512.
54
Rosenberg, B. (1972). The behaviour of random variables with nonstationary variance and the distribution of security prices. Working paper 11, Graduate School of Business Administration, University
of California, Berkeley. Reprinted in Shephard (2005).
Rydberg, T. H. and N. Shephard (2000). A modelling framework for the prices and times of trades
made on the NYSE. In W. J. Fitzgerald, R. L. Smith, A. T. Walden, and P. C. Young (Eds.),
Nonlinear and Nonstationary Signal Processing, pp. 217–246. Cambridge: Isaac Newton Institute
and Cambridge University Press.
Scholes, M. and J. Williams (1977). Estimating betas from nonsynchronous trading. Journal of Financial Economics 5, 309–27.
Schwert, G. W. (1989). Why does stock market volatility change over time? Journal of Finance 44,
1115–1153.
Schwert, G. W. (1990). Indexes of U.S. stock prices from 1802 to 1987. Journal of Business 63, 399–426.
Schwert, G. W. (1998). Stock market volatility: Ten years after the crash. Brookings-Wharton Papers
on Financial Services 1, 65–114.
Shephard, N. (2005). Stochastic Volatility: Selected Readings. Oxford: Oxford University Press.
Sheppard, K. (2005). Measuring realized covariance. Unpublished paper: Department of Economics,
University of Oxford.
Shiryaev, A. N. (1999). Essentials of Stochastic Finance: Facts, Models and Theory. Singapore: World
Scientific.
Wasserfallen, W. and H. Zimmermann (1985). The behaviour of intraday exchange rates. Journal of
Banking and Finance 9, 55–72.
Wiener, N. (1924). The quadratic variation of a function and its Fourier coefficients. Journal of Mathematical Physics 3, 72–94.
Woerner, J. (2004). Power and multipower variation: inference for high frequency data. Unpublished
paper.
Zhang, L. (2004). Efficient estimation of stochastic volatility using noisy observations: a multi-scale
approach. Unpublished paper: Department of Statistics, Carnegie Mellon University.
Zhang, L., P. Mykland, and Y. Aı̈t-Sahalia (2005). A tale of two time scales: determining integrated
volatility with noisy high-frequency data. Journal of the American Statistical Association. Forthcoming.
Zhou, B. (1996). High-frequency data and volatility in foreign-exchange rates. Journal of Business and
Economic Statistics 14, 45–52.
55