Estimating the Quadratic Variation
of Poisson Jump Diffusion Processes
with Noisy High-Frequency Data
Diplomarbeit
eingereicht bei
Prof. Dr. Anton Wakolbinger
JProf. Dr. Christoph Kühn
Institut für Stochastik und Mathematische Informatik
Fachbereich Informatik und Mathematik
Johann Wolfgang Goethe-Universität
Frankfurt am Main
von
Alireza Dorfard
Strackgasse 14, 61440 Oberursel
Matrikelnummer: 2151944
12. August 2007
Danksagung
Ich möchte mich an dieser Stelle ganz herzlich bei Herrn Prof. Dr. Wakolbinger und Herrn
JProf. Dr. Kühn bedanken, die mir den Zugang zum Thema dieser Arbeit ermöglicht
haben und mir zu jeder Zeit als kompetente und in jeder Hinsicht hilfsbereite Betreuer
zur Seite standen. Bedanken möchte ich mich ebenso bei meinen Eltern sowie meiner
Schwester, die mich mit ihrer Liebe und ihrem Vertrauen in meinen Interessen stets
förderten und mich mit all ihren Kräften während meines Studiums unterstützten. Besonderer Dank gebührt meiner bezaubernden Freundin Sandra, meinem besten Freund Enrico
sowie meinem langjährigen Freund Max, die mich in entscheidender Weise geprägt haben
und mir mit ihrem Rat und ihrer Unterstützung die Kraft gegeben haben Entscheidungen
zu treffen, welche meinen Lebensweg tiefgehend beeinflusst haben.
Alireza Dorfard,
Oberursel, 12. August 2007
i
Contents
1 Introduction
1.1
1.2
1
A Guide to High-Frequency Financial Data, Microstructure Noise and the
Goals of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
The Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2 Foundations
6
3 Quadratic Variation and Realized Volatility
9
3.1
Quadratic Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
3.2
The Full Grid and Realized Volatility
3.3
Volatility Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
. . . . . . . . . . . . . . . . . . . . . 10
4 Estimating Quadratic Variation on the Full Grid
17
4.1
Error from Market Microstructure Noise on the Full Grid . . . . . . . . . . 17
4.2
Error from Discretization on the Full Grid . . . . . . . . . . . . . . . . . . . 20
4.3
Total Error and Optimal Sampling Frequency on the Full Grid . . . . . . . 34
5 Estimating Quadratic Variation on the Multi Grid
37
5.1
The Multi Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2
Error from Market Microstructure Noise on the Multi Grid . . . . . . . . . 39
5.3
Error from Discretization on the Multi Grid . . . . . . . . . . . . . . . . . . 42
5.4
Total Error and Optimal Subgrid Frequency on the Multi Grid . . . . . . . 67
6 Estimating Quadratic Variation with Bias Correction
69
6.1
Bias Corrected Two Time Scale Estimator . . . . . . . . . . . . . . . . . . . 69
6.2
Minimizing the Variance of the Estimation Error for the Two Time Scale
Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7 Conclusion
75
A Notation
76
B References
77
ii
1
Introduction
1.1
A Guide to High-Frequency Financial Data, Microstructure Noise
and the Goals of the Thesis
The characteristics of asset returns are essential for many subjects of financial economics
related to the price and risk of financial instruments. One of the most critical features of
the asset return is the return volatility which has been subject to an enormous amount
of research on its modeling and forecasting. In the past, the analysis of volatility has
mostly been done by studying direct indicators of volatility, by using parametric models
like GARCH, or by studying volatilities with specific option pricing models such as BlackScholes.
Recently, the widespread availability of high-frequency financial data has led to a
change from monthly to inter-daily time horizons in volatility modeling. A result of this
availability of rich data sources is the non-parametric volatility measure termed realized
volatility, which is estimated from the sum of frequently sampled squared returns, and
under the assumption that the logarithmic price process follows an Itô process asymptotically approximates the integrated volatility, the integral of instantaneous volatility over a
specific time interval. Contributions on realized and integrated volatilities include works
by Hull and White (1987), Jacod and Protter (1998), Andersen, Bollerslev, Diebold, and
Labys (2001), Barndorff-Nielsen and Shephard (2002a), and Mykland and Zhang (2002).
The key problem in dealing with the realized volatility estimator is the empirically
found fact that its robustness is lost for smaller sampling frequencies, so that the realized volatility diverges as the sampling frequency increases, although in theory realized
volatility converges to the quadratic variation if returns are sampled arbitrarily finely.
The reason for this phenomenon is called market microstructure, and includes, but is not
limited to, the discreteness of prices, irregular trading, bid-ask spread and asymmetric
information. This reality has led empirical finance literature to the common suggestion to
extend the sampling frequency to one observation every couple of minutes although information on asset prices is sometimes available in extremely high frequencies, and therefore
would allow a sampling frequency of one observation every few seconds. The consequence
of this approach is that only a tiny proportion of the available information is utilized
with the intention to reduce the bias caused by market microstructure and obtain sounder
estimates of integrated volatilities.
This standard procedure in estimating integrated volatility has been criticized as suboptimal in Zhang, Mykland and Aı̈t-Sahalia (2005). The motivation of the criticism is
1
1
INTRODUCTION
2
because contamination due to market microstructure noise is not considered in the examined price process and the fact that a large fraction of the available tick-by-tick price data
remains unused in the method presently employed by practitioners. Zhang et al. (2005)
propose a solution that considers the impacts of market microstructure through an observation error in the price process and incorporates all available high-frequency data. The
presented solution is herewith able to notably correct the detrimental effects of market
microstructure. Their analysis is based on X ∗ , the logarithmic price process of an asset,
which evolves in continuous time and follows a continuous Itô process
Z t
Z t
∗
∗
σs dBs ,
µs ds +
Xt = X0 +
0
0
where µt and σt are the instantaneous drift and diffusion functions and Bt is a standard
Brownian motion.
This thesis will rest upon the theory and methodology presented in Zhang et al. (2005)
and extends details of their work from a continuous Itô process to a Poisson jump diffusion
process. The rationale behind this extension is the widely agreed upon fact that financial
movements exhibit unusual behavior relative to what would be expected from a normal
distribution, as financial price series sometimes exhibit time intervals of small changes
followed by wild and severe movement. The question raised in financial literature as
to whether jump diffusion processes with discontinuous sample paths provide a more
appropriate empirical model for financial price series therefore motivates this thesis to
study the implications of jumps on the contributions of Zhang et al. (2005). Given the
fact that jumps in financial price series are rare and of large size, a compound Poisson
process is assumed as the jump component of the price process. The considered Poisson
jump diffusion process X has the form
Z
Xt = X0 +
t
Z
µs ds +
0
t
σs dBs +
0
Nt
X
γk ,
(1.1)
k=0
where Nt is a homogeneous Poisson process with intensity λ, and γt refers to the centered,
i.i.d jump sizes.
To incorporate the indicated observation error resulting from microstructure noise, it is
assumed that the observed process Y at all observation times ti ∈ [0, T ] is of the following
form
Yti = Xti + ti ,
(1.2)
where X is the latent true or efficient logarithmic price as in (1.1), which would be the
correct price in a perfect market without frictions, trading costs and informational asymmetries, and the 0ti s are independent market microstructure noise around the true price.
1
INTRODUCTION
3
In addition, we will consider the error from discretization, which results because it
is not possible to construct estimators based on realized volatility using returns sampled
arbitrarily finely, and hence the quadratic variation is measured with a discretization error,
where the extent of the error decreases as the sampling frequency increases.
The foci of this thesis are the implications of the consideration of microstructure and
discretization effects on the estimation of the quadratic variation of the process X, through
a series of estimators based on realized volatility, given high-frequency transaction price
information of the Poisson jump diffusion process X over a fixed time interval [0, T ]. We
introduce an estimator from Zhang et al. (2005), termed “two time scale” estimator, that
uses all available transaction data, copes with the contamination due to microstructure
noise and consistently estimates the quadratic variation of the process X. The main
contribution and finding of this thesis is the derivation and minimization of the variance
of the estimation error of the quadratic variation of the process X for the “two time scale”
estimator. This analysis is done under the assumption that the underlying process follows
a Poisson jump diffusion process with constant drift and diffusion coefficients. This task is
achieved by presenting findings on the estimation error resulting from microstructure noise
and combine these with our derivations regarding the error resulting from discretization.
1.2
The Main Result
We follow Zhang et al. and introduce the concept of the full and multi grid, where the
full grid, denoted by G n , contains all times tni ∈ [0, T ], i = 0, ..., n, at which observations
of the price process occur, and the multi grid is a family of K disjoint subgrids of the full
grid G n , denoted by G (n,r) , r = 1, ..., K. The time intervals between observations in the
full grid G n are all of length tni − tni−1 =
T
n,
and n is the number of time intervals between
observations on [0, T ]. Moreover, the first and last observation of each subgrid G (n,r) are
the time points tr−1 and tn−K+r , where the time intervals between observations are of
K·T
n , and
subgrid G (n,r) .
length
nK =
n−K+1
K
is the number of time intervals between observations in any
We denote the realized volatility of the process X along the full grid G n by
dX, XeG
n
=
n
X
Xti − Xti−1
2
,
i=1
and the averaged realized volatilities of X along the subgrids G (n,r) , called multi grid
estimator, is denoted by
dX, Xe
(avg)
=
K
1 X
·
dX, Xe(r) =
K
r=1
K nK 2
1 X X
·
Xtr−1+j·K − Xtr−1+(j−1)·K .
K
r=1 j =1
1
INTRODUCTION
4
Furthermore, we denote [X, X]T as the quadratic variation of the process X on the time
interval [0, T ], where theoretical results in stochastic processes state that as the sampling
frequency increases, i.e. as n → ∞, then
n
p
dX, XeG −→ [X, X]T ,
p
dX, Xe(avg) −→ [X, X]T .
and
n
Proceeding as above we denote dY, Y eG by the observed realized volatility and dY, Y e(avg)
by the observed multi grid estimator, which as we will show are both biased estimators
n
n
of dX, XeG and dX, Xe(avg) respectively. In fact, the bias of dY, Y eG increases linearly
with n and the bias of dY, Y e(avg) increases linearly with nK .
We combine these two estimators and construct the bias-adjusted estimator, which we
detailed and termed “two time scale” at the end of the last subsection. The “two time
scale” estimator is given by
d
[X,
X]T
(avg)
= dY, Y eT
−
n
nK
· dY, Y eGT .
n
We present the main finding of this thesis, which is the variance of the estimation error
d
for the “two time scale” estimator, [X,
X]T , when evaluating the quadratic variation
of X, [X, X]T , under the assumption that the underlying process follows a Poisson jump
diffusion process with constant drift and diffusion coefficients. We further give the optimal
d
condition under which the variance of [X,
X]T − [X, X]T is minimized for large n.
If we assume the conditions of Theorem 3, and choose the number of subgrids as
K = c · n2/3 , then
d
IE [X,
X]T − [X, X]T
= O n−1/3
d
Var [X,
X]T − [X, X]T
= n−1/3 · c · Σ 2 + c−2 · Ξ12
+ n−2/3 · c−1 · Ξ22 + o n−2/3 ,
(1.3a)
(1.3b)
where Ξ1 2 , Ξ2 2 and Σ 2 have the form
Ξ12 = 8 · IE 2
2
,
Ξ22 =
(avg)
8 · IE dX, XeT
· IE 2 − 2 · Var 2 ,
and respectively
Σ2
=
T
4
1
· (λ T )2 · (Var γ)2 + · λ T · IE γ 4
3
2
0
Z T
8
+
· λ T · Var (γ) ·
σ 2 ds .
3
0
4
·T
3
Z
σ 4 ds +
1
INTRODUCTION
5
Effectively, the quantity n−1/3 · c−2 · Ξ12 + n−2/3 · c−1 · Ξ22 in the variance is due to
microstructure noise and the quantity n−1/3 · c · Σ 2 is due to discretization effects. Ad1/3
ditionally, the optimal c that minimizes the variance for large n is given by 2 · Ξ12 / Σ 2
.
Hence, this estimator copes with the detrimental effects of microstructure noise and
utilizes all available price information.
This thesis is organized in the following manner. In the second section we address
the key assumptions with respect to the Poisson jump diffusion process X. Sections 3.1
and 3.2 present corresponding definitions and results, and section 3.3 provides an excursus
regarding the usefulness of quadratic variation as a volatility measure. Sections 4 and 5
successively develop the variance of the total estimation error of quadratic variation for the
n
observed realized volatility, dY, Y eG , and the observed multi grid estimator, dY, Y e(avg) ,
respectively, under the assumption that the underlying process X follows a Poisson jump
diffusion process with constant drift and diffusion coefficients. Section 6 eventually introd
duces the “two time scale” estimator, [X,
X]T , and provides the main result. The final
conclusion is presented in section 7.
2
Foundations
Let the logarithmic price process p = (pt )t∈[0,T ] be defined on a complete probability
space, (Ω, F, P), evolving in continuous time with t ∈ [0, T ], where 0 < T < ∞. Moreover,
consider an information filtration, i.e. an increasing family of σ-fields (Ft )t∈[0,T ] ⊆ F,
which satisfies the usual conditions of P-completeness and right continuity. Let the σ-field
Ft reflect all available information at time t with F ≡ FT as the set of events that are
distinguishable at time T , and let p be adapted to the filtration F.
We assume p to be an arbitrage-free price process, which belongs to the class of special
semimartingales that have been investigated by Back (1991), whose work makes clear why
the use of the class of special semimartingales is reasonable for a price process from an
economic point of view. As p is a special semimartingale it can be uniquely canonically
decomposed1 such that
p t = p 0 + At + M t ,
0≤t≤T,
(2.1)
where M is a right continuous with left limits (cadlag) local martingale, A is a continuous
process of finite variation and p0 = M0 = A0 = 0. Moreover, M may be further decomposed2 into Mt = Mtc + Mtd , where the former is a continuous local martingale, M c , and
the latter a compensated local jump martingale of finite variation, M d , so that equation
(2.1) becomes
pt = p0 + At + Mtc + Mtd .
(2.2)
The corresponding left continuous with right limits (caglad) process of p is defined through
pt− ≡ lims↑t ps ≡ lims→t,s≤t ps for each t ∈ [0, T ], so that
∆pt ≡ pt − pt−
(2.3)
is the jump at time t.
Equation (2.1) provides a general form for an asset return process, and is representative for any Itô, jump or jump diffusion process. We further define SSM as the class
containing all processes of form (2.1) as well as FV ⊂ SSM as the class of all its processes
of finite variation, and denote SSMc ⊂ SSM as well as FV c ⊂ FV as their respective
subsets containing all continuous processes.
1
2
See Protter (2003), Theorem III.30 (p. 129)
See Protter (2003) (p. 191)
6
2
FOUNDATIONS
7
The process of special interest to us is a square integrable Poisson jump diffusion
process X ∈ SSM given by
Xt = Xtc + Xtd ,
0≤t≤T,
(2.4)
where X c ∈ SSMc is an Itô process and X d ∈ FV is a compound Poisson process, and
the two processes X c and X d are independent of each other.
The Itô process X c has the form
Xtc
t
Z
=
t
Z
µs ds +
0
σs dBs ,
(2.5)
0
where Bt is a standard (Ft )-Brownian motion, and µt = µ (t, (Xsc , s ≤ t)) as well as
σt = σ (t, (Xsc , s ≤ t)) are (Ft )-adapted. The processes µ and σ are predictable as well
as bounded, and σ is also bounded away from zero. X c ∈ SSM can be uniquely decomR
2
Rt
Rt
T
posed into Act = 0 µs ds and Mtc = 0 σs dBs , where with 0 |µs | ds being integrable
RT
the former is a continuous square integrable process of finite variation and with 0 σs2 ds
being integrable the latter is a square integrable martingale, so that X c is square integrable on [0, T ]. This model hence allows a time-varying drift, stochastic volatility and
correlation between the price and volatility innovations.
The compound Poisson process X d is given by the following equation
Xtd
=
Nt
X
γk ,
(2.6)
k=0
where X d is adapted to the filtration F, and Nt is a homogeneous Poisson process with
intensity λ > 0 so Nt < ∞ for t ∈ [0, T ]. The γk refer to the i.i.d. jump sizes, which
are independent of Nt and satisfy IE γ 4 < ∞. The process Xtd ∈ SSM can be uniquely
Rt
decomposed into Adt = 0 λ·IE (γ) ds, which is a continuous function of finite variation, and
Mtd = Xtd − Adt , which is a discontinuous square integrable martingale of finite variation.
The jump component X d is supposed to represent the positive and negative effects
on the log price due to the arrival of important new information3 . Although there is a
common agreement that bad news have an increased potential of downward corrections
compared to the upward potential of good news, and thus from an economic standpoint
the compound Poisson process would have a negative drift, we limit the scope of this thesis
to the case of no drift in the jump process. Hence, we assume the jump sizes γk to be
zero-mean, i.e. IE(γ) = 0, so we have
Ad = 0
3
See Merton (1976), (p. 127)
as well as
X d = M d,
(2.7)
2
FOUNDATIONS
8
and the process X ∈ SSM as given in (2.4) can be decomposed as in equation (2.2),
where
Z
t
At =
µs ds
(2.8a)
σs dBs ,
(2.8b)
0
Mtc
Z
=
Mtd =
t
0
Nt
X
γk ,
(2.8c)
k=0
which means that the drift component of Xt is solely represented by At =
Xd
Rt
0
µs ds and
is a discontinuous square integrable martingale.
Note that some of the results in this thesis are delivered under stronger conditions,
namely that the underlying process X follows a Poisson jump diffusion process with constant drift and diffusion coefficients, where this is clearly indicated in the assumptions
preceding these results.
3
Quadratic Variation and Realized Volatility
3.1
Quadratic Variation
Definition 1 (Quadratic Variation) Let p, q be semimartingales. Then the quadratic
variation, [·, ·], of the process p on the interval [0, t] is given by
[p, p]t =
p2t
−
Z
p20
t
−2
ps− dps ,
(3.1)
0
and we define the quadratic covariation of p and q as
Z t
Z t
qs− dps .
ps− dqs −
[p, q]t = pt qt − p0 q0 −
(3.2)
0
0
The quadratic variation and covariation processes are cadlag, non-decreasing, adapted and
of finite variation. Furthermore, we have the polarization identity
[p, q]t =
1
([p + q, p + q]t − [p, p]t − [q, q]t ) .
2
(3.3)
Additionally, the quadratic variation satisfies
[p, p]0 = p20 = 0,
(3.4a)
∆ [p, p]t = (∆pt )2 ,
(3.4b)
as well as the following important property4
n
X
n
n
pτi − pτi−1
2
p
−→ [p, p]T ,
(3.5)
i=1
where (τ0n , τ1n , ..., τnn ) is a sequence of random partitions on [0, T ] with τin being stopping
n→∞
times, 0 = τ n ≤ τ n ≤ ... ≤ τnn = T , and max1≤i≤n τ n − τ n → 0.
0
1
i
i−1
Moreover, if one of the processes p or q is of finite variation, then
X
[p, q]t =
∆ps ∆qs ,
(3.6)
s≤t
so that the quadratic variation of a continuous process of finite variation is zero, and
the same is true for the quadratic covariation of a continuous process with a process of
finite variation. We refer to equation (2.2), and note that due to equation (3.6) we have
[A, A]t = 0, [A, M ]t = 0, as well as [M c , M d ]t = 0. This gives us for a semimartingale p
[p, p]t = [M, M ]t = [M c , M c ]t + [M d , M d ]t
X
= [M c , M c ]t +
(∆Ms )2 ,
(3.7)
0≤s≤t
4
See Protter (2003), Theorem II.23 (p. 68) or Jacod and Shiryaev (2003), Theorem I.4.47 (p. 52)
9
3
QUADRATIC VARIATION AND REALIZED VOLATILITY
10
so that the quadratic variation of the process p depends only on the quadratic variation
of its martingale component M .
Definition 2 (Sharp Bracket Process) Let p be a semimartingale with locally integrable quadratic variation, then the sharp bracket (or angle bracket, or conditional quadratic
variation) process, hp, pit , is the compensator of [p, p]t on the interval [0, t]. That is, it is
the unique predictable process that makes [p, p]t − hp, pit into a local martingale.
In case pc is a continuous semimartingale with integrable quadratic variation, the sharp
bracket and quadratic variation processes do not differ, i.e.
[pc , pc ]t = hpc , pc it .
(3.8)
Additionally, if M is a square integrable martingale, then we have
IE Mt2 = IE ([M, M ]t ) = IE (hM, M it ) .
3.2
(3.9)
The Full Grid and Realized Volatility
Due to the importance of observed price information for the realized volatility, we introduce the concept of grids which contain observation times over a specific time interval.
Subsequently, we define the realized volatility and realized covariance along these grids.
Definition 3 (Grids) Let p be a logarithmic price process, which is observed at times
0 = tn0 ≤ tn1 ≤ ... ≤ tnn = T , then the full grid of all observed points is defined as
G n := {tn0 , ..., tnn } ,
where the time increments (tni , tni+1 ] are assumed to be equidistant, i.e.
δtni = δtn =
T
,
|G n |
i = 1, ..., n,
(3.10)
with δtni := tni − tni−1 being the time between successive observations in G n . Furthermore,
we define arbitrary subgrids Hn ⊆ G n , where
|Hn | := (Number of points in the grid Hn ) − 1 .
To indicate successive elements in a subgrid, we define tni,− and tni,+ as the preceding and
following elements of tni ∈ Hn in the subgrid Hn .
3
QUADRATIC VARIATION AND REALIZED VOLATILITY
11
Thus, if Hn = G n , then tni,− = tni−1 and tni,+ = tni+1 . Furthermore, |Hn | is also the
number of time increments (tni , tni,+ ] in Hn such that both endpoints of the time intervals
are contained in Hn , and hence |G n | = n.
Note that in case the number of time increments, |G n | = n, is regarded for n →
∞, the sequence of grids G n = {0 = tn0 , ..., tnn = T } of observation times is considered
asymptotically, as well. That is, on a fixed interval [0, T ], G n becomes more and more
dense as the number of time increments in G n increases such that
δtn =
T
→ 0,
|G n |
as n → ∞,
(3.11)
and rates of convergence are hence given relative to the partition scheme above.
Due to simpler notation the use of double subscripts, indicating the dependence on n,
has been avoided in case the conception is clear, but it is emphasized that all conditions
and results have to be interpreted in the above noted way.
Definition 4 (Realized Volatility) The realized volatility and covariance (or observed
quadratic variation and covariation), d·, ·e, of the processes p and q, on an arbitrary subgrid
H ⊆ G with observation times in [0, T ], are defined as
X
2
dp, peH
=
pti − pti,− ,
T
(3.12a)
ti,− ,ti ∈H, ti ≤T
dp, qeH
T
=
X
pti − pti,− · qti − qti,− ,
(3.12b)
ti,− ,ti ∈H, ti ≤T
where pti is the observation of the process p at time ti .
The realized volatility and covariance are cadlag, non-decreasing processes of finite variation, which only change values at times ti ∈ H.
As seen earlier, the quadratic variation is obtained by cumulating the instantaneous
squares of returns. Therefore, for sufficiently large n, the realized volatility provides an
arbitrarily good approximation to quadratic variation, and from (3.5) we have the result
dp, peH
T
n
p
−→
[p, p]T
(3.13)
with maxi |tni − tni,− | → 0, min Hn → 0, and max Hn → T as n → ∞, where tni,− , tni ∈ Hn .
Hence, in this model the realized volatility consistently estimates the quadratic variation. This property is essential as the direct observability of the realized volatility, for any
fixed sampling frequency n on the time interval [0, T ], also makes the quadratic variation
basically observable using high-frequency price information. Moreover, it is important
n
to note that convergence in the mean follows if additionally dp, peH
is uniformly
T
n∈IN
integrable.
3
QUADRATIC VARIATION AND REALIZED VOLATILITY
12
Definition 5 (Order Notation) Let {an } and (bn } be sequences of real numbers. The
O- and o- notation is defined by
an = o (bn )
:⇐⇒
an = O (bn )
an
lim n→∞ bn
:⇐⇒
= 0,
an
lim sup bn
n→∞
and
< ∞.
The Op - and op - notation is defined by
an p
−→ 0
bn
an
∀ η > 0 ∃ M < ∞ ∃ n0 : P bn
an = op (bn )
an = Op (bn )
3.3
:⇐⇒
:⇐⇒
as n → ∞ ,
and
> M < η ∀ n > n0 .
Volatility Measure
The aim of this subsection is to give a rationale for the fact that practitioners see realized
volatility as a feasible volatility measure, and also to point out under which assumptions
such an undertaking is possible. Thus, we will introduce assumptions beyond those given
for the process p as in (2.1) or respectively those given for the process X as in (2.4),
and employ these assumptions in this subsection to validate the usefulness of the realized
volatility.
To understand the relationship between the quadratic variation and realized volatility
of a square integrable process p ∈ SSM as given in (2.1), and their importance as a
volatility measure, we proceed by taking a closer look at the variance of pt = At + Mt ,
which reveals
Var(pt ) = Var(Mt ) + Var(At ) + 2 Cov(Mt , At ).
(3.14)
It is possible to see above that the interdependencies between M and A create further
complexities in the estimation of the variance of p. Anderson et al. (2003), reduce these
interdependencies by making the assumption that the process A is a continuous predetermined function over [0, T ], and back these restrictive assumption with an economic and
financial rationale5 . Moreover, the consequences of this assumption seem justified from a
down-to-earth perspective as in many models for short term analysis, the drift and the
martingale component are assumed to be uncorrelated, and if the time interval [0, t] is
5
Anderson et al. (2003), (p.6 et sqq.) state: ”Although these conclusions may appear to hinge on
restrictive assumptions, they apply to a wide set of models used in the literature. For example, a constant
mean is frequently invoked in models for daily or weekly asset returns.” For further discussion on the
economic and financial justifications see Anderson et al. (2003).
3
QUADRATIC VARIATION AND REALIZED VOLATILITY
13
small, the variance of the drift is regarded as almost negligible compared to the variance
of the martingale component. This assumption can be formulated as
A ∈ FV c , Cov(At , Mt ) = 0, Var(At ) = 0,
n
n
n→∞
n→∞
and IE dA, M etH
→ 0, IE dA, AeH
→ 0,
t
(3.15)
(3.16)
where both assumptions above are true if A is deterministic, and (3.16) alone follows from
n
n
(3.13) together with dA, M eH
and dA, AeH
being uniformly integrable. The consequence
t
t
is
Var(pt )
(3.15)
=
Var(Mt ) = IE(Mt2 )
(3.9)
=
IE([M, M ]t ) = IE([p, p]t ),
(3.17)
which depicts that under the economically justified assumptions given in (3.15) the variance of pt equals the expectation of its quadratic variation, which is feasible, as we have
seen in (3.13) that the quadratic variation of pt can be consistently estimated by the
realized volatility of pt on the grid H.
Note that the convergence in probability of realized volatility to quadratic variation
does generally not imply a convergence in expectation of these two terms. However, with
dp, peH
t
n
n
n
n
= dA, AeH
+ 2 · dA, M eH
+ dM, M eH
t
t
t ,
and given the assumptions in (3.16) together with the uniform integrability of dM, M eH
t
n
we have
n
lim IE dp, peH
= IE ([M, M ]t )
t
n→∞
(3.15) & (3.9)
=
Var (pt ) ,
(3.18)
which means that the expected realized volatility asymptotically is an unbiased estimator,
and would allow an acceptable approximation to the variance of returns.
Hence, we have clarified under which conditions it is possible to draw a feasible connection between the variance, quadratic variation and realized volatility for models using
special semimartingales, where, as noted by Anderson et al. (2003), practitioners utilize
models with such assumptions as given above for a short term analysis of asset returns.
We apply these results firstly to the continuous Itô process X c ∈ SSMc , secondly to the
pure jump process X d ∈ FV and eventually sum up the outcomes for the jump diffusion
process X = X c + X d ∈ SSM.
For an Itô process X c the quantities µt dt and σt2 dt are the infinitesimal predictive
mean and volatility of returns with
L
dXtc | Ft− ∼ N µt dt , σt2 dt ,
3
QUADRATIC VARIATION AND REALIZED VOLATILITY
14
and we can infer that the quadratic variation of X c equals the integrated volatility of the
process X c over the time interval [0, t], i.e.
Z t
Z t
Z t
c
c
σs2 ds,
σs dBs =
[X , X ]t =
σs dBs ,
(3.19)
0
0
0
so that the quadratic variation is a cumulative volatility measure for a univariate diffusion,
and under the conditions leading to equation (3.17) we have
Z t
2
IE
σs ds = IE ([X c , X c ]t ) = Var (Xtc ) .
(3.20)
0
Most notably, it is possible to approximate the integrated volatility of X c through X c ’s
realized volatility with (3.13), i.e.
dX
c
n
, X c eH
T
T
Z
p
σs 2 ds,
−→
(3.21)
0
so that the realized volatility of an Itô process consistently estimates its integrated volatility, and herewith makes it observable through high-frequency data.
For the homogeneous compound Poisson process X d the quantity λ · IE (γ) dt is the
infinitesimal mean, and λ · IE γ 2 dt is the infinitesimal volatility of returns so that the
integrated volatility of X d over the time interval [0, t] is given by6
Z
t
Nt
E
i
D
h
X
λ · IE γ 2 ds = M d , M d
6= M d , M d =
γk2 .
t
0
t
(3.22)
k=0
Because of this fact we come to the conclusion that in case of a compound Poisson process,
where due to its non-continuity the quadratic variation is not identical to the sharp bracket
process, the quadratic variation of Xtd is not equal to Xtd ’s integrated volatility. Nonetheless, under condition (3.15) together with equation (3.9) we have
Z t
D
E
2
IE
λ · IE γ ds
= IE M d , M d
t
0
(3.9)
=
IE
=
IE
h
i
M d, M d
!
Nt
X
γk2
t
(3.9)
=
2
IE Mtd
(3.15)
=
Var Xtd ,
(3.23)
k=0
and therefore we can state that based on assumption (3.15) the expectation of Xtd ’s
quadratic variation equals the expectation of Xtd ’s integrated volatility as well as it’s
variance. Although equation (3.22) indicates that, in contrast to an Itô process, in a pure
6
For the first equality see Klebaner (2005), Theorem 9.15 (p.260)
3
QUADRATIC VARIATION AND REALIZED VOLATILITY
15
jump case it is not possible to make any general statements about a consistent estimation
of the integrated volatility of Xtd through its realized volatility, we can assent that, un
H n
der assumption (3.15) and the uniform integrability of X d , X d t , equation (3.18) holds
so that the expectation of the realized volatility of Xtd is a consistent estimator of Xtd ’s
variance, i.e.
n
d
lim IE(dX d , X d eH
t ) = Var(Xt ).
n→∞
(3.24)
Moving to the process X we note that the quadratic variation of Xt follows from (3.7),
and is given by
Z
[X, X]t =
t
σs2 ds +
0
Nt
X
γk2 .
(3.25)
k=1
Moreover, under the conditions leading to (3.17), and due to the independence of X c and
X d the variance of Xt results from equations (3.20) and (3.23), and equals
Var (Xt )
=
Var (Xtc ) + Var Xtd
(3.17)
=
IE [X c , X c ]t + [X d , X d ]t
!
Z t
Nt
X
2
2
γk
.
=
IE
σs ds +
0
(3.26)
k=1
Furthermore, (3.13) implies that the realized volatility of Xt is a consistent estimator of
Xt ’s quadratic variation on the grid Hn , i.e.
n
dX, XeH
t
p
t
Z
σs2
−→
0
ds +
Nt
X
γk2 ,
(3.27)
k=1
and that, under the specific assumption given in (3.15) and the uniform integrability of
n
dX, XeH
t , the expectation of the realized volatility of Xt is an asymptotically unbiased
estimator of Xt ’s variance, that is
n
lim IE(dX, XeH
t ) = Var(Xt ).
n→∞
(3.28)
Consequently, under the economically justified assumptions we presented at the beginning
of this subsection, it is possible to empirically determine an estimate of the expectation
of the realized volatility of Xt through an analysis of a multitude of time intervals, and
therewith be able to give an approximation of the variance of the returns of the underlying
asset, which is an important outcome to this derivation. Specifically, the fact that this
approach is non-parametric, and thus requires no specifications with respect to the drift
and diffusion functions is of relevance.
3
QUADRATIC VARIATION AND REALIZED VOLATILITY
16
These results give us adequate justification to regard the realized volatility as a sensible instrument in the study of a Poisson jump diffusion’s volatility, which in the sense
of the evaluation above is embodied by a jump diffusion’s quadratic variation. Although
the further analysis will not hinge on assumptions as presented above, we have an understanding of why practitioners regard the realized volatility and quadratic variation as
feasible. Therefore, we proceed as mentioned in section 1 by analyzing the effects of market microstructure noise and discretization on the estimation of X’s quadratic variation
with noisy high-frequency data of the Poisson jump diffusion process X.
4
Estimating Quadratic Variation on the Full Grid
The findings in this section partially expand the contributions in the second section of
Zhang et al. (2005) from a model based on a continuous semimartingale to the Poisson
jump diffusion process X of form (2.4) with X c as in (2.5) and X d as in (2.6).
4.1
Error from Market Microstructure Noise on the Full Grid
Taking into account that contamination due to market microstructure noise is not considered in the examined price process X, we begin by incorporating the observation error of
the true price through the introduction of the observed process Y on the full grid G on
[0, T ], which is given by
Yti = Xti + ti ,
(4.1)
where the ti ∈ G, and the process X is the latent true logarithmic price as in (2.4),
which would be the correct price in a perfect market without frictions, trading costs
and informational asymmetries. The ti ’s that represent microstructure noise around the
true price are independent of F and satisfy the assumptions that the ti ’s are i.i.d with
IE (ti ) = 0 as well as IE 4ti < ∞.
The observed realized volatility, dY, Y eGT , which is the realized volatility of the observed
process Y along the full grid G as detailed in (3.12a) has the following form
dY, Y eGT = dX, XeGT + 2 · dX, eGT + d, eGT .
(4.2)
Lemma 1 Assume X to be a process of form (2.4) and Yti , ti to be defined as in (4.1),
then the expectation and variance of the observed realized volatility, dY, Y eGT , conditional
on the process X, are given by
IE(dY, Y eGT | X) = dX, XeGT + 2n IE 2
(4.3)
and respectively
=
Var(dY, Y eGT | X)
4n IE 4 + 8 dX, XeGT · IE 2 − 2 Var 2 + Op n−1/2 .
(4.4)
Proof: We begin with the derivation of the conditional expectation in (4.3), given by
IE(dY, Y eGT | X) = IE dX, XeGT + 2 dX, eGT + d, eGT | X
= dX, XeGT + 2 IE dX, eGT | X + IE d, eGT .
{z
} |
{z
}
|
IL1
17
IIL1
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
18
Recalling that the zero-mean ti are i.i.d. and independent of the process X, we continue
by deriving IL1 through a more general consideration on an arbitrary subgrid H ⊆ G as
we will need these more general results for some of the proofs in the next section, and
receive
IE dX, eH
T
= IE
X
Xti − Xti,− · ti − ti,− X
X
ti,− ,ti ∈H, ti ≤T
X
=
ti,− ,ti ∈H, ti ≤T
Xti − Xti,− · IE ti − ti,−
|
{z
}
=
0
(4.5)
=0
and respectively IIL1
IE d, eH
T
X
= IE
ti − ti,−
2
ti,− ,ti ∈H, ti ≤T
=
X
− 2
IE 2ti + IE 2ti,−
X
ti,− ,ti ∈H,
ti,− ,ti ∈H, ti ≤T
IE (ti ) · IE ti,−
| {z } | {z }
t ≤T
i
=0
= 2 · |H| · IE 2
=0
(4.6)
so that in case H = G this results in
IE(dY, Y eGT | X)
|G n |=n
=
dX, XeGT + 2 · n · IE 2 .
For the calculation of the conditional variance in equation (4.4) see Zhang et al. (2005),
subsection A.1 (p. 1407), as their evaluation likewise applies to the semimartingale X as
in (2.4) under the conditions stated for X c as in (2.5) and for X d as in (2.6) so that (4.4)
holds.
2
Lemma 1 shows how tremendously the realized volatility is biased by market microstructure noise and how it clearly estimates the wrong quantity. Rather than estimating the true realized volatility, dX, XeGT , the observed realized volatility, dY, Y eGT , relates
to the second, IE 2 , and forth moment, IE 4 , of the noise term .
Lemma 2 Assume X to be a process of form (2.4) and Yti , ti to be defined as in (4.1),
then for n → ∞, conditional on the X process,
p
1 L
√ dY, Y eGT − 2n IE(2 ) X −→ 2 IE (4 ) × Znoise ,
n
(4.7)
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
19
where Znoise is a standard normal random variable with the subscript ”noise” indicating
that the randomness comes from the noise term . Furthermore, a consistent estimator of
the variance of the noise term is given by
d2 ) = 1 · dY, Y eG
IE(
T
2n
so that, conditional on X, as n → ∞
√
L p
d2 ) − IE(2 ) X −→
n · IE(
IE (4 ) × Z ,
(4.8)
where the random variable Z is standard normal, and the fourth moment of the noise
term, , can be consistently estimated by
IEd
(4 ) =
n
4
1 X
d2 ).
·
Yti − Yti−1 − 3 · IE(
2n
(4.9)
i=1
Proof: For the proof see Zhang et al. (2005), where these results are part of Theorem A.1,
subsection A.2, (p. 1408-9). Theorem A.1 is based on Lemma A.2 (a), subsection A.2, (p.
1408), which does not use any properties of a solely continuous Itô process, but can be
applied directly to a special semimartingale as used in this thesis. Despite using Lemma
A.2 (a), the result of Theorem A.1 in Zhang et al. (2005) is conditional on the underlying
process X, and thus unrelated to the nature of the process X. Moreover, Theorem A.1 in
Zhang et al. is only concerned with the distribution of the realized volatility of the noise
terms in the full and multi grid case, hence the proof of Theorem A.1 remains unchanged
in case X is of form (2.4).
2
Lemma 1 shows that the bias of the realized volatility increases linearly as the sampling
frequency is increased. Nonetheless, a high sampling frequency can be very beneficial if
the effects from discretization are considered. On the one hand, the proportion of contamination due to market microstructure noise is increased by sampling finely and thus
the observed realized volatility, dY, Y eGT , reflects the realized volatility of the underlying
process, dX, XeGT , with ever less accuracy. On the other hand, as the sampling frequency
is increased, the time interval between two observation times becomes smaller, and hence
the true quadratic variation between these two observations will be captured with greater
precision by the realized volatility of the underlying process, dX, XeGT . As a result, empirical researchers have established the general rule in regard to realized volatility to sample
less frequently, that is to use only a fraction of the price information available, to balance
these two effects, and thus discard a large proportion of the available data.
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
20
Zhang et al. implement sparse sampling by considering subgrids Hn ⊆ G n with |Hn | =
nsparse . To study the total error from noise and discretization, dY, Y eH
T − [X, X]T , we first
present the effect of sparse sampling on the estimation of the observed realized volatility,
dY, Y eH
T , under the presence of market microstructure noise as given in Zhang et al. (2005).
Subsequently, we analyze the discretization error between the realized volatility of the
underlying process, dX, XeH
T , and the true quadratic variation, [X, X]T .
Lemma 3 Assume X to be a process of form (2.4) and Yti , ti to be defined as in (4.1).
Suppose that for a given n, the grid G n and subgrid Hn are given, and that nsparse → ∞
and δtn → 0 as n → ∞ is fulfilled for the sequence of grids G n . Then, conditional on
the X process,
dY, Y eH
T
n
n
= dX, XeH
+ 2 nsparse · IE 2
(4.10)
T
1/2
n
2
× Znoise
+
4 nsparse · IE 4 + 8 dX, XeH
− 2 Var 2
T · IE + Op n−1/4 ,
where Znoise is an asymptotically standard normal random variable with the subscript
”noise” indicating that the randomness comes from the noise term .
Proof: For the proof see Zhang et al. (2005), proof of Lemma 1, subsection A.2 (p. 1409),
where the proof is based on Lemma A.2 (a) (p. 1408) as well as Theorem A.1 (p. 1408-9)
from Zhang et al. (2005), which both remain true if X is a a special semimartingale7 . As
the prerequisites of the proof of Lemma 1 of Zhang et al. (2005) remain true unchanged
as well as the fact that the result of Lemma 1 is conditional on the X process, and thus
unrelated to its nature, we can assert that Lemma 1 of Zhang et al. (2005) is valid for the
jump diffusion process X as given in (2.4) of this thesis. Please note that in contrast to
Lemma 1 of Zhang et al. (2005), the law of the noise term does not change with n in this
thesis’ Lemma 3. Thus, the remainder term Op n−1/4 that results from the conditional
variance of the observed realized volatility as given in (4.4) does not depend on IE 2 as
2
is the case in Lemma 1 of Zhang et al. (2005).
4.2
Error from Discretization on the Full Grid
n
We now aim for a result on the variance of the discretization error, dX, XeGT − [X, X]T ,
of the realized volatility of the process X in regard to its quadratic variation. First, we
introduce a result on the approximate behavior of the discretization error of the continuous
7
Refer to the proof of Lemma 2 of this thesis, where we detail the reasons why Lemma A.2 (a) and
Theorem A.1 of Zhang et al. (2005) apply to a general semimartingales without change.
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
21
process X c . For this purpose it is necessary to introduce a mode of convergence termed
“stable convergence in law”8 .
Definition 6 (Stable Convergence In Law) A sequence of random variables Zn conL
S
verges stably in law to Z, written “ Zn −→
Z”, if Z is defined on an extension (Ω̃, F̃, P̃)
of the probability space (Ω, F, P), and if
lim IE (U f (Zn )) = IE (U f (Z)) ,
n→∞
(4.11)
for every bounded, continuous f , and every F-measurable, bounded random variable U .
It is important to note that stable convergence in law implies weak convergence, which
can be defined as given above by taking U = 1. We will give adequate reasons for the
introduction of this mode of convergence after its use in the next proposition.
Now, we recite the result from Zhang et al. (2005), equation (25) (p. 1398) on the
n
discretization error of the continuous semimartingale X c in the full grid G n , dX c , X c eGT −
[X c , X c ]T , and present a proof which is based on Jacod and Protter (1998), Theorem 5.5
(p. 293).
Proposition 1 Under the conditions given in connection with (2.5) for the process X c ,
then as n → ∞
Z
n
LS
n1/2 · dX c , X c eGT − [X c , X c ]T −→
2·T
T
1/2
c
,
σs4 ds
· Zdisc
(4.12)
0
where Zdisc is a standard normal random variable which is independent of the process X c ,
with the subscript ’disc’ denoting the randomness being caused by discretization effects.
Jacod and Protter (1998) utilize this stronger mode of convergence than weak convergence to prove the central limit theorem at hand, as the variance of the limiting discretization error given in the proposition above is stochastic. Although stable convergence in law
in this case is weaker than convergence conditional on X c , the convergence portrayed above
n
indicates that the scaled discretization error, n1/2 · dX c , X c eGT − [X c , X c ]T , converges to
the limiting random variable jointly with the underlying process X c . In the proof we will
c
see that Zdisc
results from the appearance of an additional Brownian motion W , whose
independence of the process X c is of great importance, because then conditional on F the
limit of the scaled discretization error is a centered normal random variable with variance
RT
2 T 0 σs4 ds.
8
See Jacod and Protter (1998), Chapter 2 (p. 270) or Hall and Heyde (1980), Chapter 3.2.(iv) (p. 56)
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
22
Proof: By the definition of the quadratic variation we have for ti,− , ti ∈ Hn
Z t
2 (3.1)
c
c
c
c
Xs−
dXsc and thus
Xt
= [X , X ]t + 2
0
2
Xtci · Xtci,− = Xtci,− · Xtci − Xtci,− + Xtci,−
Z ti
Z ti,−
c
c
c
c
c
=
Xti,− dXs + [X , X ]ti,− + 2
Xs−
dXsc ,
ti,−
0
so that for ti−1 , ti ∈ G n we have
Xtci
−
Xtci−1
2
c
c
c
Z
c
= [X , X ]ti − [X , X ]ti−1 + 2
ti
ti−1
c
Xs−
− Xtci−1 dXsc .
(4.13)
Summing over i = 1, ..., n for the full grid G n leads to
n
dX
c
n
, X c eGT
c
c
− [X , X ]T
= 2·
|G | Z
X
ti
i=1 ti−1
T c
Xs−
0
Z
= 2·
For an analysis of the asymptotic behavior of
RT 0
c
Xs−
− Xtci−1 dXsc
c
c
− Xδt
n ·b s/ δtn c dXs .
(4.14)
c − Xc
c
Xs−
δtn ·b s/ δtn c dXs , we refer to
Theorem 5.5 of Jacod and Protter (1998), where the authors show that if X c satisfies the
assumptions in connection with (2.5), then as n → ∞
n
1/2
Z
T
c
Xs−
0
−
c
Xδt
n ·b s/ δtn c
dXsc
1/2 Z T
T
·
σs2 dWs ,
−→
2
0
LS
where the convergence is stable in law, and W is a standard Brownian Motion, independent
of X c . The independence of W and X c together with the equality in (4.14) implies that
Z
n
LS
n1/2 · dX c , X c eGT − [X c , X c ]T
−→
2·T
T
σs4 ds
1/2
c
· Zdisc
.
0
2
Next, we introduce a result on the behavior of the discretization error of X d .
Proposition 2 Assume the conditions given in connection with (2.6) for the process X d ,
then for n large enough
n
dX d , X d eGT − [X d , X d ]T
a.s.
=
0.
(4.15)
Proof: For large enough n almost all paths of X d have no more than one jump be
2 tween any two successive observation times of G n , so that Xtdi − Xtdi−1 = X d , X d t −
i
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
d d
X ,X t
i−1
23
2
for all i = 1, ..., n.
n
Now, we begin the analysis of the variance of the discretization error, dX, XeGT −
[X, X]T , of the realized volatility of the process X in regard to its quadratic variation,
where we contribute to this topic through an analysis of the variance of the discretization
error resulting from the process X d as well as the error resulting from the interdependencies
between the processes X c and X d .
Theorem 1 Let X be a process of form (2.4) with X c as in (2.5) and X d as in (2.6) with
(2.7) in effect. We additionally assume that µ and σ are constant. Then,
IE dX, XeGT
n
Var dX, XeGT
n
− [X, X]T
− [X, X]T
= o n−1/2
(4.16)
= n−1 · Σ2 + o n
−1
,
(4.17)
σ 2 ds .
(4.18)
where Σ 2 has the following form
Σ2 = 2T
Z
T
σ 4 ds + 2 (λ T )2 (Var γ)2 + 4 λ T Var (γ)
0
Z
T
0
Lemma 4 Let X c be a process as in (2.5) and X d as in (2.6) with (2.7) in effect. Then,
dX d , X d eGT
n
n
− [X d , X d ]T and dX c , X d eGT have zero expectation and variance
1
n
· 2 (λ T )2 (Var γ)2 , and
=
Var dX d , X d eGT − [X d , X d ]T
n
Z T
1
c
d Gn
=
Var dX , X eT
· λ T Var (γ)
IE σs2 ds + O n−3/2 .
n
0
(4.19)
(4.20)
Lemma 5 Let X c be a process as in (2.5). We additionally assume that µ and σ are
constant. Then,
n
IE dX c , X c eGT − [X c , X c ]T
= o n−1/2
Z T
1
c
c Gn
c
c
·2·T
σ 4 ds + o n−1 .
Var dX , X eT − [X , X ]T
=
n
0
Proof of Lemma 4:
In the first step we show (4.19) and in the second step (4.20).
(4.21)
(4.22)
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
24
Step 1
Firstly, we concentrate the analysis on the variance of the discretization error of X d
n
on the full grid, dX d , X d eGT − [X d , X d ]T . We conduct the derivations for an arbitrary
subgrid of form Hn = {0, ..., T }, as we require these more general results for the proofs in
the multi grid case presented in the next chapter, and upon that apply the outcome on
the full grid G n .
We start with the calculation of the mean, which equals
IE dX
= IE
d
n
, X d eH
T
tX
i ≤T
d
d
− [X , X ]T
Xtdi − Xtdi,−
2
NT
X
− IE
ti,− ,ti ∈Hn
!
γi2
=
0,
(4.23)
i=1
as the first summand in the second equality is given by
IE
tX
i ≤T
ti,− ,ti
Xtdi − Xtdi,−
2
=
∈Hn
tX
i ≤T
ti,− ,ti
=
IE
∈Hn
tX
i ≤T
ti,− ,ti
Nti
X
2
γk
k=Nti,−
IE
∈Hn
Nti
X
!
γk2 + 2 · IE
=
ti,− ,ti
=
IE IE
∈Hn
tX
i ≤T
γk γl
Nti,− ≤ k < l ≤ Nti
k=Nti,−
|
tX
i ≤T
X
Nti
X
k=Nti,−
{z
= 0 as k 6= l
}
γk2 Nti − Nti,−
IE Nti − Nti,− · IE γ 2
ti,− ,ti ∈Hn
=
tX
i ≤T
λ · (ti − ti,− ) · IE γ 2
ti,− ,ti ∈Hn
= λT · IE γ 2 ,
and respectively the second summand equals
!
!!
NT
NT
X
X
IE
γi2
= IE IE
γi2 NT
= IE NT IE γ 2
= λT · IE γ 2
i=1
i=1
so that the summands cancel each other.
(4.24)
(4.25)
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
25
Thus, with the result regarding the mean in (4.23), the variance of the discretization
error of X d equals the second moment of X d ’s discretization error
2
n
d
d
d
d Hn
d
d
Var dX d , X d eH
−
[X
,
X
]
=
IE
dX
,
X
e
−
[X
,
X
]
T
T
T
T
2
tX
i ≤T
n
d
d Hn
d
d
d
d
= IE
dX d , X d eH
ti − dX , X eti,− − [X , X ]ti − [X , X ]ti,−
ti,− ,ti ∈Hn
=
tX
i ≤T
IE
2
n
d
d Hn
d
d
d
d
dX d , X d eH
+
ti − dX , X eti,− − [X , X ]ti − [X , X ]ti,−
ti,− ,ti ∈Hn
n
d
d Hn
d
d
d
d
IE dX d , X d eH
−
dX
,
X
e
−
[X
,
X
]
+
[X
,
X
]
t
t
ti
ti,−
i
i,−
{z
}
0≤ti <tj ≤T ; ti ,tj ∈ Hn |
X
2 ·
= 0 analogous to (4.23)
·
=
tX
i ≤T
IE
IE dX
|
d
n
, X d eH
tj
n
d
d
d
d
− dX d , X d eH
tj,− − [X , X ]tj + [X , X ]tj,−
{z
}
= 0 analogous to (4.23)
2
n
d
d
d
d
d
d Hn
−
[X
,
X
]
−
[X
,
X
]
dX d , X d eH
−
dX
,
X
e
,
t
t
ti
ti,−
i
i,−
(4.26)
ti,− ,ti ∈Hn
where the last equality is due to the fact that the expectation of the mixed discretization
terms vanishes, because the compound Poisson process X d has independent increments,
and therefore the discretization errors on two time intervals (ti,− , ti ] and (tj,− , tj ] are
independent for i 6= j and ti , tj ∈ Hn , which results in equation (4.26) as analogous to
(4.23) we can observe that the expectation of X d ’s discretization error between any two
observations is zero.
Considering the result above, it is possible to receive the variance of the discretization
error of X d on [0, T ] by deriving the variance of the discretization error on each time
interval (ti,− , ti ], where ti,− , ti ∈ Hn , which is given by
Var
= IE
Hn
= IE
d
n
, X d eH
ti
− dX
2
Nti
X
γk −
k=Nti,−
dX d , X d eti − dX d , X d eti,− − [X d , X d ]ti − [X d , X d ]ti,−
dX
Hn
d
n
, X d eH
ti,−
Nti
X
k=Nti,−
= IE 4 · IE
X
Nti,− ≤k<l≤Nti
d
d
IE 2
X
d
− [X , X ]ti − [X , X ]ti,−
2
γk2
d
!
!2
2
=
·
γk · γl
Nti,− ≤k<l≤Nti
2
γk · γl Nti − Nti,−
(4.27a)
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
26
X
= 4 · IE
2
IE γk2 γl
Nti−1 ≤ k < l≤Nti
2
2
1 = 4 · IE
· Nti − Nti,− − Nti − Nti,−
· IE(γ 2 )
2
2
= 2 · λ (ti − ti,− ) (λ (ti − ti,− ) + 1) − λ (ti − ti,− ) · IE(γ 2 )
= 2 · λ2 (ti − ti,− )2 · (Var γ)2 ,
(4.27b)
(4.27c)
where the fifth equality results from the fact that the zero mean random variables γk
and γl are independent for k 6= l so that only those mixed terms of the squared sum
in equation (4.27a) remain whose expectation is non-zero, which is the case for those
2
1
·
N
−
N
−
N
−
N
summands with pairwise identical indices.
ti
ti,−
ti
ti,−
2
Hence, it is possible to use (4.26) and (4.27c) to conclude that the variance of the
discretization error of X d for an arbitrary subgrid Hn = {0, ..., T } equals
Var dX
d
n
, X d eH
T
d
d
− [X , X ]T
tX
i ≤T
2
2
= 2 · λ · (Var γ) ·
(i − (i, −))2 ·
ti,− ,ti ∈Hn
T2
.
n2
Since the full grid G n satisfies (i − (i, −))2 = (i − (i − 1))2 = 1, for i = 1, ..., n, the variance
of the discretization error of X d for the full grid G n equals
2
n
(λ T )2 · (Var γ)2 .
Var dX d , X d eGT − [X d , X d ]T
=
n
(4.28)
Step 2
The next step of the proof analyzes the interdependencies of X d and X c when evalun
ating the variance of the total discretization error, dX, XeGT − [X, X]T . We calculate the
variance of the observed realized covariance of the processes X c and X d .
Recalling that X c and X d are independent, the compound Poisson process has independent increments as well as the fact that X d ’s jump sizes γk have zero mean, we go
n
on calculating the mean of dX c , X d eGT more generally for an arbitrary subgrid Hn , which
equals
Hn
IE dX c , X d eT
tj ≤ T
= IE
X
Xtcj − Xtcj,− · Xtdj − Xtdj,−
tj,− , tj ∈ Hn
tj ≤ T
=
IE Xtcj − Xtcj,− · IE Xtdj − Xtdj,− = 0 .
{z
}
|
tj,− , tj ∈ Hn
X
(4.29)
= 0
Gn
This leads to the calculation of Var dX c , X d eT
, where we aim for a more general
result for subgrids with equidistant observation times, as we will need these derivations in
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
27
the next chapter during the analysis of the discretization error on the multi grid. Suppose
that
T
tni − tni,− = Kn · δtn = Kn ·
n
Kn
where n > Kn ∈ IN and
→ 0 as n → ∞.
n
n
Heq
⊆ G n , with
for all
n
tni,− , tni ∈ Heq
, (4.30a)
(4.30b)
Then, we have
(4.29)
Hn
Hn 2
Var dX c , X d eT eq
=
IE dX c , X d eT eq
2
tX
i ≤T
= IE
Xtci − Xtci,− · Xtdi − Xtdi,−
n
ti,− , ti ∈ Heq
=
tX
i ≤T
2
2
IE Xtci − Xtci,− · IE Xtdi − Xtdi,−
n
{z
}
|
ti,− , ti ∈ Heq
+
= λ δt·K· Var(γ) see (4.24)
2 ·
X
IE
Xtcr − Xtcr,−
Xtcs − Xtcs,−
(i,−)+1 ≤ r < s ≤i
· IE Xtdr − Xtdr,− · IE Xtds − Xtds,−
{z
} |
{z
}
|
=0
= λ δt · Kn · Var (γ) ·
=0
tX
i ≤T
2
IE Xtci − Xtci,− ,
n
|
{z
}
ti,− , ti ∈ Heq
(4.31)
V.1P
n , and analogous
so we proceed to analyze V.1P as given in equation (4.31) for tni,− , tni ∈ Heq
to the derivations leading to (4.13), we have
Xtci
−
Xtci,−
2
Z
ti
= [X, X]ti − [X, X]ti,− + 2 ·
ti,−
Z
ti
=
ti,−
σs2 ds + 2 ·
Z
ti
ti,−
c
Xs−
− Xtci,− dXsc
c
Xs−
− Xtci,− dXsc .
(4.32)
Note that under the assumptions regarding the process X c given in (2.5) we do know
RT
that the quantity 0 σs2 ds is integrable so that by the Itô-isometry we have the equality
R
2
R
R ti
ti
ti
IE ti,−
σs dBs = IE ti,−
σs2 ds = ti,−
IE σs2 ds < ∞, where the last equality is due
to Fubini’s theorem.
Considering the second summand on the left hand side of (4.32), we go one step
R ti c
n 3/2
further and evaluate the integrability of the quantity K
· ti,−
Xs− − Xtci,− dXsc as
we will need this outcome shortly to make a statement with respect to the order of the
expectation of the quantity, which will reveal the asymptotic negligibility of the mean of
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
R ti ti,−
28
R ti
c − Xc
c
2
Xs−
ti,− dXs compared to ti,− IE σs ds .
Z
ti c
c
c
· IE Xs− − Xti,− dXs (4.33)
ti,−
K
Z
Z
ti ti n 3/2
n 3/2
c
c
c
c
· IE Xs− − Xti,− µs ds +
· IE Xs− − Xti,− σs dBs .
≤
ti,−
ti,−
K
K
|
{z
} |
{z
}
n 3/2
J.1
J.2
Firstly, we assess J.1 as given in (4.33) and apply the Cauchy-Schwarz inequality and
Fubini’s theorem, which leads to
sZ
sZ
!
n 3/2
2
ti
ti c − Xc
IE
µ2s ds ·
ds
J.1 ≤
Xs−
ti,−
K
ti,−
ti,−
sZ
s
2
n 3/2 Z ti
ti
c − Xc
ds
(4.34)
≤
IE (µ2s ) ds ·
IE Xs−
ti,−
K
ti,−
ti,−
v
u
!2
v
Z s−
Z s−
u
n 3/2u K
K
u
u · T sup IE µ2 · u · T sup IE
≤
µu du +
σu dBu ,
un
s
un
K
t
ti,−
ti,−
s∈[0,T ]
s∈[ti,− ,ti ]
t
|
{z
}
|
{z
}
J.1.A
J.1.B
where J.1.A is finite as µ is bounded, and for J.1.B we have with the Cauchy-Schwarz
inequality
Z
J.1.B
=
sup
IE
s∈[ti,− ,ti ]
≤
!2
s−
µu du
ti,−
Z
!2
s−
+
σu dBu
Z
+2
ti,−
ti,−
r∈[ti,− ,ti ]
Z
s−
µu du
ti,−
!
!2
Z ti
K
σu2 du +
· T sup |µu | + IE
IE
n
ti,−
u∈[0,T ]
v
!2 v
u
u
Z
Z
s−
u
u
t
2·
sup IE
µu du · t sup IE
s∈[ti,− ,ti ]
s−
σu dBu
ti,−
!2
r−
σu dBu
ti,−
!2
!
2
K
K
2
·T
· IE sup |µu | +
· T · IE sup σu +
n
n
u∈[0,T ]
u∈[0,T ]
v
v
!
!
u
2 u
2
u K
uK
2·t
·T
· IE sup |µu | · t · T · IE sup σu2
n
n
u∈[0,T ]
u∈[0,T ]
≤
K
· T < ∞,
(4.35)
n
2
where supu∈[0,T ] |µu | and supu∈[0,T ] σu2 are integrable due to the boundedness of µ
=: Θ ·
and σ, and thus Θ is bounded from above. Note that the upper bound of J.1.B as detailed
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
29
n . We insert the upper bound of
in (4.35) applies for all n ∈ IN and for all ti,− , ti ∈ Heq
J.1.B into the derivations for J.1 as detailed in (4.34), which results in
r
J.1 ≤
T sup IE (µ2s ) · T 2 · Θ < ∞ .
(4.36)
s∈[0,T ]
Secondly, we evaluate J.2 as given in (4.33)
J.2 ≤
Z
!
r c
IE
sup Xs−
− Xtci,− σs dBs r∈[ti,− ,ti ] ti,−
n 3/2
K
r
≤ C · T sup IE (σs2 ) · T 2 · Θ < ∞ ,
(4.37)
s∈[0,T ]
where the last line and the finiteness are due to the fact that the process φti,− (r)
r∈[ti,− ,ti ]
R
r
c
c
is a local martingale, which means that we know
:= ti,− Xs− − Xti,− σs dBs
r∈[ti,− ,ti ]
from the Burkholder-Davis-Gundy inequality9 that a constant C, 0 < C < ∞, exists so
that
IE
sup
r∈[ti,− ,ti ]
φti,− (r)
!
≤ C · IE
q
[φti,− , φti,− ]ti
so with the knowledge of the quadratic variation of an Itô integral and the application of
the Cauchy-Schwarz inequality as well as Fubini’s Theorem we have
sZ
!
2
ti c − Xc
Xs−
σs2 ds
= C · IE
ti,−
ti,−
sZ
ti
σs4 ds ·
4
≤ C · IE
sZ
4
ti,−
sZ
ti,−
sZ
ti
≤ C · IE
IE (σs2 ) ds ·
sZ
ti,−
≤ C·
K
n
ti,−
ti
≤ C·
ti
σs2 ds ·
ti,−
sZ
ti
3/2 r
ti
ti,−
c − Xc
Xs−
ti,−
c
Xs−
−
Xtci,−
4
2
!
ds
!
ds
2
c − Xc
IE Xs−
ds
ti,−
T sup IE (σs2 ) · T 2 · Θ
<
∞,
s∈[0,T ]
where we used the outcome given in (4.35) in the last line.
Through the evaluations that resulted in (4.36) and (4.37), we have shown that the
n , and hence
upper bounds of J.1 and J.2 apply for all n ∈ IN as well as for all ti,− , ti ∈ Heq
we can conclude that a real number B exists such that
n 3/2 Z ti c
sup
IE Xs−
− Xtci,− dXsc ≤ B < ∞
ti,−
K
n
9
See Protter (2003), Theorem IV.73 (p. 222)
n
∀ ti,− , ti ∈ Heq
. (4.38)
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
30
Thus, together with the integrability results we deduced for the quantities on the right
side of (4.32), the variance detailed in (4.31) equals
Hn
Var dX c , X d eT eq
tX
i ≤T
n
= λ δt · Kn · Var (γ) ·
ti,− , ti ∈
Z
= λ δt · Kn · Var (γ) ·
σs2 ds
IE
Z
n
ti,− , ti ∈ Heq
ti
= λT ·
Kn
· Var (γ) ·
n
n
min Heq
c
Xs−
−
ti,−
Z
Xtci,−
!!
dXsc
+
n
min Heq
ti
ti,−
IE σs2 ds
n
max Heq
Z
Kn
λT ·
· Var (γ) · 2 · IE
n
ti,−
n
ti,− , ti ∈ Heq
n
max Heq
Z
+ 2 · IE
IE σs2 ds
tX
i ≤T
λ δtn · Kn · Var (γ) · 2 · IE
ti
Z
ti,−
n
Heq
tX
i ≤T
n
!
ti
c
Xs−
− Xtci,− dXsc
+
c
Xs−
−
c
Xδt
n ·K ·b s/(δtn ·K )c
n
n
!
dXsc
,
(4.39)
where bxc indicates the floor function of a real number x, which is a function that returns
the largest integer less than or equal to x. In fact, with the outcome in (4.38), this results
in
lim sup
n
1/2
n
Kn
Z
n
max Heq
IE
n
min Heq
and therefore IE
R max Hn eq
n
min Heq
c
Xs−
−
c
Xδt
n ·K ·b s/(δtn ·K )c
n
n
!
dXsc
∞,
<
(4.40)
p
c − Xc
c has the order O
Xs−
dX
K
/n
.
n
n
n
s
δt ·Kn ·b s/(δt ·Kn )c
Together with (4.39) we come to the following conclusion
Var dX c , X d e
n
Heq
Kn
= λT ·
· Var (γ) ·
n
3/2
n
max Heq
Z
n
min Heq
2
IE σs
Kn
ds + O
n3/2
!
,
(4.41)
and consequently for the grid G n , this result translates to
Var dX
c
n
, X d eGT
=
λT
· Var (γ) ·
n
Z
T
IE σs2 ds + O n−3/2 .
(4.42)
0
2
n
Proof of Lemma 5: From equation (4.14) we know that dX c , X c eGT − [X c , X c ]T = 2 ·
P |G n | R ti c
c
c
X
−
X
s−
ti−1 dXs . With the results leading to (4.38) and due to the fact
i=1 ti−1
that µ and σ are constant the zero mean property of the Itô Integral holds for the quantities
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
31
below
n
IE dX c , X c eGT − [X c , X c ]T
"
!
Z ti Z s−
Z
|G n |
X
= 2·
IE
µ du µ ds + IE
ti−1
i=1
ti−1
ti
Z
σ dBu µ ds
ti−1
ti−1
{z
|
ti
Z
+ IE
{z
}
= 0
!#
s−
σ dBu σ dBs
ti−1
1 T2
= 2 · n · · 2 · µ2 = O
2 n
Z
+ IE
ti−1
|
ti
Z
µ du σ dBs
ti−1
}
= 0
!
s−
Z
!
s−
ti−1
|
{z
}
= 0
1
.
n
(4.43)
The variance has the following form
Gn
Var dX c , X c eT − [X c , X c ]T
n
|G | Z
X
= 4 · Var
n
= 4·
ti−1
i=1
|G |
X
ti
ti
Z
ti−1
i=1
c
Xs−
− Xtci−1 dXsc
Var
|
!
c
Xs−
− Xtci−1 dXsc ,
{z
(4.44)
}
V
where the last the equality is because µ and σ are constant and due to the fact that
the Brownian motion has independent increments, which leads to the vanishing of the
covariances due to independence. The quantity V as given in (4.44) equals
Z
V
ti
= IE
ti−1
!2
c
Xs−
− Xtci−1 dXsc
|
{z
− IE
ti−1
}
V.I
ti
Z
|
!!2
c
Xs−
− Xtci−1 dXsc
{z
, (4.45)
}
= O(n−4 ) [see (4.43)]
and V.I has the following form
Z
ti
V.I = IE
ti−1
Z
ti
Z
c
Xs−
− Xtci−1 µ ds +
s−
= IE
Z
ti
Z
ti−1
ti
Z
Z
ti
ti−1
Z
σ dBu µ ds
ti−1
s−
+
c
Xs−
− Xtci−1 σ dBs
s−
µ du µ ds +
ti−1
Z
ti−1
ti
Z
ti−1
!2
s−
µ du σ dBs +
ti−1
!2
σ dBu σ dBs
ti−1
.
(4.46)
ti−1
With similar methods as those leading to (4.38) it is possible to show that the dominant
R
2
ti R s−
quantity in V.I is IE ti−1
σ
dB
σ
dB
and that the other quantities in V.I are of
u
s
ti−1
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
32
comparatively negligible order. With the Itô-isometry we receive
!2
!2
Z s−
Z ti Z s−
Z ti
IE
σ dBu σ dBs
=
IE
σ dBu σ
ds
ti−1
ti−1
ti−1
Z
ti
ti−1
s−
Z
σ 2 du σ 2 ds =
=
ti−1
=
1 T
·
2 n
ti−1
ti
Z
1
· (ti − ti−1 )2 · σ 4
2
σ 4 ds ,
(4.47)
ti−1
so that together with the explanations above V as detailed in (4.45) has the following form
!
Z
Z ti 1 T ti 4
c
c
c
Var
·
σ ds + o n−2 .
(4.48)
Xs− − Xti−1 dXs
=
2 n ti−1
ti−1
Hence, we refer to (4.44) and receive
Var dX
c
n
, X c eGT
c
c
− [X , X ]T
T
= 2·
n
Z
T
σ 4 ds + o n−1 .
(4.49)
0
2
Proof of Theorem 1:
n
We first analyze the discretization error, dX, XeGT − [X, X]T , which has the form
n
n
n
dX, XeGT − [X, X]T =
dX c , X c eGT − [X c , X c ]T + dX d , X d eGT − [X d , X d ]T
n
+ 2 · dX c , X d eGT − [X c , X d ]T .
(4.50)
| {z }
=0
where [X c , X d ]T = 0 as Xtc is continuous and Xtd is of finite variation.
n
The order of the expectation of dX, XeGT − [X, X]T as given in (4.16) now follows
directly from Lemmas 4 and 5. The variance of the discretization error is given by
n
Var dX, XeGT − [X, X]T
n
n
= Var dX c , X c eGT − [X c , X c ]T + Var dX d , X d eGT − [X d , X d ]T
|
{z
}
|
{z
}
V −IP
V −IIP
n
+ 2 · Cov dX c , X c eT − [X c , X c ]T , dX d , X d eGT − [X d , X d ]T
|
{z
}
Gn
= 0 [due to independence]
+ 4 · Cov dX
|
+ 4 · Cov
|
d
n
, X d eGT
n
dX c , X c eGT
n
− [X d , X d ]T , dX c , X d eGT
{z
}
V −IIIP
n
n
− [X c , X c ]T , dX c , X d eGT
+ 4 · Var dX c , X d eGT ,
|
{z
}
{z
}
V −IVP
V −VP
(4.51)
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
33
with Lemmas 4 and 5 providing insight into the quantities V − IP , V − IIP as well as
V − VP , where we adapt the results in Lemma 4 to constant µ and σ.
We continue by defining the variation of an arbitrary process Z between two successive
observations in the full grid G n as δZtm := Ztm − Ztm−1 . We start to derive V − IIIP as
given in (4.51)
=
n
n
Cov dX d , X d eGT − [X d , X d ]T , dX c , X d eGT
n
n
IE dX d , X d eGT − [X d , X d ]T · dX c , X d eGT
n
2
n
Nti
|G |
|G |
NT
X
X
X
X
IE
γk −
γk2 ·
δXtcj · δXtdj
=
n
|G |
X
2 ·
IE
(4.29)
=
i=1
i=1
k=Nti−1
n
Ntj
|G |
X
X
c
γk · γl ·
δXtj ·
X
Nti−1 ≤k<l≤Nti
X
j=1
γm
m=Ntj−1
Ntj
|G n |
=
j=1
k=0
IE δXtcj 2 ·
X
X
Nti−1 ≤k<l≤Nti m=Ntj−1
i, j
IE (γk · γl · γm )
|
{z
}
=
0,
(4.52)
=
0.
(4.53)
= 0 as k 6= l
and continue calculating V − IVP as given in (4.51)
(4.29)
=
=
n
n
Cov dX c , X c eGT − [X c , X c ]T , dX c , X d eGT
n
n
IE dX c , X c eGT − [X c , X c ]T · dX c , X d eGT
|G n |
X
n
IE dX c , X c eGT − [X c , X c ]T ·
δXtci · δXtdi
i=1
|G n |
=
X
i=1
IE
n
dX c , X c eGT − [X c , X c ]T · δXtcj · IE δXtdj
| {z }
=0
Final Step
We are finally able to combine the derived results and see that the discretization error of
n
n
X on the full grid, dX, XeGT −[X, X]T , is the sum of the quantities dX c , X c eGT −[X c , X c ]T ,
d d G n
n
X , X T −[X d , X d ]T and 2·dX c , X d eGT , where the respective covariances of these three
summands are zero, and hence we add the variances which results in the assertion of the
theorem.
2
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
4.3
34
Total Error and Optimal Sampling Frequency on the Full Grid
Corollary 1 Assume the conditions of Lemma 1 and Theorem 1 as well as the additional requirement that the subgrid Hn = {0, ..., T } exhibits equidistant observations, and
apply a sparse sampling frequency |Hn | = nsparse . Then, the total error from noise and
discretization is given by
n
2
−1/2
IE dY, Y eH
−
[X,
X]
=
2
n
·
IE
+
o
n
,
sparse
T
T
n
Var dY, Y eH
− [X, X]T
= Ξ 2 + O n−1/2 ,
T
(4.54a)
(4.54b)
where Ξ 2 has the form
+ 8 · IE dX, XeGT · IE 2 − 2 · Var 2
{z
}
due to noise
Z T
2
2
+
·T
σ 4 dt +
· (λ T )2 · (Var γ)2
nsparse
n
sparse
0
Z T
4
· λ T · Var (γ) ·
σ 2 ds .
+
nsparse
0
|
{z
}
due to discretization
Ξ 2 = 4 nsparse · IE 4
|
Proof: This is a direct result of Lemma 1 and Theorem 1, where it is possible to amalgamate the outcomes since the ’s and the X process are independent. As Hn is assumed
to be equidistant without loss of generality we make the derivations for the full grid G n .
The expectation follows from Lemma 1 and Theorem 1 so that we move to the variance
and receive
n
n
n
n
Var dY, Y eGT − [X, X]T
= Var dY, Y eGT − dX, XeGT + dX, XeGT − [X, X]T
n
n
n
= Var dY, Y eGT − dX, XeGT + Var dX, XeGT − [X, X]T
+
|
{z
} |
{z
}
I
II
n
n
n
(4.55)
2 · Cov dY, Y eGT − dX, XeGT , dX, XeGT − [X, X]T
|
{z
}
III
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
35
The quantity II is given in Theorem 1. We recall the assumptions given in connection
with (4.1) and first look at I and afterwards at III.
I
=
=
(4.4)
=
n
n
n
n
Var dY, Y eGT
+ Var dX, XeGT
− 2 · Cov dY, Y eGT , dX, XeGT
{z
}
|
n
= Var(dX,XeG
with
(4.2)
and
IE
() = 0
T )
n n
n + Var IE dY, Y eGT X
− Var dX, XeGT
IE Var dY, Y eGT X
{z
}
|
{z
}
|
Gn
given in (4.4)
= Var(dX,XeT ) see (4.3)
4 n · IE 4 + 8 · IE dX, XeGT · IE 2 − 2 · Var 2 + O n−1/2 , (4.56)
P
δXti · δXti−1 , where this
where the O n−1/2 results from10 IE (δXtn − δXt1 ) and IE
order is obvious for the former quantity and evidence for the order of the latter is given in
conjunction with (5.51) when we show that IE δXti · δXti−1 = O n−3/2 for i = 1, ..., n.
We move to the quantity III as given in equation (4.55)
III
(4.2)
=
n
n
n
Cov 2 · dX, eTG + d, eGT , dX, XeGT − [X, X]T
= 0,
(4.57)
again due to the independence of the ’s and the X process and IE () = 0. Hence, the
assertion of the corollary holds.
2
As detailed in Zhang et al. (2005), in the practice of financial applications in connection with the calculation of the realized volatility, the sampling frequency is usually set to
time intervals of several minutes instead of using high frequency price information, which
are mostly available at time intervals of only a few seconds. Considering the equations
in (4.54), this working method of sparse sampling is definitely reasonable as practitioners
are trying to control the bias resulting from market microstructure noise by sampling less
frequently, but are nonetheless retaining sampling intervals which are not too great so that
the realized volatility of the underlying process provides an acceptable approximation to
the true quadratic variation.
Having understood the rationale behind sparse sampling, and having an impression
of the interdependencies between the error resulting from noise and discretization respectively, the task is to determine what the optimal sparsely sampled frequency, n∗sparse =
|Hopt | , for a specific equidistant subgrid Hopt on the grid of observation times G has to be.
Analogous to Zhang et al. (2005), we move to minimize the mean squared error (MSE) of
10
See Zhang et al. (2005), subsection A.1, below equation (A.4) (p. 1407)
4
ESTIMATING QUADRATIC VARIATION ON THE FULL GRID
36
n
the estimator dY, Y eH
T , where the bias and variance of the estimator are given in (4.54),
and hence the MSE equals
MSE
=
2 nsparse · IE 2
where Γ is defined as
Z
Γ := 2 · T
2
+ O (1) + 4 · nsparse · IE 4 +
T
2
4
1
· Γ,
nsparse
Z
2
σ dt + (Var γ) (λ T ) + 2 · λ T · Var (γ)
T
(4.58)
σ ds .
2
0
0
The form of the MSE shows that the balance between the sampling frequency, nsparse ,
and the variance of the noise, IE 2 , depends primarily on the magnitude of the bias
from noise versus the size of the error due to discretization. This is due to the fact that in
equation (4.58) the noise effect on nsparse that comes from the variance is of lower order
than that which comes from the bias.
We minimize the MSE and aim for an approximation of the optimal sampling frequency
n∗sparse
by setting
∂ MSE
∂nsparse
= 0, which results in
∂ MSE
−2
+ 4 · IE 4 − nsparse
·Γ =
∂nsparse
!
−2
n2sparse
1
⇔ 0 ≈ n3sparse + IE 2
·
· IE 4 − · Γ ,
2
8
0 ≈ 8 nsparse · IE 2
2
so that we receive an approximation to the optimal sampling frequency
q
1 3
n∗sparse =
· IE (2 )−2 · Γ · (1 + o(1)) , as IE 2 → 0,
2
(4.59)
(4.60)
where the error contribution, (1 + o(1)), in (4.60) is due to the variance of the noise in
(4.59), which is asymptotically negligible for IE 2 → 0 as it is of lower order than the
contribution from discretization. This result shows that the lower the error spread is the
higher the sampling frequency is allowed to be. These considerations work if the variance
of the noise is small. Note that in case n∗sparse > n, we set n∗sparse = n. The estimator
optimally sampled according to n∗sparse is denoted as dY, Y eH
T
opt
.
It is important to note that due to the fact that the estimation of the components of
the quantity Γ under the model given in (4.1) has not been established, yet, the practical
feasibility of the result above still needs to be clarified.
5
Estimating Quadratic Variation on the Multi Grid
The findings in this section partially expand the contributions in the third section of Zhang
et al. (2005).
As presented at the end of the last section, the estimator dY, Y eH
T
opt
uses only a fraction
of the financial data available, and thus ignores one of the principle rules of statistics which
states that all available data should be utilized. Therefore, Zhang et al. (2005) introduce
a method which does not merely rely on subsampling of the available information, but also
combines the estimators of a multitude of subgrids of the full grid G n = {0 = tn0 , ..., tnn = T }
and averages these estimators. Consequently, this method joins the benefits resulting from
subsampling, as shown in the last section, with the variation-decreasing effects gained
through averaging and additionally uses all the price information available. The estimator
(avg)
dY, Y eT
, which is based on this method, will be studied in more detail in this section.
(avg)
Specifically, we aim for new results on the variance of the estimation error, dY, Y eT
−
[X, X]T , with the logarithmic price process being a Poisson jump diffusion process.
5.1
The Multi Grid
Definition 7 (Multi Grid) Let G n = {0 = tn0 , ..., tnn = T } be the full grid of equidistant
observation times as given in Definition 3. Define the subgrids G (i,n) ⊆ G n , i = 1, ..., Kn ,
such that
G
n
=
K
[n
G (i,n) , where G (i,n) ∩ G (j,n) = ∅ when i 6= j,
(5.1)
i=1
where the observation times of G (i,n) are allocated by
n
G (i,n) = tni−1 , tni−1+Kn , tni−1+2·Kn , ... , tni−1+n
·Kn
K
o
,
i = 1, ..., Kn .
(5.2)
The integer nK is defined as
nK := |G (i,n) | =
so that tni−1+n
K
·Kn
n − Kn + 1
|G n | − Kn + 1
=
,
Kn
Kn
is the last element in the subgrid G (i,n) , and denote this as regular
allocation of sample points to subgrids.
Note that under Definition 7 the assumptions of Definition 3 remain valid and that
asymptotics are still under (3.11) and
n
→ ∞
K
as n → ∞.
37
(5.3)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
38
Furthermore, due to simpler notation the use of double subscripts, indicating the
dependence on n, has been avoided in case the conception is clear, but it is emphasized
that all conditions and results have to be interpreted in the above noted way.
According to Definition 7, the integer nK represents the number of time intervals for
each subgrid G (i) , i = 1, ..., K, and depends on the subgrid frequency K, which represent
the number of subgrids as given in the definition above. Moreover, under the regular
allocation of sample points to grids, the observations of each subgrid G (i) partition the
reference time T as follows
T
nK
X
= (ti−1 − t0 ) +
(ti−1+j·K − ti−1+(j−1)·K ) + (tn − ti−1+nK ·K )
j =1
= (i − 1) · δ t + nK · K · δ t + (K − i) · δ t
T
T
+ (K − 1) · ,
=
nK · K ·
n
| {z n}
= max G (i) − min G (i)
which together with the asymptotics in (5.3) demonstrates that
n→∞
min G (i) −→
0
n→∞
max G (i) −→
as well as
T.
Furthermore, the realized volatility of the subsampled observations Yt = Xt + t , t ∈ G (i) ,
(i)
is denoted by dY, Y eT and has the following form
X
(i)
dY, Y eT =
Ytj − Ytj,−
2
tj,− , tj ∈ G (i) ; tj ≤T
nK
X
=
(Yti−1+j·K − Yti−1+(j−1)·K )2 ,
j=1
where tj,− ∈ G (i) is the element preceding tj ∈ G (i) in the subgrid G (i) .
As indicated earlier, Zhang et al. (2005) combine subsampling and averaging effects
n
to introduce a contestant to the estimators dY, Y eGT and dY, Y eH
T
dY, Y
(avg)
eT
=
=
opt
, which is given by
K
1 X
(i)
dY, Y eT
K
1
K
i=1
K X
(i)
(i)
(i)
dX, XeT + 2 dX, eT + d, eT
i=1
(avg)
= dX, XeT
(avg)
+ 2 dX, eT
(avg)
+ d, eT
,
(5.4)
and depicts the “pooled” realized volatility of the subsampled observations. We will focus
(avg)
this section on the study of dY, Y eT
which we evaluated in the last section.
, and seek similar results as in the full grid case,
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
5.2
39
Error from Market Microstructure Noise on the Multi Grid
Lemma 6 Assume X to be a process of form (2.4) and Yti , ti to be defined as in (4.1),
(avg)
then the expectation and variance of the “pooled” realized volatility, dY, Y eT
, conditional
on the process X, are given by
(avg)
(avg) X = dX, XeT
+ 2 · nK · IE 2
IE dY, Y eT
(5.5)
and respectively
(avg) Var dY, Y eT
X
4
1
1 (avg)
4
2
2
=
. (5.6)
· n · IE +
8 · dX, XeT
· IE − 2 · Var + Op √
K K
K
n·K
Proof: We begin with the derivation of the conditional expectation
(avg) IE dY, Y eT
X
=
!
K X
1
(i)
(i)
(i) dX, XeT + 2 dX, eT + d, eT X
· IE
K
i=1
(avg)
= dX, XeT
+
K
1 X
(i) ·
2 · IE dX, eT X +
K
|
{z
}
i=1
= 0 [see (4.5)]
=
(avg)
dX, XeT
=
(i)
IE d, eT
|
{z
}
= 2·|G (i) |·IE(2 ) [see (4.6)]
K
1 X
+
·
2 · |G (i) | ·IE 2
| {z }
K
i=1
(avg)
dX, XeT
= nK
+ 2 · nK · IE 2 .
(5.7)
Next, we derive the conditional variance
(avg) Var dY, Y eT
X
=
K
K
X
X
1
(i) (i)
(j)
,
·
Var
dY,
Y
e
X
+
2
·
Cov
dY,
Y
e
,
dY,
Y
e
X
T
T
T
K2
|
{z
}
|
{z
}
i=1
(5.8)
i<j
=0
IL
where conditional on the process X, the randomness of the observed squared returns on
an arbitrary grid results solely from the noise terms ti which are i.i.d., and due to (5.1),
(i)
(j)
conditional on X, the quantities dY, Y eT and dY, Y eT are independent when i 6= j, so
that their common covariance is equal to zero.
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
40
We continue by defining the variation of an arbitrary process Z between two obser
vations in the subgrid G (i) on the time interval tm−K , tm as δ (i) Ztm := Ztm − Ztm,− ,
where tm,− , tm ∈ G (i) , and begin with the calculation of IL as given in equation (5.8)
(i) Var dY, Y eT X
tm
≤T
2 X
= Var
δ (i) Ytm X
tm,− , tm ∈ G (i)
tm
≤T
X
=
tm,− , tm ∈
G (i)
Var
|
2 δ Ytm X
{z
}
(i)
I.1L
tm,+ ≤T
+ 2
·
X
tm,− , tm , tm,+ ∈
G (i)
Cov
|
2 2 (i)
δ Ytm,+ , δ Ytm X ,
{z
}
(i)
(5.9)
I.2L
where the form of the sum of the covariances in the second equality is because of the
explanations above so that Ytm,+ − Ytm depends only on Ytm − Ytm,− due to tm . The
derivation of I.1L and I.2L as
in equation (5.9) is identical
given
to the approach in the
2
2
2
calculations of Var (δYtm ) X and Cov δYtm,+ , (δYtm ) X as given in Zhang et
al. (2005), subsection A.1 (p. 1407) with the only difference being that the expressions
δXtm and δtm in their calculations have to be replaced by δ (i) Xtm and δ (i) tm respectively
which results in
Var
δ (i) Ytm
2 2
= 8 δ (i) Xtm IE 2 + 2 · IE 4
X
2
+ 2 · IE 2
,
2 2 2
Cov
δ (i) Ytm,+ , δ (i) Ytm X
= IE 4 − IE 2
− 4 δ (i) Xtm · δ (i) Xtm,+ · IE 2
− 2 · IE 3 δ (i) Xtm,+ − δ (i) Xtm .
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
41
Therefore, IL as given in (5.8) and detailed in (5.9) has the following form
(i)
Var dY, Y eT |X
tm
≤T
2
X
4
2 2
(i)
2
=
2 · IE + 2 · IE + 8 δ Xtm IE tm,− , tm ∈ G (i)
tm,+ ≤T
+ 2
·
X
2
IE 4 − IE 2
− 4 δ (i) Xtm · δ (i) Xtm,+ · IE 2
2 · IE 3
tm,− , tm , tm,+ ∈ G (i)
tm,+ ≤T
− 2
·
X
δ (i) Xtm,+ − δ (i) Xtm
tm,− , tm , tm,+ ∈ G (i)
2 (i)
2 + |G | − 1 · IE 4 − IE 2
= 2 · |G (i) | · IE 4 + IE 2
{z
}
|
= 4·|G (i) |· IE(4 ) − 2·Var(2 )
tm,+ ≤T
+ 8·
(i)
dX, XeT
2
· IE −8
·
X
δ (i) Xtm · δ (i) Xtm,+ · IE 2
tm,− , tm , tm,+ ∈ G (i)
− 4 · IE 3 δ (i) Xti−1+n
K
·K
− δ (i) Xti−1+K
,
where analogous to the proof of the conditional variance in the full grid case of Zhang et al.
P (i)
(2005), subsection A.1 (p. 1407), the quantities
δ Xtm · δ (i) Xtm,+ and δ (i) Xti−1+n ·K −
K
δ (i) Xti−1+K are both of order Op (nK−1/2 ), so that IL is given by
(i)
Var dY, Y eT | X
(i)
= 4 · nK · IE 4 + 8 · dX, XeT · IE 2 − 2 · Var 2 + Op nK−1/2 .
(5.10)
Consequently, the conditional variance detailed in (5.8) equals
(avg)
Var dY, Y eT
|X
1 1
nK
(avg)
4
2
2
· IE +
8 · dX, XeT
· IE − 2 · Var + Op √
. (5.11)
= 4·
K
K
n·K
2
This result shows that the quantity dY, Y
subsampled estimators dY, Y
(i)
eT
(avg)
eT ,
which is arranged by averaging the
over the K subgrids of equal size nK , is still a biased
estimator of the quadratic variation of the true return process, [X, X]T . Nonetheless,
we can assert that in case the subsampling and averaging method introduced above is
applied, it is definitely possible to reduce the bias resulting from the noise compared to
(avg)
no subsampling at all, as the bias of dY, Y eT , which amounts to 2 · nK · IE 2 , only
increases with the number of observations on the subgrids, nK , compared to the number
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
42
of observations in full grid G n . Additionally, this estimator is superior to the optimally
sampled estimator dY, Y eH
T
opt
, which has been studied in the last section in the sense that
it utilizes all available financial data.
The next Lemma gives a result from Zhang et al. (2005) on the asymptotic error
distribution of the noise in the multi grid case.
Lemma 7 Assume X to be a process of form (2.4) and Yti , ti to be defined as in (4.1).
Suppose that for a given n, the grid G n as well as the associated subgrids G (i) , i = 1, ..., K,
are given as in Definition 3 and 7. Moreover, assume that δtn → 0 and nK → ∞ as
n → ∞ is fulfilled for the sequence of grids G n as well as the associated subgrids. Then,
conditional on the X process,
s
p
K L
(avg)
(avg)
(avg)
· dY, Y eT
− dX, XeT
− 2 · nK · IE 2 X −→ 2 IE (4 ) · Znoise ,
nK
(avg)
where Znoise is a standard normal random variable with the subscript ”noise” indicating
that the randomness comes from the noise term .
Proof: The proof is given in Zhang et al. (2005), where this result is again part of
Theorem A.1, subsection A.2 (p. 1408-9). As discussed earlier, Theorem A.1 is based
on Lemma A.2 (a), subsection A.2 (p. 1408), which does not use any properties of a
continuous Itô process, but can be applied directly to a the Jump Diffusion process X
given in (2.4). Despite using Lemma A.2 (a), the proof of Theorem A.1 in Zhang et al.
(2005) is conditional on the underlying process X and thus unrelated to the nature of the
2
X process.
5.3
Error from Discretization on the Multi Grid
We first present an asymptotic result on the “pooled” discretization error of the process
(avg)
X c , dX c , X c eT
− [X c , X c ]T , for equidistant observation times and a regular allocation
of points to subgrids from Zhang et al. (2005).
Proposition 3 Let X c be a process as in (2.5), and additionally suppose that µ and σ
are continuous. Assume that for a given n, the grid G n as well as the associated subgrids
G (i) , i = 1, ..., K, are given as in Definition 3 and 7. Moreover, assume that δtn → 0 and
nK → ∞ as n → ∞ is fulfilled for the sequence of grids G n as well as the associated
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
43
subgrids. Then as n → ∞
1/2
Z T
n 1/2 4
LS
c
c (avg)
c
c
4
c
· dX , X eT
− [X , X ]T
−→
σt dt
·T
· Zdisc
,
M
K
3
0
(5.12)
L
S
c
where −→
denotes convergence stable in law, and Zdisc
is a standard normal random
M
variable which is independent of the process X c .
Proof: This proposition originates from Zhang et al. (2005), Theorem 3, section 3.4
(p.1401), where the proof is given in section A.3 (p.1411).
(avg)
Continuing the analysis of the “pooled” realized volatility, dX, XeT
, we focus on
the discretization resulting from discrete observations and evaluate the variance of the
(avg)
“pooled” discretization error, dX, XeT
− [X, X]T .
Theorem 2 Let X be a process of form (2.4) with X c as in (2.5), and X d as in (2.6)
with (2.7) in effect. Additionally, assume that µ and σ are constant. Suppose that for a
given n, the grid G n as well as the associated subgrids G (i) , i = 1, ..., K, are given as in
Definition 3 and 7. Then,
K
(avg)
IE dX, XeT
− [X, X]T
= O
n
K
K
(avg)
2
·Σ + o
,
Var dX, XeT
− [X, X]T
=
n
n
where Σ 2 has the following form
Z T
4
4
1
Σ2 =
·T
σ 4 dt +
· (λ T )2 · (Var γ)2 + · λ T · IE γ 4
3
3
2
0
Z T
8
+
· λ T · Var (γ) ·
σ 2 ds .
3
0
(5.13)
(5.14)
(5.15)
Lemma 8 Let X c be a process as in (2.5), and X d as in (2.6) with (2.7) in effect. Ad
ditionally, suppose that σ is wide-sense stationary, i.e. IE (σs ) = IE (σt ) and IE σs2 =
IE σt2 ∀ s, t ∈ [0, T ]. Suppose that for a given n, the grid G n as well as the associated
subgrids G (i) , i = 1, ..., K, are given as in Definition 3 and 7. Then,
1
d
d (avg)
d
d
,
IE dX , X eT
− [X , X ]T
= O
n
(avg)
Var dX d , X d eT
− [X d , X d ]T
4 K
1
K
2
2
4
=
·
· (λ T ) · (Var γ) + · λ T · IE γ
+ o
, and
3 n
2
n
(5.16)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
(avg)
IE dX c , X d eT
= 0,
(avg)
Var dX c , X d eT
Z T
2 K
K
2
=
IE σs ds + o
.
·
· λ T · Var (γ) ·
3 n
n
0
44
(5.17)
Lemma 9 Let X c be a process as in (2.5), and additionally suppose that µ and σ are
constant. Assume that for a given n, the grid G n as well as the associated subgrids G (i) ,
i = 1, ..., K, are given as in Definition 3 and 7. Then,
K
c
c (avg)
c
c
,
= O
IE dX , X eT
− [X , X ]T
n
Z T
4 K
K
c
c (avg)
c
c
4
Var dX , X eT
− [X , X ]T
=
·
·T
σ dt + o
.
3 n
n
0
(5.18)
(5.19)
Proof of Lemma 8:
In the first step we show (5.16) and in the second step (5.17).
Step 1
The first step is the analysis of the “pooled” discretization error of X d on a multi grid
(avg)
as given in Definition 7, dX d , X d eT
− [X d , X d ]T . We begin with the calculation of the
mean of the “pooled” discretization error of the process X d , which has the following form
(avg)
IE dX d , X d eT
− [X d , X d ]T
=
(4.24) & (4.25)
=
=
K
1 X d d (i) IE dX , X eT − IE [X d , X d ]T
K
i=1
!
tj ≤T
K
X
1 X
λ · (tj − tj,− ) − λ T · Var (γ)
K
(i)
i=1
tj,− , tj ∈ G
|
{z
}
= λ·(max G (i) −min G (i) ) = λ T · n−K+1
n
K −1
−λT ·
· Var (γ) ,
n
(5.20)
where the derivations above are based on the proof of the discretization error’s mean in
the full grid case as given in equations (4.24) and (4.25).
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
45
We continue by calculating the variance of the “pooled” discretization error of X d ,
which is given by
(avg)
Var dX d , X d eT
− [X d , X d ]T
=
K
X
1
(i)
Var
dX d , X d eT − [X d , X d ]T
K2
=
K
1 X
d
d (i)
d
d
Var
dX
,
X
e
−
[X
,
X
]
T
T
K2
{z
}
|
!
i=1
+
(5.21)
i=1
IP
1
2·
K2
X
Cov dX
d
(h)
, X d eT
1≤h<i≤K |
d
d
− [X , X ]T , dX
{z
d
(i)
, X d eT
IIP
d
d
− [X , X ]T .
}
In Part 1.A we will derive the variance given by IP and in Part 1.B move to calculate
the covariance given by IIP .
Part 1.A
(i)
To calculate IP we derive the second moment of dX d , X d eT − [X d , X d ]T and receive
2
(i)
IE dX d , X d eT − [X d , X d ]T
"
!
tj ≤T
X
d
d (i)
d
d (i)
d
d
d
d
dX , X etj − dX , X etj,− − [X , X ]tj − [X , X ]tj,−
= IE
|
{z
} |
{z
}
tj,− , tj ∈ G (i)
(i)
j
=: δ (i) [X d , X d ]tj
=: δ (i) dX d , X d et
#
2
d
d
d
d
d
d
d
d
−
[X , X ]ti−1 − [X , X ]0 + [X , X ]T − [X , X ]ti−1+n ·K
K
| {z }
=0
tj ≤T
= IE
X
2
(i)
δ (i) dX d , X d etj − δ (i) [X d , X d ]tj
tj,− , tj ∈ G (i)
{z
|
tj ≤T
IE
X
δ (i) dX d , X d etj − δ (i) [X d , X d ]tj
"
− 2·
}
I.1P
(i)
tj,− , tj ∈ G (i)
{z
|
}
=0
· IE
[X d , X d ]ti−1 + [X d , X d ]T − [X d , X d ]ti−1+n
K
+ IE
|
d
d
d
d
d
d
[X , X ]ti−1 + [X , X ]T − [X , X ]ti−1+n ·K
K
{z
I.2P
2
}
,
#
·K
(5.22)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
46
where the quantities in the second summand of the last equality are independent due to
the fact that the compound Poisson process X d has independent increments and the two
quantities are based on non-overlapping intervals with the result that the second summand
vanishes as the discretization error of X d in the full grid case on an arbitrary interval has
an expectation of zero.
We continue by deriving I.1P as given in equation (5.22)
tj ≤T
IE
X
2
(i)
δ (i) dX d , X d etj − δ (i) [X d , X d ]tj
tj,− , tj ∈ G (i)
tj ≤T
(4.26)
X
=
2
(i)
IE δ (i) dX d , X d etj − δ (i) [X d , X d ]tj
tj,− , tj ∈ G (i)
(4.27b)
=
|G (i) | = nK
=
tj ≤T
X
2 · IE
|
tj,− , tj ∈ G (i)
Ntj − Ntj,−
2
− Ntj − Ntj,− · IE γ 2
{z
}
2
= (tj − tj,− )2 ·λ2 = (δt·K)2 ·λ2
2 · nK · (λ δt)2 · K 2 · (Var γ)2
|{z}
=
=
n−(K−1)
K
2
2 · (λ T ) ·
K
(K − 1) K
−
n
n2
· (Var γ)2 ,
(5.23)
where the first equality is again due to the fact that X d has independent increments so that
the discretization error on two disjoint time intervals are independent, and additionally
that the discretization errors have zero expectation, which leads to the disappearance of
the mixed terms. Furthermore, the second equality results through conditioning on the
Poisson process Nt , where the calculations are identical to those derivations that lead to
equation (4.27b), which we conducted to gain the variance of the discretization error in
the full grid case.
Next we calculate I.2P as given in equation (5.22) and note that because of the inde
=
pendent increment property of X d we know that if ti−1 + T − ti−1+nK ·K = T (K−1)
n
2
b − a, then the quantity IE [X d , X d ]ti−1 + [X d , X d ]T − [X d , X d ]ti−1+n ·K
is equal
K
2
to IE [X d , X d ]b − [X d , X d ]a . Thus, we calculate the second moment of the quadratic
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
47
variation of X d on an arbitrary interval (a, b ] and receive
d
d
d
d
IE [X , X ]b − [X , X ]a
2
= IE
Nb
X
!2
γi2
i = Na
= IE IE
Nb
X
X
γi4 + 2
Na ≤i<j≤Nb
i = Na
γi2 γj2 Nb − Na
2
= IE (Nb − Na ) · IE(γ 4 ) + IE (Nb − Na )2 − (Nb − Na ) · IE γ 2
= λ (b − a) IE γ 4 + (λ (b − a) (λ (b − a) + 1) − λ (b − a)) · (Var γ)2
= λ (b − a) · IE γ 4 + (λ (b − a))2 · (Var γ)2 ,
(5.24)
so that I.2P as given in (5.22) has the following form
2
+ [X d , X d ]T − [X d , X d ]ti−1+n ·K
K
2
(K − 1)
(K − 1)
= λT ·
· IE γ 4 + λ T ·
· (Var γ)2 .
n
n
IE
[X d , X d ]ti−1
(5.25)
Now, we use I.1P as detailed in (5.23) as well as I.2P as detailed in (5.25) to deduce the
(i)
second moment of dX d , X d eT − [X d , X d ]T as detailed in (5.22), and together with its
expectation derived in (5.20), we receive the following result for IP as given in (5.21)
(i)
Var dX d , X d eT − [X d , X d ]T
2 2
(i)
(i)
= IE dX d , X d eT − [X d , X d ]T − IE dX d , X d eT − [X d , X d ]T
(K − 1) K
K
2
−
· (Var γ)2
= 2 · (λ T ) ·
n
n2
2
K −1 2
K −1
K −1
2
4
+ λT ·
· IE γ + λ T ·
· (Var γ) − λ T ·
· Var γ
n
n
n
2
K
K −1
K
= 2·
(λ T )2 · (Var γ)2 + λ T ·
· IE γ 4 + O
,
(5.26)
n
n
n2
2
where the error term O K
includes the quantity (−2) · (λ T )2 · (K−1)K
· (Var γ)2 −
n2
n2
2
λ T · K−1
.
n · Var γ
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
48
Part 1.B
We proceed with the calculation of IIP as given in equation (5.21)
(h)
(i)
Cov dX d , X d eT − [X d , X d ]T , dX d , X d eT − [X d , X d ]T
= IE
(h)
dX d , X d eT
(i)
− [X d , X d ]T · dX d , X d eT − [X d , X d ]T
!
(h)
(i)
− IE dX d , X d eT − [X d , X d ]T · IE dX d , X d eT − [X d , X d ]T
|
{z
}
2
K−1 2
= (λ T · n ) ·(Var γ) analogous to (5.20)
2
(h)
(i)
= IE [X d , X d ]T
− IE dX d , X d eT · [X d , X d ]T − IE dX d , X d eT · [X d , X d ]T
{z
}
{z
}
{z
}
|
|
|
II.1P
II.2P
(h)
(i)
II.2P
+ IE dX d , X d eT · dX d , X d eT − λ T ·
{z
}
|
|
II.3P
2
K −1
· (Var γ)2 .
n
{z
}
2
= O
(5.27)
K
n2
We start out by calculating the characteristics of the three components in (5.27) individually and subsequently combine the outcomes.
The quantity II.1P in (5.27) is given by the derivations leading to (5.24) and equals
2
IE [X d , X d ]T
= λ T · IE γ 4 + (λ T )2 · (Var γ)2 ,
(5.28)
so we continue with the evaluation of II.2P as given in equation (5.27), where due to
[X d , X d ]T = [X d , X d ]ti−1 + [X d , X d ]ti−1+n
K
·K
− [X d , X d ]ti−1 + [X d , X d ]T − [X d , X d ]ti−1+n
K
·K
the expression II.2P has the following form
(i)
IE dX d , X d eT · [X d , X d ]T
(i)
= IE dX d , X d eT · [X d , X d ]ti−1+n ·K − [X d , X d ]ti−1
K
+
d
(i)
, X d eT
IE dX
| {z
= λ T · 1−
(K−1)
n
tj ≤ T
= IE
X
d
d
d
d
d
d
· IE [X , X ]ti−1 + IE [X , X ]T − [X , X ]ti−1+n ·K
K
|
}
{z
} |
{z
}
λT·
·Var γ
(i−1)
·Var γ
n
λT·
tj ≤ T
dX d , X d etj − dX d , X d etj,−
X
[X d , X d ]tj − [X d , X d ]tj,−
(i)
(i)
tj,− , tj ∈ G (i)
tj,− , tj ∈ G (i)
{z
II.2.AP
+ (λ T ) ·
(K−i)
·Var γ
n
|
2
(K − 1) (K − 1)2
−
n
n2
}
!
· (Var γ)2 ,
(5.29)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
49
where the first equality results from the fact that the two factors in the second summand,
(i)
dX d , X d eT and [X d , X d ]ti−1 + [X d , X d ]T − [X d , X d ]ti−1+n ·K , are independent in conK
sideration of the independent increment property of a compound Poisson process as they
are based on disjoint time intervals.
Next, we derive the quantity II.2.AP as given in (5.29)
tj ≤ T
II.2.AP
X
=
IE
(i)
(i)
dX d , X d etj − dX d , X d etj,−
[X d , X d ]tj − [X d , X d ]tj,−
tj,− , tj ∈ G (i)
+ 2
·
X
IE
(i)
(i)
dX d , X d etr − dX d , X d etr,−
[X d , X d ]ts − [X d , X d ]ts,−
0≤tr <ts ≤T ; tr ,ts ∈ G (i)
=
2
Ntj
X
IE
tj ≤ T
X
γm ·
m=Ntj,−
tj,− , tj ∈ G (i)
2
γm
N
tj
X
m=Ntj,−
+ 2 ·
IE
i≤ r< s≤ (i−1+nK ·K ) ; tr ,ts ∈ G (i)
X
2
Ntr
X
γp ·
Nts
X
γq2
q =Nts,−
p =Ntr,−
|
{z
}
Objects are independent due to disjoint time intervals
=
tj ≤ T
X
IE
N
tj
X
2
2
γm
X
·
IE
X
2
tj,− , tj ∈ G (i) m=Ntj,−
+ 2
tj
X
X
+
m=Ntj,−
tj,− , tj ∈ G (i)
N
tj ≤ T
Ntj,− ≤r<s≤Ntj
2
Ntr
X
·
γp
|
{z
=
X
tj,− , tj ∈
+ 2
IE
G (i)
·
4
IE γm + 2 ·
X
2
IE γr
Ntj,− ≤r<s≤Ntj
m=Ntj,−
γq2
{z
}
= (ts − ts,− )·λ· Var γ [s. (4.25)]
N
tj
X
q =Nts,−
|
}
= (tr − tr,− )·λ· Var γ [s. (4.24)]
tj ≤ T
= 0 as r 6= s
Nts
X
IE
p =Ntr,−
i≤ r< s≤ (i−1+nK ·K ) ; tr ,ts ∈ G (i)
2
IE γm
· γr · γs
{z
}
|
· IE γs2 Ntj − Ntj,−
2
(tr − tr,− ) (ts − ts,− ) · λ2 · IE γ 2
X
i≤ r< s≤ (i−1+nK ·K ) ; tr ,ts ∈ G (i)
tj ≤ T
=
X
tj,− , tj ∈ G (i)
+ 2
·
2
IE Ntj − Ntj,− IE γ 4 + IE Ntj − Ntj,− − Ntj − Ntj,− (Var γ)2
{z
}
|
|
{z
}
= (tj − tj,− )· λ
X
i≤ r< s≤ (i−1+nK ·K ) ; tr ,ts ∈ G (i)
= (tj − tj,− )2 · λ2
(tr − tr,− ) (ts − ts,− ) · λ2 · (Var γ)2
!
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
= λT
(K − 1)
1−
n
X
· IE γ 4 + λ2
=
= λT
(K − 1)
1−
n
(tr − tr,− ) (ts − ts,− ) · (Var γ)2
i≤ r, s≤ (i−1+nK ·K ) ; tr , ts ∈ G (i)
|
50
· IE γ
4
tj,− , tj ∈ G (i)
(tj − tj,− )
(K − 1)
+ (λ T ) · 1 −
n
2
2
{z
Ptj ≤ T
2
= T 2·
2
}
(K−1)
1− n
· (Var γ)2 ,
(5.30)
so that together with the result in equation (5.30) we can evaluate II.2P as given in (5.27)
and detailed in (5.29) and receive the following result
(i)
IE dX d , X d eT · [X d , X d ]T
(K − 1)
(K − 1)
2
4
= λT 1 −
· IE γ + (λ T ) · 1 −
· (Var γ)2 .
n
n
(5.31)
Finally, we pursue the calculation of II.3P as given in equation (5.27), and begin by
defining δZti := Zti − Zti−1 as the variation of an arbitrary process Z between successive
(i)
observations in the full grid G n . A closer look at the terms dX d , X d eT , i = 1, ..., K, reveals
that
tj ≤T
(i)
dX d , X d eT
=
X
X d tj − X d tj,−
2
tj,− , tj ∈ G (i)
tj ≤T
=
X
tj ≤T
=
(j,−)+1≤ m≤j
X
X
tj,− , tj ∈ G (i)
(5.2)
δXtdm
tj,− , tj ∈ G (i)
=
2
X
i−1+nK ·K X
m=i
δXtdm
2
(j,−)+1≤ m≤j
δXtdm
2
+ 2 ·
X
δXtdr · δXtds
(j,−)+1≤r<s≤j
tj ≤T
+
X
tj,− , tj ∈
2 ·
G (i)
X
δXtdr · δXtds ,
(j,−)+1≤r<s≤j
(5.32)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
51
so that we can apply this derivation in the calculation of II.3P and receive
(5.32)
=
(h)
(i)
IE dX d , X d eT · dX d , X d eT
h−1+nK ·K tg ≤T
2
X
X
IE
+
δXtdu
u=h
i−1+nK ·K X
δXtdv
X
+
v=i
tj,− , tj ∈ G (i)
h−1+nK ·K IE
X
(g,−)+1≤p<q≤g
!
X
2 ·
2
i−1+nK ·K X
·
δXtdr · δXtds
(j,−)+1≤r<s≤j
δXtdu
δXtdp · δXtdq
tj ≤T
2
=
2 ·
tg,− , tg ∈ G (h)
·
X
δXtdv
2
+
v=i
u=h
{z
|
}
II.3.AP
·K
K
h−1+n
X
+
u=h
tg ≤T
X
tg,− , tg ∈
2
G (h)
·
X
IE
(g,−)+1≤p<q≤g |
δXtdu
2
δXtdp
·
δXtdq
{z
= 0 as tp 6= tq
·K
K
i−1+n
X
+
v=i
tj ≤T
X
tj,− , tj ∈
tg ≤T
+ IE
X
·
X
IE
(j,−)+1≤r<s≤j |
δXtdv
2
δXtdr
{z
·
= 0 as tr 6= ts
δXtds
}
X
δXtdp · δXtdq
(g,−)+1≤p<q≤g
!
tj ≤T
X
2 ·
tj,− , tj ∈
|
2
G (i)
2 ·
tg,− , tg ∈ G (h)
·
}
G (i)
X
δXtdr · δXtds
.
(5.33)
(j,−)+1≤r<s≤j
{z
II.3.BP
}
Moving on to the derivation of II.3.AP as given in equation (5.33), we see that since
2
2
Ph−1+nK ·K
Pi−1+nK ·K
h < i, the two sums
δXtdu
and
δXtdv have identical sumv=i
u=h
mands when u = v = i, ..., h − 1 + nK · K, so that the expectation of the product of these
quantities results in a sum with (n − (k − 1))2 summands of which n − (k − 1) − (i − h)
4
terms are of form IE δXtdm and (n − (k − 1))2 −(n − (k − 1) − (i − h)) terms are of form
2
2 2 2
IE δXtdr · IE δXtds
, where ts 6= tr . We know that IE δXtdr
equals (λ δt Var γ)2
from the derivations in equation (4.24), and thus continue to calculate the unknown quan-
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
tity IE δXtdm
IE δXtdm
4
4
52
which equals
Ntm
= IE
X
4
γk
k=Ntm,−
k
4
δN
γ1k1 · · · γδNt tm δNtm
IE IE
m k1 , · · · , kδNtm
k1 +···+kδNt =4
m
4
δNtm
4
2 2
IE γ
IE δNtm IE γ +
2
2, 2
2
IE (δNtm ) · IE γ 4 + 3 · IE (δNtm (δNtm − 1)) · IE γ 2
λ δt · IE γ 4 + 3 · (λ δt)2 (Var γ)2 ,
(5.34)
=
=
=
=
X
so that on the basis of this result and the explanations above the quantity II.3.AP as
given in equation (5.33) can be expressed as
h−1+nK ·K i−1+nK ·K 2
2
X
X
δXtdv
IE
δXtdu ·
v=i
u=h
!
1
4
= (n − (K − 1) − (i − h)) ·
(Var γ) + λ T · · IE γ
n
1 2
+ (n − (K − 1))2 − (n − (K − 1) − (i − h)) · λ T ·
(Var γ)2
n
n2 − 2 · n · (K − 1) + (K − 1)2 + 2 · (n − (k − 1) − (i − h))
· (Var γ)2
= (λ T )2 ·
n2
(n − (K − 1) − (i − h))
· IE γ 4
+ λT ·
n
2
(K − 1)
2
= (λ T ) · 1 +
− 2·
· (Var γ)2
n
n
2
(K − 1) + (i − h)
K
4
+ λT · 1 −
· IE γ + O
,
(5.35)
n
n2
2
where the error term O K
in the last equality includes the asymptotically negligible
n2
(K−1)2 − 2·(K−1+i−h)
quantity
· (λ T )2 · (Var γ)2 . Having derived II.3.AP , the final part missn2
1
3 λT ·
n
2
2
ing to derive II.3P as given in equation (5.27) is the quantity II.3.BP as given in equation
(5.33).
The derivation of II.3.BP depends on the understanding of the relationships between
the time increments of the observations of the subgrids G (h) and G (i) . Two time intervals
between successive observations of different subgrids G (h) and G (i) , let those time intervals
be th−1+(g−1)·K , th−1+g·K and ti−1+(j−1)·K , ti−1+j·K , can either have no intersection,
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
53
and thus lead to independent increments of a compound Poisson process on these intervals,
or have an intersection, and therefore lead to dependent increments of a compound Poisson
process. In case two time intervals between observations of G (h) and G (i) overlap, the length
of their intersection is either |i − h| · δ t or (K − |i − h|) · δ t. Remembering that h < i, we
begin the derivation of II.3.BP as given in equation (5.33)
tj ≤T
tg ≤T
X
X
X
X
δXtdr · δXtds
2 ·
δXtdp · δXtdq
IE
2 ·
tj,− , tj ∈ G (i) (j,−)+1≤r<s≤j
tg,− , tg ∈ G (h) (g,−)+1≤p<q≤g
= 4 ·
tg ≤T
tj ≤T
X
X
X
tg,− , tg ∈ G (h) tj,− , tj ∈ G (i)
IE δXtdp · δXtdq · δXtdr · δXtds
X
(g,−)+1≤p<q≤g (j,−)+1≤r<s≤j
nK
= 4·
X
X
IE δXtdp · δXtdq · δXtdr · δXtds
X
l=1 h+(l−1)·K≤p<q≤h−1+l·K
i+(l−1)·K≤r<s≤i−1+l·K
nK −1
+4·
X
X
m=1 h+m·K≤p<q≤h−1+(m+1)·K
X
IE δXtdp · δXtdq · δXtdr · δXtds
i+(m−1)·K≤r<s≤i−1+m·K
!
2
(K − i + h) · (K − i + h − 1) = 4 · nK ·
· IE (δXt )2
2
!
2
(i − h) · (i − h − 1) + 4 · (nK − 1) ·
· IE (δXt )2
2
2
= 2 · nK · K (K − 1) · 2 (i − h)2 − K · (i − h) − (i − h) · (i − h − 1) · IE (δXt )2
|
{z
}
= (λ T · n−1 · Var γ)2
2
= 2 · (λ T ) ·
K −1
(i − h)2 /K − (i − h)
+2·
n
n
!
K2
· (Var γ) + O
n2
2
,
(5.36)
where the second equality above is due to the fact that under the regular allocation of
sample points to subgrids as detailed in Definition 7, the time intervals between successive observations of two different subgrids G (h) and G (i) have exactly nK intersections
of length (K − (i − h)) · δ t, when th−1+(l−1)·K , th−1+l·K and ti−1+(l−1)·K , ti−1+l·K
overlap with l = 1, ..., nK , as well as (nK − 1) intersections of length (i − h) · δ t, when
th−1+m·K , th−1+(m+1)·K and ti−1+(m−1)·K , ti−1+m·K overlap with m = 1, ..., (nK − 1),
which means that all mixed terms, IE δXtdp · δXtdq · δXtdr · δXtds , except of those within
the intersections noted above, vanish because the mixed terms are all independent of
each other outside the intersections, and IE δXtd = 0. In addition, let us assume without loss of generality that the overlap of two time increments is of form (ta−1 , ta−1+z ],
so that within the intersection there are exactly
δXtdr
·
δXtds ,
1
2
· z · (z − 1) zero-mean observations
where a ≤ r < s ≤ a − 1 + z. Since the sum of these
1
2
· z · (z − 1) terms is
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
54
multiplied by the sum of the same number of zero-mean observations δXtdp · δXtdq , where
a ≤ p < q ≤ a − 1 + z, the expectation of the mixed terms IE δXtdp · δXtdq · δXtdr · δXtds
within the intersection (ta−1 , ta−1+z ] is zero due to independence, unless r = p and s = q,
which is the case for exactly 12 · z · (z − 1) of the mixed terms whose expectation then has
2
2 the shape IE δXtdr · δXtds
. The third equality is a direct conclusion of the above
2
term in the last equality incorporates the quantity
explanations. Moreover, the O K
n2
2
2
2K(i−h)
3(i−h)
(i−h)
2(i−h)2 /K
2K
1
2 · −K
+
+
−
−
−
+
· (λ T · Var γ)2 .
2
2
2
2
2
2
2
n
n
n
n
n
n
n
Thus, II.3P as given in equation (5.27) can now be derived by joining the results of
II.3.AP as detailed in (5.35) and II.3.BP as detailed in (5.36), which equals
(h)
(i)
IE dX d , X d eT · dX d , X d eT
!
(i − h)2 /K − (i − h)
1
+ 2·
· (λ T )2 · (Var γ)2
= 2· 1 +
n
n
2
K
(K − 1) + (i − h)
4
· IE γ + O
.
+ λT · 1 −
n
n2
(5.37)
Now, it is possible to derive IIP as given in equation (5.21) and detailed in (5.27),
and consequently we combine the results for II.1P in (5.28), II.2P in (5.31) and II.3P in
(5.37), which leads to
(h)
(i)
Cov dX d , X d eT − [X d , X d ]T , dX d , X d eT − [X d , X d ]T
= λ T · IE γ 4 + (λ T )2 · (Var γ)2 +
(K − 1)
(K − 1)
2
4
− 2 · λT 1 −
· IE γ − 2 · (λ T ) · 1 −
· (Var γ)2
n
n
!
2
2
(i
−
h)
/K
−
(i
−
h)
+ (λ T )2 · 1 +
+ 4·
· (Var γ)2
n
n
2
(K − 1) + (i − h)
K
+ λT · 1 −
· IE γ 4 + O
.
n
n2
!
2
K
(i
−
h)
/K
−
(i
−
h)
= 2 · (λ T )2 ·
+ 2·
· (Var γ)2
n
n
2
(K − 1) − (i − h)
K
4
· IE γ + O
+ λT ·
.
(5.38)
n
n2
Part 1.C
Eventually, we receive the variance of the “pooled” discretization error as detailed in
(5.21), by implementing the results for IP as detailed in (5.26) as well as IIP as detailed
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
55
in (5.38)
(avg)
Var dX d , X d eT
− [X d , X d ]T
2 1
K
K −1
K
2
2
4
=
·K · 2·
(λ T ) · (Var γ) + λ T ·
· IE γ + O
2
K
n
n
n2
!
!
X
K
1
(i − h)2 /K − (i − h)
2
2
2 · (λ T ) ·
· (Var γ)
+ 2 ·2 ·
+ 2·
K
n
n
1≤h<i≤K
2
X 1
(K − 1) − (i − h)
K
+ 2 ·2 ·
λT ·
+ O
· IE γ 4
K
n
n2
1≤h<i≤K
2
K
K −1
K
= 2·
(λ T )2 · (Var γ)2 + λ T ·
· IE γ 4 + O
n
n
n2
!
X
2
(λ T )2
λT
− 2 · 4·
· (Var γ)2 +
· IE γ 4
·
(i − h)
K
n
n
1≤h<i≤K
!
X
2
(λ T )2
1
+ 2 · 4·
· (Var γ)2 ·
·
(i − h)2 ,
(5.39)
K
n
K
1≤h<i≤K
and with
X
(i − h)
=
m
X
x =
m=1x=1
1≤h<i≤K
X
K−1
X
2
(i − h)
=
K−1
X
m
X
m=1x=1
1≤h<i≤K
x2 =
K−1
X
m · (m + 1)
K · (K − 1) · (K + 1)
=
2
6
m=1
K−1
X
m · (2m + 1) · (m + 1)
K 2 · (K − 1) · (K + 1)
=
2·3
6·2
m=1
it is possible to add up the terms and receive the following result for the variance of the
“pooled” discretization error that was detailed in (5.21)
d
(avg)
, X d eT
d
d
Var dX
− [X , X ]T
2
4 K
2
K
=
·
+
· (λ T )2 · (Var γ)2 + O
3 n
K ·n
n2
2 K
1 1
1
+
·
− + ·
· λ T · IE γ 4
3 n
n 3 K ·n
2
4 K
1
K
1
2
2
4
·
· (λ T ) · (Var γ) + · λ T · IE γ
+O
.
=
+O
2
3 n
2
n
n
(5.40)
This result together with (5.20) gives us the expectation and variance of the discretization error of the process X d on the multi grid.
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
56
Step 2
The next step is the analysis of the realized covariance of the processes X d and X c on
the multi grid. The expectation is given by
IE dX
c
(avg)
, X d eT
K
X
=
(i)
= 0,
IE dX c , X d eT
{z
}
|
i=1
(5.41)
= 0 See (4.29)
and the variance has the following form
(avg)
Var dX c , X d eT
K
K
X
X
1
c
d (i)
c
d (h)
c
d (i)
+
2
·
Cov
dX
,
X
e
,
dX
,
X
e
=
·
Var
dX
,
X
e
T
T
T
.
K2
{z
}
{z
}
|
|
i=1
h<i
VM −V.1P
(5.42)
VM −V.2P
Part 2.A
The quantity VM − V.1P as given in (5.42) has already been deduced more generally
Hn
in Section 3.2, equation (4.41), during the derivation of the quantity Var dX c , X d eT eq ,
n
where Heq
is a subgrid with equidistant observations as presented in (4.30). Since the
subgrids G (i) , i = 1, ..., K, fulfill the assumptions in (4.30), we conclude that VM − V.1P
equals
(i)
Var dX c , X d eT
K
· Var (γ) ·
= λT ·
n
Effectively, we have
Z ti−1+n
K
·K
σs2
IE
ti−1+n
Z
K
K 3/2
n3/2
2
IE σs ds + O
ti−1
Z
T
IE
ds =
ti−1
·K
σs2
ds + O
0
K
n
!
.
.
(5.43)
R ti
This equality holds as n· ti−1
σ 2 ds is dominated by T ·sups∈[0,T ] σs2 for ti−1 , ti ∈ G n , where
s
we have IE T · sups∈[0,T ] σs2 < ∞ due to the boundedness of σ, and therefore
Z
tn
i
lim sup n
n
tn
i−1
!
σs2
ds
!
≤
T sup
s ∈ [0,T ]
σs2
< ∞
(5.44)
so that by the reversed Lemma of Fatou
R n
R n
ti
t
2 ds !
IE tni σs2 ds
σ
n
ti−1 s
i−1
≤ IE lim sup
lim sup
1/n
1/n
n
n
!
≤ IE T sup σ 2
s ∈ [0,T ]
<
∞,
(5.45)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
R
ti
and thus IE ti−1
σs2 ds is of order O
1
n
57
, which explains the order of the error term given
in the equation above. Hence, VM − V.1P as given in (5.42) has the following form
Var dX
c
(i)
, X d eT
K
= λT ·
· Var (γ) ·
n
Z
T
IE
σs2
ds + O
0
K2
n2
.
(5.46)
Part 2.B
Before we go on we repeat the definitions for the variation of an arbitrary process Z
between two successive observations in the full grid G n , i.e. δZtm := Ztm − Ztm−1 , as well
as the variation between two successive observations in the subgrid G (i) which extends
over the time interval tm−K , tm , i.e. δ (i) Ztm := Ztm − Ztm,− = Ztm − Ztm−K , where
tm,− , tm ∈ G (i) .
Recalling that h < i, we begin with the calculation of VM − V.2P as given in (5.42)
(i)
(h)
Cov dX c , X d eT , dX c , X d eT
tj ≤T
=
(4.29)
=
(i)
(h)
IE dX c , X d eT · dX c , X d eT
tg ≤T
X
IE δ (i) Xtcj · δ (h) Xtcg · IE δ (i) Xtdj · δ (h) Xtdg .
|
{z
}
tg,− , tg ∈ G (h)
X
tj,− , tj ∈ G (i)
(5.47)
VM −V.2.AP
The quantity VM − V.2.AP has the following form for g = h − 1 + K, ..., h − 1 + nK · K and
j = i − 1 + K, ..., i − 1 + nK · K
IE δ
(i)
Xtdj
·δ
(h)
Xtdg
=
j
X
g
X
IE δXtds · δXtdr ,
s = j−K+1 r = g−K+1
where the terms δXtds and δXtdr are zero-mean so that in case r 6= s the expectation of
the product δXtds · δXtdr vanishes due to independence. Therefore, with the same line of
arguments we presented in conjunction with equation (5.36), on page 53 et seq., regarding
the number and size of intersections of time intervals between observations in different
subgrids G (h) , G (i) with h < i, and the vanishing of the expectation of mixed increments
of the process X d due to independence, we further receive
(i)
(h)
Cov dX c , X d eT , dX c , X d eT
=
(5.48)
nK
X
2
IE (Xtci−1+l·K − Xtci−1+(l−1)·K ) (Xtch−1+l·K − Xtch−1+(l−1)·K ) (K − i + h) · IE δXtd
| {z }
l=1
1
·Var(γ)
λT·n
nK −1
+
2
IE (Xtci−1+m·K − Xtci−1+(m−1)·K ) (Xtch−1+(m+1)·K − Xtch−1+m·K ) (i − h) · IE δXtd ,
| {z }
m=1
X
1
λT·n
·Var(γ)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
58
so that it is necessary to find out more about the expectation of the product of increments of X c as given above in the sums of (5.48), that is a quantity which has the
h
i
form IE Xtcs − Xtcs−K Xtcr − Xtcr−K
with observation times ts , ts−K , tr , tr−K ∈ G n ,
s−K ≤r <s
h
i
Xtcr − Xtcr−K
h
i
= IE Xtcs − Xtcr + Xtcr − Xtcs−K Xtcr − Xtcs−K + Xtcs−K − Xtcr−K
h
i
h
i
= IE Xtcs − Xtcr · Xtcr − Xtcs−K + IE Xtcs − Xtcr · Xtcs−K − Xtcr−K
h
i
2
c
c
c
c
c
c
+ IE Xtr − Xts−K · Xts−K − Xtr−K + IE Xtr − Xts−K .
IE
Xtcs − Xtcs−K
(5.49)
The result for the last summand in the above equation (5.49) follows from (4.32) as well
as the subsequent derivations up to page 30, which show that we have
!
2 Z tr
K 3/2
c
2
c
.
IE Xtr − Xts−K =
IE σ ds + O
n3/2
ts−K
(5.50)
Furthermore, the first three summands in equation (5.49) are the expectations of the
product of increments of the diffusion process X c on non-overlapping time intervals,
h
i
i.e. IE Xtca − Xtca−C Xtcb − Xtcb−D
such that the observation times have the form
ta , ta−C , tb , tb−D ∈ G n with C, D ≤ K as well as (a − C, a] ∩ (b − D, b] = ∅, and can be
expressed as
i
Xtca − Xtca−C Xtcb − Xtcb−D
!
Z
Z tb
Z ta
= IE
µs ds ·
µs ds + IE
IE
h
|
{z
J.1
Z
tb
Z
µs ds ·
IE
tb−D
|
ta
}
|
!
{z
µs ds ·
σs dBs
Z
}
ta
Z
|
tb
σs dBs ·
ta−C
}
+
tb−D
{z
+ IE
σs dBs
!
tb
J.2
ta−C
J.3
Z
ta−C
tb−D
ta−C
ta
!
σs dBs .
(5.51)
tb−D
{z
J.4
}
We apply the Cauchy-Schwarz inequality to J.1, J.2 and J.3. Due to the fact that we
2
have IE T · sups∈[0,T ] |µs |
< ∞, we use the same line of arguments that we present
further below in conjunction with (5.65) and (5.66) (reversed Lemma of Fatou) and come
R
2
2
tx
to the conclusion that IE tx−K
µs ds is of order O K
. Additionally, we again apply
n2
the arguments in conjunction with (5.44) and (5.45) (also reversed Lemma of Fatou)
R
2
R
ty
ty
σs dBs = IE ty−K
σs2 ds is of order O K
and receive IE ty−K
n . Consequently, with
3/2 3/2 2
K
K
C, D ≤ K we can conclude that J.1 = O K
,
J.2
=
O
and
J.3
=
O
.
2
3/2
n
n
n3/2
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
59
Moreover, J.4 is equal to
ta
Z
Z
!
tb
σs dBs ·
IE
σs dBs
tb−D
ta−C
Z T
Z T
I{s ∈ (b−D, b]} σs dBs
= IE
I
σs dBs ·
0 {s ∈ (a−C, a]}
|
{z
} |0
{z
}
α
β
Z
= IE
T
0
I{s ∈ (a−C, a]} I{s ∈ (b−D, b]} σs2 ds
|
{z
}
=
0,
(5.52)
= 0
where the second equality results from the identity αβ =
1
2
· (α + β)2 − α2 − β 2 as
well as the subsequent application of the Itô-isometry, and the last equality is due to
(a − C, a] ∩ (b − D, b] = ∅.
Due to the derivations in conjunction with equations (5.50), (5.51) and (5.52), we can
conclude that the quantity detailed in (5.49) equals
IE
h
Xtcs − Xtcs−K
Xtcr − Xtcr−K
tr
Z
i
2
IE σs ds + O
=
ts−K
K 3/2
n3/2
!
,
and hence it is possible to directly apply this result to VM − V.2P as given in (5.42) and
detailed in (5.48), which then has the form
(i)
(h)
Cov dX c , X d eT , dX c , X d eT
=
nK
Z th−1+l·K
X
λT
(K − i + h)
IE σs2 ds +
· Var (γ) ·
n
ti−1+(l−1)·K
l=1
nK −1
Z ti−1+m·K
X
λT
· Var (γ) ·
(i − h)
IE σs2 ds + O
n
th−1+m·K
m=1
K 3/2
n3/2
!
,
(5.53)
where the error term results from O K 3/2 /n3/2 · 1/n · nK · O (K) = O K 3/2 /n3/2 . Since
we further have
nK Z
X
th−1+l·K
IE
l = 1 ti−1+(l−1)·K
Z ti−1
σs2
nK −1 Z
ds +
IE σs2 ds =
=
th−1+(l)·n
K
Z
X
m=1
T
ti−1+m·K
IE σs2 ds
th−1+m·K
IE σs2 ds + O (K/n) ,
(5.54)
0
where the last equality is because for ti−1 , ti ∈ G n the term
R ti
ti−1
IE σs2 ds is of order
O (1/n), which we verified in conjunction with (5.43). Hence, we include this fact in
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
60
(5.53) and have
(i)
(h)
Cov dX c , X d eT , dX c , X d eT
Z T
λT
=
IE σs2 ds + O K 2 /n2 +
· Var (γ) · (K − (i − h))
n
0
nK −1 Z
X ti−1+m·K
λT
· Var (γ) · (2 · (i − h) − K)
IE σs2 ds,
n
th−1+m·K
(5.55)
m=1
where the error term O K 3/2 /n3/2 from equation (5.53) is included in O K 2 /n2 above,
Rt
which results from the substitution of t i−1
IE σs2 ds as given in (5.54). Having no
h−1+(l)·n
K
further information regarding the process σ means that we can not simplify the expression
above any further.
Therefore, we go back to (5.53) and assume the process σ to be wide-sense stationary,
that is IE (σs ) = IE (σt ) and IE σs2 = IE σt2 for all s, t ∈ [0, T ]. This implies for
VM − V.2P as given in (5.42) that
(i)
(h)
Cov dX c , X d eT , dX c , X d eT
!
K 3/2
λT
2
2
2
· Var (γ) · nK (K − (i − h)) + (nK − 1) (i − h)
· δt · IE σs + O
=
n
n3/2
Z T
λT
K2
=
· Var (γ) · K + 2 · (i − h)2 /K − (i − h)
IE σs2 ds + O
, (5.56)
n
n2
0
where the error term has increased in the last equality due to the factoring of the quantity
nK (K − (i − h))2 + (nK − 1) (i − h)2 .
Part 2.C
The outcome above together with the result for VM − V.1P as given in (5.42) and
detailed in (5.46) leads to the the realized covariance of the processes X d and X c on the
multi grid as detailed in (5.42) which is given by
(avg)
Var dX c , X d eT
=
2
Z T
K
1 X
K
K
2
·
λT ·
· Var (γ) ·
IE σs ds + O
+
2
K
n
n2
0
i=1
Z T
K X
2 λT
2
2
IE
σ
·
·
Var
(γ)
·
ds
·
K
+
2
·
(i
−
h)
/K
−
(i
−
h)
,
s
K2 n
0
h<i
where we have
K X
h<i
K + 2 · (i − h) /K − (i − h)
=
2
2 K
1
3
·
− +
,
3 n
n K ·n
(5.57)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
61
and therefore
(avg)
Var dX c , X d eT
=
2 K
·
· λ T · Var (γ) ·
3 n
Z
T
IE σs2 ds + O
0
K2
n2
+O
1
K ·n
.
(5.58)
This result together with (5.41) gives us the expectation and variance of the realized co2
variance of the processes X d and X c on the multi grid.
Proof of Lemma 9:
(avg)
Define DT := dX c , X c eT
− [X c , X c ]T . For the expectation it is possible to use
the same methods leading to equation (4.43) in the proof of Lemma 5, where analogous
derivations result in
IE ( DT ) = O
K
n
.
(5.59)
As detailed in the second section DT = AT + MT can be decomposed into its finite variation component, AT , and its martingale component, MT . We use the square integrability
of M and receive Var (MT ) = IE MT2 = IE ([M, M ]T ) = IE ([D, D]T ). Utilizing a result
for the quantity [D, D]T from Zhang et al. (2005), Theorem 2, section 3.4 (p.1401), we
receive the following for constant σ
!
X
n
K
4
4
n
+ o
IE ([D, D]T ) = IE
+ o (1) ·
σti · δt
3
n
i = 1
Z T
4
K
K
·T · ·
σ 4 dt + o
= IE
n
3 0
n
Z T
4 K
K
=
·
·T ·
σ 4 dt + o
,
3 n
n
0
K
·T ·
n
(5.60)
Considering that in the variance of DT the quantities Var (AT ) and Cov (AT , MT ) are of
2
negligible order compared to Var (MT ) the assertion holds.
Proof of Theorem 2:
(avg)
We begin by first taking a look at dX, XeT
(avg)
dX, XeT
=
=
, which equals
K
1 X
(i)
dX, XeT
K
1
K
i=1
K X
(i)
(i)
(i)
dX c , X c eT + dX d , X d eT + 2 · dX c , X d eT
i=1
(avg)
= dX c , X c eT
(avg)
+ dX d , X d eT
(avg)
+ 2 · dX c , X d eT
.
(5.61)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
(avg)
The order of the expectation of the discretization error, dX, XeT
multi grid
Gn
62
− [X, X]T , on the
now follows directly from Lemmas 8 and 9, and the variance has the following
form
(avg)
Var dX, XeT
− [X, X]T
(avg)
(avg)
= Var dX c , X c eT
− [X c , X c ]T + Var dX d , X d eT
− [X d , X d ]T
{z
}
{z
}
|
|
VM −IP
VM −IIP
(avg)
+ 2 · Cov dX c , X c eT
|
(avg)
− [X c , X c ]T , dX d , X d eT
{z
− [X d , X d ]T
}
= 0 [due to independence]
(avg)
(avg)
+ 4 · Cov dX d , X d eT
− [X d , X d ]T , dX c , X d eT
|
{z
}
VM −IIIP
+ 4 · Cov
|
(avg)
dX c , X c eT
(avg)
− [X c , X c ]T , dX c , X d eT
{z
VM −IVP
}
(avg)
+ 4 · Var dX c , X d eT
.
|
{z
}
(5.62)
VM −VP
Lemmas 8 and 9 provide insight into the quantities VM − IP , VM − IIP as well as
VM − VP , where we adapt the results in Lemma 8 to constant µ and σ.
We proceed to derive VM − IIIP and VM − IVP as given in (5.62), where we derive
these for general processes µ and σ as given in in (2.5).
We start to evaluate VM − IIIP
(avg)
(avg)
Cov dX d , X d eT
− [X d , X d ]T , dX c , X d eT
=
K
K
1 X X
d
d (i)
d
d
c
d (h)
·
Cov
dX
,
X
e
−
[X
,
X
]
,
dX
,
X
e
,
T
T
T
K2
|
{z
}
i=1 h=1
VM −III.1P
(5.63)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
63
and further evaluate the term VM − III.1P as given in (5.63)
(4.29)
=
=
(i)
(h)
Cov dX d , X d eT − [X d , X d ]T , dX c , X d eT
(i)
(h)
IE dX d , X d eT − [X d , X d ]T · dX c , X d eT
N
2
"
tj
tj ≤ T
N
T
X
X
X
IE
γk −
γk2
tj,− , tj ∈ G (i)
k=Ntj,−
k=0
tg ≤ T
X
·
Xtcg − Xtcg,− · Xtdg − Xtdg,−
#
tg,− , tg ∈ G (h)
=
"
tj ≤ T
IE
X
Nmin G (i)
2 ·
X
γk · γl −
tj,− , tj ∈ G (i) Ntj,− ≤k<l≤Ntj
X
γk2 −
k=0
NT
X
γk2
k = Nmax G (i)
N
tg ≤ T
X
·
Xtcg − Xtcg,− ·
=
tj,− , tj ∈
2 ·
G (i)
X
X
γm
N
tg ≤ T
X
#
m=Ntg,−
tg,− , tg ∈ G (h)
tj ≤ T
tg
X
Xtcg
Ntj,− ≤k<l≤Ntj tg,− , tg ∈ G (h)
−
Xtcg,−
·
tg
X
IE (γk · γl · γm )
|
{z
}
m=Ntg,−
= 0 as k 6= l
N
Ntg
min G(i)
tg ≤ T
NT
X
X
X
X
2
c
c
2
γm
γk ·
Xtg − Xtg,− ·
γk +
− IE
, (5.64)
(h)
m=Ntg,−
k = Nmax G (i)
k=0
tg,− , tg ∈ G
|
{z
}
JP
so that it is
necessary to make a case differentiation
regarding
the values of h and i to
PNT
PNmin G(i) 2
(h)
γ 2 · dX c , X d eGT
= IE (JP ) .
evaluate IE
γk + k = N
k=0
(i) k
max G
Beginning with the case i = h, it is apparent that IE(JP ) in equation (5.64) is
zero, due to the fact that X c and X d are independent, and additionally the time in
tervals 0, min G (i) ∪ max G (i) , T and min G (i) , max G (i) are disjoint so that the two
factors of JP are independent, which makes the expectation of the product vanish as
(4.29)
(h)
IE dX c , X d eTG
= 0.
In case h < i we have an intersection of the intervals between the observations of
the two factors of JP on the time interval 0, ti−1 ∩ th−1 , th−1+nK ·K = th−1 , ti−1 as
(5.2)
min G (i) = ti−1 for all i = 1, ..., K. Thus, again due to the independence of X c and X d as
well as the independent increment property of the compound Poisson process, only those
mixed terms γk2 · γm of JP have non-zero expectation for which k = m, and those are all
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
64
within the intersection th−1 , ti−1 so that together with (5.64)
(i)
(h)
Cov dX d , X d eT − [X d , X d ]T , dX c , X d eT
Nti−1
X
γk2 · Xtch−1+K − Xtch−1 · γk
= − IE
k = Nth−1
Nti−1
X
= − IE IE
γk3 · Xtch−1+K
k = Nth−1
− Xtch−1 Nti−1 − Nth−1
= − IE Nti−1 − Nth−1 · IE γ 3 · IE Xtch−1+K − Xtch−1
!
Z th−1+K
λT
3
= −
· (i − h) · IE γ · IE
µs ds ,
n
th−1
as IE
R
th−1+K
th−1
R
T
σs dBs = 0 with the assumption that IE 0 σs2 ds < ∞.
In case i < h we use the same arguments as above, and because the time overlap is
ti−1+nK ·K , T ∩ th−1 , th−1+nK ·K = ti−1+nK ·K , th−1+nK ·K we receive
(i)
(h)
Cov dX d , X d eT − [X d , X d ]T , dX c , X d eT
Nth−1+n ·K
XK
2
= − IE
γk · Xtch−1+n ·K − Xtc
k = Nti−1+n
K
= −
(
)
h−1+ n −1 ·K
K
K
· γk
·K
λT
· (h − i) · IE γ
n
3
Z
· IE
th−1+n
K
t
·K
(
µs ds .
)
h−1+ n −1 ·K
K
R tn
µs ds and tnl−1 , tnl ∈ G n the quantity n·δAtnl = n· tnl µs ds is dominated by the
l−1
random variable A∗T := T · sups∈[0,T ] |µs | , which is integrable due to the boundedness
For At =
Rt
0
of the process µ. Therefore, we have
lim sup
n
|δAtl |
1
n
≤ A∗T < ∞
(5.65)
so that by the reversed Lemma of Fatou
lim sup
n
IE |δAtl |
1
n
≤ IE lim sup
n
|δAtl |
1
n
!
≤ IE (A∗T ) < ∞ ,
(5.66)
R
tl
which shows that IE tl−1
µs ds is also of order O n1 , and hence for tl−K , tl ∈ G n we have
R
tl
IE tl−K
µs ds = O K
n . Consequently, it is possible to apply this fact to the derivations
above and conclude that the order of VM − III.1P as given in (5.63) and detailed in (5.64)
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
65
is
Cov dX
d
(i)
, X d eT
d
d
− [X , X ]T , dX
c
(h)
, X d eT
= O
K2
n2
,
(5.67)
and as a result VM − IIIP as given in (5.62) and detailed in (5.63) is of order
2
K
d
d (avg)
d
d
c
d (avg)
,
Cov dX , X eT
− [X , X ]T , dX , X eT
= O
n2
(5.68)
which means that the quantity VM − IIIP is asymptotically negligible compared to the
(avg)
other summands of which Var dX, XeT
− [X, X]T as detailed in (5.62) is comprised
of.
Next, we calculate VM − IVP as given in (5.62) which equals
(avg)
(avg)
Cov dX c , X c eT
− [X c , X c ]T , dX c , X d eT
=
K
K
1 X X
c
c (i)
c
c
c
d (h)
·
= 0,
Cov
dX
,
X
e
−
[X
,
X
]
,
dX
,
X
e
T
T
T
K2
{z
}
i=1 h=1 |
(5.69)
=0
due to the fact that
(4.29)
=
=
(i)
(h)
Cov dX c , X c eT − [X c , X c ]T , dX c , X d eT
(i)
(h)
IE dX c , X c eT − [X c , X c ]T · dX c , X d eT
tg ≤ T X
(i)
IE dX c , X c eT − [X c , X c ]T ·
Xtcg − Xtcg,− · Xtdg − Xtdg,−
tg,− , tg ∈ G (h)
tg ≤ T
X
=
IE
tg,− , tg ∈ G (h)
=
(i)
dX c , X c eT − [X c , X c ]T
· Xtcg − Xtcg,−
· IE Xtdg − Xtdg,−
{z
}
|
=0
0.
(5.70)
Final Step
Finally, we combine the outcomes and observe that the discretization error of X on the
(avg)
(avg)
multi grid, dX, XeT
− [X, X]T , is the sum of the quantities dX c , X c eT
− [X c , X c ]T ,
d d (avg)
X ,X T
− [X d , X d ]T as well as 2 · dX c , X d e(avg) . The error from the respective
covariances of these three summands is of comparatively negligible order due to the independence of X c and X d as well as (5.68) and (5.69). Hence, we can add the variances of
the quantities above and receive the assertion of the theorem.
2
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
66
Theorem 2 hence gives a result regarding the variance of the discretization error of
(avg)
X on the multi grid, dX, XeT
− [X, X]T . We established an understanding of the
discretization error in case the underlying logarithmic price process X is a Poisson Jump
Diffusion process with constant drift and diffusion coefficients. We have thus partially
extended the results from Zhang et al. (2005). Especially the fact that when evaluating
the quadratic variation via the realized volatility the variance of the discretization error
of the Itô process is of the same order as the variance of the discretization error resulting
from the compound Poisson process is a very important outcome to this derivation.
It is particularly remarkable that regardless of the grid we use, we have the same constant factor in the variance of the discretization error for the Itô as well as the compound
Poisson process. Additionally, it is interesting to observe that irrespective of the behavior
of the subgrid frequency K, the variance of the discretization error on the full grid as given
in (4.17) compared to the variance of the discretization error on the multi grid as given in
(5.14) is almost identical except for a factor of 2/3 · K. The reasons for these observations
are principally due to the proper choice of the grid and to some extent because of the
independent increment property of the Poisson process as well as the Brownian motion.
This means that the constant factors in the discretization error depend mainly on the
characteristics of the grid along which the realized volatility estimators are observed and
constructed.
Taking the full and multi grid for example, we give some facts to back the rationale
presented above and elaborate where the factor 4/3·K in the variance of the discretization
error in the multi grid case is coming from. From equation (5.32) we know that
(i)
dX, XeT
=
n−K+i
X
2
(δXtm ) +
m=i
nK
X
j=1
2 ·
X
δXtr · δXts .
i+(j−1)·K ≤ r < s ≤ i+j·K
Summing over i = 1, ..., K, multiplying by 1/K and ignoring the end effects coming from
the first and last K − 1 observation times leads to
(avg)
dX, XeT
= dX, XeGT
n
+
n
X
i = K−1
δXtj · 2 ·
K−1
X
j=1
1−
j
K
· δXti−j + Op
Considering that together with Proposition 1 we know that dX, XeGT
Op √1n , we come to the conclusion that
n
K
n
.
− [X, X]T =
(avg)
Var dX, XeT
− [X, X]T
n
K−1
X
X
K
1
j
= Var
δXtj · 2 ·
1−
· δXti−j + Op
+ Op √ .
K
n
n
i = K−1
j=1
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
67
Now, the reason the factor 4/3 · K appears in the variance of the discretization error in
the multi grid case for both the Itô as well as compound Poisson process is that during the
further derivation of the variance the emergence of the following quantity can be observed
K−1
X j 2
4
4·
1−
=
· K + O (1) .
K
3
j=1
Consequently, it becomes apparent that the way the grid as well as the estimator is constructed, that is the way observation times are allocated to grids and realized volatility
estimators are combined, plays a vital role in the magnitude of the variance of the discretization error both in the full and multi grid. Thus, it is necessary to emphasize that
the results that have been derived regarding the discretization error only hold in case of a
regular allocation of observation times to grids as presented in Definition 3 and 7.
5.4
Total Error and Optimal Subgrid Frequency on the Multi Grid
In this section we have presented the error resulting from the noise term and derived
the variance of the discretization error in the multi grid case and proceed by detailing the
(avg)
variance of the total error of the observed multi grid estimator dY, Y eT
when estimating
the quadratic variation.
Corollary 2 Assume the conditions of Lemma 6 and Theorem 2. Then the total error
from noise and discretization on the multi grid is given by
(avg)
(avg)
eT
IE dY, Y eT
Var dY, Y
− [X, X]T
− [X, X]T
r
= 2 nK · IE 2 + o
=
2
ΞM
+ o
K
n
K
n
!
,
(5.71a)
,
(5.71b)
2 has the form
where ΞM
2
ΞM
=
n
1 (avg)
(5.72)
4 · K · IE 4 +
8 · IE dX, XeT
· IE 2 − 2 · Var 2
K
| K
{z
}
due to noise
Z T
4 K
4 K
1
2
2
4
4
+
·
·T
σ ds +
·
· (λ T ) · (Var γ) + · λ T · IE γ
3 n
3 n
2
0
Z T
8 K
+
·
· λ T · Var (γ) ·
σ 2 ds .
3 n
0
|
{z
}
due to discretization
Proof: This is a direct result of Lemma 6 and Theorem 2, where it is possible to amalgamate the outcomes since the ’s and the X process are independent. The detailed
5
ESTIMATING QUADRATIC VARIATION ON THE MULTI GRID
68
derivations are analogous to those we presented in the proof of Corollary 1.
2
We can observe that the variance of the total error is minimal if the magnitude of
the variance resulting from noise and discretization has the same order, that is if we have
a · nK /K = 1/nK , where a is a constant. Hence, we can conclude that the variance of the
(avg)
total error of the multi grid estimator dY, Y eT
[X, X]T is minimal if we choose K = O
n2/3
when estimating the quadratic variation
. We will come to a similar conclusion for
the number of subgrids K in the next section when we analyze the unbiased “two time
scale” estimator.
6
Estimating Quadratic Variation with Bias Correction
Although the analysis in the last sections has shown that it is possible to build an improved
estimator through averaging and subsampling, we are still confronted with a bias resulting
from the contamination through market microstructure noise. Therefore, we want to make
(avg)
further adjustments to the estimator dY, Y eT
so that we receive a bias correction, which
results in a consistent estimation of the quadratic variation of the latent true logarithmic
price process X. Furthermore, we present the variance of the total estimation error of
quadratic variation for this asymptotically unbiased estimator that we term “two time
scale”.
6.1
Bias Corrected Two Time Scale Estimator
To achieve the bias correction for the “two time scale” estimator, Zhang et al. (2005)
present a method which uses the full grid as well as the multi grid such that the result is
a bias corrected estimator using both time scales. We begin the construction of this “two
time scale” estimator by referring to Lemma 2, where a consistent estimator of IE 2 was
introduced
IEd
(2 ) =
n
1
· dY, Y eGT ,
2n
n
so that 2 · nK · IE 2 , which is the bias of the estimator dY, Y eGT , can be consistently
estimated by the term 2 · n · IEd
(2 ). This fact is used by Zhang et al. (2005) to present
K
d
the bias-adjusted estimator [X,
X]T defined as
d
[X,
X]T
(avg)
:= dY, Y eT
=
(avg)
dY, Y eT
− 2 · nK · IEd
(2 )
n
n
− K · dY, Y eGT ,
n
(6.1)
which uses the full and multi grid estimator, and hence is based on two time scales. To
understand the reason why this estimator is unbiased, we utilize (4.3) as well as (5.5) and
for arbitrary constants a, b ∈ IR consider
n n
(avg)
IE a · dY, Y eT
− b · K · dY, Y eGT X
n
n
n (avg)
= a · dX, XeT
+ 2 · nK · IE 2 − b · K · dX, XeGT + 2 · n · IE 2
n
nK
(avg)
Gn
· dX, XeT + 2 · (a − b) · nK · IE 2 ,
(6.2)
= a · dX, XeT
− b·
n
d
where when a = b the bias is eliminated, and for this reason [X,
X]T is defined as in (6.1).
69
6
ESTIMATING QUADRATIC VARIATION WITH BIAS CORRECTION
70
Lemma 10 Assume X to be a process of form (2.4) and Yti , ti to be defined as in (4.1).
d
Then, the expectation and variance of the “two time scale” estimator, [X,
X]T , conditional
on the process X, are given by
IE
n
n
(avg)
d
[X,
X]T X
= dX, XeT
− K · dX, XeGT ,
n
(6.3)
and respectively
Var
d
=
[X,
X]T X
1 (avg)
8 · dX, XeT
· IE 2 − 2 · Var 2
K
!
1
nK
2 2
.
+ 8·
· IE + op
K
K
(6.4)
Proof: The conditional expectation follows directly from (6.2), and the derivation of the
conditional variance is given in Zhang et al. (2005), subsection A.1, (p. 1407-8).
2
This result provides evidence that the “two time scale” estimator really is an unbiased
estimator of the quadratic variation of the true return process, [X, X]T . Unlike the biased
estimators that we analyzed in the last sections that rather captured the noise from market
microstructure than the variation of the true underlying process, this estimator relates
solely to the true realized volatility on the full and multi grid.
6.2
Minimizing the Variance of the Estimation Error for the Two Time
Scale Estimator
The next Lemma originates from Zhang et al. (2005), and is an extension of the Lemmas
3 and 7 given on pages 20 and 42 respectively. We will utilize this more general result
afterwards to uncover the total error of the “two time scale” estimator.
Lemma 11 Assume the conditions of Lemma 7. Then, conditional on the X process,
q
(avg)
(avg)
K
2
·
dY,
Y
e
−
dX,
Xe
−
2
·
n
·
IE
K
T
T
nK
L
X −→ G,
√
2
d
2
n · IE ( ) − IE where the limiting random variable G is independent of X and is bivariate normal with
the following distribution
N
0,
4 IE 4
2 Var 2
2 Var 2
IE 4
!!
.
6
ESTIMATING QUADRATIC VARIATION WITH BIAS CORRECTION
71
Proof: This result originates from Theorem A.1, subsection A.2 (p. 1408-9), of Zhang
et al. (2005), where the proof remains unchanged in case the process X is of form (2.4).
This is due to the fact that the proof of Theorem A.1 uses Lemma A.2 (a), subsection A.2
(p. 1408), which is also valid for the Jump Diffusion process X as given in (2.4) without
change. Despite using Lemma A.2 (a), the proof of Theorem A.1 in Zhang et al. (2005)
is conditional on the underlying process X and is thus unrelated to the nature of the X
2
process.
The following Theorem is of true importance as it provides insight regarding the question of the optimal number of subgrids K as n tends to infinity, and most importantly
reveals the variance of the total estimation error for the “two time scale” estimator when
evaluating the quadratic variation.
Theorem 3 Assume the conditions of Lemma 7 and Theorem 2. Then, with respect to
d
the bias-adjusted estimator [X,
X]T the optimal choice of K as n → ∞ is
K = c · n2/3 .
Given this result, we then have
d
IE [X,
= O n−1/3
X]T − [X, X]T
d
Var [X, X]T − [X, X]T
= n−1/3 · c · Σ 2 + c−2 · Ξ12
+ n−2/3 · c−1 · Ξ22 + o n−2/3 ,
(6.5a)
(6.5b)
where Ξ1 2 , Ξ2 2 and Σ 2 have the form
Ξ12 = 8 · IE 2
2
Ξ22 =
,
(avg)
8 · IE dX, XeT
· IE 2 − 2 · Var 2 ,
and respectively
Σ
2
T
4
1
2
2
4
σ ds +
· (λ T ) · (Var γ) + · λ T · IE γ
=
3
2
0
Z T
8
+
· λ T · Var (γ) ·
σ 2 ds .
3
0
4
·T
3
Z
4
The optimal c that minimizes the variance for large n is given by
∗
c
=
2 · Ξ12
Σ2
1/3
.
(6.6)
6
ESTIMATING QUADRATIC VARIATION WITH BIAS CORRECTION
72
Proof: Considering the total estimation error for the “two time scale” estimator when
evaluating the quadratic variation reveals that
(avg)
(avg)
d
d
[X,
X]T − [X, X]T =
[X,
X]T − dX, XeT
+ dX, XeT
− [X, X]T .
The analysis of the difference between the unbiased “two time scale” and multi grid es(avg)
d
timator, [X,
X]T − dX, XeT , leads to the following result for its scaled asymptotic
distribution, conditional on the process X,
1/2 (avg)
d
K/nK
[X,
X]T − dX, XeT
|
{z
}
= O(K/n1/2 )
1/2 K
(avg)
(avg)
dY, Y eT
− 2 · nK · IEd
(2 ) − dX, XeT
=
nK
1/2 K
(avg)
(avg)
dY, Y eT
− 2 · nK · IE 2 − dX, XeT
=
nK
1/2
− K · nK
2 · IEd
(2 ) − IE 2
|
{z
}
√
= O( n)
2 L
.
−→ N 0 , 8 IE 2
(6.7)
The convergence given above holds according to Lemma 11, where the variance of the
limiting random variable follows from the equality 4 · IE 4 − 23 Var 2 + 22 IE 4 =
2
8 · IE 2 .
Furthermore, due to the asymptotic result in (6.7) as well as Proposition 3 the order
d
of the total estimation error for the “two time scale” estimator [X,
X]T when evaluating
the true quadratic variation [X, X]T can be given by
(avg)
(avg)
d
d
[X,
X]T − [X, X]T =
[X,
X]T − dX, XeT
+ dX, XeT
− [X, X]T
!
!
K 1/2
n1/2
+ Op
.
(6.8)
= Op
K
n1/2
Minimizing the total error above by balancing the magnitude of the two sources of error
requires Op K 1/2 /n1/2 = Op n1/2 /K so that
d
K = c · n2/3 , which implies that [X,
X]T − [X, X]T = Op n−1/6 .
With this choice for K, the expectation given in (6.5a) now follows directly from the
expectations given in Lemma 10 and in Theorem 2, where the error term O n−1/3 =
nK
Gn
−2/3 from
O K
n from (5.13) also includes the quantity n · dX, XeT = O (1/K) = O n
(6.3).
6
ESTIMATING QUADRATIC VARIATION WITH BIAS CORRECTION
73
The variance of the total estimation error for the “two time scale” estimator when
evaluating the quadratic variation has the from
d
Var [X,
X]T
d
= Var [X,
X]T
d
= Var [X,
X]T
|
− [X, X]T
(avg)
+ dX, XeT
− [X, X]T
(avg)
(avg)
− dX, XeT
+ Var dX, XeT
− [X, X]T
{z
}
|
{z
}
(avg)
− dX, XeT
I
given in (5.14)
(avg)
(avg)
d
+ 2 · Cov [X,
X]T − dX, XeT
, dX, XeT
− [X, X]T .
|
{z
}
(6.9)
II
We begin with the derivation of the quantity I
(avg)
(avg)
d
d
I = Var [X,
X]T + Var dX, XeT
− 2 · Cov [X,
X]T , dX, XeT
(avg)
d
d
+ Var IE [X,
+ Var dX, XeT
X]T X
= IE Var [X,
X]T X
{z
}
{z
}
|
|
given in (6.4)
given in (6.3)
(avg)
d
− 2 · Cov [X,
,
X]T , dX, XeT
{z
}
|
with
(6.10)
I.1
d
Var IE [X,
X]T X
I.1
(6.1)
=
=
=
n2
n
(avg)
Var dX, XeT
+ K2 · Var dX, XeGT
|n {z
}
1
−4/3
·O(1) = O(n
=O
)
K2
n
nK
(avg)
− 2·
· Cov dX, XeT
, dX, XeGT
, and
n
(6.3)
=
n
n
(avg)
(avg)
− 2 · Cov dY, Y eT
− K · dY, Y eGT , dX, XeT
n
n
n
(avg)
(avg)
− 2 · Cov dX, XeT
− K · dX, XeGT , dX, XeT
n
n
n
(avg)
(avg)
− 2 · Var dX, XeT
+ 2 · K · Cov dX, XeT
, dX, XeGT ,
n
n
(avg)
where the second equality above is because of the form of dY, Y eGT and dY, Y eT
as given
in (4.2) and (5.4) as well as the fact that the ’s are independent of X with IE () = 0.
Including the derivations above into (6.10) leads to following result for the quantity I
as given in equation (6.9)
(avg)
d
Var [X,
X]T − dX, XeT
−4/3
d
= IE Var [X, X]T X
+ O n
.
(6.11)
6
ESTIMATING QUADRATIC VARIATION WITH BIAS CORRECTION
74
Next, we evaluate the quantity II as given in equation (6.9) and receive
II
(6.1)
=
(5.4)
=
=
≤
=
n
n
(avg)
(avg)
(avg)
Cov dY, Y eT
− K · dY, Y eGT − dX, XeT
, dX, XeT
− [X, X]T
n
nK
(avg)
(avg)
(avg)
Gn
Cov d, eT
+ 2 · dX, eT
−
· dY, Y eT , dX, XeT
− [X, X]T
n
n
n
(avg)
− K · Cov dX, XeGT , dX, XeT
− [X, X]T
n
1/2
nK
(avg)
Gn
−
− [X, X]T
·
Var dX, XeT · Var dX, XeT
n
{z
} |
{z
}
|
|{z}
= O(1)
= O(K/n) = O(n−1/3 )
= O(K −1 ) = O(n−2/3 )
O n−5/6
= o n−2/3 ,
(6.12)
n
where the third equality is again due to the form of dY, Y eGT and the fact that the ’s are
independent of X with IE () = 0.
Connecting the results for I as given in (6.11) and II as given in (6.12) we receive the
following result for the variance as detailed in equation (6.9)
=
(6.3) & (5.14)
=
d
Var [X,
X]T − [X, X]T
(avg)
d
IE Var [X,
X]T X
+ Var dX, XeT
− [X, X]T + o n−2/3
(6.13)
n−1/3 · c · Σ 2 + c−2 · Ξ12 + n−2/3 · c−1 · Ξ22 + o n−2/3 ,
where we set K = c · n2/3 due to the derivations above. Effectively, the quantity
n−1/3 · c−2 · Ξ12 + n−2/3 · c−1 · Ξ22 in the variance is due to microstructure noise and
the quantity n−1/3 · c · Σ 2 is due to discretization effects. For the adjustment of the
constant c in the variance, we consider the fact that the quantity n−2/3 · c−1 · Ξ22 is
asymptotically negligible for large n compared to the other variance components, so that
the derivation of the optimal c∗ that minimizes the variance when n is large is simply a
minimization of c−2 · Ξ12 + c · Σ 2 as a function of c.
2
With this result we have finally achieved the goal of this thesis, as we have detailed
the minimized variance of the estimation error of an unbiased estimator of the quadratic
variation that utilizes all existing transaction data and incorporates as well as simultaneously corrects the effects of market microstructure for any size of the noise under the
assumption that the underlying logarithmic price process follows a Poisson jump diffusion
process with constant drift and diffusion coefficients.
7
Conclusion
In this thesis we investigated the error resulting from the discrete estimation of quadratic
variation in the presence of market microstructure noise under alternative sampling schemes
based on equidistant observations on a fixed time interval. To give a rationale for these
investigations, we dealt with reasons for the feasibility of quadratic variation as a nonparametric volatility measure from an empirical finance point of view. Different from
most of the existing literature which employs a continuous Itô process as the model representing the logarithmic asset price, the analysis in this thesis relied on a Poisson jump
diffusion process. Utilizing the framework introduced in Zhang et al. (2005), this thesis
contributed to the discussion on discretization by showing that in case of a compound
Poisson process the variance of the discretization error of the realized volatility estimator
when evaluating the quadratic variation has the order O n−1/2 . Furthermore, this thesis
employed the methods in Zhang et al. (2005) to present the “two time scale” estimator
of quadratic variation that i) uses all available high-frequency price information, ii) copes
with the detrimental effects of market microstructure noise and iii) consistently estimates
the quadratic variation. The “two time scale” estimator is developed by combining results
on the estimation error from noise and discretization for two biased estimators that are
based on the full as well as multi grid, and have been studied extensively in this thesis. The
benefit of this “two time scale” estimator is particularly pronounced, as in the existing literature, most consistent estimators of quadratic variation lose the property of asymptotic
unbiasedness if market microstructure noise is incorporated in the model. Moreover, one
of the main findings of this thesis was the variance of the estimation error for the “two time
scale” estimator when a Poisson jump diffusion process with constant drift and diffusion
coefficients is assumed as the underlying logarithmic price process, and additionally the
derivation of the optimal condition under which the variance is minimized. Eventually, it
is important to note that the estimation of some of the components of the variance of the
error from discretization has not been established, yet, and hence the practical usefulness
of these results in case the logarithmic price process is a Poisson jump diffusion process
needs to be clarified.
75
A
Notation
≈
Approximate equality
a.s.
=
Almost sure equality
:⇐⇒
Definitional equivalence
L
∼
N µ,
Distributional equivalence
σ2
p
−→
L
−→
LS
Normal distribution with mean µ and variance σ 2
Convergence in probability
Convergence in law
−→
Stable convergence in law
o (1)
Tends to zero
O (1)
Bounded
op (1)
Tends to zero in probability
Op (1)
Bounded in probability
bxc
Largest integer that is less than or equal to x
[Z, Z]T
Quadratic variation of the process Z on the time interval [0, T ]
dZ, ZeH
T
Realized volatility of the process Z along the grid H, where the
observation times in H originate from the time interval [0, T ]
δZtm
Ztm − Ztm−1 , i.e. the variation of the process Z between
successive observations in the full grid G
δ (i) Ztm
Ztm − Ztm,− = Ztm − Ztm−K , where tm,− , tm ∈ G (i) , i.e. the variation
of the process Z between successive observations in the subgrid G (i)
76
B
References
Aı̈t-Sahalia, Y., Mykland, P. A., and Zhang, L. (2005), ”How Often to Sample a Continuous-Time
Process in the Presence of Market Microstructure Noise,” Review of Financial Studies, 18, 351-416.
Andersen, T. G., Bollerslev, T., Diebold, F. X., and Labys, P. (2001), ”The Distribution of Exchange Rate Realized Volatility,” Journal of the American Statistical Association, 96, 42-55.
Andersen, T. G., Bollerslev, T., Diebold, F. X., and P. Labys (2003). ”Modeling and forecasting
realized volatility,” Econometrica, 71, 579-625.
Back, K. (1991), ”Asset pricing for general processes”, Journal of Mathematical Economics, 20,
371-395.
Bandi, F. M., Russell, J. R. (2003), ”Microstructure Noise, Realized Volatility and Optimal Sampling,” technical report, University of Chicago, Graduate School of Business.
Barndorff-Nielsen, O. E., and Shephard, N. (2002a), ”Econometric Analysis of Realized Volatility
and Its Use in Estimating Stochastic Volatility Models,” Journal of the Royal Statistical Society,
Ser. B, 64, 253-280.
Barndorff-Nielsen, O. E., and Shephard, N. (2002b), ”Estimating Quadratic Variation Using Realized Variance,” Journal of Applied Econometrics, 17, 457-478.
Barndorff-Nielsen, O. E., and Shephard, N. (2003), ”Power and Bipower Variation with Stochastic
Volatility and Jumps,” technical report, University of Oxford, Nuffield College, Economics Working Paper No. 2003-W18.
Barndorff-Nielsen, O. E., and Shephard, N. (2005), ”Econometrics of Testing for Jumps in Financial Economics Using Bipower Variation,” Journal of Financial Econometrics, Vol. 4, 1, 1-30.
Christensen, K., Podolskij, M., (2006), ”Realized Range-Based Estimation of Integrated Variance,”
technical report, Aarhus School of Business; Ruhr University of Bochum, Dept. of Probability and
Statistics.
Gallant, A. R., Hsu, C.-T., and Tauchen, G. T. (1999), ”Using Daily Range Data to Calibrate
Volatility Diffusions and Extract the Forward Integrated Variance,” The Review of Economics and
Statistics, 81, 617-631.
Heston, S. (1993), ”A Closed-Form Solution for Options With Stochastic Volatility With Applications to Bonds and Currency Options,” Review of Financial Studies, 6, 327-343.
Jacod, J., and Protter, P. (1998), ”Asymptotic Error Distributions for the Euler Method for Stochastic Differential Equations,” The Annals of Probability, 26, 267-307.
Jacod, J., and Shiryaev, A. N. (2003), Limit Theorems for Stochastic Processes (2nd ed.), New
York: Springer-Verlag.
77
Klebaner, F.C. (2005), Introduction to stochastic calculus with applications (2nd ed.), London:
Imperial College Press.
Mykland, P. A., and Zhang, L. (2002), ”ANOVA for Diffusions”, technical report, University of
Chicago, Dept. of Statistics.
Oomen, R. (2004), ”Properties of Realized Variance for a Pure Jump Process: Calendar Time Sampling versus Business Time Sampling,” technical report, University of Warwick, Warwick Business
School.
Oomen, R. (2005), ”Properties of Realized Variance under Alternative Sampling Schemes,” technical report, University of Warwick, Warwick Business School.
Protter, P. (2003), Stochastic Integration and Differential Equations (2nd ed.), New York: SpringerVerlag.
Rootzen, H. (1980), ”Limit Distributions for the Error in Approximations of Stochastic Integrals,”
The Annals of Probability, 8, 241-251.
Tauchen, G., Zhou, H. (2006), ”Identifying Realized Jumps on Financial Markets,” technical report, Duke University, Dept. of Economics.
Zhang, L., Mykland, P. A. and Aı̈t-Sahalia, Y. (2005), ”A tale of two time scales: determining
integrated volatility with noisy high-frequency data,” Journal of the American Statistical Association, Vol. 100, 472, 1394-1411.
78
Hiermit versichere ich, die vorliegende Diplomarbeit ohne Hilfe Dritter und nur mit den
angegebenen Quellen und Hilfsmitteln angefertigt zu haben.
Alireza Dorfard,
Oberursel, 12. August 2007
© Copyright 2026 Paperzz