Stochastic Volatility Models with Integrated
GARCH(t) Structure
Peter Bloomfield
Department of Statistics
North Carolina State University
April 1, 2008
1
Introduction
A stochastic volatility model consists of a pair of stochastic processes {Xt , Yt },
of which only Yt is observed, but where the conditional distribution of Yt |Xt =
xt has a scale that depends on xt . The unobserved Xt is interpreted as a
state variable that affects the processes that result in the observed Yt .
The conditional heteroscedasticity (CH) approach to modeling volatility
is based on the conditional variance function
ht = Var(Yt |Yt−1 = yt−1 , Yt−2 = yt−2 , . . . ) .
Engle’s seminal ARCH model and the many subsequent variants are based
on various specifications of the function
ht = ht (yt−1 , yt−2 , . . . ).
When combined with an assumption about the conditional distribution, given
Yt−1 = yt−1 , Yt−2 = yt−2 , . . . , of the standardized quantity
Yt
et = √ ,
ht
the model allows the likelihood function of parameters in ht (·) to be constructed, leading directly to estimation and tests. The conditional variance
1
function ht (·) must be interpreted with care; on its face, it describes how the
current variance is associated with past observed values; however, in many
cases the association cannot be interpreted as causality.
The conditional variance function ht (·) can be defined for a stochastic
volatility model, but typically cannot be found in closed form. Also, in general, the distribution of t depends on yt−1 , yt−2 , . . . . Estimation and testing
is accordingly more complex. The model constructed in the next section,
however, has the familiar IGARCH form for ht (·) and a fixed distribution
for t . Related multivariate models are constructed in following sections.
The models exploit the inverse Γ and inverse Wishart distributions as the
natural conjugate prior distributions for the variance in the Gaussian family.
2
Univariate Model
2.1
The Latent Process
The latent process {Xt } is defined by:
ν η2
X0 ∼ Γ ,
,
2 2
(1)
and for t > 0
Xt = Bt Xt−1 ,
(2)
where
ν 1
,
θBt ∼ β
2 2
(3)
and {Bt } are i.i.d. and independent of X0 .
The shape parameter ν will be
required
to satisfy ν > 2 to ensure that
−1
−1
−1
E X0 < ∞. Requiring E Xt
= E X0 for all t > 0 is also convenient,
and is met if
ν−2
θ=
.
(4)
ν−1
2
2.2
The Observed Process
The observed process is defined by
Yt = σt t
(5)
1
σt = √ ,
Xt
(6)
t ∼ N (0, 1)
(7)
where
and
are i.i.d. and independent of {Xt }.
An equivalent definition is that given Xu = xu , 0 ≤ u, and Yu = yu , 0 ≤
u < t,
Yt ∼ N (0, σt2 )
with the same definition of σt .
2.3
Derived Distributions
2.3.1
Marginal distribution of Y0
Since η 2 X0 ∼ Γ ν2 , 12 = χ2ν ,
Y
q0 = q 0
η2
ν
η 2 X0
ν
∼ tν .
Since ν > 2,
Var(Y0 ) =
η2
ν
η2
×
=
.
ν
ν−2
ν−2
Write
η2
,
ν−2
and t∗ (ν) for the standardized t-distribution with ν degrees of freedom:
r
ν−2
× t(ν).
t∗ (ν) =
ν
Then
p
Y0 ∼ h0 t∗ (ν).
h0 =
3
2.3.2
Conditional distribution of X0 |Y0
The joint density of X0 and Y0 is
η 2 η 2 x0 ν2 −1
( 2 )
2
Γ ν2
r
2 η 2 x0
x0
y x0
exp −
×
exp − 0
fX0 (x0 ) × fY0 |X0 =x0 (y0 |x0 ) =
2
2π
2
2
ν+1
η + y02
−1
x0
= Ax0 2
exp −
2
for a constant A, so conditionally on Y0 = y0 ,
ν + 1 η 2 + y02
X0 ∼ Γ
,
.
2
2
2.3.3
Conditional distribution of X1 |Y0
From (3),
1
ν 1
,
B1 ∼ β
θ
2 2
and because B1 is independent of Y0 , it has the same distribution conditionally on Y0 = y0 . Recall that if G ∼ Γ(α1 + α2 , β), B ∼ β(α1 , α2 ), and G and
B are independent, then BG ∼ Γ(α1 , β). So conditionally on Y0 = y0 ,
2
ν
η + y02
.
X1 = B1 X0 ∼ Γ , θ
2
2
2.3.4
Conditional distribution of Y1 |Y0
The conditional distribution of Y1 given Y0 = y0 follows from the same argument as in Section 2.3.1, and is
p
Y1 ∼ h1 t∗ (ν),
where
θ
h1 =
η 2 + y02
2
ν
−1
2
=
θ
θ
η 2 + y02 = θh0 +
y2.
ν−2
ν−2 0
Now
θ=
ν−2
,
ν−1
4
(8)
which means that
θ
1
=
= 1 − θ,
ν−2
ν−1
and (8) becomes
h1 = θh0 + (1 − θ)y02 .
2.4
The Recursion
Write Y(t−1):0 = (Yt−1 , Yt−2 , . . . , Y0 ). The argument of Sections 2.3.2–2.3.4,
used recursively, proves:
Proposition 1. Suppose that {Xt } is defined by (1)–(4), and that {Yt } is
defined by (5)–(7). Then
p
Y0 ∼ h0 t∗ (ν),
where
η2
h0 =
,
ν−2
and for t > 0, conditionally on Y(t−1):0 = y(t−1):0 ,
Yt ∼
p
ht t∗ (ν),
where
2
ht = θht−1 + (1 − θ)yt−1
.
That is, {Yt } has IGARCH(t) structure with
ν=
2−θ
1−θ
degrees of freedom.
5
3
3.1
Multivariate Model With Scalar Volatility
The Latent Process
The latent process is similar to that in Section 2.1, except that
ν p
,
θBt ∼ β ,
2 2
(9)
where p is the dimension of the observed process, and
θ=
3.2
ν−2
.
ν−2+p
(10)
The Observed Process
The observed process is a p-dimensional vector
Yt = σt t ,
(11)
where σt is again the scalar volatility given by
1
σt = √ ,
Xt
and
t ∼ Np (0, Σ).
3.3
(12)
The Recursion
A similar analysis to that of Section 2.3 yields:
Proposition 2. Suppose that {Xt } is defined by (1), (2), (9), and (10), and
that {Yt } is defined by (6), (11), and (12). Then
p
Y0 ∼ h0 t∗p (Σ, ν),
where
h0 =
η2
ν−2
6
and for t > 0, conditionally on Y(t−1):0 = y(t−1):0 ,
p
Yt ∼ ht t∗p (Σ, ν),
where
ht = θht−1 + (1 − θ)
0
yt−1
Σ−1 yt−1
.
p
Here t∗p (Σ, ν) denotes the p-dimensional multivariate t-distribution with
covariance matrix Σ and ν (> 2) degrees of freedom.
Since the conditional covariance matrix of Yt is a scalar multiple of the
fixed matrix Σ, this model is even more restrictive than Bollerslev’s constant
correlation model, and is presumably of little practical interest.
4
4.1
Multivariate Model With Matrix Volatility
The Latent Process
A more interesting multivariate extension results from replacing the scalar
latent process {Xt } by a matrix analog. The distribution of the initial state
X0 is Wishart, the multivariate analog of the χ2 distribution (and hence of
the Γ distribution used in the scalar model):
X0 ∼ Wp η −1 , ν ∗ ,
(13)
where Wp (Σ, ν ∗ ) denotes the Wishart distribution in p dimensions with covariance matrix Σ and degrees of freedom ν ∗ . The degrees
of freedom ν ∗ will
be required to satisfy ν ∗ > p + 1, to ensure that E X−1
is finite.
0
For t > 0, Xt is updated from Xt−1 in a manner that generalizes the scalar
update of multiplication by an independent η-distributed random variable.
Let the (ν ∗ + 1) × p matrix Qt be uniformly distributed on the manifold
Q0 Q = Ip . Partition Qt as
Qt (1)
Qt =
Qt (2)
where Qt (1) has ν ∗ rows and Qt (2) consists of a single row. Write the Cholesky
factorization of Xt−1 as
Xt−1 = Lt−1 L0t−1 .
7
Now Xt is defined as
1
(1) 0 (1)
Xt = Lt−1 Qt Qt L0t−1
θ
Requiring that E X−1
= E X−1
implies that
t
0
θ=
4.2
ν∗ − p − 1
.
ν∗ − p
(14)
(15)
The Observed Process
The observed process {Yt } is defined by its conditional distributions: conditionally on Xu = xu , 0 ≤ u, and Yu = yu , 0 ≤ u < t,
Yt ∼ Np (0, Σt ) ,
(16)
Σt = X−1
t .
(17)
where
4.3
4.3.1
Derived Distributions
Joint distribution of X0 and Y0
The joint density of X0 and Y0 is
1
1
1 0
2
exp − trace(ηx0 )
det(x0 ) exp − y0 x0 y0
det(x0 )
2
2
×
1
p
1
1 ∗
1
∗
(2π) 2
2 2 pν det(η)− 2 ν Γp ν ∗
2
1
1
(ν ∗ −p)
0
2
det(x0 )
exp − trace[(η + y0 y0 ) x0 ]
2
=
1
1
1 ∗
pν ∗
− 12 ν ∗
2
2
det(η)
Γp ν × (2π) 2 p
2
where Γp 12 ν ∗ is the multivariate Γ-function defined by
1
(ν ∗ −p−1)
2
Γp
p
Y
1
1 ∗
1 ∗
p(p−1)
4
ν =π
Γ (ν + 1 − i) .
2
2
i=1
8
4.3.2
Marginal distribution of Y0
The marginal density of Y0 is obtained by integrating out x0 , and is seen to
be proportional to
− 21 (ν ∗ +1)
det(η + y0 y0 0 )
− 12 (ν ∗ +1)
;
= det(η) 1 + y0 0 η −1 y0
consequently
Y0 ∼
t∗p
1
∗
η, ν − p + 1 .
ν∗ − p − 1
Write ν = ν ∗ − p + 1, and
H0 =
ν∗
1
1
η=
η,
−p−1
ν−2
so that
Y0 ∼ t∗p (H0 , ν) .
4.3.3
Conditional distribution of X0 |Y0
By inspection of the joint density, conditionally on Y0 = y0 ,
h
i
−1
X0 ∼ Wp (η + y0 y0 0 ) , ν ∗ + 1 .
4.3.4
Conditional distribution of X1 |Y0
The conditional distribution of X1 given Y0 = y0 is derived from the conditional density of X0 using the following facts: if Z is an n × p matrix
whose rows are a random sample from Np (0, Λ), and Z = QR is the Q-R
decomposition of Z (that is, Q0 Q = Ip , and R is upper-triangular), then:
• Q and R are independent;
• Q is uniformly distributed on the manifold Q0 Q = Ip ;
• R is the Cholesky factor of the Wp (Λ, n)-distributed matrix Z0 Z and
therefore has the same distribution.
Then Q1 L00 is a (ν ∗ + 1) × p matrix whose rows
are, conditionally
on
Y0 = y0 , distributed as a random sample from Np 0, (η + y0 y0 0 )−1 , and
(1)
Q1 L00 consists of the first ν ∗ rows of Q1 L00 .
9
Therefore conditionally on Y0 = y0 ,
1
1 (1) 0 0 (1) 0 0 −1
∗
X1 =
Q1 L0
Q1 L0 ∼ Wp (η + y0 y0 ) , ν .
θ
θ
4.3.5
Conditional distribution of Y1 |Y0
The conditional distribution of Y1 given Y0 = y0 follows from the same
argument as in Section 4.3.2, and is
Y1 ∼ t∗p (H1 , ν) ,
where
H1 =
θ
θ
(η + y0 y00 ) = θH0 +
y0 y00 .
ν−2
ν−2
As before,
θ
= 1 − θ,
ν−2
so this becomes
H1 = θH0 + (1 − θ)y0 y00 .
4.4
The Recursion
The argument of Sections 4.3.3–4.3.5, used recursively, proves:
Proposition 3. Suppose that {Xt } is defined by (13)–(15), and that {Yt }
is defined by (16)–(17). Then
Y0 ∼ t∗p (H0 , ν)
where
H0 =
1
η,
ν−2
and for t > 0, conditionally on Y(t−1):0 = y(t−1):0 ,
Yt ∼ t∗p (Ht , ν)
where
0
Ht = θHt−1 + (1 − θ)yt−1 yt−1
.
10
That is, {Yt } has the integrated form of the scalar BEKK multivariate
GARCH model, but where the conditional distribution of Yt is the multivariate t-distribution with
2−θ
ν=
1−θ
degrees of freedom.
11
© Copyright 2026 Paperzz