A Geometric View of Interest Rate Theory - S-WoPEc

A Geometric View of Interest Rate Theory
Tomas Björk
∗
†
Department of Finance
Stockholm School of Economics
Box 6501
S-113 83 Stockholm SWEDEN
e-mail: [email protected]
August 11, 2000
To appear in
Handbook of Mathematical Finance,
Cambridge University Press
Abstract
The purpose of this essay is to give an overview of some recent work
concerning structural properties of the evolution of the forward rate curve
in an arbitrage free bond market. The main problems to be discussed are
as follows.
• When is a given forward rate model consistent with a given family
of forward rate curves?
• When can the inherently infinite dimensional forward rate process
be realized by means of a finite dimensional state space model.
We consider interest rate models of Heath-Jarrow-Morton type, where
the forward rates are driven by a multidimensional Wiener process, and
where he volatility is allowed to be an arbitrary smooth functional of the
present forward rate curve. Within this framwork we give necessary and
sufficient conditions for consistency, as well as for the existence of a finite
dimensional realization, in terms of the forward rate volatilities.
∗ Support
from ITM is gratefully acknowledged.
am grateful to D. Filipovic and J. Zabzcyk for a number of very helpful discussions. A
number of valuable comments from an unknown referee on one of the underlying papers helped
to improve that part considerably. I am also highly indebted to B.J. Christensen, A. Gombani
and L. Svensson for their generosity in letting me use our joint results for this overview.
†I
1
Contents
1 Introduction
3
1.1
Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.2
Main Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2 Linear Realization Theory
4
2.1
Deterministic Forward Rate Volatilities . . . . . . . . . . . . . .
5
2.2
Existence of Finite Linear Realizations . . . . . . . . . . . . . . .
7
2.3
Transfer Functions . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.4
Minimal Realizations . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.5
Economic Interpretation of the State Space . . . . . . . . . . . .
12
2.6
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
2.7
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
3 Invariant Manifolds
16
3.1
Parameter Recalibration . . . . . . . . . . . . . . . . . . . . . . .
16
3.2
Invariant Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . .
18
3.3
The Formalized Problem . . . . . . . . . . . . . . . . . . . . . . .
19
3.3.1
The Space . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3.3.2
The Forward Curve Manifold . . . . . . . . . . . . . . . .
19
3.3.3
The Interest Rate Model . . . . . . . . . . . . . . . . . . .
20
3.3.4
The Problem . . . . . . . . . . . . . . . . . . . . . . . . .
20
3.4
The Invariance Conditions . . . . . . . . . . . . . . . . . . . . . .
20
3.5
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.5.1
The Nelson-Siegel Family . . . . . . . . . . . . . . . . . .
23
3.5.2
The Hull-White and Ho-Lee Models . . . . . . . . . . . .
24
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
3.6
4 Existence of Nonlinear Realizations
26
4.1
Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
4.2
The Geometric problem . . . . . . . . . . . . . . . . . . . . . . .
27
4.3
The Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
2
4.4
4.5
1
1.1
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
4.4.1
Constant Volatility . . . . . . . . . . . . . . . . . . . . . .
31
4.4.2
Constant Direction Volatility . . . . . . . . . . . . . . . .
32
4.4.3
When is the Short Rate a Markov Process? . . . . . . . .
34
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
Introduction
Setup
We consider a bond market model (see [2], [27]) living on a filtered probability
space (Ω, F , F, Q) where F = {Ft }t≥0 . The basis is assumed to carry a standard
m-dimensional Wiener process W , and we also assume that the filtration F is
the internal one generated by W .
By p(t, x) we denote the price, at t, of a zero coupon bond maturing at t + x,
and the forward rates r(t, x) are defined by
∂ log p(t, x)
.
∂x
Note that we use the Musiela parameterisation, where x denotes the time to
maturity. The short rate R n
is defined as
o R(t) = r(t, 0), and the money account
Rt
B is given by B(t) = exp 0 R(s)ds . The model is assumed to be free of
arbitrage in the sense that the measure Q above is a martingale measure for
the model. In other words, for every fixed time of maturity T ≥ 0, the process
Z(t, T ) = p(t, T − t)/B(t) is a Q-martingale.
r(t, x) = −
Let us now consider a given forward rate model of the form
(
dr(t, x) = β(t, x)dt + σ(t, x)dW,
r(0, x)
= ro (0, x).
(1)
where, for each x, β and σ are given optional processes. The initial curve
{ro (0, x); x ≥ 0} is taken as given. It is interpreted as the observed forward
rate curve.
The standard Heath-Jarrow-Morton drift condition ([20]) can easily be transferred to the Musiela parameterisation. The result (see [6], [26]) is as follows.
Proposition 1.1 (The Forward Rate Equation) Under the martingale measure Q the r-dynamics are given by
Z x
∂
r(t, x) + σ(t, x)
σ(t, u)? du dt + σ(t, x)dW (t), (2)
dr(t, x) =
∂x
0
o
(3)
r(0, x) = r (0, x).
3
where
1.2
?
denotes transpose.
Main Problems
Suppose now that we are give a concrete model M within the above framework,
i.e. suppose that we are given a concrete specification of the volatility process
σ. We now formulate a couple of natural problems:
1. Take, in addition to M, also as given a parameterized family G of forward
rate curves. Under which conditions is the family G consistent with the
dynamics of M? Here consistency is interpreted in the sense that, given
an initial forward rate curve in G, the interest rate model M will only
produce forward rate curves belonging to the given family G.
2. When can the given, inherently infinite dimensional, interest rate model M
be written as a finite dimensional state space model? More precisely,
we seek conditions under which the forward rate process r(t, x) induced
by the model M, can be realized by a system of the form
dZt
r(t, x)
= a (Zt ) dt + b (Zt ) dWt ,
= G (Zt , x) .
(4)
(5)
where Z (interpreted as the state vector process) is a finite dimensional
diffusion, a(z), b(z) and G(z, x) are deterministic functions and W is the
same Wiener process as in in (2).
As will be seen below, these two problems are intimately connected, and the
main purpose of this chapter is to give an overview of some recent work in this
area. The text is mainly based on [3], [4] and [5], but the presentation given
below is more focused on geometric intuition than the original articles, where
full proofs, technical details and further results can be found. In the analysis
below we use ideas from systems and control theory (see [24]) as well as from
nonlinear filtering theory (see [8]). References to the literature will sometimes
be given in the text, but will mainly be summarised in the Notes at the end of
each section.
The organisation of the text is as follows. In Section 2 we study the existence
of a finite dimensional factor realization in the comparatively simple case when
the forward rate volatilities are deterministic. In Section 3 we study the general
consistency problem, and in Section 4 we use the consistency results from Section
3 in order to give a fairly complete picture of the nonlinear realization problem.
2
Linear Realization Theory
In the general case, the forward rate equation (2) is a highly nonlinear infinite
dimensional SDE but, as can be expected, the special case of linear dynamics is
4
much easier to handle. In this section we therefore concentrate on linear forward
rate models, and look for finite dimensional linear realizations.
2.1
Deterministic Forward Rate Volatilities
For the rest of the section we only consider the case when the volatility σ(t, x) =
[σ1 (t, x), . . . , σm (t, x)] is a deterministic time-independent function σ(x) of x
only.
Assumption 2.1 The volatility σ is a deterministic C ∞ -mapping σ : R+ →
Rm
Denoting the function x 7−→ r(t, x) by r(t) we have, from (2),
= {Fr(t) + D} dt + σdW (t),
dr(t)
o
= r (0).
r(0)
(6)
(7)
Here the linear operator F is defined by
∂
,
∂x
F=
whereas the function D is given by
Z
x
D(x) = σ(x)
(8)
σ(s)? ds.
(9)
0
The point to note here is that, because of our choice of a deterministic volatility
σ(x), the forward rate equation (6) is a linear (or rather affine) SDE. Because
of this linearity (albeit in infinite dimensions) we therefore expect to be able to
provide an explicit solution of (6). We now recall that a scalar equation of the
form
dy(t) = [ay(t) + b] dt + cdW (t)
has the solution
Z
y(t) = eat y(0) +
t
Z
ea(t−s) bds +
0
t
ea(t−s) cdW (s),
0
and we are led to conjecture that the solution to (6) is given by the formal
expression
Z t
Z t
Ft o
F(t−s)
e
Dds +
eF(t−s) σdW (s).
r(t) = e r +
0
0
Ft
The formal exponential e acts on real valued functions, and we have to figure
out how it operates. From the standard series expansion of the exponential
function one is led to write
∞ n
X
Ft t
[F n f ] (x)
(10)
e f (x) =
n!
n=0
5
∂n
∂xn ,
In our case F n =
so (assuming f to be analytic) we have
∞ n n
X
t ∂ f
(x)
eFt f (x) =
n! ∂xn
n=0
(11)
This is, however, just
series expansion of f around the point x, so for
a Taylor
analytic f we have eFt f (x) = f (x + t). We have in fact the following precise
result (which can be proved rigorously).
Proposition 2.1 The operator F is the infinitesimal generator of the semigroup
of left translations, i.e. for any f ∈ C[0, ∞) we have
Ft e f (x) = f (t + x).
The solution of the forward rate equation (6) is given by as
Z
Ft o
t
r(t, x) = e r (0, x) +
e
F(t−s)
Z
t
D(x)ds +
0
eF(t−s) σ(x)dW (s)
(12)
σ(x + t − s)dW (s).
(13)
0
or equivalently by
Z
t
r(t, x) = ro (0, x + t) +
0
Z
D(x + t − s)ds +
0
t
From (12) it is clear by inspection that we may write the forward rate equation
(6) as
dr0 (t, x)
r(t, x)
=
=
Fr0 (t, x)dt + σ(x)dW (t), r0 (0, x) = 0
r0 (t, x) + δ(t, x),
(14)
(15)
where δ is given by
Z
δ(t, x) = ro (0, x + t) +
0
t
D(x + t − s)ds
(16)
Since δ(t, x) is not affected by the input W , we see that the problem of finding
a realization for the term structure system (6) is equivalent to that of finding a
realization for (14). We are thus led to the following definition.
Definition 2.1 A matrix triple [A, B, C(x)] is called an n-dimensional realization of the systems (6) and (14) if r0 has the representation
dZ(t)
r0 (t, x)
= AZ(t)dt + BdW (t), Z(0) = 0.
= C(x)Z(t),
6
(17)
(18)
Our main problems are now as follows.
• Take as a priori given a volatility structure σ(x).
• When does there exists a finite dimensional realization?
• If there exists a finite dimensional realization, what is the minimal dimension?
• How do we construct a minimal realization from knowledge of σ?
• Is there an economic interpretation of the state process Z in the realization?
2.2
Existence of Finite Linear Realizations
We will now go on to study the existence of a finite dimensional realization of
the stochastic system (14), and in order to get some ideas, suppose that there
actually exists a finite dimensional realisation of (14) of the form (17)-(18).
Solving (14) we have
Z
r0 (t, x) =
t
eF(t−s) σ(x)dW (s) =
Z
0
t
0
σ(x + t − s)dW (s),
while, from the realization (17)-(18), we also have
Z
r0 (t, x) = C(x)Z(t) = C(x)
t
eA(t−s) BdW (s)
0
Thus we have, with probability one, for each x and each t,
Z t
Z t
σx (t − s)dW (s) =
C(x)eA(t−s) BdW (s)
0
(19)
0
where we use subindex x to denote left translation, i.e. fx (t) = f (x + t). This
leads us immediately to conjecture that the equation
σx (t) = C(x)eAt B
must hold for all x and t, and we have our first main result.
Proposition 2.2
1. The forward rate process has a finite dimensional linear realization if and
only if the volatility function σ can be written on the form
σ(x) = C0 eAx B
7
(20)
2. If σ has the form (20) then a concrete realization of r0 is given by
dZ(t)
r0 (t, x)
= AZ(t)dt + BdW (t), Z(0) = 0.
(21)
= C(x)Z(t),
(22)
with A, B as in (20), and with C(x) = C0 eAx . The forward rates r(t, x)
are then given by (15)-(16).
Proof. It is clear from the discussion above that if there exists a finite realisation, then we must have the factorisation σx (t) = C(x)eAt B. Setting x = 0, and
denoting C(0) by C0 , in this gives us the relation (20). If, on the other hand,
σ factors as in (20), then we simply define Z as in (21). A direct calculation as
above then shows that we have r0 (t, x) = C0 eAx z(t).
Remark 2.1 Let us call a function of the form ceAx b, where c is a row vector,
A is a square matrix and b is a column vector, a quasi-exponential (or QE)
function. The general form of a quasi-exponential function f is given by
X
X
eλi x +
eαi x [pj (x) cos(ωj x) + qj (x) sin(ωj x)] ,
(23)
f (x) =
i
j
where λi , α1 , ωj are real numbers, whereas pj and qj are real polynomials.
QE functions will turn up again, so we list some simple properties.
Lemma 2.1 The following hold for the quasi-exponential functions.
• A function is QE if and only if it is a component of the solution of a vector
valued linear ODE with constant coefficients.
• A function is QE if and only if it can be written as f (x) = ceAx b.
• If f is QE, then f 0 is QE.
• If f is QE, then its primitive function is QE.
• If f and g are QE, then f g is QE.
2.3
Transfer Functions
Using ideas from linear systems theory, an alternative view of the realization
problem is obtained by studying transfer functions, i.e. by going to the frequency domain. To get some intuition, consider again the equation
dr0 (t, x) = Fr0 (t, x)dt + σ(x)dW (t), r0 (0, x) = 0.
8
(24)
Let us now formally “divide by dt”, which gives us
dW
dr0
(t, x) = Fr0 (t, x) + σ(x)
(t),
dt
dt
where the formal time derivative dW
dt (t) is interpreted as white noise. We interpret this equation as an input-output system where the random input signal t 7−→ dW
dt (t) is transformed into the infinite dimensional output signal
t 7−→ r0 (t, ·). We thus view the equation as a version of the following controlled
ODE
dr0
(t, x) = Fr0 (t, x) + σ(x)u(t),
dt
r0 (0) = 0,
(25)
where u is a deterministic input signal. Generally speaking, tricks like this does
not work directly, since we are ignoring the difference between standard differential calculus, which is used to analyze (25), and Itô calculus which we use when
dealing with SDEs. In this case, however, because of the linear structure, the
second order Itô term will not come into play, so we are safe. (See the discussion
in Section 3.4 around the Stratonovich integral for how to treat the nonlinear
situation.)
It is now natural to study the transfer function for the system (25), which relates
the Laplace transform of the input signal to the Laplace transform of the output
signal.
Definition 2.2 The transfer function, K(s, x), for (25) is determined by the
relation
r̃0 (s, x) = K(s, x)ũ(s),
where ˜ denotes the Laplace transform in the t-variable.
From the uniqueness of the Laplace transform we then have the following result.
Lemma 2.2 The system
dZ(t)
r0 (t, x)
= AZ(t)dt + BdW (t), Z(0) = 0.
(26)
= C(x)Z(t),
(27)
is a realization of
dr0 (t, x) = Fr0 (t, x)dt + σ(x)dW (t), r0 (0, x) = 0
(28)
if and only if the deterministic control system
dr0
(t, x) = Fr0 (t, x) + σ(x)u(t),
dt
9
(29)
has the same transfer function as the system
dZ
(t)
dt
r0 (t, x)
= AZ(t) + Bu(t),
(30)
= C(x)Z(t).
(31)
Furthermore we have
Lemma 2.3 The transfer function K(s, x) of (29) is given by
K(s, x) = L [σx ] (s),
where L denotes the Laplace transform, and σx denotes left translation.
Proof. From (29) we have
Z
r0 (t, x) =
0
t
σ(x + t − s)u(s)ds = [σx ? u] (t),
and thus
r̃0 (s, x) = L [σx ] (s)ũ(s).
For concrete computation of a realization, the following result is useful.
Lemma 2.4
• The transfer function of the system (30)-(31) is given by
K(s, x) = C(x) [sI − A]−1 B.
• The r0 system has a finite realization if and only if there exists a factorization of the form
−1
L [σx ] (s) = C(x) [sI − A]
B
• Denote the transfer function of r0 by K(s, x), and assume that that there
exits a finite dimensional realization. If we have fund A, B and C such
that
K(s, 0) = C [sI − A]−1 B,
then a realization of r0 is given by A, B, CeAx .
10
Proof. The first assertion is immediately obtained by taking the Laplace transform of (30)-(31). The second follows at from Lemma 2.2, and the third from
Proposition 2.2.
If we want to find a concrete realization for a given system, we thus have two
possibilities. We can either look for a factorization of the volatility function
as σ(x) = CeAx B, or we can try to factor the transfer function as K(s, 0) =
C [sI − A]−1 B. From a logical point of view the two approaches are equivalent,
but from a practical point of view it is much easier to factor the transfer function
than to factor the volatility. There are in fact a number of standard algorithms in
the systems theoretic literature which construct a realization, given knowledge
of the transfer functions. See [7].
2.4
Minimal Realizations
The purpose of this section is to determine the minimal dimension of a finite
dimensional realization.
Definition 2.3 The dimension of a realization [A, B, C(x)] is defined as the
dimension of the corresponding state space. A realization [A, B, C(x)] is said
to be minimal if there is no other realization with smaller dimension. The
McMillan-degree, D, of the forward rate system is defined as the dimension
of a minimal realization.
In order to get a feeling for how to determine the McMillan degree, we note
that r0 has a finite dimensional realization if and only if r0 evolves on a finite
dimensional subspace in the infinite dimensional function space H. Furthermore,
it seems obvious that the McMillan degree equals the dimension of this subspace.
In order to determine the subspace above, let us again view the r0 system as a
special case of the following controlled equation, where we have suppressed x.

 dr0 = Fr0 (t) + σu(t),
dt
(32)

r0 (0) = 0.
The solution of this equation is given by
Z t
Z tX
∞
(t − s)n n
F σu(s)ds.
eF(t−s) σu(s)ds =
r0 (t) =
n!
0
0
0
This is a linear combination of vectors of the form Fn σi , so we see that the
smallest subspace R which contains r0 (t) for all t and for all choices of the
input signal u is given by
(33)
R = span σ, Fσ, F2 σ, · · · = span Fk σi ; i = 1, · · · , m. k = 0, 1, · · ·
We thus have the following result.
11
Proposition 2.3 Take the volatility function
σ = [σ1 , · · · , σm ]
as given. Then the McMillan degree, D, is given by
D = dim (R) ,
(34)
with R defined as in (34). The forward rate system thus admits a finite dimensional realization if and only if the space spanned by the components of σ and
all their derivatives is finite dimensional.
2.5
Economic Interpretation of the State Space
In general, the state space of the minimal realization of a given system has no
concrete (e.g. physical) interpretation. In our case, however, the states of the
minimal realization turn out to have a simple economic interpretation in terms
of a minimal set of “benchmark” forward rates.
Assume that [A, B, C] is a minimal realization, of dimension n, of the forward
rates as in (21)-(22). Let us choose a set of “benchmark ” maturities x1 , · · · , xn .
We use the notation x̄ = (x1 , · · · , xn ). Assume furthermore that the maturity
vector x̄ is chosen so that the matrix


CeAx1


..
T (x̄) = 

.
CeAxn
is invertible. It can be shown (see [4]) that, outside a set of measure zero, this
can always be done be done as long as the maturities are distinct. We use the
notation


r0 (t, x1 )


..
r0 (t, x̄) = 

.
r0 (t, xn )
and corresponding interpretations for column vectors like r(t, x̄), δ(t, x̄) etc.
The following result shows how the entire term structure is determined by the
benchmark forward rates.
Proposition 2.4 Assume that (21)-(22) is a minimal realization of the forward
rates, and assume furthermore that a maturity vector x̄ = (x1 , · · · , xn ) is chosen
as above. Then the following hold.
• With notation as above, the vector r(t, x̄) of benchmark forward rates has
the dynamics
dr(t, x̄) = T (x̄)AT −1 (x̄)r(t, x̄) + Ψ(t, x̄) dt + T (x̄)BdW (t), (35)
r(0, x̄) = r? (0, x̄),
12
where the deterministic function Ψ is given by
Ψ(t, x̄) =
∂r?
(0, tē + x̄) + D(tē + x̄) − T (x̄)AT −1 (x̄)δ(t, x̄)
∂x
Here ē ∈ Rn denotes the vector with unit components, i.e.


1,
 1, 


ē =  . 
 .. 
1
• The system of benchmark forward rates determine the entire forward rate
process according to the formula
r(t, x) = CeAx T −1 (x̄)r(t, x̄) − CeAx T −1 (x̄)δ(t, x̄) + δ(t, x).
(36)
• The correspondence between Z and r is given by
r0 (t, x̄) = T (x̄)Z(t)
(37)
Proof. See [4].
The conclusion is thus that the state variables of a minimal realization can be
interpreted as an affine transformation of a vector of benchmark forward rates.
2.6
Examples
In this section we will give some simple illustrations of the theory. Note the
handling of multiple roots of the matrix A, and the fact that the input noise
can have dimension smaller than the dimension of A.
Example 2.1 σ(x) = σe−ax
We consider a model driven by a one dimensional Wiener process, having the
forward rate volatility structure
σ(x) = σe−ax ,
where σ in the right hand side denotes a constant. (The reader will probably
recognize this example as the Hull-White model.) We start by determining the
McMillan degree D, and by Proposition 2.3 we have
D = dim(R),
13
where the space R is given by
dk
−ax
σe
; k≥0 .
R = span
dxk
It is obvious that R is one dimensional, and that it is spanned by the single
function e−ax . Thus the McMillan degree is given by D = 1. We now want
to apply Proposition 2.2 to find a realization, so we must factor the volatility
function. In this case this is easy, since we have the trivial factorization σ(x) =
1 · e−ax · σ. In the notation of Proposition 2.2 we thus have
C0
=
A =
B =
1,
−a,
σ.
A realization of the forward rates is thus given by
dZ(t)
r0 (t, x)
r(t, x)
= −aZ(t)dt + σdW (t),
= e−ax Z(t),
= r0 (t, x) + δ(t, x),
and since the state space in this realization is of dimension one, the realization
is minimal. We see that if a > 0 then the system is asymptotically stable.
We now go on to the interpretation of the state space, and since D = 1 we can
choose a single benchmark maturity. The canonical choice is of course x1 = 0,
i.e. we choose the instantaneous short rate R(t) as the state variable. In the
notation of Proposition 2.4 we then have
T (x̄) =
r(t, x̄) =
1,
R(t),
and we get rate dynamics
dR(t) = {Ψ(t, 0) − aR(t)} dt + σdW (t).
Thus we see that we have indeed the Hull-White extension of the Vasiček model.
Note however that we do not have to choose the benchmark maturity as x1 = 0.
We can in fact choose any fixed maturity, x1 and then use the corresponding
forward rate as benchmark. This will give us the dynamics
dr(t, x1 ) = {Ψ(t, x1 ) − ar(t, x1 )} dt + e−ax1 dW (t),
and now the entire forward rate curve will be determined by the x1 -rate according to formula (36).
Example 2.2 σ(x) = xe−ax
In this example we still have a single driving Wiener process, but the volatility
function is now “hump-shaped”.
14
By taking derivatives of σ(x) we immediately see, from Proposition 2.3 that R
is given by
R = span xe−ax , e−ax ,
so in this case D = 2, and we have a two dimensional minimal state space. In
order to obtain a realization we compute the transfer function K(s, x), which is
given by Lemma 2.3 as
i
h
K(s, x) = L (x + ·)e−a(x+·) (s).
An easy calculation gives us
K(s, x) =
sxe−ax + (1 + ax)e−ax
xe−ax
e−ax
=
+
,
(a + s)2
(a + s)
(a + s)2
and we now look for a realization of this transfer function (for a fixed x). The
obvious thing to do is to use the standard controllable realization (see [7]), and
we obtain
C(x) = xe−ax , (1 + ax)e−ax
−2a −a2
,
A =
1
0
1
B =
.
0
Since D = 2 and this realization is two-dimensional we have a minimal realization, given by
dZ1 (t) =
dZ2 (t) =
r0 (t, x)
r(t, x)
=
=
−2aZ1 (t)dt − a2 Z2 (t)dt + dW (t),
Z1 (t)dt
xe−ax Z1 (t) + (1 + ax)e−ax Z2 (t)
r0 (t, x) + δ(t, x).
We have a double eigenvalue of the system matrix A at λ1 = −a, so if a > 0
the system is asymptotically stable.
2.7
Notes
This section is mainly based on [4]. The first paper to appear in this area was
to our knowledge the preprint [26], where the Musiela parameterization and the
space R is discussed in some detail. See also the closely related and interesting
preprints [15] [16] and [32]. Because of the linear structure, the theory above
is closely connected to (and in a sense inverse to) the theory of affine term
structures developed in [13]. The standard reference on infinite dimensional
SDEs is [12], where one also can find a presentation of the connections between
control theory and infinite dimensional linear stochastic equations.
15
3
Invariant Manifolds
In this section we study when a given submanifold of forward rate curves is
invariant under the action of a given interest rate model. This problem is of
interest from an applied as well as from a theoretical point of view. In particular
we will use the results from this section to analyze problems about existence
of finite dimensional factor realizations for interest rate models on forward rate
form. Invariant manifolds are, however, also of interest in their own right, so we
begin by discussing a concrete problem which naturally leads to the invariance
concept.
3.1
Parameter Recalibration
A standard procedure when dealing with concrete interest rate models on a high
frequency (say, daily) basis can be described as follows:
1. At time t = 0, use market data to fit (calibrate) the model to the observed
bond prices.
2. Use the calibrated model to compute prices of various interest rate derivatives.
3. The following day (t = 1), repeat the procedure in 1. above in order to
recalibrate the model, etc..
To carry out the calibration in step 1. above, the analyst typically has to produce a forward rate curve {ro (0, x); x ≥ 0} from the observed data. However,
since only a finite number of bonds actually trade in the market, the data consist
of a discrete set of points, and a need to fit a curve to these points arises. This
curve-fitting may be done in a variety of ways. One way is to use splines, but also
a number of parameterized families of smooth forward rate curves have become
popular in applications—the most well-known probably being the Nelson-Siegel
(see [28]) family. Once the curve {ro (0, x); x ≥ 0} has been obtained, the parameters of the interest rate model may be calibrated to this.
Now, from a purely logical point of view, the recalibration procedure in step
3. above is of course slightly nonsensical: If the interest rate model at hand is
an exact picture of reality, then there should be no need to recalibrate. The
reason that everyone insists on recalibrating is of course that any model in fact
only is an approximate picture of the financial market under consideration, and
recalibration allows incorporating newly arrived information in the approximation. Even so, the calibration procedure itself ought to take into account that
it will be repeated. It appears that the optimal way to do so would involve
a combination of time series and cross-section data, as opposed to the purely
cross-sectional curve-fitting, where the information contained in previous curves
is discarded in each recalibration. .
16
The cross-sectional fitting of a forward curve and the repeated recalibration is
thus, in a sense, a pragmatic and somewhat non-theoretical endeavour. Nonetheless, there are some nontrivial theoretical problems to be dealt with in this context, and the problem to be studied in this section concerns the consistency
between, on the one hand, the dynamics of a given interest rate model, and, on
the other hand, the forward curve family employed.
What, then, is meant by consistency in this context? Assume that a given interest rate model M (e.g. the Hull–White model) in fact is an exact picture of
the financial market. Now consider a particular family G of forward rate curves
(e.g. the Nelson-Siegel family) and assume that the interest rate model is calibrated using this family. We then say that the pair (M, G) is consistent (or,
that M and G are consistent) if all forward curves which may be produced by
the interest rate model M are contained within the family G. Otherwise, the
pair (M, G) is inconsistent.
Thus, if M and G are consistent, then the interest rate model actually produces
forward curves which belong to the relevant family. In contrast, if M and G are
inconsistent, then the interest rate model will produce forward curves outside
the family used in the calibration step, and this will force the analyst to change
the model parameters all the time—not because the model is an approximation
to reality, but simply because the family does not go well with the model.
Put into more operational terms this can be rephrased as follows.
• Suppose that you are using a fixed interest rate model M. If you want
to do recalibration, then your family G of forward rate curves should be
chosen is such a way as to be consistent with the model M.
Note however that the argument also can be run backwards, yielding the following conclusion for empirical work.
• Suppose that a particular forward curve family G has been observed to
provide a good fit, on a day-to-day basis, in a particular bond market.
Then this gives you modelling information about the choice of an interest
rate model in the sense that you should try to use/construct an interest
rate model which is consistent with the family G.
We now have a number of natural problems to study.
I Given an interest rate model M and a family of forward curves G, what are
necessary and sufficient conditions for consistency?
II Take as given a specific family G of forward curves (e.g. the Nelson-Siegel
family). Does there exist any interest rate model M which is consistent
with G?
17
III Take as given a specific interest rate model M (e.g. the Hull-White model).
Does there exist any finitely parameterized family of forward curves G
which is consistent with M?
In this section we will mainly address problem (I) above. Problem II has been
studied, for special cases, in [17], [18], whereas Problem III can be shown (see
Proposition 4.2) to be equivalent to the problem of finding a finite dimensional
factor realization of the model M and we provide a farily complete solution in
Section 4.
3.2
Invariant Manifolds
We now move on to give precise mathematical definition of the consistency
property discussed above, and this leads us to the concept of an invariant
manifold.
Definition 3.1 (Invariant manifold) Take as given the forward rate process
dynamics (2). Consider also a fixed family (manifold) of forward rate curves
G. We say that G is locally invariant under the action of r if, for each point
(s, r) ∈ R+ × G, the condition rs ∈ G implies that rt ∈ G, on a time interval
with positive length. If r stays forever on G, we say that G is globally invariant.
The purpose of this section is to characterize invariance in terms of local characteristics of G and M, and in this context local invariance is the best one can
hope for. In order to save space, local invariance will therefore be referred to as
invariance.
To get some intuitive feeling for the invariance concepts one can consider the
following two-dimensional deterministic system
dy1
dt
dy2
dt
=
y2 ,
=
−y1 .
For this system it is obvious that the unit circle C = (y1 , y2 ) : y12 + y22 = 1
is globally invariant, i.e. if we start the
on C.
system on C it will stay forever
The ‘upper half’ of the circle, Cu = (y1 , y2 ) : y12 + y22 = 1, y2 > 0 , is on the
other hand only locally invariant, since the system will leave Cu at the point
(1, 0). This geometric situation is in fact the generic one also for our infinite
dimensional stochastic case. The forward rate trajectory will never leave a locally
invariant manifold at a point in the relative interior of the manifold. Exit from
the manifold can only take place at the relative boundary points. We have no
general method for determining whether a locally invariant manifold also is
globally invariant or not. Problems of this kind has to be solved separately for
each particular case.
18
3.3
3.3.1
The Formalized Problem
The Space
As our basic space of forward rate curves we will use a weighted Sobolev space,
where a generic point will be denoted by r.
Definition 3.2 Consider a fixed real number γ > 0. The space Hγ is defined
as the space of all differentiable (in the distributional sense) functions
r : R+ → R
satisfying the norm condition krkγ < ∞. Here the norm is defined as
2
krkγ =
Z
∞
r2 (x)e−γx dx +
0
Z
∞
0
dr
(x)
dx
2
e−γx dx.
Remark 3.1 The variable x is as before interpreted as time to maturity. With
the inner product
Z ∞
Z ∞
dq
dr
(x)
(x) e−γx dx,
r(x)q(x)e−ax dx +
(r, q) =
dx
dx
0
0
the space Hγ becomes a Hilbert space. Because of the exponential weighting
function all constant forward rate curves will belong to the space. In the sequel
we will suppress the subindex γ, writing H instead of Hγ .
3.3.2
The Forward Curve Manifold
We consider as given a mapping
G : Z → H,
(38)
where the parameter space Z is an open connected subset of Rd , i.e. for each
parameter value z ∈ Z ⊆ Rd we have a curve G(z) ∈ H. The value of this curve
at the point x ∈ R+ will be written as G(z, x), so we see that G can also be
viewed as a mapping
(39)
G : Z × R+ → R.
The mapping G is thus a formalization of the idea of a finitely parameterized
family of forward rate curves, and we now define the forward curve manifold as
the set of all forward rate curves produced by this family.
Definition 3.3 The forward curve manifold G ⊆ H is defined as
G = Im (G)
19
3.3.3
The Interest Rate Model
We take as given a volatility function σ of the form
σ : H × R+ → Rm
i.e. σ(r, x) is a functional of the infinite dimensional r-variable, and a function
of the real variable x. Denoting the forward rate curve at time t by rt we then
have the following forward rate equation.
Z x
∂
?
rt (x) + σ(rt , x)
σ(rt , u) du dt + σ(rt , x)dWt .
(40)
drt (x) =
∂x
0
Remark 3.2 For notational simplicity we have assumed that the r-dynamics
are time homogenous. The case when σ is of the form σ(t, r, x) can be treated
in exactly the same way. See [3].
We need some regularity assumptions, and the main ones are as follows. See [3]
for technical details.
Assumption 3.1 We assume the following .
• The volatility mapping r 7−→ σ(r) is smooth.
• The mapping z 7−→ G(z) is a smooth imbedding, so in particular the
Frechet derivative G0z (z) is injective for all z ∈ Z.
• For every initial point r0 ∈ G, there exists a unique strong solution in H
of equation (40).
3.3.4
The Problem
Our main problem is the following.
• Suppose that we are given
– A volatility σ, specifying an interest rate model M as in (40)
– A mapping G, specifying a forward curve manifold G.
• Is G then invariant under the action of r?
3.4
The Invariance Conditions
In order to study the invariance problem we need to introduce some compact
notation.
20
Definition 3.4 We define Hσ by
Z
x
σ(r, s)ds
Hσ(r, x) =
0
Suppressing the x-variable, the Itô dynamics for the forward rates are thus given
by
∂
?
rt + σ(rt )Hσ(rt ) dt + σ(rt )dWt
(41)
drt =
∂x
and we write this more compactly as
drt = µ0 (rt )dt + σ(rt )dWt ,
(42)
where the drift µ0 is given by the bracket term in (41). To get some intuition
we now formally “divide by dt” and obtain
dr
= µ0 (rt ) + σ(rt )Ẇt ,
dt
(43)
where the formal time derivative Ẇt is interpreted as an “input signal” chosen
by chance. As in Section 2.3 we are thus led to study the associated deterministic
control system
dr
= µ0 (rt ) + σ(rt )ut .
(44)
dt
The intuitive idea is now that G is invariant under (42) if and only if G is invariant
under (44) for all choices of the input signal u. It is furthermore geometrically
obvious that this happens if and only if the velocity vector µ(r) + σ(r)u is
tangential to G for all points r ∈ G and all choices of u ∈ Rm . Since the tangent
space of G at a point G(z) is given by Im [G0z (z)], where G0z denotes the Frechet
derivative (Jacobian), we are led to conjecture that G is invariant if and only if
the condition
µ0 (r) + σ(r)u ∈ Im [G0z (z)]
is satisfied for all u ∈ Rm . This can also be written
µ0 (r)
σ(r)
∈ Im [G0z (z)] ,
∈ Im [G0z (z)] ,
where the last inclusion is interpreted componentwise for σ.
This “result” is, however, not correct due to the fact that the argument above
neglects the difference between ordinary calculus, which is used for (44), and
Itô calculus, which governs (42). In order to bridge this gap we have to rewrite
the analysis in terms of Stratonovich integrals instead of Itô integrals.
Definition 3.5 For given semimartingales
X and Y , the Stratonovich inteRt
gral of X with respect to Y , 0 X(s) ◦ dY (s), is defined as
Z t
Z t
1
Xs ◦ dYs =
Xs dYs + hX, Y it .
(45)
2
0
0
21
The first term on the RHS is the Itô integral. In the present case, with only
Wiener processes as driving noise, we can define the ‘quadratic variation process’
hX, Y i in (45) by
(46)
dhX, Y it = dXt dYt ,
with the usual ‘multiplication rules’ dW · dt = dt · dt = 0, dW · dW = dt. We
now recall the main result and raison d’être for the Stratonovich integral.
Proposition 3.1 (Chain rule) Assume that the function F (t, y) is smooth.
Then we have
∂F
∂F
(t, Yt )dt +
◦ dYt .
(47)
dF (t, Yt ) =
∂t
∂y
Thus, in the Stratonovich calculus, the Itô formula takes the form of the standard chain rule of ordinary calculus.
Returning to (42), the Stratonovich dynamics are given by
∂
1
?
rt + σ(rt )Hσ(rt ) dt − dhσ(rt ), Wt i
drt =
∂x
2
+ σ(rt ) ◦ dWt .
(48)
In order to compute the Stratonovich correction term above we use the infinite
dimensional Itô formula (see [12]) to obtain
dσ(rt ) = {. . .} dt + σr0 (rt )σ(rt )dWt ,
(49)
where σr0 denotes the Frechet derivative of σ w.r.t. the infinite dimensional rvariable. From this we immediately obtain
dhσ(rt ), Wt i = σr0 (rt )σ(rt )dt.
(50)
Remark 3.3 If the Wiener process W is multidimensional, then σ is a vector
σ = [σ1 , . . . , σm ], and the rhs of (50) should be interpreted as
σr0 (rt )σ(rt , x) =
m
X
0
σir
(rt )σi (rt )
i=1
Thus (48) becomes
drt
=
+
∂
1 0
?
rt + σ(rt )Hσ(rt ) − σr (rt )σ(rt ) dt
∂x
2
σ(rt ) ◦ dWt
(51)
We now write (51) as
drt = µ(rt )dt + σ(rt ) ◦ dWt
22
(52)
where
∂
r(x) + σ(rt , x)
µ(r, x) =
∂x
Z
x
0
σ(rt , u)? du −
1 0
[σ (rt )σ(rt )] (x).
2 r
(53)
Given the heuristics above, our main result is not surprising. The formal proof,
which is somewhat technical, is left out. See [3].
Theorem 3.1 (Main Theorem) The forward curve manifold G is locally invariant for the forward rate process r(t, x) in M if and only if,
1
G0x (z) + σ (r) Hσ (r)? − σr0 (r) σ (r)
2
σ (r)
∈ Im[G0z (z)] ,
∈
Im[G0z (z)]
,
(54)
(55)
hold for all z ∈ Z with r = G(z).
Here, G0z and G0x denote the Frechet derivative of G with respect to z and x,
respectively. The condition (55) is interpreted componentwise for σ. Condition
(54) is called the consistent drift condition, and (55) is called the consistent
volatility condition.
Remark 3.4 It is easily seen that if the family G is invariant under shifts in
the x-variable, then we will automatically have the relation
G0x (z) ∈ Im[G0z (z)],
so in this case the relation (54) can be replaced by
1
σ(r)Hσ(r)? − σr0 (r) σ (r) ∈ Im[G0z (z)],
2
with r = G(z) as usual.
3.5
Examples
The results above are extremely easy to apply in concrete situations. As a test
case we consider the Nelson–Siegel (see [28]) family of forward rate curves. We
analyze the consistency of this family with the Ho-Lee and Hull-White interest
rate models. It should be emphasised that these examples are chosen only in
order to illustrate the general methodology. For more examples and details, see
[3].
3.5.1
The Nelson-Siegel Family
The Nelson–Siegel (henceforth NS) forward curve manifold G is parameterized
by z ∈ R4 , the curve x 7−→ G(z, x) as
G(z, x) = z1 + z2 e−z4 x + z3 xe−z4 x .
23
(56)
For z4 6= 0, the Frechet derivatives are easily obtained as
G0z (z, x) = 1, e−z4 x , xe−z4 x , −(z2 + z3 x)xe−z4 x ,
G0x (z, x) = (z3 − z2 z4 − z3 z4 x)e−z4 x .
(57)
(58)
In order for the image of this map to be included in Hγ , we need to impose
the condition
z4 > −γ/2. In this
case, the natural parameter space is thus
Z = z ∈ R4 : z4 6= 0, z4 > −γ/2 . However, as we shall see below, the results
are uniform w.r.t. γ. Note that the mapping G indeed is smooth, and for z4 6= 0,
G and G0z are also injective.
In the degenerate case z4 = 0, we have
G(z, x) = z1 + z2 + z3 x ,
(59)
We return to this case below.
3.5.2
The Hull-White and Ho-Lee Models
As our test case, we analyze the Hull and White (henceforth HW) extension of
the Vasiček model. On short rate form the model is given by
dR(t) = {Φ(t) − aR(t)} dt + σdW (t),
(60)
where a, σ > 0. As is well known, the corresponding forward rate formulation
is
(61)
dr(t, x) = β(t, x)dt + σe−ax dWt .
Thus, the volatility function is given by σ(x) = σe−ax , and the conditions of
Theorem 3.1 become
G0x (z, x) +
σ 2 −ax
e
− e−2ax ∈
a
σe−ax ∈
Im[G0z (z, x)],
(62)
Im[G0z (z, x)].
(63)
To investigate whether the NS manifold is invariant under HW dynamics, we
start with (63) and fix a z-vector. We then look for constants (possibly depending
on z) A, B, C, and D, such that for all x ≥ 0 we have
σe−ax = A + Be−z4 x + Cxe−z4 x − D(z2 + z3 x)xe−z4 x .
(64)
This is possible if and only if z4 = a, and since (63) must hold for all choices
of z ∈ Z we immediately see that HW is inconsistent with the full NS manifold
(see also the Notes below).
Proposition 3.2 (Nelson-Siegel and Hull-White) The Hull-White model
is inconsistent with the NS family.
24
We have thus obtained a negative result for the HW model. The NS manifold is
‘too small’ for HW, in the sense that if the initial forward rate curve is on the
manifold, then the HW dynamics will force the term structure off the manifold
within an arbitrarily short period of time. For more positive results see [3].
Remark 3.5 It is an easy exercise to see that the minimal manifold which is
consistent with HW is given by
G(z, x) = z1 e−ax + z2 e−2ax .
In the same way, one may easily test the consistency between NS and the model
obtained by setting a = 0 in (60). This is the continuous time limit of the Ho and
Lee model [21], and is henceforth referred to as HL. Since we have a pedagogical
point to make, we give the results on consistency, which are as follows.
Proposition 3.3 (Nelson-Siegel and Ho-Lee)
(a) The full NS family is inconsistent with the Ho-Lee model.
(b) The degenerate family G(z, x) = z1 + z3 x is in fact consistent with Ho-Lee.
Remark 3.6 We see that the minimal invariant manifold provides information
about the model. From the result above, the HL model is closely tied to the class
of affine forward rate curves. Such curves are unrealistic from an economic point
of view, implying that the HL model is overly simplistic.
3.6
Notes
The section is based on [3]. As we very easily detected above, neither the HW
nor the HL model is consistent with the Nelson-Siegel famliy of forward rate
curves. A much more difficult problem is to determine whether any interest
rate model is. This is Problem II in Section 3.1 for the NS family, and it has
been solved recently (using different techniques) in [17], where it is shown that
no nontrivial Wiener driven model is consistent with NS. Thus, for a model
to be consistent with Nelson-Siegel, it must be deterministic. In [18] (which is
a technical tour de force) this result is extended to a much larger exponential
polynomial family than the NS family. In our presentation we have used strong
solutions of the infinite dimensional forward rate SDE. This is of course restrictive. The invariance problem for weak solutions has recently been studied in
[19]. An alternative way of studying invariance is by using some version of the
Stroock–Varadhan support theorem, and this line of thought is carried out in
depth in [32].
25
4
Existence of Nonlinear Realizations
We now turn to Problem 2 in Section 1.2, i.e. the problem when a given forward
rate model has a finite dimensional factor realization. For ease of exposition we
mostly confine ourselves to a discussion of the case of a single driving Wiener
process and to time invariant forward rate dynamics. Multidimensional Wiener
processes, and time varying systems can be treated similarly, and for completeness we state the results for the multidimensional case. We will use some ideas
and concepts from differential geometry, and a general reference here is [30].
The section is based on [5].
4.1
Setup
In order to study the realization problem we need (see Remark 4.1) a very
regular space to work in.
Definition 4.1 Consider a fixed real number γ > 0. The space Bγ is defined as
the space of all infinitely differentiable functions
r : R+ → R
satisfying the norm condition krkγ < ∞. Here the norm is defined as
krk2γ =
∞
X
2−n
n=0
Z
∞
0
dn r
(x)
dxn
2
e−γx dx.
Note that B is not a space of distributions, but a space of functions. As with H
we will often suppress the subindex γ. With the obvious inner product B is a
pre-Hilbert space, and in [5] the following result is proved.
Proposition 4.1 The space B is a Hilbert space, i.e. it is complete. Furthermore, every function in the space is fact real analytic, and can thus be uniquely
extended to a holomorphic function in the entire complex plane.
We now take as given a volatility σ : B → B and consider the induced forward
rate model (on Stratonovich form)
drt = µ(rt )dt + σ(rt ) ◦ dWt
(65)
where as before (see Section 3.4).
µ(r) =
1
∂
r + σ(r)Hσ(r)? − σr0 (r)σ(r)
∂x
2
We need some regularity assumptions.
26
(66)
Assumption 4.1 We assume that σ is chosen such that the following hold.
• The mapping σ is smooth.
• The mapping
1
r 7−→ σ(r)Hσ(r)? − σr0 (r)σ(r)
2
is a smooth map from B to B.
Remark 4.1 The reason for our choice of B as the underlying space, is that the
linear operator F = d/dx is bounded in this space. Together with the assumptions above, this implies that both µ and σ are smooth vector fields on B, thus
ensuring the existence of a strong local solution to the forward rate equation for
every initial point ro ∈ B.
4.2
The Geometric problem
Given a specification of the volatility mapping σ, and an initial forward rate
curve ro we now investigate when (and how) the corresponding forward rate
process possesses a finite, dimensional realization. We are thus looking for smooth
d-dimensional vector fields a and b, an initial point z0 ∈ Rd , and a mapping
G : Rd → B such that r, locally in time, has the representation
dZt
r(t, x)
= a (Zt ) dt + b (Zt ) dWt , Z0 = z0
= G (Zt , x) .
(67)
(68)
Remark 4.2 Let us clarify some points. Firstly, note that in principle it may
well happen that, given a specification of σ, the r-model has a finite dimensional
realization given a particular initial forward rate curve ro , while being infinite
dimensional for all other initial forward rate curves in a neighbourhood of ro .
We say that such a model is a non-generic or accidental finite dimensional
model. If, on the other hand, r has a finite dimensional realization for all initial
points in a neighbourhood of ro , then we say that the model is a generically
finite dimensional model. In this text we are solely concerned with the generic
problem. Secondly, let us emphasise that we are looking for local (in time)
realizations.
We can now connect the realization problem to our studies of invariant manifolds.
Proposition 4.2 The forward rate process possesses a finite dimensional realization if and only if there exists an invariant finite dimensional submanifold G
with ro ∈ G.
27
Proof. See [3] for the full proof. The intuitive argument runs as follows. Suppose
that there exists a finite dimensional invariant manifold G with ro ∈ G. Then
G has a local coordinate system, and we may define the Z process as the local
coordinate process for the r-process. On the other hand it is clear that if r has
a finite dimensional realization as in (67)-(68), then every forward rate curve
that will be produced by the model is of the form x 7−→ G(z, x) for some choice
of z. Thus there exists a finite dimensional invariant submanifold G containing
the initial forward rate curve ro , namely G = ImG.
Using Theorem 3.1 we immediately obtain the following geometric characterisation of the existence of a finite realization.
Corollary 4.1 The forward rate process possesses a finite dimensional realization if and only if there exists a finite dimensional manifold G containing ro ,
such that, for each r ∈ G the following conditions hold.
µ(r)
σ(r)
∈ TG (r),
∈ TG (r).
Here TG (r) denotes the tangent space to G at the point r, and the vector fields
µ and σ are as above.
4.3
The Main Result
Given the volatility vector field σ, and hence also the field µ, we now are faced
with the problem of determining if there exists a finite dimensional manifold G
with the property that µ and σ are tangential to G at each point of G. In the
case when the underlying space is finite dimensional, this is a standard problem
in differential geometry, and we will now give the heuristics.
To get some intuition we start with a simpler problem and therefore consider the
space B (or any other Hilbert space), and a smooth vector field f on the space.
For each fixed point ro ∈ B we now ask if there exists a finite dimensional
manifold G with ro ∈ G such that f is tangential to G at every point. The
answer to this question is yes, and the manifold can in fact be chosen to be
one-dimensional. To see this, consider the infinite dimensional ODE
drt
dt
r0
= f (rt ),
o
= r .
(69)
(70)
If rt is the solution, at time t, of this ODE, we use the notation
rt = ef t ro .
ft
:
t
∈
R
, and we note that the
We have thus defined
a
group
of
operators
e
ft o
set e r : t ∈ R ⊆ B is nothing else than the integral curve of the vector field
28
f , passing through ro . If we define G as this integral curve, then our problem is
solved, since f will be tangential to G by construction.
Let us now take two vector fields f1 and f2 as given, where the reader informally
can think of f1 as σ and f2 as µ. We also fix an initial point ro ∈ B and the
question is if there exists a finite dimensional manifold G, containing ro , with
the property that f1 and f2 are both tangential to G at each point of G. We
call such a manifold an tangential manifold for the vector fields. At a first
glance it would seem that there always exists an tangential manifold, and that it
can even be chosen to be two-dimensional. The geometric idea
is that we start
at ro and let f1 generate the integral curve ef1 s ro : s ≥ 0 . For each point
ef1 s ro on this curve we now let f2 generate the integral curve starting at that
point. This gives us the object ef2 t ef1 s ro and thus it seems that we sweep out a
two dimensional surface G in B. This is our obvious candidate for an tangential
manifold.
In the general case this idea will, however, not work, and the basic problem is as
follows. In the construction above we started with the integral curve generated
by f1 and then applied f2 , and there is of course no guarantee that we will
obtain the same surface if we start with f2 and then apply f1 . We thus have
some sort of commutativity problem, and the key concept is the Lie bracket.
Definition 4.2 Given smooth vector fields f and g on B, the Lie bracket [f, g]
is a new vector field defined by
[f, g] (r) = f 0 (r)g(r) − g 0 (r)f (r)
(71)
The Lie bracket measures the lack of commutativity on the infinitesimal scale
in our geometric program above, and for the procedure to work we need a
condition which says that the lack of commutativity is “small”. It turns out
that the relevant condition is that the Lie bracket should be in the linear hull
of the vector fields.
Definition 4.3 Let f1 , . . . , fn be smooth independent vector fields on some space
X. Such a system is called a distribution, and the distribution is said to be
involutive if
[fi , fj ] (x) ∈ span {f1 (x), . . . , fn (x)} , ∀i, j,
where the span is the linear hull over the real numbers.
We now have the following basic result, which extends a classic result from finite
dimensional differential geometry (see [30]).
Theorem 4.1 (Frobenius) Let f1 , . . . , fk and be independent smooth vector
fields in B and consider a fixed point ro ∈ B. Then the following statements are
equivalent.
29
• For each point r in a neighbourhood of ro , there exists a k-dimensional
tangential manifold passing through r.
• The system f1 , . . . , fk of vector fields is (locally) involutive.
Proof. See [5], which provides a self contained proof of the Frobenius Theorem
in Banach space.
Let us now go back to our interest rate model. We are thus given the vector fields
µ, σ, and an initial point ro , and the problem is whether there exists a finite
dimensional tangential manifold containing ro . Using the infinite dimensional
Frobenius theorem, this situation is now easily analyzed. If {µ, σ} is involutive
then there exists a two dimensional tangential manifold. If {µ, σ} is not involutive, this means that the Lie bracket [µ, σ] is not in the linear span of µ and σ,
so then we consider the system {µ, σ, [µ, σ]}. If this system is involutive there
exists a three dimensional tangential manifold. If it is not involutive at least one
of the brackets [µ, [µ, σ]], [σ, [µ, σ]] is not in the span of {µ, σ, [µ, σ]}, and we
then adjoin this (these) bracket(s). We continue in this way, forming brackets of
brackets, and adjoining these to the linear hull of the previously obtained vector
fields, until the point when the system of vector fields thus obtained actually is
closed under the Lie bracket operation.
Definition 4.4 Take the vector fields f1 , . . . , fk as given. The Lie algebra
generated by f1 , . . . , fk is the smallest linear space (over R) of vector fields
which contains f1 , . . . , fk and is closed under the Lie bracket. This Lie algebra
is denoted by
L = {f1 , . . . , fk }LA
The dimension of L is defined, for each point r ∈ B as
dim [L(r)] = dim span {f1 (r), . . . , fk (r)} .
Putting all these results together, we have the following main result on finite
dimensional realizations.
Theorem 4.2 (Main Result) Take the volatility mapping σ = (σ1 , . . . , σm )
as given. Then the forward rate model generated by σ generically admits a finite
dimensional realization if and only if
dim {µ, σ1 , . . . , σm }LA < ∞
in a neighbourhood of ro .
The result above thus provides a general solution to Problem II from Section
1.2. For any given specification of forward rate volatilities, the Lie algebra can
30
in principle be computed, and the dimension can be checked. Note, however,
that the theorem is a pure existence result. If, for example, the Lie algebra has
dimension five, then we know that there exists a five-dimensional realisation, but
the theorem does not directly tell us how to construct a concrete realization. This
is the subject of ongoing research. Note also that realizations are not unique,
since any diffeomorphic mapping of the factor space Rd onto itself will give a
new equivalent realization.
When computing the Lie algebra generated by µ and σ, the following observations are often useful.
Lemma 4.1 Take the vector fields f1 , . . . , fk as given. The Lie algebra L =
{f1 , . . . , fk }LA remains unchanged under the following operations.
• The vector field fi (r) may be replaced by α(r)fi (r), where α is any smooth
nonzero scalar field.
• The vector field fi (r) may be replaced by
X
αj (r)fj (r),
fi (r) +
j6=i
where αj is any smooth scalar field.
Proof. The first point is geometrically obvious, since multiplication by a scalar
field will only change the length of the vector field fi , and not its direction, and
thus not the tangential manifold. Formally it follows from the “Leibnitz rule”
[f, αg] = α [f, g] − (α0 f )g. The second point follows from the bilinear property
of the Lie bracket together with the fact that [f, f ] = 0.
4.4
Applications
In this section we give some simple applications of the theory developed above.
For more examples and results, see [5].
4.4.1
Constant Volatility
We start with the simplest case, which is when the volatility σ(r, x) is a constant
vector in B. We are thus back in the framework of Section 2, and we assume
for simplicity that we have only one driving Wiener process. Then we have no
Stratonovich correction term and the vector fields are given by
Z x
σ(s)ds,
µ(r, x) = Fr(x) + σ(x)
0
σ(r, x)
=
σ(x).
31
where as before F =
∂
∂x .
The Frechet derivatives are trivial in this case. Since F is linear (and bounded
in our space), and σ is constant as a function of r, we obtain
µ0r
σr0
=
F,
=
0.
Thus the Lie bracket [µ, σ] is given by
[µ, σ] = Fσ,
and in the same way we have
[µ, [µ, σ]] = F2 σ.
Continuing in the same manner it is easily seen that the relevant Lie algebra L
is given by
L = {µ, σ}LA = span µ, σ, Fσ, F2 σ, . . . = span {µ, Fn σ ; n = 0, 1, 2, . . .}
It is thus clear that L is finite dimensional (at each point r) if and only if the
function space
span {Fn σ ; n = 0, 1, 2, . . .}
is finite dimensional. We have thus obtained our old condition from Proposition
2.3 and we have the following result which extends Proposition 2.2 by in principle
allowing the realization to be non-linear.
Proposition 4.3 Under the above assumptions, there exists a finite dimensional realization if and only if σ is a quasi-exponential function.
4.4.2
Constant Direction Volatility
We go on to study the most natural extension of the deterministic volatility case
(still in the case of a scalar Wiener process) namely the case when the volatility
is of the form
σ(r, x) = ϕ(r)λ(x).
(72)
In this case the individual vector field σ has the constant direction λ ∈ H,
but is of varying length, determined by ϕ, where ϕ is allowed to be any smooth
functional of the entire forward rate curve. In order to avoid trivialities we make
the following assumption.
Assumption 4.2 We assume that ϕ(r) 6= 0 for all r ∈ H.
After a simple calculation the drift vector µ turns out to be
1
µ(r) = Fr + ϕ2 (r)D − ϕ0 (r)[λ]ϕ(r)λ,
2
32
(73)
where ϕ0 (r)[λ] denotes the Frechet derivative ϕ0 (r) acting on the vector λ, and
where the constant vector D ∈ H is given by
Z x
λ(s)ds.
D(x) = λ(x)
0
We now want to know under what conditions on ϕ and λ we have a finite
dimensional realization, i.e. when the Lie algebra generated by
µ(r)
=
σ(r)
=
1
Fr + ϕ2 (r)D − ϕ0 (r)[λ]ϕ(r)λ,
2
ϕ(r)λ,
is finite dimensional. Under Assumption 4.2 we can use Lemma 4.1, to see that
the Lie algebra is in fact generated by the simpler system of vector fields
f0 (r)
=
Fr + Φ(r)D,
f1 (r)
=
λ,
where we have used the notation
Φ(r) = ϕ2 (r).
Since the field f1 is constant, it has zero Frechet derivative. Thus the first Lie
bracket is easily computed as
[f0 , f1 ] (r) = Fλ + Φ0 (r)[λ]D.
The next bracket to compute is [[f0 , f1 ] , f1 ] which is given by
[[f0 , f1 ] , f1 ] = Φ00 (r)[λ; λ]D.
Note that Φ00 (r)[λ; λ] is the second order Frechet derivative of Φ operating on
the vector pair [λ; λ]. This pair is to be distinguished from (notice the semicolon)
the Lie bracket [λ, λ] (with a comma), which if course would be equal to zero.
We now make a further assumption.
Assumption 4.3 We assume that Φ00 (r)[λ; λ] 6= 0 for all r ∈ H.
Given this assumption we may again use Lemma 4.1 to see that the Lie algebra
is generated by the following vector fields
f0 (r)
=
Fr,
f1 (r)
f3 (r)
=
=
λ,
Fλ,
f4 (r)
=
D.
33
Of these vector fields, all but f0 are constant, so all brackets are easy. After
elementary calculations we see that in fact
{µ, σ}LA = span {Fr, Fn λ, Fn D; n = 0, 1, . . .} .
From this expression it follows immediately that a necessary condition for the Lie
algebra to be finite dimensional is that the vector space spanned by {Fn λ; n ≥ 0}
is finite dimensional. This occurs if and only if λ is quasi-exponential (see Remark 2.1). If, on the other hand, λ is quasi-exponential, then we know from
Lemma 2.1, that also D is quasi-exponential, since it is the integral of the QE
function λ multiplied by the QE function λ. Thus the space {Fn D; n = 0, 1, . . .}
is also finite dimensional, and we have proved the following result.
Proposition 4.4 Under Assumptions 4.2 and 4.3, the interest rate model with
volatility given by σ(r, x) = ϕ(r)λ(x) has a finite dimensional realization if and
only if λ is a quasi-exponential function. The scalar field ϕ is allowed to be any
smooth field.
4.4.3
When is the Short Rate a Markov Process?
One of the classical problems concerning the HJM approach to interest rate
modelling is that of determining when a given forward rate model is realized
by a short rate model, i.e. when the short rate is Markovian. We now briefly
indicate how the theory developed above can be used in order to analyze this
question. For the full theory see [5].
Using the results above, we immediately have the following general necessary
condition.
Proposition 4.5 The forward rate model generated by σ is a generic short rate
model, i.e the short rate is generically a Markov process, only if
dim {µ, σ}LA ≤ 2
(74)
Proof. If the model is really a short rate model, then bond prices are given as
p(t, x) = F (t, Rt , x) where F solves the term structure PDE. Thus bond prices,
and forward rates are generated by a two dimensional factor model with time t
and the short rate R as the state variables.
Remark 4.3 The most natural case is dim {µ, σ}LA = 2. It is an open problem whether there exists a non-deterministic generic short rate model with
dim {µ, σ}LA = 1.
34
Note that condition (74) is only a sufficient condition for the existence of a short
rate realization. It guarantees that there exists a two-dimensional realization,
but the question remains whether the realization can chosen in such a way
that the short rate and running time are the state variables. This question is
completely resolved by the following central result.
Theorem 4.3 Assume that the model is not deterministic, and take as given a
time invariant volatility σ(r, x). Then there exists a short rate realization if and
only if the vector fields [µ, σ] and σ are parallel, i.e. if and only if there exists a
scalar field α(r) such that the following relation holds (locally) for all r.
[µ, σ] (r) = α(r)σ(r).
(75)
Proof. See [5].
It turns out that the class of generic short rate models is very small indeed. We
have, in fact, the following result, which was first proved in [25] (using techniques
different from those above). See [5] for a proof based on Theorem 4.3.
Theorem 4.4 Consider a HJM model with one driving Wiener process and a
volatility structure of the form
σ(r, x) = g(R, x).
where R = r(0) is the short rate. Then the model is a generic short rate model
if and only if g has one of the following forms.
• There exists a constant c such that
g(R, x) ≡ c.
• There exist constants a and c such that.
g(R, x) = ce−ax .
• There exist constants a and b, and a function α(x), where α satisfies a
certain Riccati equation, such that
√
g(R, x) = α(x) aR + b
We immediately recognize these cases as the Ho-Lee model, the Hull-White extended Vasiček model, and the Hull-White extended Cox-Ingersoll-Ross model.
Thus, in this sense the only generic short rate models are the affine ones, and the
moral of this, perhaps somewhat surprising, result is that most short rate models considered in the literature are not generic but “accidental”. To understand
the geometric picture one can think of the following program.
35
1. Choose an arbitrary short rate model, say of the form
dRt = a(Rt )dt + b(Rt )dWt
with a fixed initial point R0 .
2. Solve the associated PDE in order to compute bond prices. This will also
produce:
• An initial forward rate curve r̂o (x).
• Forward rate volatilities of the form g(R, x).
3. Forget about the underlying short rate model, and take the forward rate
volatility structure g(R, x) as given in the forward rate equation.
4. Initiate the forward rate equation with an arbitrary initial forward rate
curve ro (x)
The question is now whether the thus constructed forward rate model will produce a Markovian short rate process. Obviously, if you choose the initial forward
rate curve ro as ro = r̂o , then you are back where you started, and everything is
OK. If, however, you choose another initial forward rate curve than r̂o , say the
observed forward rate curve of today, then it is no longer clear that the short
rate will be Markovian. What the theorem above says, is that only the models
listed above will produce a Markovian short rate model for all initial points in
a neighbourhood of r̂o . If you take another model (like, say, the Dothan model)
then a generic choice of the initial forward rate curve will produce a short rate
process which is not Markovian.
4.5
Notes
The section is based on [5] where full proofs and further results can be found,
and where also the time varying case is considered. In our study of the constant
direction model above, ϕ was allowed to be any smooth functional of the entire
forward rate curve. The simpler special case when ϕ is a point evaluation of the
short rate, i.e. of the form ϕ(r) = h(r(0)) has been studied in [1], [23] and [29].
All these cases falls within our present framwork and, the results are included
as special cases of the general theory above. A different case, treated in [10], occurs when σ is a finite point evaluation, i.e. when σ(t, r) = h(t, r(x1 ), . . . r(xk ))
for fixed benchmark maturities x1 , . . . , xk . In [10] it is studied when the corresponding finite set of benchmark forward rates is Markovian.
A classic paper on Markovian short rates is [9], where a deterministic volatility
of the form σ(t, x) is considered. Theorem 4.4 was first stated and proved in
[25]. See [14] for an example with a driving Levy process.
The geometric ideas presented above and in [5] are intimately connected to controllability problems in systems theory, where they have been used extensively
36
(see [24]). They have also been used in filtering theory, where the problem is
to find a finite dimensional realization of the unnormalized conditional density
process, the evolution of which is given by the Zakai equation. See [8] for an
overview of these areas.
References
[1] Bhar, R. & Chiarella, C. (1997) Transformation of Heath-Jarrow-Morton
models to markovian systems. European Journal of Finance, 3, No. 1, 1-26.
[2] Björk, T. (1997). Interest Rate Theory. In W. Runggaldier (ed.), Financial
Mathematics. Springer Lecture Notes in Mathematics, Vol. 1656. Springer
Verlag, Berlin.
[3] Björk, T. & Christensen, B.J. (1999) Interest rate dynamics and consistent
forward rate curves. Mathematical Finance, 9, No. 4, 323-348.
[4] Björk, T. & Gombani A. (1999) Minimal realization of interest rate models.
Finance and Stochastics, 3, No. 4, 413-432.
[5] Björk, T. & Svensson, L. (1999) On the existence of finite dimensional
nonlinear realizations of interest rate models. Working paper, Stockholm
School of Economics.
[6] Brace, A. & Musiela M. (1994) A multi factor Gauss Markov implementation of Heath Jarrow and Morton. Mathematical Finance 4, Vol.3, 563-576.
[7] Brockett, R.W. (1970) Finite Dimensional Linear Systems. Wiley.
[8] Brockett, R.W. (1981) Nonlinear systems and nonlinear estimation theory.
In Stochastic systems: The mathematics of filtering and identification and
applications (eds. Hazewinkel, M & Willems, J.C.) Reidel.
[9] Carverhill, A. (1994) When is the spot rate Markovian? Mathematical Finance, 4, 305-312.
[10] Chiarella, C & Kwon, K. (1998) Forward rate dependent Markovian transformations of the Heath-Jarrow-Morton term structure model. Working
paper. School of Finance and Economics, University of Technology, Sydney.
[11] Cox, J., Ingersoll, J. & Ross, S. (1985b). A Theory of the Term Structure
of Interest Rates. Econometrica 53, 385–408.
[12] Da Prato, G. & Zabczyk, J.(1992) Stochastic Equations in Infinite Dimensions. Cambridge University Press.
[13] Duffie, D. & Kan, R (1996) A yield factor model of interest rates. Mathematical Finance, 6, 379-406.
37
[14] Eberlein, E. & Raible, S. (1999) Term structure models driven by general
Levy processes. Mathematical Finance, 9, 31-53.
[15] El Karoui, N. & Lacoste, V (1993) Multifactor models of the term structure
of interest rates . Preprint
[16] El Karoui, N. & Geman, H. & Lacoste, V (1997) On the role of state
variables in interest rate models. Preprint
[17] Filipović, D. (1998a): A Note on the Nelson-Siegel family. Mathematical
Finance, 9, No. 4, 349-359.
[18] Filipović, D. (1998b): Exponential-polynomial families and the term structure of interest rates. To appear in Bernoulli.
[19] Filipović, D. (1999): Invariant manifolds for weak solutions of stochastic
equations. To appear in Probability Theory and Related Fields.
[20] Heath, D. & Jarrow, R. & Morton, A. (1992) Bond pricing and the term
structure of interest rates. Econometrica 60 No.1, 77-106.
[21] Ho, T. & Lee, S. (1986) Term structure movements and pricing interest
rate contingent claims. Journal of Finance 41, 1011-1029.
[22] Hull, J & White, A. (1990) Pricing interest-rate-derivative securities. The
Review of Financial Studies 3, 573-592.
[23] Inui, K. & Kijima, M. (1998) A markovian framework in multi-factor HeathJarrow-Morton models. JFQA 333 no. 3, 423-440.
[24] Isidori, A. (1989) Nonlinear Control Systems. Springer-Verlag, Berlin.
[25] Jeffrey, A. (1995) Single factor Heath-Jarrow-Morton term structure models
based on Markovian spot interest rates. JFQA 30 no.4, 619-642.
[26] Musiela M. (1993) Stochastic PDE:s and term structure models. Preprint.
[27] Musiela, M. & Rutkowski, M. (1997). Martingale Methods in Financial
Modeling. Springer Verlag, Berlin Heidelberg New York.
[28] Nelson, C. & Siegel, A. (1987): Parsimonious modelling of yield curves.
Journal of Business, 60, 473-489.
[29] Ritchken, P. & Sankarasubramanian, L. (1995) Volatility structures of forward rates and the dynamics of the term structure. mathematical Finance,
5, no. 1, 55-72.
[30] Warner, F.W. (1979) Foundations of differentiable manifolds and Lie
groups. Scott, Foresman, Hill.
[31] Vasic̆ek, O. (1977) An equilibrium characterization of the term structure.
Journal of Financial Economics 5, 177-188.
38
[32] Zabczyk, J.(1992) Stochastic invariance and conistency of financial models.
Preprint. Scuola Normale Superiore, Pisa.
39