A generator for risk neutral scenarios of credit risky portfolios
by
T.D. Zelders (644335)
B.Sc. Tilburg University 2009
A thesis submitted in partial fulfillment of the requirements for
the degree of Master of Science in Quantitative Finance and Actuarial Sciences
Faculty of Economics and Business Administration
Tilburg University
Academic supervisor:
Company supervisor:
Second reader:
dr.ir. G.W.P. Charlier
dr. A. Boer
prof.dr. J.M. Schumacher
Date: March 17, 2010
Abstract
This thesis presents a simulation framework to generate risk neutral scenarios for the value
of defaultable bond portfolios. We use a generalized continuous time Markov model to model
rating transition probabilities including default and a combined Black Scholes two-factor HullWhite model for respectively stock prices and interest rates. Transition probabilities depend
on the two economic state variables: (1) short rate, and (2) instantaneous return on stocks.
Hence, the default time is doubly stochastic and follows a Cox process. We contribute to the
literature in several ways. The main contribution is that we derive a closed form solution of
the defaultable bond price for this modeling framework. This price also allows for positive fractional recovery of market value. We also suggest a calibration procedure and define a complete
simulation framework for which we actual simulate risk neutral scenarios of defaultable bond
portfolio returns. Correlations between issuers in the portfolio is modeled implicitly trough the
economic dependence of the transition probabilities but we also suggest a way to simulate direct correlations between issuers. Future work can be done on calibrating to the complete term
structure of spreads and on extending the model to allow for other credit risky products such
as CDO’s.
”In the last 50 years, the ten most extreme days in the financial markets represent half the
returns.”
Nassim Nicholas Taleb
Acknowledgements
First I would like to thank Alex Boer of Ortec Finance for all the effort he made and time
he spent in order for me to succeed. I could not have wished for a better and more dedicated
company supervisor and I really enjoyed working together. Thank you Alex for your great
support and help with this challenging thesis, I can truly say I learned a lot from you. Next
I want to thank my academic supervisor Erwin Charlier for his useful comments and fruitful
discussions, I always enjoyed our meetings. I am also grateful to David van Bragt for giving me
the opportunity to write my thesis at Ortec Finance and for this interesting topic. During my
stay at Ortec Finance I also had the opportunity to work as an student-assistant at the pension
risk management department. I have really enjoyed this and want to thank the whole team, but
especially Chantal de Groot and Linda Hooft, for being wonderfull colleagues and having faith
in me to do some responsible tasks. I wish you all the best in your work and personal lives. I
am also very grateful to my moms Yolande, Elisabeth, and Iene for their support and how they
raised me to the man I am now. Last but certainly not least I want to thank my girlfriend Irma
for giving me the support and love I sometimes so desperately needed and for keeping up with
the crazy hours of the last couple of weeks. I love you and we are going to be very happy in
Amsterdam.
Contents
1 Introduction
1
1.1
Thesis Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2
Outline of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2 Credit risk models and Embedded options
4
2.1
Structural form models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2
Reduced form models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.3
Hybrid form models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.4
Credit model in PALM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.5
Credit risk and embedded options
. . . . . . . . . . . . . . . . . . . . . . . . . .
3 Economic framework: The Black-Scholes two-factor Hull-White model
10
12
3.1
The one-factor Hull-White model . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
3.2
The two-factor Hull-White model . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
3.3
Black-Scholes and Two-factor Hull-White . . . . . . . . . . . . . . . . . . . . . .
14
3.4
Distributional properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
4 Intensity based Modeling
17
4.1
Constant default intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
4.2
Deterministic Time-Varying default Intensity . . . . . . . . . . . . . . . . . . . .
18
4.3
Cox process: Stochastic default intensity . . . . . . . . . . . . . . . . . . . . . . .
19
4.4
Rating based Transition intensities: A continuous Markov chain . . . . . . . . . .
19
4.5
Lando’s model: A rating based Cox process . . . . . . . . . . . . . . . . . . . . .
23
4.6
Finding a historical generator matrix . . . . . . . . . . . . . . . . . . . . . . . . .
24
5 Final Model
26
5.1
Model description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
5.2
Assessment of the transition model . . . . . . . . . . . . . . . . . . . . . . . . . .
26
5.3
Analysis of the Constant Eigenvector Hypothesis . . . . . . . . . . . . . . . . . .
28
5.4
Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
5.5
The current term structure of U.S. interest rates . . . . . . . . . . . . . . . . . .
32
5.6
The current term structure of U.S. credit risky yields . . . . . . . . . . . . . . . .
32
6 Bond pricing
35
6.1
Risk neutral pricing of a non-defaultable bond . . . . . . . . . . . . . . . . . . . .
35
6.2
Risk neutral pricing of a defaultable bond . . . . . . . . . . . . . . . . . . . . . .
36
6.3
Pricing of a defaultable bond with an intensity model . . . . . . . . . . . . . . .
37
6.4
Analytic bond price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
6.5
Model implied credit spreads . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
6.6
Checking bond price by means of Monte Carlo . . . . . . . . . . . . . . . . . . .
40
7 Portfolio Simulation
43
7.1
Simulation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
7.2
Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
8 Conclusion
49
A Definitions
54
B Proofs
56
B.1 Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
B.2 Proof theorem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
B.3 Proof lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
B.4 Proof of Analytic defaultable bond price . . . . . . . . . . . . . . . . . . . . . . .
60
C A Linear Programming Stripping Procedure
67
D Credit risk model in PALM
68
D.1 Z-scores: mapping transition probabilities to the normal distribution . . . . . . .
68
D.2 Relating the business cycle to the credit cycle: macro-economic dependence of
the z-score deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
D.3 Valuation and Pricing of Defaultable Bonds in PALM . . . . . . . . . . . . . . .
71
D.4 A simple example of the PALM model . . . . . . . . . . . . . . . . . . . . . . . .
73
D.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
1
Introduction
Life insurance contracts often include several embedded options such as minim return guarantees
and bonus options, with in most cases one of the asset portfolios of the insurance company
as underlying. The awareness among regulators and insurance companies about the possible
impact of these options was raised, when in 2000 the firm Equitable Life Assurance Society came
very close to bankruptcy because it failed to properly manage the risks of the implicit options
embedded in their contracts. These options are liabilities on the balance sheet of the insurer and
need to be valued at market value. 1 Valuation of these claims is not straightforward because
they often have complex structures and sometimes concern non traded assets 2 . In practice it
is common to assume no arbitrage and use risk neutral valuation techniques to obtain the fair
value of these embedded options. In order to do so, risk neutral scenarios of the underlying value
of the embedded options are required. Since this often concerns an asset portfolio with credit
risky or defaultable bonds, also risk neutral scenarios of values of the defaultable bond portfolio
at the maturity date are needed. This thesis presents a simulation framework to generate these
risk neutral scenarios for the value of defaultable bond portfolios.
The uncertainty about whether or not a bond contract will be honored is referred to as default
risk. If the underlying firm defaults, the investor will lose its investment or will receive only
a part of its investment, the so called recovery. But credit risk also concerns a drop in credit
quality, by for instance change in credit rating, which also results in a different price of an
asset. This is often referred to as migration or downgrade risk. Because credit risk is only
partial idiosyncratic, the expected return of a defaultable bond is higher than that of a treasury
bond, which in most literature is assumed to be default free. The credit spread, the difference
in yield between defaultable and default free bonds, is therefore in expectation always positive
because it represents the risk premium of taking credit risk. A frequently used approach for the
valuation of defaultable bond prices is to model the spread and the term structure of interest
rate separately and apply risk neutral valuation. This approach however does not allow bonds
to actually default, so is insufficient if we would like to simulate future values of defaultable
bond portfolios.
ALS-life, which is the model used by the Insurance Consultancy department of Ortec Finance,
uses a combination of the Black Scholes model for stock prices and the two-factor Hull-White
model for the term structure of interest rates as their economic framework.Valuation of embedded
options is done using risk neutral simulations generated from the Black-Scholes and Hull-White
model (BSHW) and a Monte Carlo estimate of the option price. Currently ALS-life just imposes
a normal distributed credit spread on defaultable bond portfolios. Obviously, this does not give
a good fit to what we actually see in the market because defaultable bond returns are not
normally distritbuted since they can actually default and have little upside potential. Therefore
Ortec Finance needs a risk neutral scenario generator for defaultable bond portfolios. It has to
be a modified version of, and therefore consistent with, the credit model already in use by the
Pension Risk Management (PRM) department of Ortec Finance.
PRM analyzes future scenarios of the ratio between a fund its assets and liabilities, the so called
1
This is due to new accounting standards such as the International Financial Reporting Standards (IFRS) and
the new Financial Assessment Framework (FTK).
2
Arbitrage opportunities keep existing for non-traded assets because arbitrageurs can’t trade them.
1
funding ratio. So it is very important to have an idea about future asset or portfolio returns
in order to measure and manage the risk of being underfunded. PRM does this with use of
the by Ortec Finance developed Pension Asset Liability Model (PALM). Real world economic
scenarios are generated using a VAR(1) model, and include key indices of the financial market
such as the term structure of interest rate and MSCI indices. In order to simulate scenarios
for future returns of portfolios with defaultable bonds, PALM currently uses a reduced-form
model based on credit ratings. So not only default risk but also migration risk is modeled. The
transition matrix, which denotes the expected probabilities to go from one rating to another or to
default within one time unit, depends in this model on the state of the economy and the historic
average transition probabilities. For valuation these real world transition matrices are mapped
to risk neutral transition matrices assuming a constant ’risk premium’, which is calibrated using
current market prices. Another feature of the model is that it allows for positive recovery, which
is assumed to be a constant rate in time. An important issue to mention is that this is actually
a model for simulating returns of a single bond. Dependence structures between bonds within
a portfolio are only modeled implicitly by the dependence of the transition probabilities on
the business cycle. In credit portfolio modeling it is more common to model the dependence
structures explicitly, in order to capture the concentration risk: the risk that due to a certain
dependence between bonds, bonds default or downgrade as a direct consequence of the default of
another bond. This is especially important for financial institutions because they are often bond
issuers and investors in defaultable bonds at the same time. In the simulation framework we
develop in this thesis it will be possible to take into account direct correlations between issuers.
1.1
Thesis Objective
The thesis objective is to develop a simulation framework which generates risk neutral scenarios
of defaultable bond portfolios for which the following properties hold:
• Is consistent with the credit model in PALM, in the sense that it possesses all the good
properties of the PALM model.
• If possible is an improvement of the credit model in PALM, in the sense that the disadvantages of the PALM model are tried to be improved.
• Is an extension of the economic framework of ALS-life: The Black-Scholes two-factor HullWhite model.
• Admits for a closed form solution of the defaultable bond price. This is because otherwise
computational complexity becomes unacceptably high.
• Is an arbitrage free model so the return on defaultable bond portfolios is on average equal
to that of the money market account.
1.2
Outline of this thesis
The outline of this thesis is as follows. In section 2 we will give an overview of the model
available in the literature as well as a description of the credit risk model in PALM. The section
2
ends with an example of the valuation of an embedded option with a credit risky portfolio as
underlying. Section 3 presents the economic framework of ALS-life: the Black-Scholes twofactor Hull-White model. Our defaultable bond price dependence on this economic framework
is twofold: (1) cashflows are discounted with interest rates (2) transition probabilities depend on
the state of the economy. In section 2 we elaborate on intensity models and the default model we
eventually are going to use. We begin with a simple model with constant default probabilities
in time and conclude with our final default model. Section 5 describes our final model and here
we also explain a possible way of calibrating the model to the short spreads. In this section
we also estimate the current term structure of interest rates and U.S. credit risky yields. The
derivation of the closed form solution of a defaultable bond in our modeling framework can be
found in section 6 and in section 7 we define our simulation framework. Here we also simulate
risk neutral scenarios for the return of examples of defaultable bond portfolios. Finally we give
our conclusions and suggestions for subsequent work in section 8.
3
2
Credit risk models and Embedded options
The current financial crisis painfully showed the importance of incorporating what are now called
Black Swans3 in risk models. Even AAA rated structured securitized loans defaulted causing,
among others, the downfall of Lehman Brothers. This showed once more that the default event
of a bond is a Black Swan and can have a destructive effect on the financial position of the bond
holder. Omitting default risk would seriously underestimate the value of some of the embedded
options of insurance companies.
So what makes credit portfolio modeling analytically and practically difficult is that as equity
returns are more or less normal distributed, the return on a portfolio of defaultable products
such as bonds is typically highly skewed and leptokurtic4 , causing a fat left tail compared to the
normal distribution (see figure 1). This combination of high kurtosis and skewness is nowadays
sometimes referred to in economics and finance as a Taleb distribution, referring to the writer
of the book the Black Swan (Taleb 2007). Estimating it with a normal distribution would result
in significant kurtosis and skewness risk.
Credit risky returns are skewed, because of the high probability of a small gain and its little
upside potential. Note, that the distribution of a normal distributed variable is completely
described by its mean and standard deviation which makes this relatively easy to work with.
Even a skewed normal distribution can be described by three parameters, now including a
skewness coefficient. But the fat left tail of the distribution, which is the result of the improbable
but big event of default, is what really makes it more challenging to model credit returns.
Figure 1: Comparison of the normal distribution which is often used to model equity returns to
a typical distribution of a portfolio of credit returns which is leptokurtic and negatively skewed
(Source: JP Morgan)
In this section we will first give a brief overview of the existing credit model literature. The
main concepts and assumptions of the models will be discussed, but not described in detail. We
will follow with a short description of the credit risk model in PALM. The risk neutral model
in ALS-life should be consistent with the PALM model and should to a large extend have the
3
Black Swans are rare events with an extreme impact, not incorporating Black Swans in a model is as seeing
the grass but missing the trees. For more details see (Taleb 2007).
4
A leptokurtic distribution is a distribution with positive excess kurtosis
4
same properties and assumptions. Off course, if we can omit some simplifying assumptions or
improve the model we should do that. This section is concluded with an illustrative example of
the problem of pricing embedded options with a defaultable bond portfolio as underlying.
Basically, there are two classes of credit models, which for a long time have be seen as two
completely different approaches, namely the structural or classical and the reduced form models.
Structural models are based on the underlying basis of the default event, i.e. when a firm cannot
meet its financial obligations. Structural models are based explicitly on the timing of such an
event. Reduced form models however assume a certain process for migration or default and use
this to estimate default and migration probabilities. We first start with a brief overview of the
literature on the classical approach, then we will elaborate more on the reduced form literature
with special attention to rating class models and we conclude with hybrid models which try to
reconcile the structural and the reduced form approach.
2.1
Structural form models
The classic model of Merton (Merton 1974) only considers default at maturity, since this would
be the only time creditors can verify the asset value of the issuer. Therefore, models based on this
assumption are appropriately called default at maturity models and often also assume default to
be the event when the market value of the total assets of the issuer is below the face value of the
issued debt. Merton argues that in this case the possibility of default can be seen as a call option
written by the investor on the total asset value of the issuer, with the face value of the debt or
an other barrier level as the strike level. So the payoff at maturity of this option is the maximum
of zero and the so called distance to default, the difference between an issuer’s assets and the
barrier level of default. Because this is a plain vanilla european option it can be priced with the
Black-Scholes formula (Black and Scholes 1973). The price of the total debt then simply equals
the default free price minus the value of the call option. Among others (Delianedis and Geske
1999) showed that the Merton model has predictive power for rating transition probabilities
and defaults. Important contributions to the default at maturity literature are in (Ingersoll
1977), (Geske 1977), (Merton 1977), and (Smith and Warner 1979). Later (Jones, Mason, and
Rosenfeld 1984) showed that the default at maturity models undervalue credit spreads because
of their assumption that default only occurs at maturity5 .
Already in 1976 (Black and Cox 1976) relaxed this assumption and introduced a new class
within the structural models, namely that of first-passage which in mathematical terms interprets default as a first stopping time problem. In this type of models default occurs at the
first time assets are below a certain default boundary. So default can occur at any time between initiation and maturity, instead of only at maturity. Main papers here are (Briys and
De Varenne 1997),(Ericsson and Reneby 1998),(Wang 1998),(Kim, Ramaswamy, and Sundaresan 1993),(Nielsen, Saà-Requejo, and Santa-Clara 1993),(Longstaff and Schwartz 1995), and
(Saa-Requejo and Santa-Clara 1997). A non-parametric structural model to estimate default
probabilities using the market prices of option contracts on equity is described in (Capuano
2008). This model uses an important corporate finance theorem that states that the distance to
default option is equal to the equity value of the firm.6 Therefore, an equity option can be seen
5
6
Think of the difference in price between a european and an american option
The value of a firm is equal to the sum of debt and equity. Equity is junior to debt, so at default debt holders
5
as an option on the distance to default option described earlier.
The Merton model lies at the base of the, widely used in practice, KMV model (Kealhofer 1995).
This model empirically estimates default probabilities with a measure called the estimated default
frequency (EDF) and heavily uses the concept of distance-to-default.
The models discussed so far only asses the probability of default of one single issuer. Care is
needed when assessing default risk of portfolios, because in those cases the dependence structure
of default is also of importance. The workhorse of this kind of models is the Gaussian Copula
model introduced by (Vasicek 1987). This is one of the first models that does this and generates
the joint distribution of stopping times using the gaussian copula as describer of the dependence
structure. (Li 2000) looks at default as a survival problem, originally used in actuarial theory,
and extends this model by allowing other copulas to generate the joint distribution of what he
calls survival times.
2.2
Reduced form models
Structural models have many disadvantages, for one they are often not analytically tractable, and
moreover a lot of specific data is needed. Especially when assessing credit portfolios with many
different issuers, it is very hard and time consuming to gather all the needed data. Implicitly
structural models assume that investors have the same information as the managers of the
issuers, while in real life investors get information at set times by for example quarterly results.
Then there is the issue that this often concerns balance sheet data, which is known for not
always presenting the facts well. Consider for example some of the big accounting scandals such
as Enron and Ahold. When firms are near bankruptcy they are likely to take much effort to
withhold this information from the shareholders.
Therefore more recent models are of so called reduced form: they, because the issuer specific
information is reduced, directly model default or migration probabilities. The biggest part
of models of reduced form assume that default occurs at the first arrival time of a Poisson
process. The mean arrival rate of this poisson process is called the intensity and is one of
the most important notions in the credit literature. One of the major advantages of intensity
based modeling is that it is possible to keep a strong analogy with term structure modeling and
therefore to reduce the technical difficulties of modeling defaultable bonds to non defaultable
bonds, see (Duffie and Singleton 1999a). The simplest form would be to assume a constant
intensity, one step further is a deterministic time varying intensity and finally to treat the
intensity as a random process in either discrete or continuous time. If the intensity follows
a stochastic continuous process, it is said that default is doubly stochastic. The filtration on
available information is essential, it heavily influences the outcomes of these models and thereby
for a large extend defines them.
Reduced form modeling has been introduced by (Jarrow and Turnbull 1995), where default
follows an exogenous time inhomogeneous Markov process. Default intensities are constant in
time. The famous Jarrow, Lando, and Turnbull model (JLT) (Jarrow, Lando, and Turnbull
1997) extends this by incorporating rating classes. So instead of only considering the two states
defaulted and not defaulted, not defaulted is broken up in different rating classes. The major
are paid first and then the shareholders get the remains. Hence E = max(0, V − D).
6
reason to this extended approach is that price of debt changes as the credit rating of the issuer
changes, hence migration risk and not only default risk is taken into account. (Wei 2000) extends
the discrete version of the JLT model by letting intensities depend on macro economic variables,
so with time varying intensities or in the case of stochastic macro economic variables, stochastic
intensities.
(Lando 1998) is one of the first to consider stochastic default intensities in continuous time and
therefore introduces an important class of models in the literature, namely that of the already
mentioned doubly stochastic Poisson process, where in this case intensities or state variables
follow a Cox process. This model has the important feature of allowing intensities to depend on
a driving state vector which can include macro-economic or firm specific variables for example.
In (Lando 1998) a closed solution is derived for an example of a one factor model, where the
state process is the intensity process itself, and it is shown how to obtain a term structure model
with an affine-like structure. This is convenient because state space models which are affine
have, under some mild conditions, closed form solutions for bond prices, see (Duffie, Filipović,
and Schachermayer 2003).
In the Lando model, correlation between issuers of the default event is modeled in an indirect
way, via correlation in default probabilities due to the state variables. This works well for large
portfolios but (Schonbucher and Schubert 2001) argue that the default correlations in these
models are typically too low and propose copula-dependent intensities for the modeling of portfolios. Other work by (Davis and Lo 2001) and (Jarrow and Yu 2001) models default correlation
by allowing joint upward jumps in intensities at the trigger of the default of other firms or for
instance a credit crisis. Drawbacks of this approach are that it is already computationally cumbersome to derive the default probability of only two issuers, let alone the difficulties for a large
portfolio, and that it is hard to estimate the jump factor because it is unclear how to calibrate
this on historical default data.
2.3
Hybrid form models
One of the main differences between the reduced and the structural approach is whether or not
default can be predicted or is a sudden event. Structural models assume continuous monitoring
of the assets and do not see the default event as a surprise because they can observe the assets
nearing the default trigger. In the reduced form approach default cannot be foreseen because
there is not a continuous information flow about the firm’s assets. This difference can also be
seen as the information set or filtration the model uses. Reduced form models use information
available in the market, as in structural models the manager’s information set is used. In recent
years a lot of attempts have been made in order to reconcile these two approaches.
(Duffie and Lando 2001) are one of the first to consider the Merton model with incomplete
accounting information, and show that in that case default occurs with some intensity and
hence is equivalent to a reduced form model. The information is incomplete in the sense that
the value of the assets is only known at discrete points in time, and that a noise is added to
the accounted value of the assets. Note that this is a reduced form model with endogenously
determined intensities, this in contrast to original reduced form modeling whereas intensities
have to be exogenously specified. So with this type of modeling the intensities follow from the
economic fundamentals of the firm.
7
Instead of, like (Duffie and Lando 2001), dealing with incomplete information about the value
of the assets, (Giesecke 2006) assumes full asset information but allows incomplete information
about the default barrier level. He finds that this makes default an unpredictable event but that
it does not admits an intensity based representation. When making both information about the
value of the assets and the default barrier incomplete the model is equivalent to a reduced form
model. More general, Giesecke shows that in order to admit intensities, it is necessary that the
default event is unpredictable, so a sudden event, but that this is not sufficient.
Other important work in structural modeling with incomplete information is for example (Çetin,
Jarrow, Protter, and Yildirim 2004), (Frey and Runggaldier 2007), and (Frey, Prosdocimi, and
Runggaldier 2008).
2.4
Credit model in PALM
Now that it is clear what the objective is of this thesis we first need to have some understanding
of the credit model used in PALM, because the risk neutral credit model for ALS-life should be
consistent with the PALM model. A much more detailed description of the PALM model can
be found in appendix D.
The current credit model (Londen 2002) used in PALM to model credit portfolios is based
on (Wei 2000), but with some modifications. Default probabilities are estimated with use of
historical default information and the state of the economy. So the model is of the reduced form
and not of the structured form class, because default probabilities do not depend on the financial
health of the underlying firm specifically. Here is chosen for reduced form because PALM is used
for a lot of different funds with very diverse credit portfolios, gathering all the data needed for
a structured model would be extremely time consuming and in most cases impossible. Instead
of having only one credit class, i.e. two states (defaulted and not previously defaulted), PALM
uses multiple classes in the form of credit ratings. So the possible states of a bond are the K − 1
different credit ratings, which together form the non defaulted states, and the Kth rating class:
the defaulted state. Assumed is that the defaulted state is absorbing, so once an underlying firm
defaults it cannot revive or be non defaulted in the following years. The real world probabilities
of going from one rating to another are called the transition probabilities and are depicted in the
one-year transition matrix P (t). Recall that PALM is not used for pricing derivatives but for risk
management purposes, i.e. it generates real world scenarios instead of the risk neutral scenarios
in ALS-life. So in this subsection, unless explicitly stated otherwise, we only consider real world
probabilities and so if the term ”probability” is used, it means the real world probability.
p(t)11
p(t)12
p(t)13
...
p(t)1K
p(t)21
p(t)22
p(t)23
...
p(t)2K
..
..
..
..
..
P (t) =
(1)
.
.
.
.
.
p(t)K−1,1 p(t)K−1,2 p(t)K−1,3 . . . p(t)K−1,K
0
0
0
...
1
So p(t)12 is the estimated probability of migrating from the first rating class to the second within
one year, and p(t)11 is the estimated probability of staying in the first rating class within one
year. Notice that the default state is indeed absorbing because it has a probability of one of
8
staying in the same rating state. The rows of this matrix sum up to one because the probability
of migrating to a rating class which is not in the matrix equals zero.
p(t)i1 + p(t)i2 + p(t)i3 + . . . + p(t)i8 = 1,
for i = 1, 2, ..., K
(2)
As we can see the transition matrix is not constant in time but can be different in each year.
This is an important feature of the model, because this makes it possible to let the transition
probabilities themselves be stochastic and dependent on the state of the economy. More specific,
the deviation of the probabilities from their historic average depends on some of the macro
economic variables from the economic scenario generator in PALM. Hence this specifies the
relation between the credit and the business cycle. This is also the only way correlation in
default is captured; so in an indirect way. Conditional on the economic variables there would
be no correlation.
In order to price the defaultable bonds in a portfolio, PALM uses a risk neutral transition matrix.
For each current rating the real world transition probabilities, except the default probability,
are transformed to risk neutral probabilities by multiplying the real world probabilities with a
rating dependent ”risk premium”. The risk neutral default probability then simply follows from
the fact that the probabilities, for each rating class, should add up to one. The ”risk premia”
are calibrated to fit current defaultable bond prices observed in the market. Notice the implicit
assumption here that risk premia are constant in time and the same for each non-default rating.
Moreover, for pricing, PALM also assumes the transition probabilities and also the recovery
rate to be constant in time. So to simulate how bonds in a portfolio downgrade, upgrade,
and default, transition probabilities are stochastic due to their dependence on the state of the
economy. However, the pricing of these bonds is done with a model which assumes constant
transition probabilities. This is an important drawback of the PALM model because these are
two contradicting assumptions. The pricing model we will develop in this thesis does account
for stochastic transition probabilities.
To summarize, these are the main properties of the PALM model:
• Reduced form model, not firm specific
• Markov model, i.e. no memory: transition probabilities only depend on the current rating.
• Rating model, i.e. does not model only the default event but also the credit risky event of
rating transitions.
• Macro-economic dependence of transition probabilities for transition simulations of a portfolio, i.e. stochastic
• Risk premia assumed constant in time.
• Recovery rate assumed constant in time and the same for each rating.
• Pricing model assumes transition probabilities constant in time.
• Real world transition probabilities calibrated on historical transition matrix.
• Risk premia calibrated to fit current defaultable bond prices.
• Macro-economic dependence coefficients estimated using historical data and linear regression.
9
2.5
Credit risk and embedded options
To illustrate the issues arising when pricing options with a defaultable bond portfolio as underlying we start with a stylized example. Consider a minimum return guarantee option, written
by the insurer and held by the insured, which guarantees a minimum annual return of c% and
matures in T years from now. The insurer invests the lump sum premium paid at time 0 only
in corporate zero coupon bonds 7 , hence defaultable bonds. The insurer applies a buy and
hold strategy on this portfolio and buys a(0) investment grade bonds and b(0) speculative grade
bonds, both with maturity M > T . Now denote the value of this portfolio at time t as
A(t) := a(t)P I (t, M ) + b(t)P S (t, M )
and K := (1 + c)T A(0) as the guaranteed value at T. These are the weights for each rating, so
a(t) gives the amount of investment grade bonds at time t and b(t) that of speculative grade.
Notice that although a buy and hold strategy is applied, the weights a(t) and b(t) do depend on
time. This is the result of the fact that corporate bonds can downgrade, upgrade, or default. A
bond which is initially rated speculative, can be investment rated at time T , or it has defaulted
before that time and only holds the recovery value 8 . So unless we assume rating transitions
and the default event to be something deterministic, a(t) and b(t) also follow a random process.
So we know now the payoff at maturity of the option equals max(0, K − A(T )). If we would
apply risk neutral valuation to determine the fair valuehof the option, the value iwould equal the
B(t)
discounted risk neutral expectation of the payoff: E Q max(0, K − A(T )) B(T
) , where B(t) is
the money market account and Q the risk neutral measure. So we need the joint risk neutral
distribution for a(t),b(t),B(T ),P I (T ), and P S (T ) for which we also can calculate the desired
expectation. Suppose we are able to derive proper risk neutral marginal distributions of these
variables and we assume a linear dependence structure defined by correlation coefficients. Then
in only a few cases it is possible to derive an analytical expression for the joint distribution and
in even less cases we are able to derive the desired expectation. Note that this is a very simple
and stylized example, so if for example rebalancing strategies are applied or the portfolio also
consists of equity and derivatives it is very likely to be impossible to get an analytical solution
for the value of the guarantee option. A solution would be to simulate paths of the values with
for example an euler scheme and estimate the expectation with the Monte Carlo estimator,
N
B(0)
1 X
B(0)
E max(0, K − A(T )))
≈
max(0, K − Ai (T ))
Bi (T )
N
Bi (T )
Q
i=1
This is actually also the approach of Ortec Finance. They developed a generator for risk neutral
paths of equity and bonds, but they still lack a generator for defaultable bonds. As we saw in
the example above, what is needed to achieve this is:
1. Risk neutral scenarios for the weights of a defaultable bond portfolio for every time t ≥ 0,
2. The price of a defaultable bond, given that it is in credit rating i, for every time t ≥ 0 and
maturity T > t.
7
Note that zero coupon bonds are mainly a theoretical concept and are not very often traded
Another issue is how the investor invests the received recovery value if an issuer has defaulted. A usual
assumption is to assume that the recovery value appreciates with the money market account
8
10
The transition process including default has to be consistent with the model already in use by
PRM, which is implemented in PALM, and the defaultable bond price has to fit the existing
generators for equity and interest rates. Another important requirement is that the defaultable
bond price should be in analytic form. If we would also approximate this with Monte Carlo, we
have to simulate new paths for every given scenario at time T which will increase calculation
time exponentially to an unacceptable level.
11
3
Economic framework: The Black-Scholes two-factor Hull-White
model
In many models, as for example the original Black-Scholes model, interest rates are assumed
to be deterministic and constant in time. These models have proved useful in modeling stock
price behavior and especially pricing equity derivatives. Equity derivative prices are typically
not very sensitive for small changes in interest rates, so adding randomness to interest rates has
in these cases little added value. Moreover, it will make computations more cumbersome and
will result in even fewer cases in an analytical expression for the price. If this is the case, it is not
unlikely that the benefit of stochastic interest rates is overthrown by the approximation error of
for instance a Monte Carlo estimate. However, in pricing interest rate derivatives, such as some
embedded options, it is for obvious reasons crucial to add randomness to interest rates. The
payoff of such a product directly and only depends on a particular interest rate or yield curve in
the future, so if we would model this deterministically, the payoff also would be deterministic.
Hence, the fundamental idea of an option is lost.
Therefore Ortec Finance uses a combined Black-Scholes two-factor Hull-White (BSHWII) model
to price options embedded in life-insurance contracts. The risk neutral version of the BlackScholes (BS) model is used for stock prices and the two-factor Hull-White (HWII) model describes the risk neutral process of the term structure of interest rates. They are combined in the
sense that stocks and interest rates are heavily correlated, especially in the risk neutral world
where the drift of stocks is equal to the instantaneous short rate. Besides that, the random
components of the HWII and BS model, which are Wiener processes, are also correlated. Notice
that we include stocks in our economic framework because the transition probabilities in our
model will also depend on stock returns.
In this section we will explain the Black-Scholes two-factor Hull-White model. This is the
financial framework in which the credit model should be implemented so it is important to
have a good understanding of this model. First the Hull-White model for stochastic interest
rates is explained, the original one-factor model and an extension to a two-factor model. Then
we will focus on the well known Black-Scholes model used for stock prices combined with the
Hull-White model: The Black Scholes two-factor Hull-White model. Finally we will elaborate
on calibration of the BSHWII model and provide some distributional properties which are used
in further sections.
3.1
The one-factor Hull-White model
The one-factor Hull-White model (Hull and White 1990) is a no-arbitrage model and is therefore,
in contrast to equilibrium models, consistent with today’s term structure of interest rates. It
models the process followed by the instantaneous short rate r in a risk neutral world. Economic
arguments, and empirical research as well, show that interest rates appear to revert to some
long-time average in time. This is known as mean reversion and is also incorporated by the
Hull-White model. It reverts to the time dependent but deterministic long-time average θ(t)
a (or
reversion level ) with a constant reversion rate a:
dr(t) = [θ(t) − ar(t)]dt + σdW (t)
(3)
12
Here is W (t) a standard Wiener process and σ a constant volatility parameter. If we write (3)
as,
dr(t) = a[
θ(t)
− r(t)]dt + σdW (t)
a
it is clear that this is equivalent to the well known Vasicek model9 , but with a time-dependent
reversion level. Because of this time-dependency, the drift of the HW1 model is also timedependent. This makes it possible to calibrate θ(t) consistent with the initial term structure
using the no arbitrage constraint. It can be easily shown that, if we denote F (0, t) as the initial
instantaneous forward rate curve, in order to have no arbitrage opportunities the following
should be true (Hull and White 1990):
θ(t) =
3.2
δF (0, t)
σ2
+ aF (0, t) + (1 − e−2at )
δt
2a
The two-factor Hull-White model
A well known drawback of the one-factor Hull-White model is that it does not allow for a
”hump shaped” volatility structure of forward rates, which is often observed in the market.10 By
volatility structure we actually mean the term structure of volatilities, so for different maturities,
which in the one-factor Hull-White model is of an exponential declining form. To capture the
right volatility structure is important regarding interest rate derivative pricing, because Heath,
Jarrow and Morton (Heath, Jarrow, and Morton 1992) show that this structure fully determines
the price. The two-factor Hull-White model (Hull and White 1994) does allow for much more
different volatility structures, including hump shaped structures.
The two-factor Hull-White model extends the HWI model by including a stochastic factor u(t)
in the drift term. This results in a stochastic reversion level, namely θ(t)+u(t)
in contrast to the
a
θ(t)
deterministic reversion level a of the HWI model:
dr(t) = [θ(t) + u(t) − ar(t)]dt + σr dWr (t)
(4)
where u(t) has an initial value of zero and follows the process
du(t) = −bu(t)dt + σu dWu (t)
Wr (t) and Wu (t) are Wiener processes and correlated with the correlation parameter ρru . So
u(t) is mean reverting to a zero reversion level with rate b.
(Hull and White 1994) show that the price of a T -year default free discount bond at time t can
be derived analytically and is of the form,
P (t, T ) = A(t, T )e−B(t,T )r(t)−C(t,T )u(t)
A(t, T ),B(t, T ),and C(t, T ) can be found in (Hull and White 1994)
9
10
dr(t) = a[b − r(t)]dt + σdW (t)
See for example (Ritchken and Chuang 2000)
13
As in the one-factor model it is possible to calibrate θ(t) consistent with the initial term structure.
In the same paper Hull and White show that in order to fit the model to the initial term structure
of interest rates observed in the market, θ(t) must be equal to:
θ(t) =
δF (0, t)
δφ(0, t)
+ aF (0, t) +
+ aφ(0, t)
δt
δt
(5)
With,
1
1
φ(t, T ) = σr2 B(t, T )2 + σu2 C(t, T )2 + ρru σr σu B(t, T )C(t, T )
2
2
1
1
Where B(t, T ) = a1 1 − e−a(T −t) and C(t, T ) = a(a−b)
e−a(T −t) − b(a−b)
e−b(T −t) +
(6)
1
ab
Observe that if we omit the relatively small term δφ(0,t)
+ aφ(0, t), the drift of the spot rate
δt
δF (0,t)
process r(t) equals δt + a(F (0, t) − r) + u(t). Hence the mean of the spot rate follows the
instantaneous forward curve, which represents the expected spot rate implied by the market,
but with disturbance u(t). Furthermore, if r deviates from the instantaneous forward rate it
reverts back with rate a.
3.3
Black-Scholes and Two-factor Hull-White
The BSHWII model is a combination of the famous Black-Scholes model for stock prices and
the two-factor Hull-White model we just described. Because we are only interested in the risk
neutral world, the risk neutral version of the BS model is used. Hence, on average stocks
appreciate with the instantaneous short rate, i.e. investors are indifferent to risk and so there is
no such thing as a risk premium.
The BSHWII model can be defined by the general continuous-time state space model based on
stochastic differential equations (SDE) driven by a Wiener process,
dX(t) = µX (t, X(t))dt + σX (t, X(t))dW (t)
with,
S(t)
X(t) = r(t)
u(t)
r(t)S(t)
µX (t, X(t)) = θ(t) + u(t) − ar(t)
−bu(t)
σS S(t)
σX (t, X(t)) = σr
σu
(7)
where W (t) = (WS (t), Wr (t), Wu (t)) denotes a vector-valued Wiener process with dimension 3
and variance-covariance matrix Σ:
1 ρrS ρSu
1 ρru
Σ = ρrS
ρSu ρru
1
Note that because the variance of the above individual Wiener processes equals one, Σ is also
the correlation matrix of W (t).
When described by the BS model, so with constant short rates, the stock price has a closed form
solution. The solution, conditional on the interest rates between time s and t, {r(u) : s ≤ u ≤ t},
14
for the BSHWII model can be derived in the same way, i.e. by applying a log transformation of
the stock price. Due to Itô’s formula we obtain:
1 2
d ln S(t) = r(t) − σS dt + σS dWS (t)
(8)
2
By using the telescope rule we get for every t ≥ s,
Z t
S(t)
=
dln(S(τ ))
ln
S(s)
s
Z t
Z t
t−s 2
=
r(τ )dτ −
σ +
σS dWS (τ )
2 S
s
s
Z t
√
t−s 2
r(τ )dτ −
σS + σS t − sZS
=
2
s
Where ZS is part of the vector Z = (ZS , Zr , Zu ), which is multivariate normal distributed with
correlation matrix Σ.
So conditional to Fs and {r(u) : s ≤ u ≤ t}, the explicit solution of the SDE describing the
stock price behavior equals,
S(t) = S(s)e
Rt
s
√
2 +σ
r(τ )dτ − t−s
σS
S t−sZS
2
The calibration of the parameters of the BSHWII model is beyond the scope of this thesis and
is already thoroughly discussed in (Habti 2007), where the model is calibrated using swaptions.
We therefore use for the remaining of this thesis the following stylized but realistic parameter
set:
a = 0.72
b = 0.026
σr = 0.015
σu = 0.006
σs = 0.020
ρrs = 0.01
ρru = −0.1
ρsu = 0
3.4
Distributional properties
We will follow by deriving some distributional properties of the model, especially for r. These
results are needed to eventually derive a closed form solution of the defaultable zero coupon bond
price. An important property of the HWII model is that short rates are normally distributed,
this is very useful and key in deriving a closed form solution of (defaultable) bond prices.
15
Theorem 1. In the BSHWII model r(t) is, conditional on Fs where t ≥ s, normally distributed
with expectation and variance given by,
e−b(t−s) − e−a(t−s)
µr (t) := E [r(t)|Fs ] = r(s)e
+ u(s)
+
a−b
!
1 − e−2a(t−s
2
Q
2
σr (t) := var [r(t)|Fs ] = σ1
2a
−at
Q
σ22
+
(a − b)2
+
2σ1 σ2 ρ
a−b
Z
t
θ(τ )e−a(t−τ ) dτ,
s
1 − e−2a(t−s) 1 − e−2b(t−s)
1 − e−(a+b)(t−s)
+
−2
2a
2b
a+b
!
1 − e−(a+b)(t−s) 1 − e−2a(t−s)
−
a+b
2a
!
Proof. See appendix B.2
Normally distributed short rates have one main drawback: negative rates occur with non-zero
probability. However, depending on the calibration of the parameters, this risk neutral probability is typically very low. Hence,
µr (t)
Q {r(t) < 0|Fs } = Φ −
σr (t)
In contrast, because of the normality property it is possible to derive closed form solutions for
the price of a large class of interest rate derivatives. This property is also used to derive an
analytical expression for the non-defaultable bond price in section 6.
Because we are going to need the continuously compounded accumulated interest over a certain
period we also need to know the auto-correlation of r implied by the BSHWII model. In theorem
1 we derived an expression for the variance, so in order to obtain the auto-correlation we need
to find the covariance between r(u) and r(v) for every v ≤ u.
Lemma 1. The covariance between r(u) and r(v), conditional on Fs where s ≤ v ≤ u, is equal
to:
Z v
σ22
−a(u+v)
u(a−b)+bτ
aτ
v(a−b)+bτ
aτ
cov [r(u), r(v)|Fs ] = e
e
−
e
e
−
e
dτ
(a − b)2 s
Z v
Z
σ1 σ2 v a(v+τ )−b(v−τ )
+σ12
e2aτ dτ + ρ
e
+ ea(u+τ )−b(u−τ ) − 2e2aτ dτ
a−b s
s
Proof. See appendix B.3
16
4
Intensity based Modeling
Now we have specified the economic framework including interest rates we need the second
ingredient for the simulation of the value of a defaultable bond portfolio: a model for the time
of default. The use of the default model in our simulation framework is twofold: (1) to simulate
defaults and possibly rating transitions and (2) for deriving a price for defaultable bonds, which
depends on the default probabilities and the term structure of interest rates. As stated earlier,
in our case this has to be a model in reduced form. Because we do not have the data available
to model the value of all issuing firms in the portfolio of insurance companies. PALM also uses
a reduced form model so this is another reason to chose for this modeling class.
In this section we will start with a reduced form model with a default intensity constant in time.
This will be expended to allow for intensities to deterministically change in time and finally to
follow a stochastic process. These models have a state space of only two states, default and
survive. Because we also want to take into account the credit risky event of a rating transition,
we also will expand these models to allow for multiple ratings. What will follow in this section
can be used to model real world probabilities as well as risk neutral probabilities. Although it
is not generally true that the corresponding risk neutral probabilities follow the same model as
the real world probabilities. If for example real world probabilities are Markovian, risk neutral
probabilities need not to be Markovian, see(Jarrow, Lando, and Turnbull 1997)).
4.1
Constant default intensity
In intensity based modeling default occurs at the first jump of some jump process. The mean
arrival rate of that process is called the intensity of that process. First we will consider the most
commonly used jump process, the time homogeneous Poisson process.
Definition 1. A counting process {N (t), t ≥ 0} is a time homogeneous Poisson process with
intensity λ > 0 if for all 0 ≤ s ≤ t,
• N (0) = 0
• N (t) − N (s) ∼ P oisson(λ(t − s))
• Independent increments. For 0 ≥ u < v ≥ w < z, the increments N (v) − N (u) and
N (z) − N (w) are independent
• Stationary increments. The distribution of N (t + h) − N (t) only depends on h, not on
t
Recall that the the density function of a poisson distributed random variable X = 0, 1, 2, ... is
k
P r(X = k) = λk! e−λ . So
P r[N (t) = 0] = P r[N (t) − N (0) = 0] =
(λt)0 −λt
e
= e−λt
0!
(9)
In other words the probability that the firm survives t years equals e−λt . Define τ as the time
of the first jump, i.e. the time of default, then a firm is in the default state at t if t > τ . So
17
according to definition 1 and (9) we can derive the default probability p(.) at time t as follows,
p(t) := P r[t > τ ] = P r[τ < t] = P [N (t) ≥ 1] = 1 − P r[N (t) = 0] = 1 − e−λt
(10)
Notice that (10) equals the cumulative distribution function of an exponential distributed variable, hence τ ∼ exp(λ). This representation can be useful for simulations of the default time
τ.
Notice that here it can be easily seen why default in intensity based models is a sudden event
or more formally inaccessible. At each moment in time, even at an instant away from default,
the probability of default is 1 − e−λt .
4.2
Deterministic Time-Varying default Intensity
If a constant default intensity in time is assumed, then one year default probabilities are also
constant in time. So notice that if we assume a constant risk premium, even credit spreads are
assumed constant in time. Since there is a complete literature class on explaining changes in
credit spreads and historical defaults change drastically over the years we can with confidence
say that this is a very strong assumption and a large restriction on our model. Thus, it is
reasonable to believe that default probabilities also change over the years. Therefore, the time
homogeneous intensity model is often extended to a time inhomogeneous intensity model, where
we define the default event as the first jump of a time inhomogeneous Poisson process.
Definition 2. A counting process {N (t), t ≥ 0} is a time inhomogeneous Poisson process with
the (right-) continuous intensity function λ(.) > 0 if for all 0 ≤ s ≤ t,
• N (0) = 0
• N (t) − N (s) ∼ P oisson
R
t
s λ(u)du
• Independent increments. For 0 ≥ u < v ≥ w < z, the increments N (v) − N (u) and
N (z) − N (w) are independent
Definition 2 is very similar to the definition of a time homogeneous Poisson process. Its difference
is that the intensity of the Poisson process changes in time and is the Stieltjes integral of the
intensity function, also referred to in the literature as Hazard function or cumulated hazard rate.
Notice also that the stationarity property is not applicable anymore: the distribution of the
increments do depend on t now. This has a clear intuition if we look at the survival probability
in the case of constant intensity. Define the t-period survival probability as q(t). With constant
intensities we saw that this equals
−λ −λ
−λ
q(t) = e−λt = exp(−(λ
| +λ+
{z ... + λ})) = e| e {z...e }
t−times
t−times
Now suppose that λ changes every year and is constant during that year, then
q(t) = e−λ(1) e−λ(2) ...e−λ(t) = e−(λ(1)+λ(2)+...+λ(5))
18
Now suppose that λ changes every
1
n
year and is constant during that period. Then
!
nt
1
1
X
−n
λ( 1t n
)
1
1
1
1
1
λ( n
) −n
λ(2 n
)
−n
n
e
...e
= exp −
λ(s )
q(t) = e
n
s=1
So if λ is continuously changing, ∆ ↓ 0, then under some mild conditions on λ(t)
t
Z t
∆
X
λ(u)du
∆λ(s∆) = exp −
q(t) = lim exp −
∆↓0
(11)
0
s=1
Notice that this is of the same functional form as the formula for a non-defaultable bond price
obtained from forward rates.
The intensity process λ is time-varying, but deterministic. This implies that if a firm has
defaulted or not is the only relevant information for the default probability. If we would like to
model for example economic dependence of the default probabilities, we also need the intensity
process to be stochastic. The model then becomes doubly stochastic or follows a Cox process,
which is described in the next subsection.
4.3
Cox process: Stochastic default intensity
We now consider the case of a stochastic default intensity which in the literature is referred to
as the doubly stochastic Poisson process or Cox process. So now the intensity itself is also a
stochastic process. This means that conditional on the intensity process default is still the first
jump of an inhomogeneous Poisson process.
A more formal definition of a Cox process comes from (?),
Definition 3. ”N is called a Cox process, if there
R tis a nonnegative adapted stochastic process
λ(t) (called the intensity of the Cox process) with 0 λ(s)ds < ∞ ∀t > 0, and conditional on the
realization {λ(t)}t>0 of the intensity, N (t) is a time-homogeneous Poisson process with intensity
λ(t).”
So, conditional on survival up to time s, the probability of surviving from time s to t becomes
Z t
q(s, t) = E exp(−
λ(u)du)|Fs
s
Notice that this could be the expectation under the real world measure or the risk neutral
measure depending on what probability space the model is specified. We will only use risk
neutral probabilities so the transition model is only specified for the probability space (Ω, F, Q)
and the above expectation is under measure Q.
4.4
Rating based Transition intensities: A continuous Markov chain
The credit spread process over time of a specific defaultable bond has typically both a jump and
a continuous component. The jump part reflects new information on the credit quality of the
19
bond such as a change in credit rating or default, which always comes at discrete points in time.
The continuous component may be due to continuous changes in the credit quality such as the
state of the economy, or due to change in risk premiums or liquidity risk. The two state Cox
process incorporates the both components but jumps are only due to the default event. In order
to also capture jumps due to rating transitions we need a model with more than two states, for
example a Markov model which is explained in this subsection
So we now move to the more general approach of allowing more than two credit rating categories.
(Jarrow, Lando, and Turnbull 1997) show that the two state intensity model can be generalized
to a continuous Markov chain which allows more than two states. It turns out that the models
explained above are a special case of the more general continuous Markov model.
Define the (t − s) transition probability matrix at time s for a constant intensity in time as
Π(s, t) := e(t−s)Λ
11 ,
(12)
If we define Λij , as the transition intensity of moving from rating i to rating j than Λ is called
the generator matrix of the continuous time Markov chain:
Λ11
Λ12
Λ13
...
Λ1,K
Λ21
Λ22
Λ23
...
Λ2,K
..
..
..
..
..
Λ := .
.
.
.
.
ΛK−1,1 ΛK−1,2 ΛK−1,3 . . . ΛK−1,K
0
0
0
...
0
Hence, the probability of going from rating i to j between t and t + dt equals Λij dt.
To make sure that Π(s, t) is indeed a transition matrix and to ensure an appropriate evolution
of the credit spreads we impose some constraint on the generator Λij , which are similar to that
in (Arvanitis, Gregory, and Laurent 1999).
1. All off-diagonal entries of the generator matrix Λ should be non negative:
Λij ≥ 0∀i, j, i 6= j
This ensures no negative entries in any transition matrix.
2. The sum of every row of the generator matrix Λ equals zero:
K
X
Λij = 0 for j = 1, ..., K
i=1
Together with the next constraint this makes sure that the sum of the row of any transition
matrix equals 1.
3. Every diagonal entry of the generator Λ should be non-positive:
Λii ≤ 0 ∀i
11
The exponential of a matrix A is defined as eA ≡
Ai
i=0 i!
P∞
20
4. The default state is absorbing:
ΛKj = 0, for j = 1, ..., K
5. The Stochastically Monotonicity Constraint which assures that the credit spread is positively monotonic in rating. So the credit spread of a AAA rated bond is always lower of
that of for instance a AA or CCC rated bond. This is equivalent to saying that a state
i + 1 is always more risky than a state i:
X
X
Λij ≤
Λi+1,j , ∀i, k k 6= i + 1
j≥k
j≥k
In the following example we show intuitively how this corresponds to the definition of the default
event in a the case of two rating categories.
Example 1. Assume just as in the default intensity case that we only have two rating categories:
non defaulted and defaulted. Denote Λ1,K = λ, then
−λ λ
Λ :=
0 0
From the definition in (12) the 1-year transition probability matrix equals:
Λ
Π(s, s + 1) := e =
∞
X
−Λi
i=0
e−λ 1 − e−λ
=
0
1
i!
Notice that the 1-year survival probability in this example is q(1) = e−λ just as in the constant
default intensity case.
However itR is not generally true that if transition intensities vary continuously over time that
t
Π(s, t) = e s Λ(u)du , as in the two ratings case (11). This is due to the fact that without further
assumptions, when A and B are matrices of equal dimensions, in general
eA eB 6= eA+B
In order to have eA eB = eA+B we need A and B to commute (see the formal definition of eA ).
So in the case of a continuously changing Λ we need that
Λ(s)Λ(t) = Λ(t)Λ(s) ∀s, t > 0
(13)
Consider the following assumption:
Assumption 1. The Constant Eigenvector Assumption For every t ≥ 0 Λ(t) can be
decomposed in B, B −1 , and µ(t) as follows:
Λ(t) = Bµ(t)B −1
21
I.e. Λ(t) can be diagonalized for every t, and moreover, its eigenvectors are the same for each t.
So what we actually assume is that default probabilities only vary in time due to the variation in
the eigenvalues of the transition intensities. In the next subsection we will examine how strong
this assumption is but for now we just assume it to be true. As shown in the following lemma
this immediately solves our commutativity problem.
Lemma 2. If we make assumption (1) then Λ(s)Λ(t) = Λ(t)Λ(s) ∀s, t > 0.
Proof.
Λ(s)Λ(t) = Bµ(s)B −1 Bµ(t)B −1
= Bµ(s)µ(t)B −1
= Bµ(t)µ(s)B −1
(14)
= Bµ(t)B −1 Bµ(s)B −1 = Λ(t)Λ(s)
Note that in (14) we used that µ(t) and µ(s) commute because they only have entries on the
diagonal.
Corollary 1. From assumption (1) follows that Π(s, t) = e
Rt
s
Λ(u)du
.
So this assumption allows us to get an analytical expression for Π(s, t).
Theorem 2. If we make assumption (1) then
Π(s, t) = Be
Rt
s
µ(u)du
B −1
Proof. We start by noticing that
we obtain,
Π(s, t) = e
Rt
s
Rt
s
Λ(u)du =
Rt
s
Bµ(u)B −1 du = B
Rt
s
µ(u)duB −1 . Using this
Λ(u)du
Z
1 t
Λ(u)du2 + ...
=I+
Λ(u)du +
2 s
s
Z t
2
Z t
1
−1
−1
−1
= BB + B
µ(u)duB +
B
µ(u)duB
+ ...
2
s
s
Z t
Z
1 t
−1
−1
= BIB + B
µ(u)duB + B
µ(u)du2 B −1 + ...
2 s
s
Z t
2
Z t
1
= B(I +
µ(u)du +
µ(u)du + ...)B −1
2
s
s
R
i
t
∞
µ(u)du
Rt
X
s
=B
B −1 = Be s µ(u)du B −1
i!
Z
t
i=0
22
Here Π(s, t) is a matrix but from the above we can easily derive an expression for each entry
of the vector, i.e. the probability of the transition from being in rating i at time s to being in
−1
rating k at time t. If we define Bijk := Bij Bjk
then
Πik (s, t) =
K
X
Rt
Bijk e
s
µ(u)du
B −1
j=1
Letting only the eigenvalues be conditional on information restricts the economic dependence
possibilities. Therefore we reduce the chance of overfitting because a lot less parameters have to
be estimated in this way. Otherwise we had to estimate parameters for each transition intensity,
or make assumptions, as in the PALM case, that only parallel shifts are possible between the
default and the non default transition probabilities.
4.5
Lando’s model: A rating based Cox process
We know extend the previous subsection with stochastic transition probabilities. In (Lando
1998) Lando rating transitions follow a Cox process and he assumes that µj (t) is affine in state
variables that are either Gaussian or follow a CIR process. He also shows that in most cases this
allows for a closed form solution of the price of defaultable bonds. So Lando’s framework allows
for spreads which are correlated with economic factors and a closed form solution. Because these
are the two main criteria our model has to fulfill we adopt this setup and assume that:
µj (t) = γj0 + γj1 r(t) + γj2 RS (t)
Where r(t) is the short rate and RS (t) the instantaneous return on stocks at time t. The short
rate and stock price are the state variables in our economic framework and since the credit
spread literature 12 has proved that stock returns and interest rates are correlated with credit
spreads we include both the state variables. We take the instantaneous return of stocks because
this is the equity equivalent of short rates. The main disadvantage of this setup is that because
RS (t) and r(t) are Gaussian µj (t) has a positive probability of becoming negative. This would
result in negative transition probabilities which is trivially a very undesired property. A possible
solution to this problem is found in (Schonbucher 2000), where a two-factor Hull-White tree is
expanded with the default event. This allows for capping negative probabilities to zero in each
branch. So this uses branching to price and not a closed form solution. If we do not necessarily
need a closed form solution then we as well can assume a non-affine function of µ in the state
variables, which has zero probability of negative entries in µ. So we adopt the Lando framework
but it is important to be careful in setting the parameters to avoid high probabilities of negative
transition probabilities (and eventually negative spreads).
Now conditional on Fs the t − s year transition matrix is equal to:
Z t
S
Π(s, t) = B diag E exp −
γ0 + γ1 r(τ ) + γ2 R (τ )dτ Fs B −1
s
12
See for example (Huang and Kong 2003).
23
−1
And if currently in rating i, and we denote Bijk := −Bij Bjk
,then the probability of going to
rating k between time s and t equals
Z t
K
X
S
ΠiK (s, t) =
Bijk E exp
γj0 + γj1 r(τ ) + γj2 R (τ )dτ Fs
j=1
4.6
s
Finding a historical generator matrix
Rating agencies do not produce actual transition matrices for a much shorter time period than
one year. Because transitions do not occur very often, a shorter time-period would not result
in a very good estimate. There are simply not enough observations of transitions in a shorter
time-period. These historical transition matrices can be used in a discrete yearly Markov model.
But in a continuous Markov model a generator matrix is needed to produce a proper transition
matrix for each t. In most literature a historical generator is simply assumed to exist. The only
ones who addressed this problem are (Jarrow, Lando, and Turnbull 1997) and in much more
depth (Israel, Rosenthal, and Wei 2001). Israel et al. show that a generator matrix need not
exist, and if it exists it is not unique in that possibly multiple generators produce the same one
year transition matrix. They proof that, if the transition matrix is strictly diagonally dominant
13 , and P is a K × K Markov transition matrix then:
Q̃ = log(P ) := (P − I) −
(P − I)2 (P − I)3 (P − I)4
+
−
+ ...
2
3
4
converges, and produces a K × K matrix Q̃, such that exp(Q̃) = P and the row sums are
zero. Empirically, historical rating transition matrices are always strictly diagonally dominant
so this theorem is very useful in determining a generator. However, it does not guarantee nonnegative values in the transition matrices. This is only guarenteed if the off-diagonal entries of
the generator matrix are non-negative. If the negative off-diagonal entries are small, which is
very likely in the case of rating transitions, Israel et al. propose a simple method to solve this.
Replace the negative off-diagonal entries with a zero and reallocate these negative values to the
corresponding diagonal entry in that row. That is, from Q̃ we can obtain Q with non-negative
off-diagonal entries by setting
X
qij = max[q̃ij , 0], j 6= i; qii = q̃ii +
min[q̃ij , 0]
j6=i
Now exp(Q̃) does not exactly equal P but they show that the spectral norm distance is relatively small in comparison to the method suggested by Jarrow et al. ((Jarrow, Lando, and
Turnbull 1997)). Therefore we adopt this procedure to estimate generator matrices using one
year transition matrices.
We use the historical one year transition matrix of Standard & Poor’s estimated in 2009
(source:Standard & Poor’s). After eliminating the Not Rated category and normalizing the
matrix we obtain the historical transition matrix in table 1.
The historical generator we find using the procedure described above can be found in table 2.
And the Euclidean norm distance ||exp(Q̃) − P || is only 0.000214.
13
Every diagonal entry of P is larger than
1
2
24
Table 1: Modified average 1-year historical transition matrix of Standard & Poor’s estimated in
2009
Initial rating
AAA
AA
A
BBB
BB
B
CCC
D
AAA
0.9133
0.0060
0.0004
0.0001
0.0002
0.0000
0.0000
0.0000
AA
0.0788
0.9051
0.0214
0.0016
0.0006
0.0006
0.0000
0.0000
Rating at the end of year
A
BBB
BB
B
0.0055 0.0006 0.0008 0.0003
0.0810 0.0056 0.0006 0.0009
0.9150 0.0561 0.0042 0.0017
0.0414 0.9024 0.0428 0.0074
0.0021 0.0587 0.8387 0.0800
0.0017 0.0030 0.0645 0.8297
0.0027 0.0040 0.0113 0.1377
0.0000 0.0000 0.0000 0.0000
CCC
0.0006
0.0003
0.0003
0.0017
0.0089
0.0493
0.5460
0.0000
D
0.0000
0.0003
0.0008
0.0026
0.0110
0.0512
0.2985
1.0000
Table 2: Generator matrix estimated by modified historical transition matrix
Initial rating
AAA
AA
A
BBB
BB
B
CCC
D
AAA
-0.0911
0.0066
0.0004
0.0001
0.0002
0.0000
0.0000
0.0000
AA
0.0867
-0.1010
0.0235
0.0012
0.0005
0.0006
0.0000
0.0000
Rating at the
A
BBB
0.0021 0.0004
0.0890 0.0034
-0.0913 0.0617
0.0456 -0.1058
0.0007 0.0675
0.0017 0.0006
0.0034 0.0049
0.0000 0.0000
25
end of year
BB
B
0.0009 0.0002
0.0004 0.0009
0.0032 0.0015
0.0490 0.0060
-0.1814 0.0952
0.0774 -0.1969
0.0081 0.2038
0.0000 0.0000
CCC
0.0008
0.0004
0.0003
0.0019
0.0093
0.0728
-0.6139
0.0000
D
0.0000
0.0002
0.0007
0.0020
0.0079
0.0438
0.3938
0.0000
5
Final Model
Now we have a model to simulate economic scenarios, the BSHWII model, and a credit risk
model we can define our full model description. We will also elaborate on calibration and
analyze the assumptions of our model.
5.1
Model description
Our final model can be defined by the general continuous-time state space model on the probability space (Ω, F, Q) based on stochastic differential equations (SDE) driven by a Wiener
process,
dX(t) = µX (t, X(t))dt + σX (t, X(t))dW (t)
Y (t) = πY (t, X(t))
with,
lnS(t)
X(t) = r(t)
u(t)
r(t)
Y (t) =
µ(t)
r(t) − 21 σS2
µX (t, X(t)) = θ(t) + u(t) − ar(t)
−bu(t)
r(t)
πY (t, X(t)) =
d
γ0 + γ1 r(t) + γ2 dt
ln S(t)
σS
σX (t, X(t)) = σr
σu
(15)
(16)
Where W (t) = (WS (t), Wr (t), Wu (t)) denotes a vector-valued Wiener process with dimension 3
and variance-covariance matrix Σ:
1 ρrS ρSu
1 ρru
Σ = ρrS
ρSu ρru
1
And, where µ(t), γ0 , γ1 , and γ2 are vectors with length K − 1.
Note that because the variance of the above individual Wiener processes equals one, Σ is also
the correlation matrix of W (t).
Throughout the rest of this thesis we denote de instantaneous (continuously compounded) return
d
ln S(t).
on stocks as RS (t) := dt
5.2
Assessment of the transition model
Here we check what restrictions we need for the transition intensities to represent a generator
matrix, i.e. the rows of the resulting transition matrix should add up to one, and every entry in
the transition matrix should be non-negative.
Lemma 3. The rows of Λ(s, t) add up to zero ∀s, t if the first K − 1 rows of B −1 add up to zero
and the last column of µ(s, t) is zero ∀s, t.
26
Proof. First note that proving that the rows of Λ(s, t) add up to zero is equivalent to proving
that Bµ(t)B −1 1K = 0K ∀s, t. Where 1K is the one-vector of length K and 0K the zero-vector
of length K. This follows readily from the assumptions on B and µ: since the rows of B −1 add
up to zero, B −1 can only be non-zero in its last entry. Hence, µ(s, t)B −1 1K is a multiple of the
last column of µ(s, t) which is 0K by assumption.
Note that according to lemma 3 the rows of Π(s, t) add up to one independent of µj (s, t) for
j = 1, ..., K − 1 and that µK (s, t) is always zero because the default state is absorbing. This is
convenient because it means that we do not have to add restrictions on µ(s, t) when we model
its process.
It turns out that in order to have only non-negative entries in the transition matrix we do need
restrictions on both µ(s, t) and B, which makes analysis and calibration much more complex.
Even more, these restrictions are non-linear and including them would mean that µ(s, t) is not
affine in the state vector anymore, and therefore a closed form form price infeasible.
We now examine if it is possible to get a good fit in the historical transition data if we make
assumption 1 in section 4.4. Since we only have data of one year historical transitions we use
the intensity based model with yearly changing intensities. The historical transition matrices we
use are constructed by Standards & Poor’s, account for eight different rating categories, and are
from years 1988 until 2008. First we estimate the generator matrix from the one-year transition
matrix by using the method described in the previous section and secondly we diagonalize these
estimated generator matrices. By using a numerical solver we can approximately determine the
B that has the best fit in the data, by minimizing the error. We used two definitions of the error
terms: (1) the distance between the estimated and the historical intensity and, (2) the distance
between the estimated and the historical transition probability. The mean squared error (MSE)
of minimizing the error in intensities is 0.00269 and that of minimizing the error in transition
probabilities is 0.00182. The problem is that in both cases we see a lot of negative values of
the transition probabilities and, moreover, a AAA rated bond can have a higher probability of
default then a lower rated bond, e.g. BBB.
In order to solve this we examined the case of three rating categories: investment grade, speculative grade, and default. The historical transition matrices of eight ratings are transformed
to represent the historical transition of this three category case by collapsing ratings according
to some portfolio. Now it turns out that if we use the historical B matrix of one of the 21
transition matrices the MSE for intensity errors is 0.00267. Moreover, we see almost no negative
entries and higher default probabilities for speculative grade then for investment grade (for every
matrix). Also the estimated multi year transition matrices have these desired properties.
So we conclude that in order to get transition matrices with positive entries and higher default
probabilities for lower credit ratings we can model two rating categories (excluding default).
Conditional on a fixed eigenvector matrix B the transition probabilities are very sensitive to get
negative values depending on the eigenvalues of the intensities. By admitting only two rating
categories we reduce this chance and our results are more robust.
An important drawback of having only two (non default) rating categories combined with the
generator-decomposition assumption is pointed out in (Duffie and Singleton 2003): rating transitions become perfectly correlated. The see this assume we have two rating categories: Investment
(I) and Speculative (S) grade. We also make assumption 1. Then the the transition intensities
27
ΛIS and ΛSI are equal to,
µ1 (t) − µ2 (t)
B11 B22 − B12 B21
µ1 (t) − µ2 (t)
ΛSI (t) = B21 B22
B11 B22 − B12 B21
(17)
ΛIS (t) = B11 B12
(18)
So ΛIS and ΛSI are proportional to each other so they must be perfectly correlated. As a
result this model cannot capture asymmetric up- and downgrade patterns. So in spite of the
economic dependency of the transition probabilities, with only two ratings it is not possible
to incorporate e.g. the often seen effect that if the stock market is doing well there are more
upgrades than downgrades. Notice that this only follows from the decomposition assumption
1 and the fact that we only have two rating categories, so regardless of how the eigenvalues
of the intensities are modeled. Fortunately, the default intensities are not necessarily perfectly
correlated. Calculating Bµ(t)B −1 and selecting the entries of the last column of this matrix
results for two rating categories in the following,
ΛID (t) = −(ΛIS (t) + ηµ1 (t) + (1 − η)µ2 (t))
(19)
ΛSD (t) = −(ΛSI (t) + (1 − η)µ1 (t) + ηµ2 (t))
(20)
where η =
1
2.
B11 B22
B11 B22 −B12 B21 .
Hence the default intensities are only perfectly correlated if η equals
Nevertheless, if it is possible to model all the 8 S&P rating categories without a substantial
probability of negative transition rates, this is preferred to only two rating categories.
5.3
Analysis of the Constant Eigenvector Hypothesis
In this subsection we investigate how strong assumption (1) really is by evaluating the impact
of this assumption on historic transition matrices. We use 21 historic transition matrices from
Standard & Poor’s.
The only work known to exhibit The Constant Eigenvector Assumption (1) is done by Arvanitis
et al. (Arvanitis, Gregory, and Laurent 1999). Without further explanation or proof they state
that, if we have two generator matrices A and B, the quantity rN d (relative Norm distance)14
rN d(A, B) =
||AB − BA||
||A|| · ||B||
This equals zero if A and B have a common set of eigenvectors, and in general is smaller than
two for every A and B. Observe that this quantity measures how commutative two matrices
are: if they fully commute then rN d is zero. The connection with eigenvectors is that if two
matrices commute they have a common set of (normalized) eigenvectors. Arvanatis et al. find
that for their calibrated generator matrices for different days this value is always lower than 0.08.
Drawback of using calibrated generators is that in the same paper Arvanatis et al. state that
because a constant generator matrix can match the observed credit spreads in the market very
14
Here, ||X|| denotes the Euclidean norm of X, also often denoted as ||X||2 .
28
Figure 2: Values of the quantity rNd for the estimated generator matrices of years 1988 until
2008. This quantity is zero if the eigenvectors of Λ(t) and Λ(t0 ) are equal. It is always smaller
than two.
well, the calibrated stochastic generator will not be very different from the constant generator.
Hence, by their calibration procedure the generator matrix will not differ very much in time so
it is likely that rN d is very low. Also comparing day to day matrices for 250 days does not
even cover the change in one year. The main advantage of using calibrated generators is that
these are risk neutral generators instead of real world generators. We only need risk neutral
generators and we want to analyze if in this world the Constant Eigenvector Assumption is
justified. However, if we assume that the difference in B due to risk premia is constant in time,
constant eigenvectors in the real world imply constant eigenvectors in the risk neutral world.
Therefore in figure (2) we depict the values of rN d for estimated generator matrices from the
one-year historical S&P transition matrices of year 1988 until 2008. These generator matrices
are estimated using the procedure described in section 4.6. We find that rN d is lower than
0.4, which is a lot higher than the 0.08 found by Arvanatis et al. It is actually not very clear
what we can conclude from this but it seems that the eigenvectors of historical generators are
not completely different from each other but also not very close to each other as suggested by
Arvanatis et al.
5.4
Calibration
The main goal of this thesis is to find an appropriate model for generating risk neutral scenarios
of defaultable bond portfolios. Studying how to accurately calibrate the model parameters is
not our priority and has to be done in subsequent work. However, in order to be able to provide
some adequate results and asses the model, we will provide a method which calibrates the model
to the initial short spread and estimated spread sensitivities.
29
The initial term structure of interest rates of the HWII model is consistent with the initial term
structure observed in the market. As described in section 3 the function θ(t) is calibrated such
that this consistency holds. Since we only have 3 parameters to calibrate per rating class, we
cannot make the whole initial defaultable term structure equal to that observed in the market.
It is possible to take a similar approach as in ((Lando 1998)): we calibrate the coefficients such
that the initial short spread, also called the spot spread, is equal to that observed in the market,
and also the first partial derivative of the spot spread with respect to the state variables X(t),
here r(t) and RS (t).
Define the spotspread for rating class i as at time t as:
∂
i
s (t) = lim −
ln P i (t, T ) − r(t)
T →t
∂T
(21)
Observe that this is the difference between the instantaneous forward rate at t of an i-rated
defaultable bond and the short rate.
Now assume that at time 0 we observe the following spot spread vector in the market,
ŝ(0) = ŝ1 (0), ..., ŝK−1 (0)
and that our estimate of the sensitivity of the spot yield spread to the spot rate for every credit
rating is,
1
ˆ
∂s
ˆ r (0) = ds
ˆ r (0) , ..., ds
ˆ r K−1 (0)
(0) =: ds
∂r
and the sensitivity to the instantaneous returns on stocks for every credit rating is,
ˆ
∂s
1
K−1
(0) =: dsˆRS (0) = dsˆRS (0) , ..., dsˆRS
(0)
∂r
An straightforward extension of proposition 6.1 in (Lando 1998) shows that calibration to the
observed spot rate and sensitivities comes down to solving the systems:
−β(γ1 + γ2 r(0) + γ3 RS (0) = ŝ(0)
ˆ r (0)
−βγ2 = ds
−βγ3 = dsˆRS (0)
Where,
β = (BijK )i,j=1,...,K−1
γi = (γi,1 , ..., γi,K−1 ) for i = 0, 1, 2
Calibration of B, the constant eigenvectors of the generator matrix, is not straightforward. We
could use historical 1-year transition matrices as done in section 5.3 and take for example the
average or the last observed eigenvectors of the transition matrices. The first problem with
this approach is that Israel et al. (Israel, Rosenthal, and Wei 2001) show that there not need
to be a generator consistent with a certain 1-year transition matrix. Moreover, it is possible
30
that multiple generators are consistent with one 1-year transition matrix, which will produce
different transition probabilities for other time horizons. The second problem which arises is
that since these observed transitions happened in the real world and we are only interested in
the risk neutral world, this requires us to make additional assumptions: the mapping of the real
world measure to the risk neutral measure, i.e. specifying the Radon-Nikodym process. Clearly,
calibrating B to historical 1-year transition matrices is very complex and since calibration is not
our main goal we will for now not choose for this approach.
An alternative would be to also calibrate B to the initial credit spread. Since we proposed to
calibrate the coefficient matrix γ to the short spread we could choose to calibrate B to have
the model fit the complete initial term structure of credit spreads. However, to do this without further constraints would probably result in overfitting and is unlikely to produce realistic
transition probabilities.
Arvanitis et al. (Arvanitis, Gregory, and Laurent 1999) come up with a hybrid solution to
this problem in case of a constant generator matrix. They propose a least square optimization
method which minimizes the squared errors of the price plus the squared error of the estimated
historical generator matrix. In other words, a trade-off is made of fitting the model to market
prices and trying to be close to the historical generator matrix. Notice that this is a way to
map the historical generator on the risk neutral generator since observed market prices are used
to modify the historical generator. Because the objective function also incorporates deviation
from the historical generator this method is much more likely to produce realistic credit spreads
(constant).
In our case, we only want to estimate the constant risk neutral eigenvector matrix B. We
could try to estimate B such that errors from the historical B and the observed bond prices are
minimized:
!
K−1
N
K X
K
hist )2
XX
X
(Bij − Bij
n,model
n,market 2
min
(P i
(B) − P i
) +
βij
B̂
i=1 n=1
i=1 j=1
Observe the important assumption here: risk premiums in B are assumed constant in time. This
is actually also directly implied by the model, because otherwise B would be also stochastic or
at least deterministically changing in time. The main problem with this approach is that it,
depending on the amount of bond prices we calibrate to, quickly changes the complete dynamics
of the transition process. By letting only the first K-1 columns and rows of the B matrix very
we can make sure that the rating transitions still add up to one. We tried this, but even after
one year, the calibrated B gave negative transition probabilities and even negative spreads. So
we can conclude that this not work and we stick to the eigenvectors of the historical generator.
We assume 7 rating categories including default, use the eigenvectors from the estimated generator matrix15 in table 2 for the B matrix and the estimated current spot spreads in (25) to
calibrate the coefficients γ0 . To calibrate all our coefficients we also need estimates of the partial
derivatives of spot spreads to our state variables r(t) and RS (t). Estimation of these partial
derivatives is beyond the scope of this thesis and should be done in subsequent work. In the
literature on explaining credit spreads by economic factors it is widely found that spreads are
15
We estimated the generator again after removing the AAA column and row from the historical 1-year transition
matrix and a renormalization
31
Table 3: Estimate of Nelson Siegel parameters for the U.S. Treasury yield curve on February
26, 2010
β̂0
β̂1
β̂2
τ̂
0.0551
-0.0528
-0.0610
1.5428
MSE
2.1953E-06
negatively correlated with both treasury rates and returns on the stock market. It is also found
that the effect of these economic factors on the spread becomes stronger for lower ratings and
that the correlation between interest rates and spreads is higher than that of stock returns and
spreads. For more details see for example the extensive study in (Huang and Kong 2003). We
therefore (arbitrarily) choose the spot spread sensitivities to be:
ˆ r (0) = (−0.2, −0.3, −0.4, −0.5, −0.6, −0.8, −1.0))
ds
dsˆ S (0) = (−0.1 − 0.2 − 0.3 − 0.4 − 0.5 − 0.7 − 0.9)
R
(22)
(23)
Notice the relative low values. This is because if we have high correlation between the economic
state variables and the default probabilities we also have higher probabilities of negative entries in
the transition matrix. So in order to derive some adequate results we kept correlation relatively
low. Indeed, this is a big disadvantage of the model because it does not allow for a realistic high
correlation between the transition probabilities and the economic variables.
5.5
The current term structure of U.S. interest rates
From Bloomberg we downloaded the U.S. treasury yields on February 26, 2010 for maturities
from 1 to 10 and 15,20,and 30 years. These are fixed points but we actually need a continuous
function which we can differentiate in order to calibrate θ in our short rate model. Therefore
we fit the following Nelson Siegel function by minimizing the quadratic errors to the treasury
data:
τ
R(0, t) = β0 + (β1 + β2 )
(24)
1 − e−t/τ − β2 e−t/τ
t
Estimation results are denoted in table 3 and we see a very good fit.
5.6
The current term structure of U.S. credit risky yields
In order to obtain current zero coupon prices we follow the procedure in (Allen, Thomas, and
Zheng 2000). In our model the price of a defaultable bond at a certain time is determined by
its maturity and its rating. In reality there are much more factors that influence the price, such
as liquidity, sectorial factors, firm specific factors, and so on. Hence, the price of a bond with
certain maturity and rating is not unique. Standard procedures of stripping coupons such as
bootstrapping and linear regression only work if there is only one bond for each maturity. If
32
t
Treasury
AA
A
BBB
BB
B
CCC
1
0.9967
0.9670
0.9487
0.9475
0.9128
0.9128
0.8907
2
0.9841
0.8963
0.8780
0.8728
0.8366
0.8033
0.7812
3
0.9597
0.8551
0.8172
0.7908
0.7174
0.6585
0.6119
4
0.9265
0.8219
0.7106
0.6782
0.5989
0.5401
0.4934
5
0.8880
0.7833
0.6720
0.6396
0.5604
0.5015
0.4283
6
0.8468
0.7422
0.5684
0.5360
0.4567
0.3978
0.3246
7
0.8051
0.7005
0.4682
0.4358
0.3566
0.2977
0.1371
8
0.7641
0.6594
0.4190
0.3866
0.3073
0.2484
0.0878
9
0.7242
0.6196
0.3792
0.3370
0.2578
0.1989
0.0382
10
0.6860
0.5814
0.3409
0.2988
0.2195
0.1607
Table 4: Defaultable Zero-Coupon Bond Prices
there are more or no bonds that mature in a certain period or year, these procedures cannot
generate a unique set of discount prices. To solve this, (Allen, Thomas, and Zheng 2000) describe
the problem of stripping bonds as an Linear Programming (LP) problem which simply finds the
discount prices for which the pricing error is minimized. Constraints are added to make sure
that the discount price of bonds with a higher maturity is lower than that of a lower maturity,
and that the price of a bond with a higher credit rating is higher than that of a bond with lower
rating.16 For a detailed description of the procedure see appendix C
Our data for the defaultable bonds comes from Datastream and concerns all straight and fixed
coupon US industry corporate bonds that have maturity date up to February 23, 2020 and
for which the following datatypes are available on February 23, 2010: S&P (long term) rating,
maturity, coupon, coupon date, market price, market value next call date, date of last price
change, issued amount, and outstanding amount. Just as in (Allen, Thomas, and Zheng 2000)
we exclude bonds that have not been traded for the last two months, or have market value less
than 100,000. These bonds may be illiquid and therefore have irrelevant market prices. We also
exclude callable and unrated bonds and bonds with a different issued than outstanding amount.
Lastly we filter out all bonds for which the rating information is of before January 1, 2009 and
bonds that defaulted. The remaining list contains information of 835 bonds: 38 AA, 173 A, 444
BBB, 98 BB, 55 B,and 27 CCC rated. After our filtration we only had one AAA rated bond left
so we omitted it. Estimating the discount prices with a simplex algorithm results in the prices
in table 4. The continuously compounded yield spread which follows from these discount price
can be seen in figure 3.
An important issue is that of calculating the cashflows for the sampling dates because bonds
have different coupon and maturity dates. We use a semi-annual sampling period, which is a
trivial choice since all our bonds have also semi-annual coupons. The current date, the date for
which we downloaded the market prices is February 22, 2010, but we set the sampling date on
May 15 and November 15 because these are common coupon dates. So our zero sampling date
is November 15, 2009. Hence, the market prices we observed already have accrued interest. To
deal with these different dates of sampling and coupons, (Allen, Thomas, and Zheng 2000) use
simple interpolation which is explained in detail in the appendix. To calibrate γ we need to
estimate the current spot spread defined by (21) for every i ∈ {1, ..., K − 1}. We can do this
by fitting, for every rating category, the Nelson Siegel function (24) to the corresponding yield
16
Or equivalently: The stochastically monotonicity constraint for credit spreads is satisfied (Credit spreads do
not cross each other).
33
Estimated Credit Spread
0.2
AA
A
BBB
BB
B
CCC
0.15
0.1
0.05
0
1
2
3
4
5
6
Years to Maturity
7
8
9
Figure 3: Continuously compounded yield spreads implied by stripping procedure and term
structures in table 4
curve. In (Nelson and Siegel 1987) it is shown that, if a yield curve is modeled by a Nelson
Siegel function, the instantaneous forward rate f (t) at time t is equal to,
t
t t
f (t) = β0 + β1 e− τ + β2 e− τ
τ
So f (0), the instantaneous forward rate at time 0, is β0 + β1 . Hence, if we approximate the yield
curve with a Nelson Siegel function then the spotspread (21) equals β0 +β1 −r(0). Table 5 shows
the estimated Nelson Siegel parameters fitted to the yield curves17 implied by the estimated zero
coupon prices. The short rate can be approximated by taking the limit of (24) for t ↓ 0. This
also results in r(0) = β0 + β1 , which gives rise to an estimate of the short rate on February
22, 2010 of 0.0023. Using this and the Nelson Siegel fits for the defaultable bonds we get the
following estimate of the current spot spread from classes AA down to CCC (in basis points),
ŝ(0) = (306, 277, 369, 338, 494, 777)
(25)
Notice that the monotonicity constraint for spreads is not satisfied here. The the extend that
this is not the result of estimation errors, apparently market participants estimate default probabilities to be higher for an AA rated bond than for a A rated bond. A possible explanation
is that due to the current financial crisis markets are a lot less efficient. Investors make decisions based on emotion rather than their ratio and especially the defaultable bond market is a
lot less liquid. Even though we tried to eliminate illiquidity effects by excluding not recently
traded bonds and bonds with low market value, in very illiquid markets this will probably not
be enough to eliminate all the effects. Also observe that these spot spreads are relatively high,
and this is probably because in a financial crisis risk premia are likely to become higher. As said
before studies show a negative relation between treasury rates and credit spreads. So this also
explains the high observed spreads because the current U.S. treasury rates are very low.
17
continuously compounded
34
Table 5: Estimate of Nelson Siegel parameters for the yield curves of credit risky bonds on
February 22, 2010
β̂0
β̂1
β̂2
τ̂
AA
5.2981
0.0603
-0.0274
-0.0691
A
1.5492
0.0716
-0.0415
-0.0576
BBB
1.3948
0.0784
-0.0392
-0.0747
BB
3.9564
0.1020
-0.0659
-0.0020
B
7.0122
0.1393
-0.0876
-0.0019
CCC
20.9682
7.3654
-7.2854
-7.7763
MSE
2.85514E-07
8.82771E-06
7.01411E-06
1.22472E-05
1.94533E-05
0.000106291
6
Bond pricing
As we saw in the introduction the risk neutral value of a portfolio of defaultable bonds at time
t > 0, is determined by two things:
• The market value at time t of the different defaultable bonds within the portfolio.
• The composition of the portfolio in terms of credit ratings, including default.
Now we discussed our economic framework and our credit risk model we are ready to derive
a defaultable bond price. In this section we will first give some general results on bond prices
of intensity based models and then derive a closed form solution of the defaultable bond price,
with and without recovery.
6.1
Risk neutral pricing of a non-defaultable bond
The main theorem of mathematical finance states that there are no arbitrage opportunities if
and only if there exists a measure QN , equivalent to the real world measure P and depending
on the Numéraire N (t), such that every relative price process in the market X(t)
N (t) is a martingale
process with respect to QN :
X(t)
QN X(T )
=E
|Ft , ∀t ≤ T
(26)
N (t)
N (T )
Where QN denotes the equivalent martingale measure with respect to the Numéraire N (t). What
is known for risk neutral pricing is when we take the money market account as a Numéraire and
the equivalent martingale measure is then called the risk neutral measure. Hence, investors are
indifferent between investing in any contract and the money market account.
Now consider a non-defaultable discount bond with maturity T . According to (26) the no
35
arbitrage price or risk neutral price of such a bond equals
P (t, T )
1
QB
=E
|Ft
B(t)
B(T )
B(t)
P (t, T ) = EQB
|Ft
B(T )
i
h RT
= EQB e− t r(τ )dτ |Ft
18
The price of a discount bond can easily be extended to the price of a bond with any positive
face value. Since the face value is fixed this is simply the price of a discount bond times the face
value. Because coupon paying bonds can be seen as a portfolio of zero coupon bonds we restrict
ourselves to determine the price of zero coupon bonds. Hence, by the law of one price the price
of a coupon bond must equal the price of the equivalent portfolio of zero coupon bonds, i.e. a
linear combination of prices of zero coupon bonds.
6.2
Risk neutral pricing of a defaultable bond
We can also use the main theorem of mathematical finance to determine the risk neutral price
of a defaultable bond. To this end, consider a defaultable discount bond with maturity T , i.e. a
loan that promises 1 unit of currency, say $, at time T which will be paid if and only if the issuer
has not defaulted before time T . So the payoff of such a contract is 1[T >τ ] , where τ denotes
the time of default. Assuming the issuer survived up to time t, applying the main theorem of
mathematical finance we can derive the following theorem:
Theorem 3. The risk neutral price of a defaultable discount bond with maturity T and zero
recovery is
P (t, T ) = EQ [e−
RT
t
r(s)ds
1[τ >T ] |Ft ]
Proof. The payoff of this contract equals 1T >τ , therefore according to the main theorem of
finance we get
1
P (t, T )
QB
=E
1
]|Ft
B(t)
B(T ) [τ >T
h RT
i
P (t, T ) = EQB e− t r(τ )dτ 1[τ >T ] |Ft
Notice that if the default time τ and short rate r(t) are independent the defaultable bond price
is just the non-defaultable bond time the default probability.
Until now we have only considered defaultable bonds with zero recovery. If an issuer has defaulted then let W be the amount of recovery received at default19 . It is convenient to see such
18
Recall that the money market account follows dB(t) = r(t)B(t)dt and so B(t) = exp
19
R
t
0
r(τ )dτ
Note that for now we do not specify how this amount is determined, as for example a fraction of the face
value.
36
a product as a portfolio of two bond products. One that pays the face value at time T if and
only if the issuer survived until T , so a defaultable bond without recovery, and the other that
pays the amount W at default time τ if and only if the firm defaulted before T . These products
together form a defaultable bond with positive recovery and so the risk neutral price at time t
is the sum of the price of these two products:
Theorem 4. The risk neutral price of a defaultable discount bond with maturity T and recovery
payment W at default is (Duffie and Singleton 2003)
P (t, T ) = EQ [e−
RT
t
r(s)ds
1[τ >T ]|Ft ] + EQ [e−
Rτ
t
r(s)ds
W 1[τ ≤T ] |Ft ]
Note that these expressions for bond prices are very general because they do not state anything
about the modeling of the default event nor the process of the spot rate, even the recovery
process. So they can be used to derive bond prices for intensity based modeling and for structural
modeling.
With regard to recovery their are basically two options: fractional recovery of face value or
fractional recovery of market value. We can think of arguments of both approaches and from a
legal point of view it is not possible to pick one as a favor. Therefore we choose for fractional
recovery of market value because this is a more tractable approach, so that we can also derive
an analytic defaultable bond price for positive recovery.
6.3
Pricing of a defaultable bond with an intensity model
Now suppose we model default probabilities as a doubly stochastic intensity model. So conditional on the (stochastic) intensity process default occurs at the first jump of a time-inhomogeneous
poisson process. (Lando 1998) shows the following lemma.
Lemma 4. If the time of default τ follows a cox process, i.e. a doubly stochastic intensity
model, then the following identity holds:
h RT
i
RT
EQ [e− t r(s)ds 1[τ >T ] |Ft ] = 1[τ >t] EQ e− t r(s)+λ(s)ds |Ft
Proof. See (Lando 1998).
The intuition of this result is rather straightforward
h R T since wei know that the probability of
surviving between t and T , i.e. τ > T , equals EQ e− t λ(s)ds |Ft .
Corollary 2. If the time of default τ follows a Cox process, the price of an i rated defaultable
zero coupon bond at time t maturing at T equals:
h RT
i
P i (t, T ) = EQ e− t rs +λi (s)ds |Ft
37
6.4
Analytic bond price
Here we will give the defaultable bond price if we use the model discussed in section 5. A
complete proof can be found in appendix B. We find:
Theorem 5. If short rates follow a two-factor Hull-White model and the credit events of rating
transitions and defaults are modeled as described in 5, then the closed form solution of the price
of an i rated defaultable zero coupon bond at time t with maturity date T equals,
K
X
1 2
P i (t, T ) =
βijK exp γj0 − γj2 σS (t − s)
2
j=1
Z T
Q
r(τ )dτ |Ft
exp (γj1 + γj2 − 1)E
t
Z T
1
2
+ (γj1 + γj2 − 1) var
r(τ )dτ |Ft + (γj2 )2 (t − s)σS2
2
σ σt ρ r S r,S
−a(t−s)
+2(γj1 + γj2 − 1)γj2
a(t
−
s)
+
e
−
1
a2
1 −b(t−s)
σu σS ρSu 1
−a(t−s)
e
−1 − 2 e
−1
+
a−b
a2
b
hR
i
hR
i
T
T
EQ t r(τ )dτ |Fs and var t r(τ )dτ |Fs can be found in appendix B.
Proof. See appendix B.
This is for zero recovery. But for constant non zero fractional recovery of market value δ, (Duffie
and Singleton 1999b) proofs that in that case:
P i (t, T ) = EQ [e−
RT
t
r(τ )+(1−δ)λi (τ )dτ
]
This leads to a simple adjustment in our price.
i
h RT
i
h RT
RT
EQ e− t r(τ )+(1−δ)λi (τ )dτ = EQ e− t r(τ )dτ e− t (1−δ)λi (τ )dτ
h RT
RT
= EQ e− t r(τ )dτ −Be t (1−δ)µ(τ )dτ B −1
=
K
X
h
RT
βijK EQ e
t
(1−δ)µj (τ )−r(τ )dτ
i
iK
i
j=1
So if we define γ̃ := (1 − δ)γ the formula for the price remains the same but with γ̃ as the
coefficient matrix.
6.5
Model implied credit spreads
Now we have a closed form solution of the defaultable bond price we can easily derive the model
implied credit spreads. If in the literature is referred to the credit spread, this in most cases
38
Figure 4: Continuously compounded yield spreads after calibration procedure
means the credit yield spread.20 This is defined by the yield difference between defaultable and
non-defaultable bonds. Hence, the credit yield spread at time t of an i-rated bond with maturity
T equals:
Y Si (t, T ) =
ln P (t, T ) − ln(P i (t, T ))
T −t
In section 5 we derived, by means of calibration to observed spot spreads, a parameter set for
our model. However, our estimated spot spreads did not satisfy the Stochastically Monotonicity
Constraint. Figure 4 depicts the model implied credit spreads if we use this particular parameter
set. We can see that credit spreads cross each other and some are very close to each other.
Moreover if we calculate transition matrices, we see a high rate of negative transition matrix
entries. What is the exact cause of this should be further investigated but it is likely that this
is due to the fact that the Stochastically Monotonicity Constraint is not satisfied, and we have
a very low initial short rate which yields in a relative high probability of negative interest rates.
So especially in markets of financial crisis, this model is hard to calibrate while keeping realistic
simulations.
In order to be able to derive some realistic results, for the remaining of this thesis we use stylized
parameters for the spot spreads and the current term structure of U.S. interest rates. We choose
the spot spreads in basis points to be21 :
ŝ(0) = (16, 20, 27, 44, 89, 150, 250))
20
21
Sometimes the difference in forward rates is meant.
Notice that we now have 8 rating categories including default.
39
0.07
AAA
AA
A
BBB
BB
B
CCC
0.06
Credit Spread
0.05
0.04
0.03
0.02
0.01
0
0
5
10
15
Years to Maturity
20
25
30
Figure 5: Continuously compounded model implied yield spreads using the stylized parameters
And the Nelson Siegel Parameters of the current term structure of U.S. interest rates:
β0 = 0.045
β1 = −0.015
β2 = 0
τ =4
Calibration, i.e. solving the systems in section 5.4, results in the K −1×3 matrix γ = [γ0 γ1 γ2 ],
−0.1465 1.0545 1.0351
−0.1609 1.4673 1.1155
−0.0535 0.5142 0.4033
γ=
−0.1347 1.1902 0.9280
−0.1094 0.9591 0.8020
−0.1128 0.9803 0.7967
−0.1201 1.0471 0.8281
The resulting term structure of credit spreads is depicted in figure 5. Credit spreads are still high
but we do see what is often seen in the market that investment grade spreads have a upward and
speculative grade spreads a downward yield curve. We also find no or only very small negative
entries in transition matrices, therefore we use these parameter values for the remaining sections.
6.6
Checking bond price by means of Monte Carlo
We can estimate a confidence interval for the price of a defaultable discount bond by applying
Monte Carlo simulations. These are the steps:
40
1. Simulate sample paths of the economic variables by means of a discretization scheme.
Throughout this thesis we will use the Euler scheme to simulate sample paths. Euler discretization of the BSHWII model with time step ∆t means generating a three dimensional
time series (S(t), r(t), u(t)), t = 0, 1, 2, ... iteratively by means of:
√
S(t + 1) = S(t) + r(t)S(t)∆t + σS ∆tZS (t) , S(0) = S0
√
r(t + 1) = r(t) + [θ(t) + u(t) − ar(t)]∆t + σr ∆tZr (t) , r(0) = r0
√
u(t + 1) = u(t) − bu(t)∆t + σu ∆tZu (t) , u(0) = u0
Where ZS (t), Zr (t), Zu (t) are random numbers sampled from a multivariate normal distribution with zero mean and the following covariance matrix,
1 ρrS ρSu
1 ρru
Σ = ∆t ρrS
ρSu ρru
1
Sampling from a multivariate normal distribution can be done by decomposing the joint
distribution in uncorrelated marginal distributions with for instance a Cholesky decomposition. This is also the method we will use throughout this thesis.
2. Approximate the integral over r(t) appearing in the formula for the price with,
b∆t
X
b
Z
X(t)dt ≈
a
X(j)∆j
j=a∆t
3. If N paths are sampled with the Euler scheme and n denotes the nth path, using (31) the
sample price of a i-rated defaultable discount bond at time t and years to maturity T in
sample path n equals
"
!
i
!#
K
T
∆t
X
X
S (T ∆t)
n
n
P i (t, T ) =
βijK exp − γj0 (t − s) + (1 + γj1 )
r (k)∆t + γj2 ln
S i (t∆t)
j=1
k=t∆t
4. The Monte Carlo estimate (MCE) of the price then simply yields the sample mean of the
sample prices,
M CE
Pi
(t, T ) =
N
1 X n
P i (t, T )
N
n=1
We can also determine a α% confidence interval of the defaultable bond price. By the
Central Limit Theorem the sample mean converges in distribution to the normal distribution for large N . So in order to derive a confidence interval we need the variance of the
exponential term. Since we do not know this we will use the sample variance to estimate
it:
S 2 (P i ) =
N
2
1 X n
M CE
P i (t, T ) − P i
(t, T )
N −1
n=1
41
1
Price Discount Bond
0.9
0.8
0.7
0.6
0.5
0.4
0
2
4
6
Years to Maturity
8
10
Figure 6: Monte Carlo check of Analytic price of a AAA bond (blue line) and a CCC bond (red
line). The upper and lower bounds of the 95% confidence intervals of Monte Carlo estimates
(500 scenarios) are the dotted lines and the solid lines are the analytic prices. Prices are always
within the bounds, so our analytic price seems correct.
So the α confidence interval equals,
α 1
α 1
M CE
M CE
−1
−1
Pi
(t, T ) − Φ (1 − ) √ S(P i (t, T )), P i
(t, T ) + Φ (1 − ) √ S(P i (t, T ))
2
2
n
n
Using the parameter set given in the previous section we derived a 95% confidence interval for a
AAA and CCC bond and compared it with the prices we obtained from our closed form solution
to check whether these are in the intervals. In figure 6 we see that the analytic prices always
lie between the upper and lower bound of the 95% confidence interval for different maturities so
that we can say with a high confidence that the pricing formula we derived is correct.
42
7
Portfolio Simulation
We now have the two main ingredients to simulate risk neutral scenarios for the value of a
portfolio including defaultable bonds: (1) an analytic formula for the prices of defaultable bonds
and (2) risk neutral transition matrices. We use the risk neutral transition matrices to simulate
rating transitions (including default) of the bonds in the portfolio, and the analytic prices
to determine the value of the different bonds in each scenario. Probabilities implicit in the
simulation present the risk neutral measure because we price in the risk neutral world but also
migrate in the risk neutral world. Remember that the purpose of this model is to price embedded
options with a Monte Carlo estimate and therefore we need risk neutral scenarios of the portfolio.
In this section we start by explaining the simulation framework in general and then we derive
some results for different investment portfolios.
7.1
Simulation Framework
We simulate S e economic scenarios (paths) of stock prices, interest rates and the latent HWII
factor u. In each scenario we have different transition matrices and different bond prices. There
are two options here: (1) we assume an infinitely large portfolio and therefore transitions occur
exactly according to their transition probabilities, or (2) we also simulate the actual transitions
of the bonds in the portfolio. If we choose to also simulate rating scenarios, we can simulate one
rating scenario in each economic scenario of simulate multiple rating scenarios in each economic
scenario. The last option significantly increases computational complexity, since in that case we
have to simulate transitions in each economic scenario. So this would result in a simulation on
top of a simulation due to are doubly stochastic model. If we for instance would simulate 100
transition and 500 economic scenarios we would have to simulate 50,000 scenarios in total. But
the problem by assuming that rating transitions exactly happen according to their probabilities
is that this leads to lower risk for small portfolios. Consider the extreme example of a portfolio
of only one 15-year AAA zero-coupon bond with nominal value 100. At maturity this bond
is either worth 100 or the recovery value, but if we do not simulate the transitions our model
produces a lot of different portfolio values. Every scenario has a different value of the portfolio
according to the different default probabilities (at maturity the bond is worth the nominal value
independently of the rating except for the default state). Say that in a certain scenario the
default probability is 0.05 and we have zero recovery then, without simulating transitions, the
portfolio value equals 95. Moreover only with really high default probabilities22 it is possible
to get a negative return on the bond. This can be seen in figure 7. Note that we do see a left
skewed distribution and a fat left tail. We also report that the sample mean of the geometric
return equals 0.0434 and is very close to the sample mean of the geometric return of the money
market account which is 0.0424. So there are no arbitrage opportunities: Each asset under
measure Q accumulates wealth on average equal to the money market account. The difference
is explained by the fact that we calculated the geometric returns for the AAA bond instead of
the continuously compounded returns and due to possible approximation errors..
Now lets consider the same bond portfolio with only the 15-year AAA bond, but with simulation
22
More precisely we get a negative return at maturity if the default probability is higher than 1 −
P is the price of the bond.
43
P
.
100
Where
70
60
Frequency
50
40
30
20
10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Return
Figure 7: Distribution of the return at maturity of a 15-year AAA bond, simulated with the
assumption that transitions exactly happen according to the transition probabilities. The red
line represents the fit of a normal distribution.
of the transitions. The resulting distribution can be seen in figure 8. Here we see indeed that
there are only two possibilities for a defaultable bond at maturity, defaulted which corresponds
to the return of -1.0 or the nominal value. So we see an extreme fat tail and a heavily skewed
distribution. We can conclude that, especially for small portfolios in the sense of only few issuers,
we also need to simulate transitions in order to get realistic dynamics.
These are the steps in our simulation framework:
1. Under the risk neutral measure Q we simulate a set of economic scenarios Ωe = 1, 2, ..., S e
of short rate r(t), stock price S(t), and latent variable u(t) according to the model of
section 5 for 0 ≤ t ≤ M . Here is M the risk horizon, which is in the case of the embedded
option pricing the maturity of the option. Outcomes are denoted as ri (t),Si (t), and ui (t).
Here is i an element of Ωe .
2. Determine the transition matrices for each 0 ≤ t ≤ M for each scenario in Ωe , i.e. calculate
the intensity eigenvalues and compose with use of the fixed B the transition matrices.
Hence for each 0 ≤ t ≤ M we have S e transition matrices. For which t’s we need the
transition matrices depends on the applied investment strategy. If we use buy and hold,
we only need the transition matrices which give the transition probabilities for the holding
period M. For yearly rebalancing the 1-year transition matrices are needed for every integer
t for which 0 ≤ t ≤ M .
3. Simulate in each economic scenario, for each bond in the portfolio and for each 0 ≤ t ≤ M ,
the rating at time t. Again if we apply a buy and hold strategy we only need to simulate the
ratings of the bonds in the portfolio for time M. Simulation is done by drawing a random
number x from a (0,1) uniformly distributed variable X. Consider bond n ∈ {1, .., N } with
44
1000
900
800
Frequency
700
600
500
400
300
200
100
0
−1
−0.5
0
0.5
1
1.5
2
2.5
Return
Figure 8: Distribution of the return at maturity of a 15-year AAA bond, where also the transitions are simulated. The red line represents the fit of a normal distribution.
initial rating k, then, if we denote Πc (s, t) as the cumulated transition matrix between s
and t, then the simulated rating at time t equals:
c
1 X ≤ Πk,1 (0, t)
ratingn (t) = l
Πck,l (0, t) < X ≤ Πck,l+1 (0, t) for l ∈ 2, .., K − 1
K Πck,K (0, t) ≤ X
Note: We do not simulate multiple rating scenarios for each economic scenarios, per economic scenario we only simulate one rating scenario.
4. Calculate the price of each bond in the portfolio for each ”combined” economic-rating
scenario. This is done by using the closed form solutions of the defaultable discount
prices, which gives us the price in each economic scenario of a defaultable bond at every
time 0 ≤ t ≤ M and every maturity for each rating. Each economic-rating scenario
contains the simulated weights of the portfolio, so by multiplying these weights with the
prices for that scenario we obtain the portfolio value in that scenario.
In step 3 it is also possible to simulate multiple rating scenarios for each economic scenario. As
stated before this gives rise to much more computational complexity and is in most cases not
very useful, but in some cases this can be useful. JP Morgan’s CreditmetricsT M takes a similar
approach as we do but draw and take quantiles from a normal distribution instead of a uniformly
distribution. If we draw from univariate normal distributions this is exactly the same, but we
could also draw from a multivariate normal distribution and imply direct correlations between
issuers. This can be useful if we for instance have two bonds of large financial institutions in our
portfolio and we know one of these financials has also a large position in the other. To be able
to draw from correlated normal distributed variables we can use a Cholesky decomposition for
45
250
Frequency
200
150
100
50
0
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Return
Figure 9: Distribution of the return at year 10 of a portfolio of 10 15-year AAA bond and 5
15-year CCC bonds. Sample kurtosis is 2.959 and sample skewness -0.508
instance. When including such correlation multiple rating scenarios within an economic scenario
can be very useful.
7.2
Simulation Results
We simulate 1,000 economic scenarios and only consider portfolios with a Buy and Hold strategy,
zero recovery and with only zero-coupon bonds. This can be easily extended to the case of
coupon bonds and positive recovery. The first portfolio we consider is that of 10 AAA, and 5
CCC bonds. All with maturity of 15 years and notional 100. The resulting distribution of the
return on this portfolio, with zero recovery, can be seen in figure 9.
Again as expected, we see a left skewed leptokurtic distribution, hence with a fat left tail. This
is also reflected by the sample kurtosis and skewness. The worst case scenarios show not strong
negative returns because our portfolio has relatively many AAA bonds. The probability that
one or even more of these bonds defaults within 10 years is relatively small so default of the
CCC bonds do not have a very big impact on the total portfolio value. To compare, consider
the return distribution on a portfolio with 10 CCC bonds with all maturity of 15 years (figure
10).
Here we can clearly see that this is a much riskier portfolio and that of the 1000 scenarios there
are several scenarios for which all CCC bonds has defaulted after 10 years. What is remarkable
is that the distribution is not skewed to the left but to the right causing a longer right tail,
which we can also see from the positive sample skewness. This is probably the result of the fact
that CCC rated bonds have, especially over a longer time period such as 10 years, relatively
high upside potential with low probability. It can upgrade in rating which gives a positive jump
in the yield. Also, how closer time gets to the maturity date, how bigger the change the return
46
250
Frequency
200
150
100
50
0
−1
−0.5
0
0.5
1
1.5
2
2.5
Return
Figure 10: Distribution of the return at year 10 of a portfolio of 10 15-year CCC bonds. Sample
Kurtosis is 2.745 and sample skewness 0.3963.
350
300
Frequency
250
200
150
100
50
0
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Return
Figure 11: Distribution of the return at year 10 of a portfolio of 10 15-year AAA bonds. Sample
Kurtosis is 5.3609 and sample skewness -1.0678.
47
250
Frequency
200
150
100
50
0
−0.1
0
0.1
0.2
0.3
0.4
Return
0.5
0.6
0.7
0.8
Figure 12: Distribution of the return at year 10 of a portfolio of 10 15-year bonds for each rating
class, so in total 70 different issuers. Sample Kurtosis is 2.506 and sample skewness -0.040.
over the money market account will equal the complete credit spread of a CCC rated bond.
Now lets see if a portfolio with a lot of issuers (70), for each rating 10, has a more Gaussian
distribution and is less riskier than the smaller portfolios. So we consider a portfolio with in
each rating class 10 different bonds, all of different issuers, with 15 years to maturity, and
zero recovery. The distribution of the return can be seen in 12. So diversification does reduce
kurtosis and skewness risk but we still see fat tails, even for 70 different issuers. Nevertheless
the probability on negative returns is a lot lower than for the smaller portfolios we considered.
48
8
Conclusion
In this thesis we developed a risk neutral simulation framework for portfolios including credit
risky bonds. Credit risk and market risk are combined and credit spreads are correlated with
treasury rates and stock returns. We also model the credit event of rating transitions including
default which introduces jumps in the credit spread of a particular bond and allows for a positive constant recovery of market value. One of our main results is the derivation of a closed
form solution for the price of defaultable bonds within the Black-Scholes two-factor Hull-White
economic framework. This was one of the main and at the same time most challenging goals of
this thesis.
The drawback of a model which allows a closed form solution of the defaultable bond price
is that you are restricted to simple processes for risk drivers, i.e. the Gaussian and CIR processes. Because our state variables were Gaussian and we needed spreads to correlate with these
variables, we have positive probability of negative spreads. This is due to the fact that we
model our intensities to be affine in our state variables, hence they are also Gaussian. Therefore
there is positive probability on negative transition intensities which gives rise to negative default
and transition probabilities. Calibrating the model, while keeping the probability on negative
spreads low, is therefore complex and requires subsequent work. We suggested a calibration
procedure but with today’s markets of the financial crisis, this did produce negative transition
probabilities and negative spreads with a high probability. This was mainly because of the fact
that the observed spreads did not satisfy the stochastic monotonicity constraint for spreads
and because the initial short rate was very low (so could become negative quickly). The only
solution in the literature to this problem is to price defaultable bonds by a branching method
which caps the transition probabilities. In our case this is no possibility because this increases
computational complexity exponentially.
With a stylized set of parameters we simulated risk neutral portfolio returns. These possessed
most of the desired properties of a risk neutral credit risk model. The distribution was highly
skewed and had fat tails due to downgrade and default events. Importantly, on average portfolios
accumulate wealth equal to the money market account, which is a crucial property in the risk
neutral world since investors are risk neutral and therefore risky securities do not earn a risk
premium. If this was not satisfied the model would have allowed arbitrage opportunities. We
also suggested a way to model direct correlations between bonds in a portfolio.
We have managed to completely fulfill the thesis objective: the model we derived has the same
and sometimes better properties than the credit model in PALM. An example of an improvement
is that we do not have to specify the Radon-Nikodym process, hence the mapping of real world
to risk neutral probabilities. In PALM this is done with use of a very strong assumption.
Another improvement is that our pricing model also takes into account the stochastic transition
probabilities, as in PALM bonds are priced with fixed transition matrices.
We contributed to the literature in several ways. To begin we derived a simulation framework
for risk neutral scenarios of credit risky portfolios which incorporates both market and credit
risk. This has be done for real world scenarios but not for risk neutral scenarios. Secondly we
expanded the credit risk model of Lando (Lando 1998) in multiple ways: (1) by instead of using
a Vasicek model we use the two-factor Hull-White model for the short rate, and (2) we also
include the instantaneous return on stocks as a state variable. But our main contribution is the
49
derivation of a closed form solution of defaultable bond price in our modeling framework.
Further work can be done on further assessment of the model. For instance how portfolio credit
risk is correlated to the economic state variables. It is also possible to extend the simulation
framework with other credit risky securities such as callable bonds, structured product and
credit derivatives. Also calibrating the model to the current term structure of spreads, without
a high probability of negative transition probabilities is something for future research. Finally,
introducing direct correlations in a Creditmetrics like approach as suggested earlier could be an
interesting extension of the model and should be further explored.
50
References
Allen, D., L. Thomas, and H. Zheng (2000). Stripping coupons with linear programming. The
Journal of Fixed Income 10 (2), 80–87. 32, 33, 67
Amato, J. and E. Remolona (2003). The credit spread puzzle. BIS Quarterly Review 22,
51–64. 72
Arvanitis, A., J. Gregory, and J. Laurent (1999). Building models for credit spreads. The
Journal of Derivatives 6 (3), 27–43. 20, 28, 31
Black, F. and J. Cox (1976). Valuing corporate securities: Some effects of bond indenture
provisions. Journal of Finance, 351–367. 5
Black, F. and M. Scholes (1973). The pricing of options and corporate liabilities. Journal of
political economy 81 (3), 637. 5
Brigo, D. and F. Mercurio (2006). Interest rate models: theory and practice: with smile,
inflation, and credit. Springer Verlag. 58
Briys, E. and F. De Varenne (1997). Valuing risky fixed rate debt: An extension. Journal of
Financial and Quantitative Analysis, 239–248. 5
Capuano, C. (2008). The Option-iPoD. The Probability of Default Implied by Option Prices
Based on Entropy. Technical report, IMF Working Paper 08/194 (Washington, International Monetary Fund). 5
Çetin, U., R. Jarrow, P. Protter, and Y. Yildirim (2004). Mathematics¿ Probability Title:
Modeling Credit Risk with Partial Information. Journal reference: Annals of Applied
Probability 14 (3), 1167–1178. 8
Davis, M. and V. Lo (2001). Modelling default correlation in bond portfolios. Mastering
Risk 2, 141–151. 7
Delianedis, G. and R. Geske (1999). Credit Risk And Risk Neutral Default Probabilities:
Information About Rating Migrations And Defaults. 5
Duffie, D., D. Filipović, and W. Schachermayer (2003). Affine processes and applications in
finance. Annals of Applied Probability, 984–1053. 7
Duffie, D. and D. Lando (2001). Term structures of credit spreads with incomplete accounting
information. Econometrica, 633–664. 7, 8
Duffie, D. and K. Singleton (1999a). Modeling term structures of defaultable bonds. Review
of Financial Studies 12 (4), 687–720. 6
Duffie, D. and K. Singleton (1999b). Modeling term structures of defaultable bonds. Review
of Financial Studies 12 (4), 687–720. 38
Duffie, D. and K. Singleton (2003). Credit risk: pricing, measurement, and management.
Princeton University Press Princeton, NJ. 27, 37
Ericsson, J. and J. Reneby (1998). A framework for valuing corporate securities. Applied
Mathematical Finance 5 (3), 143–163. 5
Frey, R., C. Prosdocimi, and W. Runggaldier (2008). Affine credit risk models under incomplete information. In Stochastic Processes and Applications to Mathematical Finance,
51
Proceedings of the 6th Ritsumeikan International Symposium, Ritsumeikan University,
Japan, 6-10 March 2006, pp. 97–114. 8
Frey, R. and W. Runggaldier (2007). Credit risk and incomplete information: a nonlinear
filtering approach. Preprint. 8
Geske, R. (1977). The valuation of corporate liabilities as compound options. Journal of
Financial and Quantitative Analysis, 541–552. 5
Giesecke, K. (2006). Default and information. Journal of Economic Dynamics and Control 30 (11), 2281–2303. 8
Habti, F. (2007). Valuation of swaptions and guaranteed annuity options with a two-factor
hull-white model. Master’s thesis, Erasmus University Rotterdam. 15
Heath, D., R. Jarrow, and A. Morton (1992). Bond pricing and the term structure of interest
rates: A new methodology for contingent claims valuation. Econometrica: Journal of the
Econometric Society, 77–105. 13
Huang, J. and W. Kong (2003). Explaining Credit Spread Changes. The journal of derivatives 11 (1), 30–44. 23, 32
Hull, J. and A. White (1990). Pricing interest-rate-derivative securities. Review of financial
studies, 573–592. 12, 13
Hull, J. and A. White (1994). Numerical Procedures for Implementing Term Structure Models
II. The Journal of Derivatives 2 (2), 37–48. 13
Ingersoll, J. (1977). A contingent-claims valuation of convertible securities. Journal of Financial Economics 4 (3), 289–321. 5
Israel, R., J. Rosenthal, and J. Wei (2001). Finding generators for Markov chains via empirical
transition matrices, with applications to credit ratings. Mathematical Finance 11 (2), 245–
265. 24, 30
Jarrow, R., D. Lando, and S. Turnbull (1997). A Markov model for the term structure of
credit risk spreads. Review of financial studies, 481–523. 6, 17, 20, 24
Jarrow, R. and S. Turnbull (1995). Pricing derivatives on financial securities subject to credit
risk. Journal of finance, 53–85. 6
Jarrow, R. and F. Yu (2001). Counterparty risk and the pricing of defaultable securities.
Journal of Finance, 1765–1799. 7
Jones, E., S. Mason, and E. Rosenfeld (1984). Contingent claims analysis of corporate capital
structures: an empirical analysis. Journal of Finance 39 (611-625). 5
Kealhofer, S. (1995). Managing default risk in portfolios of derivatives. Derivative Credit Risk ,
49–66. 6
Kijima, M. and K. Komoribayashi (1998). A Markov chain model for valuing credit risk
derivatives. The Journal of Derivatives 6 (1), 97–108. 73
Kim, I., K. Ramaswamy, and S. Sundaresan (1993). Does default risk in coupons affect the
valuation of corporate bonds?: A contingent claims model. Financial Management, 117–
131. 5
Lando, D. (1998). On Cox processes and credit risky securities. Review of Derivatives Research 2 (2), 99–120. 7, 23, 30, 37, 49
52
Li, D. (2000). On Default Correlation. The Journal of Fixed Income 9 (4), 43–54. 6
Londen, J. (2002). Modeling long-term credit bond portfolios within an alm context. Master
thesis financial econometrics, Erasmus University Rotterdam. 8, 68, 70
Longstaff, F. and E. Schwartz (1995). A simple approach to valuing risky fixed and floating
rate debt. Journal of finance, 789–819. 5
Merton, R. (1974). On the pricing of corporate debt: The risk structure of interest rates.
Journal of finance, 449–470. 5
Merton, R. (1977). On the pricing of contingent claims and the Modigliani-Miller theorem.
Journal of Financial Economics 5 (2), 241–249. 5
Nelson, C. and A. Siegel (1987). Parsimonious modeling of yield curves. The Journal of Business 60 (4), 473–489. 34
Nielsen, L., J. Saà-Requejo, and P. Santa-Clara (1993). Default risk and interest rate risk:
The term structure of default spreads. Arbeitspapier, INSEAD, Fontainebleau. 5
Ritchken, P. and I. Chuang (2000). Interest rate option pricing with volatility humps. Review
of Derivatives Research 3 (3), 237–262. 13
Saa-Requejo, J. and P. Santa-Clara (1997). Bond pricing with default risk. Anderson Graduate
School of Management Working Paper 13. 5
Schonbucher, P. (2000). A Tree Implementation of a Credit Spread Model for Credit Derivatives. Bonn Econ Discussion Papers. 23
Schonbucher, P. and D. Schubert (2001). Copula-dependent default risk in intensity models.
Department of Statistics, Bonn University, Working Paper . 7
Smith, C. and J. Warner (1979). On financial contracting: An analysis of bond covenants.
Journal of Financial Economics 7 (2), 117–161. 5
Taleb, N. (2007). The black swan: the impact of the highly improbable. Random House Inc. 4
Vasicek, O. (1987). Probability of loss on loan portfolio. KMV Corporation 12. 6
Wang, D. (1998). Pricing defaultable debt:
mat/9808168 . 5
some exact results. Arxiv preprint cond-
Wei, J. (2000). A multi-Factor, markov chain model for credit migrations and credit spreads.
Journal of International Money and Finance 23. 7, 8
53
A
Definitions
Roughly speaking, a model can be either deterministic or stochastic and dynamic or static in
time. The most simple case is obviously that of a static and deterministic model which is not
very useful because this is just a number. More of use is a stochastic but static model, which is
nothing else then modeling a random variable. What we need is a model that is dynamic over
time and captures the uncertainty of financial objects, so which is also stochastic. This is what
characterizes a stochastic process, its the dynamic equivalent of a random variable.
In order to give a formal definition of a stochastic process we need a short review on probability
theory, especially the probability space (Ω, F, P). The probability space represents the sample
space Ω, the event space or σ − algebra F and P which assigns probabilities to events. A random
variable X maps Ω on R, X : Ω → R. More formally:
Definition 4. The probability space associated with a random experiment is a triple (Ω, F, P)
where:
1. Ω is the set of all possible outcomes of the random experiment, and it is called the sample
space.
2. F is a family of subsets of Ω which has the structure of a σ-field:
(a) ∅ ∈ F
(b) If A ∈ F, then also Ac ∈ F
S
(c) If A1 , A2 , ... ∈ F then ∞
i=1 Ai ∈ F
3. P is a function which assigns a probability P(A) to each set A ∈ F and has the following
properties:
(a) 0 ≤ P(A) ≤ 1
(b) P(Ω) = 1
(c) For any sequence A1 , A2 , ... of disjoint sets in F:
!
∞
∞
[
X
P
Ai =
P(Ai )
i=1
i=1
A stochastic process is the following:
Definition 5. Stochastic Process Given a set T, a collection of random variables X = (Xt )t∈T
is a stochastic process.
Definition 6. Filtration A filtration F is a collection of σ-algebras: F = (Ft )t∈T for which
Fs ⊂ Ft if s ≤ t.
In finance Ft can be interpreted as information available at time t. This is a very important
concept because it facilitates modeling conditional on available information and how key variables change if more information is available. Think for instance at the influence of the current
and historical stock prices on the expected stock price within a year. Another important notion
is that of adaption.
54
Definition 7. Adaption A process X(t) is adapted to the filtration F = (Ft )t∈T if ∀t X(t) is
Ft -measurable.
So conditional to Ft , the information available at time t, the value of the process X(t) is known.
Think for example of an investment strategy as a process adapted to the stock prices up till
now. It has to be adapted and cannot be anticipative, i.e. depend on uncertain future values of
the stock price.
55
B
Proofs
B.1
Lemmas
We start with deriving a few important lemmas which are being extensively used in the proofs
in the rest of this section.
Lemma 5. Let Y (t) be an integral transform of the martingale process Z(t) by means of the
process X(t), then ∀s ≤ t
E[Y (t)|Fs ] = Y (s)
Proof.
Z
t
X(τ )dZ(τ ) − Y (s)|Fs
E[Y (t) − Y (s)|Fs ] = E Y (s) +
s
Z t
=E
X(τ )dZ(τ )|Fs
s X
= E lim
X(τi )(Z(τi+1 ) − Z(τi ))|Fs
∆τ ↓0
X
= lim
E [X(τi )E[Z(τi+1 ) − Z(τi )|Fτi ]|Fs ]
∆τ ↓0
X
= lim
E [X(τi ) · 0|Fs ]
∆τ ↓0
=0
So in other words, an integral transform of a martingale process is again a martingale process.
Corollary 3. If X(t) is adapted to the Wiener process W (t) then,
Z t
E
X(τ )dW (τ )|Fs = 0
s
Hence, the expected value of an integral transform of a Wiener process equals zero.
Lemma 6. Let the process Z(t) be adapted to the Wiener process W (t), then
Z t
Z t
var
Z(τ )dW (τ ) =
E[Z(τ )2 ]dτ
s
s
Proof. Let X(t) be an integral transform of the Wiener process W (t):
Z
X(t) =
t
Z(τ )dW (τ )
s
56
This is equivalent to dX(t) = Z(t)dW (t). Then by the It0̂ rule and the fact that the quadratic
variation of X(t), d[X(t), X(t)] is given by Z 2 (t)dt.
dX 2 (t) = 2X(t)dX(t) + d[X(t), X(t)] = 2Z(t)X(t)dW (t) + Z 2 (t)dt
Therefore,
Z
2
t
E[X (t)] = E
dX 2 (τ )
s
Z
t
t
Z
Z(τ )X(τ )dW (τ ) + E
= 2E
s
Z t
=
E[Z 2 (τ )]dτ
Z 2 (τ )dτ
s
s
From this it follows that,
Z t
Z t
2
2
2
E[Z 2 (τ )]dτ
var
Z(τ )dW (τ ) = E[X (t)] − E[X(t)] = E[X (t)] − 0 =
s
s
Lemma 7. Let X(t) be an integral transform of the Wiener process W1 (t) by means of the
process Z(t), Y (t) be an integral transform of the Wiener process W2 (t) by means of the process
V (t), and ρ the correlation between W1 (t) and W2 (t). Then the covariance between X(t) and
Y (t) is given by
Z t
Z t
Z t
Z(τ )dW1 (τ ),
V (τ )dW2 (τ ) = ρ
Z(τ )V (τ )dτ
cov[X(t), Y (t)] := cov
s
s
s
Proof.
dX(t) = Z(t)dW1 (t)
dY (t) = V (t)dW2 (t)
Then by Itô’s lemma and the fact that the quadratic covariation between X(t) and Y (t),
d[X(t), Y (t)] is given by ρZ(t)V (t)dt.
dX(t)Y (t) = Y (t)dX(t) + X(t)dY (t) + d[X(t), Y (t)]
= Y (t)Z(t)dW1 (t) + X(t)V (t)dW2 (t) + ρZ(t)V (t)dt
Therefore,
t
Z
dX(τ )Y (τ )
E[X(t)Y (t)] = E
s
Z
t
=E
Z
Y (τ )Z(τ )dW (τ ) + E
s
Z t
+E
X(τ )V (τ )dW (τ )
s
ρZ(τ )V (τ )dτ
s
Z
=ρ
t
t
E[Z(τ )V (τ )]dτ
s
57
From this it follows that,
Z
cov[X(t), Y (t)] = cov
t
t
Z
Z(τ )dW1 (τ ),
V (τ )dW2 (τ )
s
s
= E[X(t)Y (t)] − E[X(t)]E[Y (t)]
Z t
E[Z(τ )V (τ )]dτ
= E[X(t)Y (t)] − 0 = ρ
s
B.2
Proof theorem 1
Recall that in the HW2 model the short rate r(t) follows the process:
dr(t) = [θ(t) + u(t) − ar(t)]dt + σ1 dW1 (t)
(27)
du(t) = −bu(t)dt + σ2 dW2 (t)
(28)
In (Brigo and Mercurio 2006) it is shown by simple integration and substitution that this is
equivalent to expressing r(t) conditional on the σ-algebra Fs as follows:
Z t
e−b(t−s) − e−a(t−s)
+
θ(τ )e−a(t−τ ) dτ
r(t) = r(s)e−a(t−s) + u(s)
a−b
s
Z t
Z t
σ
2
+ σ1
e−b(t−τ ) − e−a(t−τ ) dW2 (τ )
e−a(t−τ ) dW1 (τ ) +
a−b s
s
(29)
As we saw earlier the stochastic integrals in (29) can be written as the limit in probability of
a sum of scaled increments of a Wiener process, i.e. approximated by a linear combination of
increments of a Wiener process. Since these increments are independently normal distributed
and the normality property is preserved in the limit, the stochastic integrals are also normally
distributed random variables. Hence, r(t) is a normally distributed variable for every t, and
therefore is a Gaussian process. So by only specifying the expectations and covariances we can
completely describe the process r(t). Note that Conditionally on Ft , r(t) is normally distributed.
We use corollary (3) to derive:
Z t
e−b(t−s) − e−a(t−s)
E[r(t)|Fs ] = r(s)e
+ u(s)
θ(τ )e−a(t−τ ) dτ
+
a−b
s
Z t
Z t
σ
2
+ σ1 E[ e−a(t−τ ) dW1 (τ )|Fs ] +
E[ e−b(t−τ ) dW2 (τ )|Fs ]
a−b
s
s
Z t
−b(t−s)
−a(t−s)
e
−e
= r(s)e−at + u(s)
+
θ(τ )e−a(t−τ ) dτ
a−b
s
−a(t−s)
(30)
We also need an explicit formula for the variation of r(t) conditional on the σ-algebra Fs . Using
58
lemmas 6 and 7 we obtain,
Z t
2
−a(t−τ )
var[r(t)|Fs ] = σ1 var
e
dW1 (τ )|Fs
s
Z t
σ22
−b(t−τ )
−a(t−τ )
+
var
e
−
e
dW
(τ
)|F
2
s
(a − b)2
s
Z t
Z t
2σ1 σ2
−b(t−τ )
−a(t−τ )
+
e
dW2 (τ )|F∫
cov
e
dW1 (τ ),
a−b
s
s
Z t
2
= σ1
e−2a(t−τ ) dτ
s
Z t
σ22
e−2b(t−τ ) + e−2a(t−τ ) − 2e−(b+a)(t−τ ) d(τ )
+
(a − b)2 s
Z
2σ1 σ2 ρ t −(b+a)(t−τ )
+
e
− e−2a(t−τ ) dτ
a−b s
!
−2at
σ22
1 − e−2at 1 − e−2bt
1 − e−(a+b)t
2 1−e
+
+
−2
= σ1
2a
(a − b)2
2a
2b
a+b
!
2σ1 σ2 ρ 1 − e−(a+b)t 1 − e−2at
−
+
a−b
a+b
2a
B.3
Proof lemma 1
The covariance between r(u) and r(v), conditional on Fs where s ≤ v ≤ u, is equal to:
Z v
σ22
−a(u+v)
v(a−b)+bτ
aτ
u(a−b)+bτ
aτ
cov [r(u), r(v)|Fs ] = e
e
−
e
e
−
e
dτ
(a − b)2 s
Z
Z v
σ1 σ2 v a(v+τ )−b(v−τ )
a(u+τ )−b(u−τ )
2aτ
2
2aτ
e
+e
− 2e dτ
+σ1
e dτ + ρ
a−b s
s
Proof. Due to expression (29) in the appendix the covariance between r(u) and r(v) is equal to:
Z u
Z u
σ2
−a(u−τ )
cov [r(u), r(v)|Fs ] = cov σ1
e
dW1 (τ ) +
e−b(u−τ ) − e−a(u−τ ) dW2 (τ ),
a−b s
s
Z v
Z v
σ2
σ1
e−a(v−τ ) dW1 (τ ) +
e−b(v−τ ) − e−a(v−τ ) dW2 (τ )|Fs
a−b s
s
=A+B+C +D
59
Where,
Z u
Z v
e−a(v−τ ) dW1 (τ )|Fs
e−a(u−τ ) dW1 (τ ), σ1
A = cov σ1
s
Zs u
Z v
σ
2
−a(u−τ )
−b(v−τ )
−a(v−τ )
e
dW1 (τ ),
B = cov σ1
e
−e
dW2 (τ )|Fs
a−b s
s
Z v
Z u
σ2
−a(v−τ )
−b(u−τ )
−a(u−τ )
e
dW1 (τ )|Fs
C = cov
e
−e
dW2 (τ ), σ1
a−b s
s
Z u
Z v
σ2
σ2
−b(u−τ )
−a(u−τ )
−b(v−τ )
−a(v−τ )
D = cov
e
−e
dW2 (τ ),
e
−e
dW2 (τ )|Fs
a−b s
a−b s
Because s ≤ v ≤ u we can write,
Z v
Z v
Z u
−a(v−τ )
−a(u−τ )
−a(u−τ )
e
dW1 (τ ) Fs
e
dW1 (τ ), σ1
e
dW1 (τ ) + σ1
A = cov σ1
s
Zs v
Z vv
= cov σ1
e−a(u−τ ) dW1 (τ ), σ1
e−a(v−τ ) dW1 (τ ) Fs
Z su
Z sv
−a(u−τ )
−a(v−τ )
+ cov σ1
e
dW1 (τ ), σ1
e
dW1 (τ ) Fs
Zv v
Zs v
−a(u−τ )
−a(v−τ )
= cov σ1
e
dW1 (τ ), σ1
e
dW1 (τ ) Fs
s
s
Here we used that increments of a Wiener process are independent, so have zero covariance.
Due to lemma 7 we can write,
Z v
2
A = σ1
e−a(u−τ ) e−a(v−τ ) dτ
s
In a similar way we can derive that,
Z
σ1 σ2 ρ v −a(u−τ ) −b(v−τ )
B=
e
e
− e−a(v−τ ) dτ
a−b s
Z
σ1 σ2 ρ v −b(u−τ )
C=
e
− e−a(u−τ ) e−a(v−τ ) dτ
a−b s
Z v
σ22
−b(u−τ )
−a(u−τ )
−b(v−τ )
−a(v−τ )
D=
e
−
e
e
−
e
dτ
(a − b)2 s
Straightforward simplification results in lemma 1.
B.4
Proof of Analytic defaultable bond price
Theorem 6. If short rates follow a two-factor Hull-White model and the credit events of rating
transitions and defaults are modeled as described in 5, then the closed form solution of the price
60
of an i rated defaultable zero coupon bond at time t with maturity date T equals,
K
X
1 2
P i (t, T ) =
βijK exp γj0 − γj2 σS (t − s)
2
j=1
Z t
Q
r(τ )dτ
exp (γj1 + γj2 − 1)E
s
Z t
1
2
+ (γj1 + γj2 − 1) var
r(τ )dτ + (γj2 )2 (t − s)σS2
2
σ σs ρ r S r,S
−a(t−s)
+2(γj1 + γj2 − 1)γj2
a(t
−
s)
+
e
−
1
a2
σu σS ρSu 1
1 −b(t−s)
−a(t−s)
+
e
−1 − 2 e
−1
a−b
a2
b
hR
i
hR
i
t
t
EQ s r(τ )dτ and var s r(τ )dτ can be found in this appendix.
Proof. As we saw in corollary
2 the price at time t iof an i rated defaultable zero coupon bond
h
RT
Q
maturing at T equals E exp(− t r(τ ) + λi (τ )dτ ) . If λi (t) is modeled as discussed in section
5 we obtain the following. Define:
−1
βijk = −Bij BjK
EQ [e−
RT
t
r(τ )+λi (τ )dτ
] = EQ [e−
RT
= EQ [e−
RT
t
t
r(τ )dτ −
e
r(τ )dτ
RT
t
K
X
λi (τ )dτ
]
RT
−1
e
−Bij BjK
t
µj (τ )dτ
]
j=1
=
K
X
βijK EQ [e−
RT
t
r(τ )dτ
RT
e
t
µj (τ )dτ
]
j=1
=
K
X
RT
βijK EQ [e
t
µj (τ )−r(τ )dτ
]
j=1
Because we assumed the affine dependence µj (t) = γj0 + γj1 r(t) + γj2 RS (t), we get
P i (t, T ) =
K
X
j=1
RT
βijK EQ [e
t
µj (τ )−r(τ )dτ
]=
K
X
βijK EQ [e
RT
t
γj0 +γj1 r(τ )+γj2 RS (τ )−r(τ )dτ
]
(31)
j=1
Obtaining a closed form expression for (31) is possible because it can be shown that r(t) and
RS (t)R are both normally
R t r distributed variables. Because an integral is basically an infinite sum,
t
also s r(τ )dτ and s S (τ )dτ are normally distributed. Finally we can use that if Z is normal
distributed with mean µZ and variance σZ2 the moment generating function equals:
1 2 2
MZ (t) = exp µt + σ t
2
61
1
2
Hence, MZ (1) = E[eZ ] = eµZ + 2 σZ . So it follows that, if we know that Z is normal distributed
then in order to calculate the expected value of the exponential transform of Z we only need µZ
and σZ2 .
The risk neutral price process of stocks is described by the Black Scholes model but with stochastic short rate as described in section 3:
dS(t) = r(t)S(t)dt + σS S(t)dWS (t)
Where the Wiener process WS (t) is correlated to Wr (t) with ρrS . By using the telescope rule
and equation (8) we get,
Z t
Z t
Z t
S(t)
t−s 2
S
R (τ )dτ = ln
r(τ )dτ −
=
σS dWS (τ )
σ +
S(s)
2 S
s
s
s
Hence,
Z t
S
E exp
γj0 + (γj1 − 1)r(τ ) + γj2 R (τ )dτ
s
Z t
Z t
Z t
t−s 2
Q
=E exp
γj0 + (γj1 − 1)r(τ )dτ + γj2
r(τ )dτ −
σ +
σS dWS (τ )
2 S
s
s
s
Z t
Z t
1
r(τ )dτ + γj2
σS dWS (τ )
=EQ exp γj0 − γj2 σS2 (t − s) + (γj1 + γj2 − 1)
2
s
s
Z t
Z t
1 2
Q
r(τ )dτ + γj2
σS dWS (τ )
=exp γj0 − γj2 σS (t − s) E exp (γj1 + γj2 − 1)
2
s
s
Z t
Z t
1 2
Q
r(τ )dτ + γj2
σS dWS (τ )
=exp γj0 − γj2 σS (t − s) exp E (γj1 + γj2 − 1)
2
s
s
Z t
Z t
1
+ var (γj1 + γj2−1 )
r(τ )dτ + γj2
σS dWS (τ )
2
s
s
Q
The expression in the exponential is normally distributed with:
Z t
Z t
Q
E (γj1 + γj2 − 1)
r(τ )dτ + γj2
σS dWS (τ )
s
s
Z t
Z t
Q
Q
=(γj1 + γj2 − 1)E
r(τ )dτ + γj2 E
σS dWS (τ )
s
s
Z t
Q
=(γj1 + γj2 − 1)E
r(τ )dτ
s
and,
Z t
Z t
var (γj1 + γj2 − 1)
r(τ )dτ + γj2
σS dWS (τ )
s
s
Z t
Z t
2
2
=(γj1 + γj2 − 1) var
r(τ )dτ + (γj2 ) var
σS dWS (τ )
s
s
Z t
Z t
+ 2(γj1 + γj2 − 1)γj2 cov
r(τ )dτ,
σS dWS (τ )
s
s
62
From lemma 6 we know that,
Z t
Z t
var
σS dWS (τ ) =
σS2 dτ ) = (t − s)σS2
s
s
To proceed we use the following,
Lemma 8.
Z t
Z t
σr σS ρr,S −a(t−s)
cov
r(τ )dτ,
σS dWS (τ ) =
a(t
−
s)
+
e
−
1
a2
s
s
Proof. First note that,
Z t
Z t
Z t
Z t
cov
r(τ )dτ,
σS dWS (τ ) =
cov r(τ ),
σS dWS (τ ) dτ
s
s
s
(32)
s
Now we can derive that,
Z t
cov r(τ ),
σS dWS (τ )
s
Z τ
Z τ
Z t
σu
−a(τ −u)
−b(τ −u)
−a(τ −u)
= cov σr
e
dWr (u) +
e
−e
dWu (u),
σS dWS (τ )
a−b s
s
s Z τ
Z τ
= σr σS cov
e−a(τ −u) dWr (u),
dWS (τ )
s
s
Z τ
Z t
σu σS
−b(τ −u)
−a(τ −u)
+
cov
e
−e
dWu (u),
dWS (τ )
a−b
s
s
Due to lemma 7 we can proceed as follows,
Z τ
Z τ
−a(τ −u)
σr σS cov
e
dWr (u),
dWS (τ )
s
s
Z τ
σr σu ρs,r = σr σS ρS,r
e−a(τ −u) du =
1 − e−a(τ −s)
a
s
and,
Z τ
Z t
σu σS
−b(τ −u)
−a(τ −u)
cov
e
−e
dWu (u),
dWS (τ )
a−b
s
s
Z
σu σS ρSu τ −b(τ −u)
σu σS ρSu 1 −b(τ −s) 1 −a(τ −s)
−a(τ −u)
=
e
−e
dWu (u) =
e
− e
a−b
a−b
b
a
s
Conclude that using (32) we derive,
Z t
Z t
cov r(τ ),
σS dWS (τ ) dτ
s
s
Z t
Z t
σr σu ρs,r σu σS 1 −b(τ −s) 1 −a(τ −s)
−a(τ −s)
=
1−e
dτ +
e
− e
dτ
a
b
a
s
s a−b
σ σ ρ 1 σr σS ρr,S 1 −b(t−s)
u S Su
−a(t−s)
−a(t−s)
=
a(t − s) + e
−1 +
e
−1 − 2 e
−1
a2
a−b
a2
b
63
So we obtained the following expression for the variance,
Z t
Z t
σS dWS (τ )
r(τ )dτ + γj2
var (γj1 + γj2 − 1)
s
s
Z t
=(γj1 + γj2 − 1)2 var
r(τ )dτ + (γj2 )2 (t − s)σS2
s
σ σ ρ r S r,S
−a(t−s)
+ 2(γj1 + γj2 − 1)γj2
a(t
−
s)
+
e
−
1
a2
σu σS ρSu 1
1 −b(t−s)
−a(t−s)
+
e
−1 − 2 e
−1
a−b
a2
b
Now consider the following lemma.
Lemma 9. If Z(t) ∼ N (µ(t), σ 2 (t)) ∀t > 0 with cov[Z(u), Z(v)] 6= 0∃u, v, then
normally distributed with expectation and variance given by,
Z b
Z b
E
Z(t)dt =
µ(t)dt
a
a
Z b
Z bZ b
var
Z(t)dt =
cov[Z(u), Z(v)]dudv
a
a
Rb
a
Z(t)dt is
a
Proof. Let t0 , ..., tn be a partition of the interval [a, b], so a = t0 < t1 < ... < tn = b, and define
∆tj := tj − tj−1 .
Rb
Then assuming a Z(t)dt exists,
Z b
b
X
E
Z(t)dt = E lim
Z(tj )∆tj
n→∞
a
j=a
b
X
= lim E
Z(tj )∆tj
n→∞
= lim
n→∞
Z
=
j=a
b
X
E[Z(tj )]∆tj
j=a
b
µ(t)dt
a
64
And,
b
Z
var
b
X
Z(t)dt = var lim
Z(tj )∆tj
n→∞
a
j=a
b
X
= lim var
Z(tj )∆tj
n→∞
j=a
b X
b
X
= lim
n→∞
cov [Z(ti )∆ti , Z(tj )∆tj ]
i=a j=a
b X
b
X
= lim
n→∞
∆ti ∆tj cov [Z(ti ), Z(tj )]
i=a j=a
Z bZ
b
cov[Z(u), Z(v)]dudv
=
a
a
Starting with the expectation, due to lemma 9 we have
Z t
Z t
Q
E
r(τ )dτ =
EQ [r(τ )]dτ
s
s
Using theorem 1 we can rewrite this as,
Z
t
t
Z
Q
E [r(τ )]dτ =
r(s)e
s
s
−aτ
e−b(τ −s) − e−a(τ −s)
+ u(s)
dτ +
a−b
Z tZ
s
t
θ(v)e−a(t−v) dvdτ
s
It can be easily shown with partial integration that,
Z t
θ(v)e−a(t−v) dv = f (0, t) + φ(0, t) − (f (0, s) + φ(0, s)) e−a(t−s)
s
Therefore by simple integration we obtain,
Z tZ
t
θ(v)e
s
s
−a(t−v)
e−a(t−s) − 1
dvdτ =
(r(s) − f (0, s) − φ(0, s)) +
a
Z
t
f (0, τ ) + φ(0, τ )dτ
s
Due to lemma 1 we can write
Z tZ u
cov(r(u), r(v))dvdu =
s
s
Z tZ u
Z v
σ22
−a(u+v)
v(a−b)+bs
as
u(a−b)+bs
as
e
e
−
e
e
−
e
ds
(a − b)2 0
s
s
Z
Z v
σ1 σ2 v a(v+s)−b(v−s)
2
2as
a(u+s)−b(u−s)
2as
+σ1
e ds + ρ
e
+e
− 2e ds dvdu
a−b 0
0
65
We can use this to determine var
Z
var
t
Z tZ
r(τ )dτ =
s
hR
t
s
i
r(τ )dτ .
t
cov[r(u), r(v)]dvdu
s
s
Notice that in the integral v can be larger then u. Since cov[r(u), r(v)] = cov[r(v), r(u)] we can
rewrite the above as,
Z tZ t
Z tZ u
Z tZ t
cov[r(u), r(v)]dvdu
cov[r(u), r(v)]dvdu +
cov[r(u), r(v)]dvdu =
s
u
s
s
s
s
Z tZ u
cov[r(u), r(v)]dvdu
=2
s
s
Obtaining a solution to the double integral is straightforward but cumbersome and results in
the following,
Z tZ u
1
2
cov[r(u), r(v)]dvdu = − 3 3
2a b (a − b)2 (a + b)
s
s
a4 2b3 σ12 (s − t) + 4b2 ρσ1 σ2 (s − t) + 2bσ2 σ2 (s − t) − 2ρσ1 eb(s−t) − 1
+σ22 −4eb(s−t) + e2b(s−t) + 3
− a3 b −b2 σ1 σ1 −4ea(s−t) + e2a(s−t) + 3 + 4ρσ2 (t − s)
+2bσ2 −2ρσ1 e(a+b)(s−t) + 2ρσ1 ea(s−t) + σ2 (s − t) + 2b3 σ12 (s − t) − σ22 e2b(s−t) − 1
− a2 b2 b2 σ1 σ1 −4ea(s−t) + e2a(s−t) + 3 + 4ρσ2 (s − t)
+2bσ2 ρσ1 2e(a+b)(s−t) − 4ea(s−t) + e2a(s−t) − 2eb(s−t) + 3 + σ2 (s − t)
−4σ22 −e(a+b)(s−t) + ea(s−t) + eb(s−t) − 1 + 2b3 σ12 (s − t)
+ b4 −4ea(s−t) + e2a(s−t) + 3 b2 σ12 + 2bρσ1 σ2 + σ22
+ ab3 −b2 σ1 σ1 −4ea(s−t) + e2a(s−t) + 3 + 4ρσ2 (t − s)
+2bσ2 2ρσ1 ea(s−t) − 1 + σ2 (s − t) + σ22 e2a(s−t) − 1 + 2b3 σ12 (s − t)
66
C
A Linear Programming Stripping Procedure
We adopted the procedure described in (Allen, Thomas, and Zheng 2000) to strip market prices
of coupon bonds to obtain zero-coupon bond prices. This procedure can deal with different
market prices for bonds with the same maturity and rating as of having no bond which matures
in a certain year for a certain rating. The problem of stripping market prices of coupon bonds
is described by a Linear Programming (LP) problem. If we denote vj (t) as the estimated price
of an j-rated discount bond and with t years to maturity, Pi as the observed present value of
bond i, and ci (t) as the expected cashflow of bond i at time t then the LP problem is:
min
N
X
ai + bi
i=1
s.t. Pi + ai =
T
X
ci (t)vd(i) (t) + bi
t=1
vj (t + 1) − vj+1 (t + 1) ≥ vj (t) − vj+1 (t)
ai , bi ≥ 0
for i = 1, ..., N ,j = 0, ...K − 1, and t = 0, ..., T − 1.
The paper also describes how to deal with cashflows that are not on the sampling dates, such
as maturity and coupon dates. Let α be the time between a coupon date and the next sampling
date proportional to the sampling interval and let β be the difference in time between the current
date and the next sampling date proportional to the sampling interval. If α ≥ β, there is no
cashflow between the current and the next sampling date and if α < β there is precisely one
cashflow. (Allen, Thomas, and Zheng 2000) approximate the cashflows for the sampling dates
with use of simple interpolation and show that if α ≥ β, the present value P equals the observed
market price at the current date plus accrued interest (α − β)ci . Here is c the semi-annually
fixed coupon payment of bond i. Moreover the approximated cashflows on the sampling dates
are,
ci (1) = αci
for t = 2, ..., T − 2
ci (t) = ci
ci (T − 1) = ci + αFi
ci (T ) = (1 − α)(ci + Fi )
If α < β the accrued interest is (1 + α − β)c and the approximated cashflows,
α
ci (0) = ci
β
β−α
ci (t) = α +
ci
β
ci (t) = ci
for t = 2, ..., T − 2
ci (T − 1) = ci + αFi
ci (T ) = (1 − α)(ci + Fi )
67
D
Credit risk model in PALM
In this appendix we will explain the credit model of PALM (Londen 2002) in more detail.
D.1
Z-scores: mapping transition probabilities to the normal distribution
In order to model the transition probabilities, so called z-score deviations are used. Basically,
it is assumed that the deviation of the historic mean depends on the state of the economy. For
modeling reasons, on which we will elaborate later, the probabilities are first mapped on the
normal distribution which results in so called z-scores. This relates to a methodology, developed
by JP Morgan used in their widely used model CreditMetrics, namely relating the rating of
an issuer directly to its asset return. Of course, this is very simplistic because rating agencies
use much more sophisticated models which many state variables. But this does help us to
understand and interpret what z-scores are. Because we are mapping to the normal distribution,
we assume for the interpretation that the future asset return is normal distributed. However, if
this assumption is not satisfied this has no consequences because this is only a way to interpret
the z-score approach. Actually we could have taken any continuous distribution, but because of
the nice properties of the normal distribution this choice eases calculation most.
The z-score of the default probability is defined as
ziK = Φ−1 (piK )
Here PiK is the default probability of a bond currently being in class i, Φ−1 the inverse cumulative
density function of the standard normal distribution, and ziK the resulting z-score for the default
probability. So the probability of a standard normal variable being less then the z-score ziK
equals the default probability. This can be interpreted as that default happens when the asset
return has dropped below the z-score. This is the worst scenario and the biggest possible drop
in credit quality. But since we have multiple rating classes, the bond’s rating class can also
drop to the second worst scenario. Suppose for example we have eight rating classes: AAA,
AA, A, BBB, BB, B, CCC, and D (defaulted). Then the second worst scenario is that a bond
drops to the CCC rating. The probability that the asset return is below the z-score zi,7 should
then equal the probability that the bond will either be defaulted or has a CCC rating. So
zi7 = Φ−1 (pi,7 + pi8 ). In general
zij = Φ−1 (pij + pi,j+1 + ... + piK )
(33)
If we define Pij ≡ pij + pi,j+1 + ... + piK , then we can rewrite (33) as,
zij = Φ−1 (Pij )
Note that what is actually done here, is dividing the domain of a standard normal distribution
into bins with the z-scores as borders. So each transition matrix can be transformed one-to-one
to a z-score matrix.
z(t)11
z(t)12
z(t)13
...
z(t)1,K−1
z(t)21
z(t)22
z(t)23
...
z(t)2,K−1
Z(t) =
..
..
..
..
.
.
.
.
.
.
.
z(t)K−1,1 z(t)K−1,2 z(t)K−1,3 . . . z(t)K−1,K−1
68
Figure 13: Example of mapping the transition probabilities onto the standard normal distribution for a BBB rated bond and the resulting bins. The dotted lines represent the corresponding
z-scores.
Because it is one-to-one the opposite is also true, each z-score matrix can be uniquely transformed
to a transition matrix. Note that the matrix is K − 1 by K − 1 and not K by K as the transition
matrix. This is due to the fact that in order to divide a domain into K bins, only K − 1
boundaries are needed. To illustrate, consider in our example a BBB rated bond with the
following transition probabilities.
AAA AA
A
BBB BB
B
CCC
D
PBBB (t) =
0.002 0.008 0.030 0.880 0.044 0.028 0.005 0.003
This corresponds to the following z-scores.
AA
A
BBB
BB
B
CCC
D
ZBBB (t) =
2.878 2.326 1.751 −1.405 −1.799 −2.409 −2.748
For example, zBBB,D (t) = Φ−1 (0.003) = −2.748 and zBBB,BBB (t) = Φ−1 (0.003+0.005+0.028+
0.044+0.88) = 1.751. See figure 13 for a graphical representation of the mapping of this example.
Now that we know how the probabilities are mapped on the normal distribution we can make
the probabilities macro-economic dependent. Then we will also see why this mapping is needed
in the first place in order to model the economic dependence.
D.2
Relating the business cycle to the credit cycle: macro-economic dependence of the z-score deviations
In this subsection the macro-economic dependence of the transition probabilities is explained.
Making the transition probabilities conditional on current information, hence the state of the
economy, is modeled trough the z-scores rather than the probabilities themselves. Moreover
only the average deviations from the historical mean of the z-score are conditional on current
information. So suppose we know the historical average transition matrix P , and the annual
69
transition matrices P (t) at each data point. Now we can transform these matrices to z-score
matrices.
P → Z, P (t) → Z(t)
Now the Z-score deviation at each data point equals
∆Z(t) = Z − Z(t)
So this is simply the deviation from the historical average. Because empirical studies show that
the shift in Z-scores is more or less the same for each rating category, we will assume an equal
magnitude of the shift in z-scores for a particular rating. Therefore we will only model the
dependence between the credit and the business cycle trough the average z-score deviation per
rating category. This also makes the model more workable and reduces computation time with
a big factor. The average Z-score deviation per rating category is defined as,
∆z(t)i =
K−1
X
j=1
∆z(t)ij
K −1
PALM assumes that these average z-score deviations are dependent on macro-economic variables.
The average Z-score deviation of a particular rating category also depends on the lagged average
Z-score deviations of the other ratings. This is modeled using dependency between the average
Z-score deviation and the common shift: the average of the average Z-score deviations,
∆z(t) =
K−1
X
i=1
∆z(t)i
K −1
The dependence structure is described by means of a vector autoregressive model with one lag
(VAR(1)).
∆z(t)1 = β01 + β11 ∆z(t − 1) + β21 ∆z(t − 1)1 + β31 X(t) + 1
∆z(t)2 = β02 + β12 ∆z(t − 1) + β22 ∆z(t − 1)2 + β32 X(t) + 2
..
.
∆z(t)K−1 = β0K−1 + β1K−1 ∆z(t − 1) + β2K−1 ∆z(t − 1)K−1 + β3K−1 X(t) + K−1
Here is X(t) the vector of explanatory variables and β3 the corresponding vector of the coefficients.
Note that if we would have taken the probabilities directly we had to use truncated regression
techniques because the dependent variable could only have values between zero and one. Moreover, the probabilities had to sum up to 1, which makes it very difficult to apply regression
techniques.
The following macro-economic variables are included in X(t) to describe the dependence between
the credit en the business cycle (Londen 2002). The variables are US based because a large part
of Standard & Poor’s obligers are also US based.
70
1. Equity price index growth (EPI): In the Merton model the assets of a firm directly influence
the value of the credit bond. In the rating-based models the equity returns influence the
transitions.
2. Gross Domestic Product growth (GDP): Wilson (2000) finds that 90% of the variance of
the default probability of the speculative grade is explained by a combination of GDPgrowth and the unemployment rate. GDP-growth is also considered to be a good estimator
for the health of the economy.
3. Unemployment ratio (UNEM): see point 2
4. MSCI US (MSCIUS): The MSCI gives the annual average return on equity. The Merton
model and the rating based model are both based on the return on equity.
5. Real estate NAREIT (OGUS): Gives the annual average return on real estate. Brown
(2001) discusses that the real estate market is subject to the same uncertainties at the
credit market. Also Brown finds that adding Mortgage Banked Securities to the regression
increases the R2 of the regression by 50%.
6. Wage index (WI): Wages index influences the business cycle. Also in a pension fund
environment it is always considered.
7. Consumer Price Index (CPI): see point 6
All these macro-economic series are expressed in the USD. Some of these series are highly
correlated with each other, so we should be careful not to add both these correlated variables
in order to avoid multi-collinearity. In the PALM system a VAR model is used to simulate
future time series of these economic variables. The parameters of this VAR model, so not that
of the macro economic dependence of z-score deviations, are estimated using the Yule-Walker
estimation method. The Yule Walker estimation method produces scenarios for the US scenarios
with averages, standard deviations and correlations between the variables equal to the historical
values.
D.3
Valuation and Pricing of Defaultable Bonds in PALM
PALM has a discrete time model for yearly changes in interest rates and transition probabilities.
Therefore the market value of a fixed income investment such as a default free bond equals the
sum of the discounted future cashflows with as discount factors the current term structure of
interest rate.
V (t, T ) =
T
X
n=1
CF (n)
(1 + r(t, n))n
Here is T the maturity of the investment, CF (t) the cashflow at time t, and r(t, n) the n-year
interest rate at time t. This is the no arbitrage price of an asset with no uncertainty about the
cashflows. However, defaultable bonds do have uncertain cashflows due to default risk. This
risk cannot be diversified away so since investors are assumed to be risk averse it has to be
71
priced.23 Hence, on top of the interest rate investors charge a risk premium ϕ to the issuer of
the investment.
V (t) =
T
X
n=1
EtP (CF (n))
(1 + r(t, n) + ϕ(t))n
We can rewrite this using a change of measure, namely the equivalent martingale measure Q, to
T
X
EtQ (CF (n))
V (t) =
(1 + r(t, n))n
n=1
This is called a martingale measure because under this measure the price process is a martingale,
hence conditional on current information the best estimate of future values is the current value.
In the literature you will also find the term risk neutral measure. This refers to the fact that
an interpretation of this measure is that it resembles the believes of a risk neutral investor
equivalent to the real world probabilities, in other words the believes of a risk averse investor.
Denote qiK (t, n) as the n-year risk neutral default probability at time t for a bond in rating
class i, CF ∗ (n) as the, conditional on not yet defaulted, cashflow at n24 , and φ as the recovery
percentage of the cashflows. Then
(
CF ∗ (t)
w.p. 1 − qi (t, n)
CF (n) =
(34)
∗
φCF (t) w.p. qi (t, n)
So due to (34), the expected cash flow under the risk neutral measure equals
EtQ (CF (n)) = [1 − qi (t, n)]CF ∗ (t) + [qi (t, n)]φCF ∗ (t)
(35)
∗
= [1 − (1 + φ)qi (t, n)]CF (t)
Now consider the example of a discount25 bond with maturity T . According to (35), the price
of such a bond at t simply equals
P (t, T ) =
1 − (1 + φ)qi (t, T )
(1 + r(t, T ))T
(36)
Next consider a 100 ∗ c yearly coupon bond, principal of $K, and with maturity T . The conditional cashflows CF ∗ (n) are cK for n = 1, 2, ..., T − 1, and (1 + c)K for n = T . Therefore
P (t, T ) =
T
−1
X
n=1
cK[1 − (1 + φ)qi (t, n)] (1 + c)K[1 − (1 + φ)qi (t, n)]
+
(1 + r(t, n))n
(1 + r(t, n))T
(37)
23
A phenomena which keeps researchers busy for a while now is the credit premium puzzle: Systematic risk
only explains a part of the credit risk premium. Because the returns of defaultable bonds are highly skewed with
only limited upside potential it takes tens of thousands of contracts to diversify idiosyncratic credit risk away.
This is why some literature suggest that also a part of the idiosyncratic risk is priced (Amato and Remolona
2003)
24
Note that in this case CF ∗ (n) = EtP (CF (n))
25
zero coupon and principal normalized to 1
72
To obtain the risk neutral probabilities we can make use of the actual market prices of traded
bonds. Assuming no arbitrage opportunities these prices should reflect investor’s believes and
also risk neutral believes. However, this does not give a unique measure if we do not impose
some assumptions on the risk premium. Therefore in PALM it is assumed that,
X
qij (t) = πi p(t)ij
∀i, j, j 6= K
and
qiK (t) = 1 −
qij (t)
j6=K
So for each current rating the real world transition probabilities, except the default probability,
are transformed to risk neutral probabilities by multiplying the real world probabilities with a
rating dependent ”risk premium” πi . The risk neutral default probability then simply follows
from the fact that the probabilities, for each rating class, should add up to one. Note that we
expect a risk premium between zero and one, 0 ≤ πi ≤ 1, because this results in a higher risk
neutral default probability.
To calibrate πi , market prices of 1-year default free and defaultable bonds can be used. If P (0, 1)
denotes the price at time 0 of a 1-year default free bond, and P i (0, 1) that of a 1-year defaultable
bond with rating i then (see (Kijima and Komoribayashi 1998) for a proof):
πi =
P i (0, 1 − P (0, 1)
1
(1 − φ)P (0, 1) (1 − piK (0))
(38)
So by implementing the market prices of P (0, 1) and P i (0, 1) in (38) we can obtain πi .
D.4
A simple example of the PALM model
In order to get a better understanding of the credit part of the PALM model we now work out
a simple example. Suppose there are only 3 rating classes, A, B and D, where D denotes the
defaulted class. Consider the stylized historical transition matrices P (1),P (2),P (3),P (4),P (5).
0.75 0.22 0.03
0.85 0.13 0.02
P (1) = 0.10 0.85 0.05
P (2) = 0.11 0.85 0.04
0
0
1
0
0
1
0.95 0.04 0.01
P (3) = 0.12 0.85 0.03
0
0
1
0.97 0.02 0.01
P (4) = 0.14 0.84 0.02
0
0
1
0.70 0.26 0.04
P (5) = 0.09 0.85 0.06
0
0
1
The average historical transition matrix does not have to be equal to the average of the historical transition matrices considered. Each year, rating agencies publish the average historical
transition matrix which typically makes use of much more data, so it is better to use them. In
73
this example we will, for the sake of simplicity, just use the average of the historical transition
matrices.
0.84 0.14 0.02
P = 0.10 0.87 0.03
0
0
1
The corresponding Z-score matrices are,
−0.674 −1.881
−1.036
Z(1) =
Z(2) =
1.282 −1.645
1.227
−1.881 −2.326
−0.524
Z(4) =
Z(5) =
1.080 −2.054
1.341
−2.054
−1.751
−1.751
−1.555
−1.645 −2.326
Z(3) =
1.175 −1.881
And,
−0.994 −2.054
Z=
1.282 −1.881
This results in the following Z-score deviations,
−0.320 −0.173
∆Z(1) = Z − Z(1) =
0
−0.236
0.650 0.273
∆Z(3) =
0.107
0
−0.4700 −0.303
∆Z(5) =
−0.059 −0.326
0.042
0
∆Z(2) =
0.055 −0.130
0.236
0
∆Z(4) =
0.095 0.173
The average Z-score deviations are
∆z(1)B = −0.246
∆z(1)D = −0.118
∆z(2)B = 0.021
∆z(2)D = −0.038
∆z(3)B = 0.461
∆z(3)D = 0.053
∆z(4)B = 0.118
∆z(4)D = 0.134
∆z(5)B = −0.387
∆z(5)D = −0.193
The common shift per year is then
∆z(1) = −0.182
∆z(2) = −0.008
∆z(3) = 0.257
∆z(4) = 0.126
∆z(5) = −0.290
74
In this example we will only consider the GDP growth (%) as explanatory variable for the
Z-score deviation. The resulting regression model is26 ,
∆z(t)B = β0B + β1B ∆z(t − 1)B + β2B GDP (t) + B
∆z(t)D = β0D + β1D ∆z(t − 1)D + β2D GDP (t) + D
Suppose GDP growth had the values (1, 2, 3, 4, 0.9) then the resulting least square estimates of
the coefficients are,
β̂0B = −0.58
β̂1B = −0.73
β̂2B = 0.28
β̂0D = −0.26
β̂1D = −0.18
β̂2D = 0.10
Intuitively the signs of these estimates seem to create the right dynamics. A high deviation
now predicts a low deviation tomorrow, which indicates a mean reversion property. Also if the
economy is predicted to do well, so a high GDP growth, indicates a high deviation, meaning
lower default probabilities. Not each of these coefficients are significant 27 , but for now we
neglect this fact because the reason is probably that we have not enough data to reject that the
coefficients are equal to zero. Now we can simulate transition matrices. Suppose we want to
predict the transition matrix at one period ahead then we first need a prediction or scenario of
GDP (6), say 2.5. The average Z-score deviations are predicted using the regression model,
∆ẑ(6)B = β̂0B + β̂1B ∆z(5)B + β̂2B GDP (6) = 0.41
∆ẑ(6)D = β̂0D + β̂1D ∆z(5)D + β̂2D GDP (6) = 0.03
Now we can easily obtain the predicted Z-score matrix and the equivalent transition matrix by
subtracting the predicted deviations from the average Z-score,
∆ẑ(6)B
∆ẑ(6)B
−1.401
−2.461
Ẑ(6) = Z −
=
1.254
−1.909
∆ẑ(6)D
∆ẑ(6)D
⇐⇒
0.919 0.074 0.007
P̂ (6) = 0.105 0.867 0.028
0
0
1
Suppose we have estimated the risk premium parameter to be 0.9924 for A rated bonds and
0.9804 for B rated bonds. The resulting risk neutral transition matrix is then,
0.912 0.072 0.015
Q̂(6) = 0.104 0.850 0.046
0.000 0.000 1.000
D.5
Simulation
In this subsection we will elaborate on how to actually generate scenarios by means of simulation
for a bond portfolio. In order to do so the preceding example is used.
26
27
Since we have only three rating classes we omitted the common shift per year in order to avoid collinearity.
0.95 certainty
75
So now we have a prediction of the risk neutral transition matrix we can, assuming a constant
flat interest rate, r, of 0.04, simulate the value of for instance a 3-year A zero bought at t = 5.
Suppose we would like to know the value of this bond at t = 6, P1 (6, 8). In order to do this we
need the two year risk neutral transition matrix predicted for t = 6. Because we assume this to
be a Markov chain,
0.839 0.127 0.032
Q̂(6, 8) = Q̂2 (6) = 0.183 0.730 0.087
0.000 0.000 1.000
So q̂1K (6, 8) = 0.032 and by 36 we obtain,
P1 (6, 8) =
1 − 0.032 + 0.032 ∗ 0.1
= 0.898
(1.04)2
Note that this is the value of the bond if it would have remained in the same rating class.
However, if after one year the bond is downgraded to B the value would be,
P1 (6, 8) =
1 − 0.087 + 0.087 ∗ 0.1
= 0.852
(1.04)2
In the worst case the bond is downgraded to D, i.e. defaulted, and is only worth 0.1. So
concluding,
0.898 w.p. p = 0.70
P1 (6, 8)) = 0.852 p = 0.24
(39)
0.1
p = 0.06
In this example we used a deterministic approach for the value of the macro economic variables,
here only GDP growth. Therefore we have only one probability distribution (39) for the value
of a bond portfolio. However, as stated earlier PALM generates stochastic scenarios of the
economic variables, and therefore we get a different probability distribution in each scenario of
the economy. Notice that in this model transition probabilities are independent of future values
of the economy. So what we can do is, in each scenario simulate the value of the bond portfolio
by drawing from the probability distribution of the bond portfolio in that particular scenario.
So for this example we can draw a random number X between zero and one, and let P1 (6, 8) be,
0.898 X ≤ 0.7
P1 (6, 8) = 0.852 0.7 < X ≤ 0.94
0.1
0.94 ≤ X
If we would do this for example 500 scenarios we would get a good representation of the probability distribution.
76
© Copyright 2026 Paperzz