Correlation Matrix with Block Structure and Efficient Sam

RESEARCH
Correlation Matrix with Block
Structure and Efficient Sampling Methods
September 2010
Jinggang Huang∗
Liming Yang, Senior Director, Quantitative Analytical Research Group, 212-438-1870,
[email protected]
A version of this paper was published in the Journal of Computational Finance, September 2010.
on this paper done while employed by Standard & Poor’s.
∗ Work
Abstract
Random sampling from a multivariate normal distribution is essential for Monte Carlo simulations in many credit
risk models. For a portfolio of N obligors, standard methods usually require O(N 2 ) calculations to get one
random sample. In many applications, the correlation matrix has a block structure that, as we show, can be
converted to a “quasi-factor” model. As a result, the cost to get one sample can be reduced to O(N ). Such a
conversion also enables us to check whether a user-defined “correlation” matrix is positive semidefinite and “fix”
it if necessary in an efficient manner.
Disclaimer: The models and analyses presented here are exclusively part of a quantitative research effort
intended to improve the computation time of Monte Carlo simulations when we deal with a correlation matrix that
has a block structure. The views expressed in this paper are the authors’ own and do not necessarily represent
the views of Standard & Poor’s. Furthermore, no inferences should be made with regard to Standard & Poor’s
credit ratings or any current or future criteria or models used in the ratings process for credit portfolios or any type
of financial security.
Table of Contents
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2
Correlation Matrix With Block Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
Implied Factor Model For Some Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
4
A Quasi-Factor Model For General Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
5
Performance Of Our Method In Real Life Problems . . . . . . . . . . . . . . . . . . . . . . . . .
11
6
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
Contents
2
www.standardandpoors.com
1
Introduction
A typical portfolio or collateral pool of a structure deal usually consists of more than one obligor. It is well known
that the overall risk of a portfolio or a structured product will depend not only on single obligor risk, but also on
how they are correlated with each other. Therefore, the correlation between obligors is an important component in
understanding the overall risk of a portfolio. In practice, two methods are generally used to specify the correlation
structure, that is, using a factor model or by specifying the correlation matrix directly.
In a factor model, all obligors correlate to each other through some “common factors”. Mathematically, we
assume that X , an N × 1 vector of standard normal random variables, can be written as:
X = BF + (1)
where B is a constant N × d matrix of factor loadings and the common factor F is a d × 1 vector of independent
standard normal random variables. is the idiosyncratic error term, an N × 1 vector of independent normal
random variables; the variance of each random variable of is picked so that the corresponding random variable
of X has variance 1. Also F and are independent of each other. It is clear that, in such a setup, once the factor
loading matrix B is given, the correlation matrix of X can be determined. In a factor model, the common factors
can be identified in advance (as in the observable factor model as described in McNeil et al. (2005)) or derived
from historical data (as in the latent factor model described in McNeil et al. (2005)).
For the other approach, the correlations between obligors are directly specified by analysts, who usually classify
the obligors into groups based on industries, sectors, countries, etc, and then the group-specific correlations are
determined. As a result, the correlation matrix defined in this way has a block structure, that is, the correlation
between any two obligors is determined by the groups that they belong to. Examples of this approach can be
found in Standard & Poor’s (2008) and Fitch Ratings (2005).
A detailed discussion of the pros and cons of each approach is beyond the scope of this paper, but we would
like to point out one (apparent) advantage of the factor approach over the correlation matrix approach. In a
factor model, a random sample can be obtained efficiently because of the assumption of independence given the
common factors; therefore, the cost of getting one random sample is proportional to N , the number of obligors
in the portfolio. In this case, Monte Carlo simulations can be performed quickly to get an estimate of the portfolio
loss. On the other hand, the cost for sampling from a multi-dimensional Gaussian distribution using a standard
method (eg, the Cholesky factorization) will be proportional to N 2 . Since the estimation of portfolio loss is crucial
in practice, it is important to find ways to speed up random sampling.
In this paper, we show that, if the correlation matrix has a block structure, the model is equivalent to a “factor
model like” structure. Hence, the cost of performing the Monte Carlo calculation can be reduced to a level similar
to that of a factor model.
This paper is organized as follows. In Section 2, we introduce the concept of a correlation matrix with block
structure. In Section 3, we consider a special case where a model specified by a correlation matrix with block
structure can be transformed to a factor model. In Section 4, we consider an efficient sampling method for a
general block correlation matrix. In Section 5, we present examples that compare the performance of our method
with the standard approach.
2
Correlation Matrix With Block Structure
Assume that X = (X1 , X2 , ...XN ) is a vector of Gaussian random variables and each Xi is a standard Gaussian
random variable with zero mean and unit standard deviation. Assume we can divide them into K groups (for
example, with respect to industry, region, size or some combinations of these criteria). We rewrite X as X =
(i)
(i)
(X (1) X (2) , ..., X (K) )T , where each X (i) = (X1 , ...XNi )T is the vector of variables that belong to the same
group i. We consider the case where correlation between two random variables only depends on the group they
belong to. That is, for any two groups i, j (including the case i = j ), there exists −1 ≤ ρi,j ≤ 1 such that for any
3
(i)
(j)
two (non-identical1 ) random variables Xm , Xn from the two groups, respectively, we have the group-specific
correlation:
(i)
ρ(Xm
, Xn(j) ) = ρi,j .
(2)
The correlation matrix ΣX has the block structure, as shown below:

1





































ρ1,1
ρ1,1
ρ1,1
1
ρ1,1
ρ1,1
ρ1,1
1
...
...
...
ρ1,1
ρ1,1
ρ1,1
ρ1,2
ρ1,2
ρ1,2
ρ1,2
ρ1,2
ρ1,2
ρ1,2
ρ1,2
ρ1,2
...
...
...
ρ1,2
ρ1,2
ρ1,2
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
ρ1,1
ρ1,1
ρ1,1
...
1
ρ1,2
ρ1,2
ρ1,2
...
ρ1,2
ρ2,1
ρ2,1
ρ2,1
ρ2,1
ρ2,1
ρ2,1
ρ2,1
ρ2,1
ρ2,1
...
...
...
ρ2,1
ρ2,1
ρ2,1
1
ρ2,2
ρ2,2
ρ2,2
1
ρ2,2
ρ2,2
ρ2,2
1
...
...
...
ρ2,2
ρ2,2
ρ2,2
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
ρ2,1
ρ2,1
ρ2,1
...
ρ2,1
ρ2,2
ρ2,2
ρ2,2
...
1
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
... ...
... ...
... ...
ρ1,K
ρ1,K
ρ1,K
ρ1,K
ρ1,K
ρ1,K
ρ1,K
ρ1,K
ρ1,K
...
...
...
ρ1,K
ρ1,K
ρ1,K
..
.
..
.
..
.
..
.
..
.
... ...
ρ1,K
ρ1,K
ρ1,K
...
ρ1,K
... ...
... ...
... ...
ρ2,K
ρ2,K
ρ2,K
ρ2,K
ρ2,K
ρ2,K
ρ2,K
ρ2,K
ρ2,K
...
...
...
ρ2,K
ρ2,K
ρ2,K
..
.
..
.
..
.
..
.
..
.
ρ2,K
ρ2,K
ρ2,K
...
ρ2,K
..
.
..
.
..
.
..
.
..
.
ρK,K
1
ρK,K
ρK,K
ρK,K
1
..
.
..
.
..
.
..
.
ρK,K
ρK,K
...
1
..
.
..
.
..
.
..
.
... ...
..
.
..
.
ρK,1 ρK,1 ρK,1 . . . ρK,1 ρK,2 ρK,2 ρK,2 . . . ρK,2 . . . . . .
1
ρK,1 ρK,1 ρK,1 . . . ρK,1 ρK,2 ρK,2 ρK,2 . . . ρK,2 . . . . . . ρK,K
ρK,1 ρK,1 ρK,1 . . . ρK,1 ρK,2 ρK,2 ρK,2 . . . ρK,2 . . . . . . ρK,K
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
ρK,1 ρK,1 ρK,1 . . . ρK,1 ρK,2 ρK,2 ρK,2 . . . ρK,2 . . . . . . ρK,K
. . . ρK,K
. . . ρK,K
. . . ρK,K
where the K submatrices on the diagonal are of dimensions N1 × N1 , N2 × N2 ,... NK × NK .
To sample from N (0, ΣX ) using the standard Cholesky factorization requires O(N 2 ) calculations per sample.
However, we can exploit the block structure presented in ΣX to reduce the cost of sampling. Below, we propose
two methods: the implied factor model and a “quasi-factor” model. The former approach reduces the problem to a
factor model that is easy to implement, but it only works with some special cases. The latter method needs more
effort to implement but can be applied to any cases where ΣX has block structure and is positive semidefinite.
Since the correlations are usually specified by user, it may happen that ΣX is not positive semidefinite (as a
result, the sampling problem is not well defined). We present efficient methods to detect and fix such problems.
3
Implied Factor Model For Some Special Cases
As we mentioned in the introduction, one advantage of factor models is that sampling can be carried out efficiently
(with O(N ) calculations). Therefore, it is desirable to convert (if possible) a multivariate Gaussian distribution to
a factor model. We note that, in Andersen et al. (2003), the authors propose an iteration procedure to find a
1
In the identical case, i.e. when i = j and m = n, we have, of course
(i)
(j)
ρ(Xm , Xn ) = 1
4
www.standardandpoors.com



















 . (3)


















factor model approximation of a general Gaussian distribution. In this paper, we focus on a correlation matrix
that has a block structure and derive a factor model (or a “quasi-factor” model) that exactly replicates the original
distribution. Moreover, compared to the procedure of Andersen et al. (2003), our approach is much faster because
we do not need any iteration and because most of the matrix operations are performed on much smaller “group”
level matrices.
For Gaussian distributions with block structure, we show in this section that, for some special cases, we can
derive a factor model that induces the same distribution.
We define the K × K “group level correlation matrix”2 R = (ρi,j ), where ρi,j is as defined in Equation(2) for
i = 1, ..., K; j = 1, 2, ..., K .
Suppose that R is positive semidefinite. Then we can find a “square root” G of R, ie:
G · GT = R
(4)
(i)
by using the Cholesky factorization. Now we consider the following factor model. For each Xm as the m − th
random variable from group i, let:
(i)
Xm
=
K
X
G(i, j) ∗ j +
p
(i)
1 − ρi,i ηm
(5)
j=1
(i)
where j , j = 1, 2, ..., K and all the ηm are independent standard normal random variables. It is easy to check
that the following result is true.
Proposition 1. If R is positive-semidefinite, then the factor model defined in equation (5) induces the correlation
matrix ΣX .
Hence we have converted the original model (specified by correlation matrix) to a factor model, with K common
factors. Notice that, for each group, the "factor loadings" of every obligor in that group are all the same. It is easy
to see that a random sample from this factor model needs O(N + K 2 ) calculations.
Notice that, in order for this method to work, we have to assume that R is positive-semidefinite, which might not
be true even when the original "full" matrix ΣX is positive definite. The following case shows such an example.
Consider the following correlation matrix with block structure

ΣX




=




1.0
0.1
0.1
1.0
0.099 0.099
0.099 0.099
0.08
0.08
0.08
0.08

0.099 0.099 0.08 0.08

0.099 0.099 0.08 0.08 

1.0
0.1 0.06 0.06 
.
0.1
1.0 0.06 0.06 


0.06 0.06 1.0 0.1 
0.06 0.06 0.1 1.0
The matrix is positive definite because each diagonal entry is greater than the sum of all other entries in the same
row. Now we consider the corresponding group level "correlation matrix"

0.1

R =  0.099
0.08

0.099 0.08

0.1 0.06  .
0.06 0.1
The determinant of the matrix = - 0.01786, indicating that this matrix is not positive-semidefinite.
2
Notice that, in general, ρi,i 6= 1, because it is the intra group correlation, not the correlation of an entity with itself, so R is not really a
correlation matrix in the usual sense
5
On the other hand, the correlation matrix ΣX itself is not always positive-semidefinite when it is defined using
group level correlations. 3 Therefore, it is important to check whether ΣX is positive-semidefinite to make sure
the model is well defined. An ordinary method involves calculating all the eigenvalues to see whether they are all
non-negative, however, such a method need O(N 3 ) calculations, which is expensive when N is large. When ΣX
has the block structure, this task can be performed more efficiently using the much smaller R matrix with O(K 3 )
calculations, as the following result indicates
Proposition 2. ΣX is positive-semidefinite if and only if the K × K matrix






R+




1
N1 (1
− ρ1,1 )
0
0
..
.
0
0
1
N2 (1 − ρ2,2 )
0
..
.
0
0
0
0
...
0
...
1
(1
−
ρ
)
...
3,3
N3
..
..
.
.
...
...
...
0
0
0
..
.
1
NK−1 (1 − ρK−1,K−1 )
...
0
0
0
0
..
.
0
1
NK (1











(6)
− ρK,K )
is positive-semidefinite.
We will prove this conclusion after Proposition 5. This result can also be used by analysts when they determine
the correlation structure. For example, if they want to make sure that the correlation numbers they specify always
lead to a well defined distribution (i.e. ΣX is positive-semidefinite) regardless of the number of obligors in each
group , then they should make sure that R is positive-semidefinite – because when the Nk → ∞, the matrix
defined in (6) converges to R.
4
A Quasi-Factor Model For General Cases
The factor model approach described in the previous section only works if the group level "correlation matrix" R is
positive-semidefinite. In this section, we study the more general cases where we only require that the ΣX matrix
is well defined (i.e. positive-semidefinite). The method we adopt here can be summarized as follows: We derive
an eigenvector decomposition of ΣX and exploit patterns shown in the decomposition to get an efficient random
sampling.
First, we present some simple results that lead to the decomposition. For each k = 1, 2, ..., K , let Vk be the
subspace of RN such that its elements are of the following form:
x̂ =
(0, 0, ..., 0, 0, 0, ..., 0, ..., 0, 0, ..., 0, x1 , x2 , ..., xNk , 0, 0, ..., 0, ..., 0, 0, ..., 0)T
| {z } | {z }
| {z } |
| {z }
{z
} | {z }
N1
N2
Nk−1
Nk
Nk+1
NK
where
x1 + x2 + ... + xNk = 0
Let V0 be the subspace of RN such that its elements are of the following form:
x̂=
3
(x1 , x1 , ..., x1 , x2 , x2 , ..., x2 , ..., xK , xK , ..., xK )T
|
|
{z
} |
{z
}
{z
}
N1
N2
NK
(7)
(8)
If ΣX is derived, for example, empirically from a complete set of historical data, then it is positive-semidefinite. However, in practice, ΣX is
usually defined directly and is not always positive-semidefinite. More details are given in the following sections in regard to this.
6
www.standardandpoors.com
Let x = (x1 , x2 , ..., xk )T , we define the mapping RK → RN
P (x) =
x̂
kx̂k2
where x̂ is as in (8).
The following lemma can be easily proved.
Lemma 1. All the subspace Vk , k = 0, 1, ..., K are perpendicular to each other, and they span RN , ie:
RN = V0 ⊕ V1 ⊕ ... ⊕ VK .
Each Vk is an invariant subspace of ΣX , ie:
ΣX Vk ⊂ Vk
and for each k 6= 0, Vk consists of eigenvectors of ΣX with eigenvalue (1 − ρk,k ), ie, for each v ∈ Vk , we have
ΣX v = (1 − ρk,k )v.
Now we consider the following two different cases: Vk , k 6= 0 and V0 .
For the case Vk , k 6= 0, to get an orthonormal basis of Vk , we define a matrix Fm of dimension m × (m − 1) for
any integer m > 1

Fm






=






√1
1·2
√
− 11·2
0
√1
2·3
√1
2·3
− √22·3
√1
3·4
√1
3·4
√1
3·4
− √33·4
...
1
(m−1)m
√ 1
(m−1)m
√ 1
(m−1)m
√ 1
(m−1)m
√
...
...
...
0
0
..
.
..
.
..
.
..
.
0
0
...
0
..
.
− √ m−1







.






(9)
(m−1)m
It is straightforward to check that
T
Fm
Fm = Im−1,m−1
(10)
where Im−1,m−1 is the identity matrix of dimension (m − 1) × (m − 1).
For any k 6= 0, we define








Uk = 







0N1 ×Nk −1
0N2 ×Nk −1
..
.
0Nk−1 ×Nk −1
FNk
0Nk k+1×Nk −1
..
.
0NK ×Nk −1








.







where 0i×j is just the zero matrix of dimension i × j . It is straight forward to check that each column v of Uk is an
unit vector (whose L2 -norm is 1) in Vk and all the columns of Uk are orthogonal to each other. So by definition,
7
the columns of UK form an orthonormal basis of the subspace Vk . Note that each column of Vk is an eigenvector
of ΣX with respect to the eigenvalue 1 − ρk,k , but these eigenvectors only depends on the size Nk , and have
nothing to do with any of the group correlations (ρi,j ).
Next, we discuss an efficient method to derive the eigenvectors from V0 . Formally, we are looking for vectors
from v ∈ V0 that are eigenvectors of ΣX , ie,
ΣX v = λv.
(11)
where v is of the form of (8) and λ is some real number. For that purpose, we define




B=



1 + (N1 − 1)ρ1,1
N1 ρ2,1
N1 ρ3,1
..
.
N1 ρK,1
N2 ρ1,2
1 + (N2 − 1)ρ2,2
N2 ρ3,2
..
.
N2 ρK,2
N3 ρ1,3
N3 ρ2,3
1 + (N3 − 1)ρ3,3
..
.
NK ρK,3
...
...
...
..
.
...
NK ρ1,K
NK ρ2,K
NK ρ3,K
..
.
1 + (NK − 1)ρK,K




.



It is straightforward to check that (11) holds if and only if
B(x1 , x2 , ..., xK )T = λ(x1 , x2 , ..., xK )T
(12)
which means that we can reduce the problem of (11) from N − dimensional space to K− dimensional space. We
restate this result in the following
Lemma 2. x ∈ RK is an eigenvector of B with respect to an eigenvalue λ if and only if P (x) is an eigenvector
of ΣX with respect to eigenvalue λ.
If the eigenvalues of B are all different, then the K eigenvectors (P (xk )) of ΣX , mapped from the K eigenvectors (xk ) (k = 1, 2, ..., K ) of B , are automatically orthogonal to each other 4 . On the other hand, if for some
eigenvalue λk of B , there are multiple eigenvectors v1 , v2 , ..., vm , then P (v1 ), P (v2 ), ..., P (vm ) may not be
orthogonal to each other. But we can use the Gram-Schmidt process to orthogonalize P (v1 ), P (v2 ), ..., P (vm ).
We have shown that the problem of decomposing ΣX (with a cost of O(N 3 )) can be reduced to the problem of
decomposing B (with a cost of O(K 3 )) and we summarize this in the following result
Proposition 3. ΣX has the following eigenvector decomposition:
ΣX = U DU T
where U is an orthogonal matrix and D is a diagonal matrix. Furthermore

FN1

0

U =
G ..

.
0
0
FN2
..
.
0
0
0
..
.
0
...
0
...
0
..
..
.
.
. . . F NK






(13)
where each FNk is as defined in (9) and G is an N × K matrix, each column of G equals P (x) with x being an
eigenvector of B . The first K entries on the main diagonal of D are the eigenvalues of B ; Beginning from the
K + 1 entry on the main diagonal, we have the following eigenvalues: 1 − ρ1,1 with multiplicity of N1 − 1, 1 − ρ2,2
with multiplicity of N2 − 1, ..., 1 − ρK,K with multiplicity NK − 1.
4
This is a known result of linear algebra. The proof is actually very simple: Suppose ΣX v1 = λ1 v1 and ΣX v2 = λ2 v2 , where λ1 6= λ2 ,
then λ1 < v1 , v2 >=< ΣX v1 , v2 >=< v1 , ΣT
X v2 >=< v1 , ΣX v2 >= λ2 < v1 , v2 >, hence < v1 , v2 >= 0
8
www.standardandpoors.com
Once we have the decomposition as presented in Proposition 3, a sample can be obtained by calculating:
√
X̃ = U D
(14)
where is an N × 1 vector of independent standard normal random variables. Note that in (14), the first K factors
(1 , ..., K ) affect all variables, the next N1 − 1 factors (K+1 , ..., K+N1 ) affect only variables in group 1, then the
next N2 − 2 factors affect only variables in group 2, and so on. Such a structure bears similarities to a factor
model, the difference is that, in factor models we have one "idiosyncratic" factor for each obligor, but in (14), we
have Nk − 1 "idiosyncratic" factors for each group k . Because of such similarities, we call the model in (14) a
quasi-factor model. So Proposition 3 indicates that a model with block correlation matrix can be converted to a
quasi-factor model. The following result is just a simple observation, but we state it as a proposition due to its
importance.
Proposition 4. A sample from the quasi-factor model (14) (and therefore any model with block correlation structure) can be obtained with O(K 2 ) + O(N ) steps.
√
√
Proof. First, since D is a diagonal matrix, D can be calculated in O(N ) steps. In (13), the G matrix has
the same rows for each group, i.e., the first N1 rows
√ are all the same, the next N2 rows are all the same, ..., so
the multiplication of G with the first K numbers of D can be carried out by O(K × K) calculations. Now, for
each group k , we need to calculate FNk k , where k is the sub-vector of corresponding to the sub matrix FNk in
(13). By the definition of FNk in (9), any two adjacent rows of FNk are all the same except for (at most) 2 entries.
Therefore, given FNk (i, :)k (where FNk (i, :) is the i − th row of FNk ), we only need a fixed amount of steps to
get FNk (i − 1, :)k . Therefore, FNk k can be carried out in O(Nk ) steps. So the total amount of steps needed to
get one sample from the quasi-factor model (14) is O(K 2 ) + O(N ). 2
Note that the cost of getting one random sample using the quasi-factor model is in the same order of that of
the implied factor model of section 3 (of course, when R is positive-semidefinite). It is natural to ask whether
the sampling time can be further improved. This seems to be hopeless if we want to exactly replicate ΣX using
(quasi-) factor model. On the other hand, in practice, it might suffice just to approximate ΣX using fewer number
of common factors. As we mentioned earlier, Andersen et al. (2003) proposed an approximation method which
converts the general whole matrix ΣX to a factor model. As we noted earlier, this approach might be slow when
the portfolio is large. It is interesting to see whether we can combine the technique developed here and that of
Andersen et al. (2003) to come up with an efficient approximation procedure when ΣX has block structure. This
is a research that we are currently undertaking.
By Lemma 2, we can always find a set of real eigenvectors of B that span RK . On the other hand, B is not
symmetric, and our experience shows that ordinary numerical routines often produce complex eigenvectors. So
it is interesting to see whether we can transform the problem of (12) into another problem which only involves a
symmetric matrix. That can be done by the following:
Proposition 5. Define a symmetric matrix




e=
B



1 + (N1 − 1)ρ1,1
√
N2 N1 ρ2,1
√
N3 N1 ρ3,1
..
.
√
NK N1 ρK,1
√
and let M = diag( N1 ,
√
√
N1 N2 ρ1,2
1 + (N2 − 1)ρ2,2
√
N3 N2 ρ3,2
..
.
√
NK N2 ρK,2
√
N1 N3 ρ1,3
√
N2 N3 ρ2,3
1 + (N3 − 1)ρ3,3
..
.
√
NK N3 ρK,3
...
...
...
..
.
...
√
N1 NK ρ1,K
√
N2 NK ρ2,K
√
N3 NK ρ3,K
..
.
1 + (NK − 1)ρK,K








√
e x] = λ[M x].
N2 , ..., NK ), then for x ∈ RK , Bx = λx if and only if B[M
e , so Bx = λx if and only if M −1 BM
e x = λx if and
Proof. It is straightforward to check that B = M −1 BM
e x = λM x. 2
only if BM
9
So we do not have to solve problem (12) directly, instead, we can solve the following:
e = λy
By
and we get the solution of (12) by taking x = M −1 y.
As a by-product of Proposition 3 and 5, we can now prove Proposition 2 in the previous section.
Proof of Proposition 2. By Proposition 3, ΣX is positive-semidefinite if and only if the eigenvalues of B are all
non-negative, which in turn is true if and only if B̃ is positive-semidefinite, by Proposition 5. On the other hand,
the matrix defined in (6) equals

√1
N1
0






0
..
.
0
√1
N2
..
.
0

...
0

0 ...
..
..
.
.
0 ...
0
..
.
 
 
 
 B̃ 
 
 
0
√1
NK
√1
N1
0
0
..
.
0
√1
N2
...
0

0 ...
..
..
.
.
0 ...
0
..
.






0
..
.
0
√1
NK
Hence ΣX is positive-semidefinite if and only if the matrix of (6) is positive-semidefinite. 2
So far, we have assumed that the model with block structure (3) is well-defined, ie, ΣX is positive-semidefinite.
But in practice, it is possible that the user-defined ΣX might not be positive-semidefinite. For the remaining part
of this section, we consider the practical problem of "fixing" ΣX when it is not positive-semidefinite. Methods of
"fixing" a general ‘correlation’ can be found in Rebonato and Jäckel (2000), here we focusing on more efficient
methods when correlation matrix has a block structure.
First, we present a common method for "fixing" ΣX when it is not positive definite.5 Consider any eigenvector
decomposition 6
ΣX = Ũ D̃Ũ T
(15)
where Ũ is an orthogonal matrix (i.e. Ũ Ũ T = I ) and D̃ is a diagonal matrix:




D̃ = 



λ1
0
0
..
.
0
0
λ2
0
..
.
0
0
0
λ3
..
.
0
...
...
...
..
.
...
0
0
0
..
.
λN




.



(16)
Each of λi is an eigenvalue of ΣX , and Ũ (:, i), the i − th column of Ũ is the corresponding eigenvector, ie:
ΣX Ũ (:, i) = λi Ũ (:, i).
If ΣX is specified through user-defined ρi,j , it is possible that ΣX is not positive definite, i.e., one or more of
the eigenvalues λi might be negative. We can fix this problem by setting all the negative λi ’s in (4) to 0 to get
a ‘fixed’ D̃ and use (15) (with a ‘fixed’ D̃ matrix) to get the ‘new’ positive-semidefinite ΣX . Of course, to use
such a method to fix ΣX , we need to find the decomposition of (15), which in general induces a cost of O(N 3 )
calculations. Our question now is: can we achieve that efficiently? The answer is yes, because the ‘specific’
decomposition as described in Proposition 3 can be carried out in O(K 3 ) calculations, instead of O(N 3 ). The
only technical detail left is: will this "fixing" procedure using the specific decomposition of Proposition 3 lead to the
same positive-semidefinite ΣX , as induced by any decomposition of (15)? For that we have the following result.
5
This is the ‘spectral decomposition’ method introduced in Rebonato and Jäckel (2000). Other methods for "fixing" can also be found in
the same paper. In this research, we focus on developing efficient ‘spectral decomposition’ method, but the general idea of reducing the
complexity of matrix operations can also be applied to other "fixing" methods when the correlation matrix has block structure.
6
In general, the eigenvector decomposition is not unique, so we use Ũ instead of U to differentiate an arbitrary eigenvector decomposition
and that of proposition 3.
10
www.standardandpoors.com
Proposition 6. The resulting positive-semidefinite ΣX of the aforementioned fixing procedure does not depend
on the choice of Ũ .
Proof.
ΣX = Ũ D̃Ũ T
Suppose there is a negative eigenvalue λ < 0, and without loss of generality, let’s assume λ1 = λ2 = ... = λm =
λ < 0, and no other eigenvalues equal λ. We ‘fix’ this eigenvalue λ by setting it to 0, and we end up with a new
diagonal matrix D̃new , and as a result, a new Σ̃X = Ũ D̃new Ũ T . It is simple to check that
Σ̃X = ΣX − λŨ (:, 1 : m)Ũ (:, 1 : m)T
where Ũ (:, 1 : m) is the first m columns of Ũ .
Assume we have another decomposition
ΣX = Û DÛ T .
The columns Û (:, 1), Û (:, 2), ..., Û (:, m) of Û span the subspace of RN that consists of all λ− eigenvectors of
ΣX , the same space as that spanned by Ũ (:, 1), Ũ (:, 2), ..., Ũ (:, m), in other words, each of the two sets is an
orthonormal basis of the same subspace. Therefore
Û (:, 1 : m) = Ũ (:, 1 : m)A
where A is a m × m orthogonal matrix. So the fixing procedure using Ũ results in the following covariance matrix:
ΣX − λÛ (:, 1 : m)Û (:, 1 : m)T
=
ΣX − λŨ (:, 1 : m)AAT Ũ (:, 1 : m)T
=
ΣX − λŨ (:, 1 : m)Im×m Ũ (:, 1 : m)T
=
ΣX − λŨ (:, 1 : m)Ũ (:, 1 : m)T .
So the fixing procedure is independent of the choice of U . So far, we have only "fixed" one negative eigenvalue,
and we can repeat this procedure and fix all negative eigenvalues to get a positive definite ΣX , which does not
depend on the choice of Ũ . 2
5
Performance Of Our Method In Real Life Problems
We have done some performance tests by comparing Monte Carlo simulations using both standard Cholesky
factorization and quasi-factor model methods. All the tests are carried out on Intel Due CPU 2.33 GHz 1.95
GB Ram systems. The programming language we use is C++, together with the numerical package provided
by Numerical Algorithms Group (NAG). Tests are performed for two large loan portfolios, whose size and group
information is presented in table 1. The group-specific correlations, i.e. the ρi,j ’s are determined by assigning
intra-group (i = j ) and inter-group (i 6= j ) correlations to each pair of groups according to their regions and
sectors. Some statistics of the correlations are presented in table 2.
To prepare for the Monte Carlo simulation using the standard method, we perform a Cholesky factorization
using “Numerical Recipe” algorithm. If the correlation matrix is not positive-semidefinite, this factorization will fail,
and we will perform the "fixing" procedure as described in previous sections. The eigenvalue/vector computations
are carried out using NAG’s nag_real_symm_eigenvalues (f02aac) function. After the "fixing" procedure, we
perform the Cholesky factorization again. Similarly, to prepare for the Monte Carlo simulation using the quasifactor model we proposed, we perform eigenvalue/vector computations (using nag_real_symm_eigenvalues) of
11
Table 1
Portfolios
Portfolio ID
# of Obligors
# of Groups
# of obligors in each group
min
max
mean
1
5134
101
1
428
51
2
10929
123
1
1083
89
Table 2
Statistics of group correlations
Portfolio ID
intra-group
inter-group
min
max
mean
min
max
mean
1
0
0.5
0.20
0
0.35
0.004
2
0
0.5
0.19
0
0.35
0.003
the matrix in Proposition 5 and ‘fix’ it if there are negative eigenvalues. Table 3 shows the time needed for the
preparation. Note that the original ‘correlation’ matrix of portfolio 1 is not positive-semidefinite and needs "fixing";
the ‘correlation’ matrix of portfolio 2 is positive-semidefinite. We see that the time needed to prepare for the
quasi-factor model is much less.
The time needed for performing Monte Carlo simulations is presented in table 4. Again the calculation time is
greatly reduced when the quasi-factor model approach is used.
12
www.standardandpoors.com
Table 3
Time needed to prepare for simulation
Portfolio ID
standard
fixing time
1
quasi-factor
Cholesky fact.
13 hours
2 hours
2
15 hours
< 1 minute
< 1 minute
Table 4
Simulation time comparison
Portfolio 1
Portfolio 2
6
number of trials
standard
quasi-factor
10,000
15 minutes
3 minutes
100,000
1 hour 40 minutes
24 minutes
500,000
8 hours 10 minutes
1 hour 20 minutes
10,000
1 hour 7 minutes
3 minutes
100,000
5 hours 30 minutes
28 minutes
500,000
1 day 2 hours
2 hours 28 minutes
1,000,000
4 days
4 hours 28 minutes
2,000,000
8 days
8 hours 30 minutes
Acknowledgments
We would like to thank Bill Morokoff, Craig Friedman, Jayson Rome and other colleagues of the Quantitative
Analytics Research Group for helpful suggestions and discussions.
References
L. Andersen, J. Sidenius, and S. Basu. All your hedges in one basket. Risk, (Noverr ber):67–72, 2003.
Fitch Ratings. The fitch default vector model-user manual. Fitch Ratings Report, 2005.
A. McNeil, R. Frey, and P. Embrechts. Quantitative Risk Management. Princeton University Press, 2005.
R. Rebonato and P. Jäckel. The most general methodology to create a valid correlation matrix for risk management and option pricing purpose. The Journal of Risk, 2:17–28, 2000.
Standard & Poor’s. CDO evaluator system version 4.1 user guide. Standard & Poor’s Structure Finance Group,
2008.
13
For more information, visit us at www.standardandpoors.com or call:
Americas
Australia
Europe
Japan
Singapore
1
61
44
81
65
212.438.7280
1300.792.553
20.7176.7176
3.4550.8711
6239.6316
Standard & Poor’s Ratings Services
55 Water Street
New York, NY 10041
www.standardandpoors.com
Copyright © 2013 by Standard & Poor’s Financial Services LLC. All rights reserved. No content (including ratings, credit-related analyses and
data, valuations, model, software or other application or output therefrom) or any part thereof (Content) may be modified, reverse engineered,
reproduced or distributed in any form by any means, or stored in a database or retrieval system, without the prior written permission of Standard
& Poor’s Financial Services LLC or its affiliates (collectively, S&P). The Content shall not be used for any unlawful or unauthorized purposes.
S&P and any third-party providers, as well as their directors, officers, shareholders, employees or agents (collectively S&P Parties) do not
guarantee the accuracy, completeness, timeliness or availability of the Content. S&P Parties are not responsible for any errors or omissions
(negligent or otherwise), regardless of the cause, for the results obtained from the use of the Content, or for the security or maintenance of
any data input by the user. The Content is provided on an “as is” basis. S&P PARTIES DISCLAIM ANY AND ALL EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE OR USE, FREEDOM FROM BUGS, SOFTWARE ERRORS OR DEFECTS, THAT THE CONTENT’S FUNCTIONING WILL BE
UNINTERRUPTED OR THAT THE CONTENT WILL OPERATE WITH ANY SOFTWARE OR HARDWARE CONFIGURATION. In no event
shall S&P Parties be liable to any party for any direct, indirect, incidental, exemplary, compensatory, punitive, special or consequential
damages, costs, expenses, legal fees, or losses (including, without limitation, lost income or lost profits and opportunity costs or losses
caused by negligence) in connection with any use of the Content even if advised of the possibility of such damages.
Credit-related and other analyses, including ratings, and statements in the Content are statements of opinion as of the date they are expressed
and not statements of fact. S&P’s opinions, analyses and rating acknowledgment decisions (described below) are not recommendations to
purchase, hold, or sell any securities or to make any investment decisions, and do not address the suitability of any security. S&P assumes
no obligation to update the Content following publication in any form or format. The Content should not be relied on and is not a substitute
for the skill, judgment and experience of the user, its management, employees, advisors and/or clients when making investment and other
business decisions. S&P does not act as a fiduciary or an investment advisor except where registered as such. While S&P has obtained
information from sources it believes to be reliable, S&P does not perform an audit and undertakes no duty of due diligence or independent
verification of any information it receives.
To the extent that regulatory authorities allow a rating agency to acknowledge in one jurisdiction a rating issued in another jurisdiction for
certain regulatory purposes, S&P reserves the right to assign, withdraw or suspend such acknowledgement at any time and in its sole
discretion. S&P Parties disclaim any duty whatsoever arising out of the assignment, withdrawal or suspension of an acknowledgment as well
as any liability for any damage alleged to have been suffered on account thereof.
S&P keeps certain activities of its business units separate from each other in order to preserve the independence and objectivity of their
respective activities. As a result, certain business units of S&P may have information that is not available to other S&P business units. S&P
has established policies and procedures to maintain the confidentiality of certain non-public information received in connection with each
analytical process.
S&P may receive compensation for its ratings and certain analyses, normally from issuers or underwriters of securities or from obligors.
S&P reserves the right to disseminate its opinions and analyses. S&P’s public ratings and analyses are made available on its Web sites,
www.standardandpoors.com (free of charge), and www.ratingsdirect.com and www.globalcreditportal.com (subscription), and may be distributed through other means, including via S&P publications and third-party redistributors. Additional information about our ratings fees is
available at www.standardandpoors.com/usratingsfees.
STANDARD & POOR’S, S&P, GLOBAL CREDIT PORTAL and RATINGSDIRECT are registered trademarks of Standard & Poor’s Financial
Services LLC.