Risk measures with Model Risk Abstract

Risk measures with Model Risk
Alessandro Pollastri∗
Peter C. Schotman†
February 15, 2016
Abstract
We study the properties of dynamic models on V aR. Using a dynamic model for
realized volatility we estimate the density of future volatility. Mixing this density
with the conditional density of returns given the volatility we derive the predictive
density of returns, which we use to estimate the risk measures. We find that different dynamic spefications lead to very diverse V aR estimates especially when longer
horizons are considered. Furthermore, using a Bayesian approach we consider the
effect of parameter uncertainty on the risk measures. We show that parameter risk
also contributes significantly to the V aR estimate.
1
Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands.
Email: [email protected]
2
Email: [email protected]
1
1
Introduction
Risk management is a fundamental task for financial institutions. A widely known
measure of risk is given by V aR, introduced by Jorion and then it has become a
standard indicator for supervisors and risk managers.
Regulators of the banking sector are mainly concerned in next day risk measures
of a given position for a certain bank. However, pension funds and insurers, for
example, have a longer investment horizons. In this paper we study how different
dynamic models affect risk measures at different horizons and also what is the impact
of estimation risk on these measures.
We propose to compute value at risk for the next T days at confidence level α using
the predictive density:
Z
p(RT ) =
+∞
p(Rt+T |Σ2t+T )p(Σ2t+T , θ|It−1 )p(θ|It−1 )dΣ2t+T dθ
(1)
0
where p(Rt+T |Σt+T ) is the conditional distribution of the cumulative returns over the
investment horizon T given the integrated variance, Σ2t+T defined by:
Σ2t+T =
T
X
2
σt+i
i=1
where σt2 is the daily volatility. Then, V aR at any confidence level α is obtained as
Z
−V aR(α)
α=
p(Rt+T )dRt+T
(2)
−∞
In order to obtain an estimate of p(Rt+T ) a dynamic model for the variance at any
2
2
maturity,T , σt+T
is required and hence value at risk can be obtained according to (2).
We aim at comparing value at risk estimates obtained using different volatility models. Escanciano and Olmo (2011) study the issue of model risk within a backtesting
framework. In order to do proper backtesting they propose to account model misspecification through the asymptotic distribution of these models and they propose
to do it using a bootstrap procedure. However, we differ from their study because we
analyze the V aR obtained considering different dynamic realized volatility models.
Moreover, it is common practice to obtain value at risk using the square root scaling
rule, i.e. the T -periods V aR is given by square root of T times value at risk for the
next period. We want to compare how exploiting the dynamic structure of these
models leads to different estimates of V aR compared to the square root scaling rule.
Danielsson and Zigrand (2006) , with a jump-diffusion model, study the effects of
using the square root T rule and they conclude that it leads to an underestimation
of V aR and that this becomes worse for longer horizons and for quantiles close to
the end of the distribution. Christoffersen, Diebold, and Schuermann (1998) study
this issue within a GARCH framework and they point out the opposite: they claim
that SRTR leads to too high fluctuations in volatility and hence overestimates V aR.
Our work differs from these studies in several ways. First, both the abovementioned
studies consider the conditional density of returns, given previous day volatility. In
this setting Danielsson and Zigrand (2006) show that fat-tails of the distribution
implies that the scaling law does not hold. Our model follows Andersen, Bollerslev,
Diebold, and Labys (2003) in specifying a conditional normal model geiven current
day stochastic volatility. Fat-tails then result from mixing of the return distribution
3
and the stochastic volatility. This model is particularly well suited for studying temporal aggregation.
Furthermore, this paper aims at investigating how parameter uncertainty for a given
model of volatility affect value at risk estimate and how it differs from the one obtained using the dynamic properties of the model taken into account. This issue has
not been extensively analyzed in the literature: an example is given by Escanciano
and Olmo (2010). Their focus is mainly on backtesting and how estimation risk affect
the common backtesting procedures when choosing a parametric model to compute
V aR. They point out that common procedures to evaluate risk models may lead to
wrong indications when it comes to rejecting a model. Our approach focuses directly
on adjusting value at risk for parameter uncertainty.
Gourieroux and Zakoı̈an (2013) also study the the effect of estimation risk on V aR.
They propose a methodology to reduce the bias arising from estimating a given
parametric model based on the expansion of the residual. When facing the issue of
parameter uncertainty, we differ from their methodology by taking a Bayesian perspective: sampling the variances with the posterior draws of the parameters of the
volatility model we consider allows us to take into account parameter uncertainty.
With the introduction of high frequency data, the way of modeling and forecasting volatilities has changed and many models have been introduced to capture the
information content that these data contain. The research in this field assumes as
underlying model of stock returns a jump-diffusion model that can be described by a
SDE. The quadratic variation is a way to describe how such a process varies, however
this is not observable and it must be estimated using high frequency data. A widely
4
accepted estimator for the quadratic variation is the realized variance defined as: Andersen, Bollerslev, Diebold, and Labys (2003) suggested the use of reduced form time
series forecasting models for realized volatilities: a famous example is given by Corsi
(2009) that models next day realized variance as a restricted AR(22) to capture the
long term property of RV. In the same fashion, more models have been introduced
adding different predictors for RV: a variable estimating the jump component of the
quadratic variation, a variable measuring the integrated variance of the quadratic
variation or alternative and robust estimators of these variables. However, these
models do not assume any dynamic for these additional variables making impossible
to build iterated forecasts of RV.
The paper is organized as follows: section two introduces the methodology to obtain
value at risk with and without parameter uncertainty, section three shows the results
and section four concludes the paper.
5
2
2.1
Methodology
The Model
Let p(Rt+T |Σ2t+T ) be the density of the cumulative return over the investment horizon
of T days. We assume p(Rt+T |Σ2t+T ) to be normally distributed with mean zero and
variance given by Σ2t+T . The cumulative return over this period is given by:
Rt+T = rt+1 + rt+2 + · · · + rt+T
(3)
Assuming independence of returns over time allows to obtain the variance over the
investment period as follows:
The notation highlights how the resulting density depends on the cumulative variance
Σ2t+T . In order to obtain a forecast for the variance we need a model which is
characterized by its parameter vector θ. We are interested in obtaining the predictive
density for the T cumulative return marginalizing with respect to the cumulative
variance and the parameters:
Z
p(Rt+T ) =
p(Rt+T |Σ2t+T )p(Σ2t+T |θ)p(θ)dΣ2t+T dθ
(4)
In order to study the effect of having different models to forecast volatility, we consider two models for realized variance which is an estimator of quadratic variation
defined as follows:
RVt =
1/∆
X
2
ri,t
i=1
where ∆ denotes the intraday sampling frequency of returns, ri . In our framework
6
we consider σt2 = RVt .
The first model we consider is the HAR model on log realized variance introduced
in Corsi (2009), which captures the long memory property of realized variance:
m
ht+1 = µ + α1 (ht − µ) + α2 (hw
t − µ) + α3 (ht − µ) + ωηt+1
(5)
where µ is the unconditional mean, ηt+1 is standard normally distributed and ω is
the VoV parameter. Furthermore, hw
t is the average of the log realized variance of
the previous five days whereas hm
t is the average of the log realzed variance of the
previous twentyone days.
The second model we consider is a simple AR(1):
ht+1 = µ + ρ(ht − µ) + ωηt+1
(6)
Having a look at (6) it is straightforward to notice that this model is nested in the
HAR model when α2 and α3 are equal to zero.
In order to compare the value at risk at different horizons from the HAR model with
the ones obtained previously we need to obtain the T steps cumulative variance,
Σ2t+T . For the AR(1) model this can simply obtained by recursion. Exploiting this
structure, we recognize that the autoregressive parameter has a strong impact on
the variance of the volatility distribution, which results in a more fat-tailed return
distribution when ρ → 1. For what concerns the HAR model, given the slightly more
complicated structure of the model, it is convenient to write in state space form and
then draw from a normal with mean and variance given in appendix A.
7
The first case we consider the two dynamic specifications meaning that in (4) the
integration with respect to θ = (µ, α, ω) is not done. In this case we consider the
OLS estimates for the HAR model and three different values for the autoregressive
parameter of the AR(1). Integration is performed through Monte Carlo sampling
where evaluate a standard normal density with the drawn cumulative variances.
Instead, when we consider parameter uncertainty in the HAR model we take a
Bayesian perspective and we have an additional Monte Carlo sampler through a
Gibbs simulation to obtain the draws of the parameters in θ. The Gibbs sampler
starts from an initial value of the parameters and then draws the first set of parameters conditional on the others, then it moves to the next parameter and so on. The
posterior for the parameter is obtained by the product of the likelihood with the
prior:
p(θ|h) ∝ p(h|θ)p(θ)
where h is the vector containing the log RV. We draw using the following blocks:
1. Draw α|µ, ω 2
2. Draw µ|α, ω 2
3. Draw ω 2 |α, µ
Given the simple structure of the model we use a 100000 draws, furthermore, to
avoid dependence from the initial condition, we get rid of the first 5000 draws. We
choose a flat prior on α and ω, whereas for the unconditional mean the prior is set
8
to be normal with mean equal to the average of ht and the variance equal to the
adjustment factor of the autocovariance. The algorithm and the posteriors for the
parameters are derived and more carefully explained in appendix A.
3
3.1
Results
Data
Our database is composed by 12 5-minutes returns series of major US companies.
We have a sample of 2489 trading days starting in January 1999 until December
2008. Table 6 in appendix show the summary statistics for log realized variance of
these companies.
3.2
Model Uncertainty
At first we want to compare how the different models for volatility impact value at
risk at different horizons, T = 1,5,21,63,126 and 252.
Figure 1 shows how the change in the autoregressive coefficient affects the value at
risk at different horizons. As expected, for T = 1 the models are indistinguishable
given that the autoregressive parameter does not enter in the variance of h. However,
for larger T the difference in the parameter values and the different structure compared to the HAR model kicks in and the random walk always present the highest
Value at Risk.
9
(a) T = 1
(b) T = 5
(c) T = 21
(d) T = 63
(e) T = 126
(f) T = 252
Figure 1: Value at risk at different horizons T and for different parameter values.
10
As we have briefly mentioned in the introduction it is common practice to obtain
the value at risk at long horizon simply multiplying the 1 step ahead VaR with the
square root of the horizon. Figures 2 and 3 shows clearly the difference between a
square root scaling of the risk measure and building a forecast using the recursive
properties of the models we consider.
(a) AR(1): ρ = .98
(b) AR(1): ρ = .99
(c) AR(1): ρ = 1
(d) HAR
Figure 2: Value at risk at different horizons T for different models.
Figure 2 shows the levels of value at risk at 5% and 1% starting from T = 5 until
11
T = 252. For the AR(1) for any parameter value it is straightforward to notice that
for the one year forecast the risk measure is unreliably too high compared to the
HAR model. Figure 3 shows instead the ratios between value at risk obtained using
the dynamic properties of the model considered and the value at risk for next day
multiplied by the square root of the horizon at the same confidence levels in the
previous figure.
(a) AR(1): ρ = .98
(b) AR(1): ρ = .99
(c) AR(1): ρ = 1
(d) HAR
Figure 3: Ratio between VaR for different horizons T and next day VaR scaled by
√
T for different models.
12
Figure 3 clearly shows that value at risk computed using the square root scaling rule
underestimates value at risk compared to the one obtained using the dynamic properties of these models. Moreover, Figure 3 shows that this scaling is monotonically
increasing in T .
3.3
Parameter Uncertainty
The second part of the analysis considers the HAR model only and we want to see
whether parameter uncertainty plays a significant role in value at risk forecasts at
different horizons. In fact we want to obtain a return distribution that take into
account parameter uncertainty as highlighted in equation (4). For the HAR model
the parameter vector is given by θ = (µ, α, ω) where α = (α1 , α2 , α3 ).
Figure 4 shows the value at risk at two different confidence levels when we take into
account parameter uncertainty. In order to integrate out parameter uncertainty we
use Gibbs sampling: technical details can be found in the appendix.
(a) Levels
(b) Ratios
Figure 4: Levels and Ratios of Value at Risk for the HAR model taking into account
parameter uncertainty.
13
Panel (a) of Figure 4 shows the levels of value at risk obtained considering parameter uncertainty and value at risk without parameter uncertainty. Panel (b) shows
instead the ratio between the value at risk with parameter uncertainty and without
parameter uncertainty. Starting from T = 5 at both confidence levels the value
at risk computed taking into account parameter uncertainty always lies above the
one without parameter risk. Moreover, the further we go with the forecast horizon
for the variance the bigger the ratio of this two quantity becomes. Also interesting
to notice is the fact that if we consider a confidence level closer to the end of the
distribution the ratio between the parameter uncertainty value at risk and the one
without parameter uncertainty becomes more relevant.
m
The previous plots consider the case where ht , hw
t , ht are at the equilibrium level
µ. Figure 5 shows how the ratios for two different confidence levels evolves during
m
a time span of ten years considering the values of ht , hw
t , ht at the last trading day
of the year into account. Tables 1, 2, 3 and 4 show the levels of Value at Risk when
we consider the dynamic properties of the HAR model with and without parameter
uncertainty for the same two confidence levels.
14
(a) T = 1
(b) T = 5
(c) T = 21
(d) T = 63
Figure 5: Ratios of Value at Risk at different horizons at two different confidence
levels for the HAR model with and without parameter uncertainty.
15
PANEL
V aR
V aRP U
PANEL
V aR
V aRP U
1999 2000
A: α = 5%
2.25 3.44
2.29 3.49
B: α = 1%
3.43 5.19
3.55 5.33
2001
2002
2003
2004
2005
2006
2007
2008
1.63
1.65
2.63
2.68
1.87
1.90
1.57
1.59
3.61
3.66
2.10
2.14
3.29
3.34
6.66
6.65
2.48
2.56
4.00
4.15
2.85
2.95
2.39
2.47
5.41
5.55
3.20
3.31
4.97
5.12
7.70
7.70
Table 1: 1-Day ahead VaR estimates with and without parameter uncertainty for
the HAR model
PANEL
V aR
V aRP U
PANEL
V aR
V aRP U
1999 2000 2001
A: α = 5%
5.64 8.02 4.04
5.76 8.19 4.11
B: α = 1%
8.39 11.91 6.01
8.69 12.33 6.19
2002
2003
2004
2005
2006
2007
2008
6.42
6.56
4.47
4.55
3.64
3.71
8.40
8.59
5.12
5.22
7.60
7.78
18.69
18.76
9.55
9.89
6.64
6.85
5.41
5.59
12.47
12.92
7.61
7.87
11.29
11.71
22.26
22.29
Table 2: 5-Days ahead VaR estimates with and without parameter uncertainty for
the HAR model
1999 2000 2001
PANEL A: α = 5%
V aR
12.50 17.02 8.86
PU
V aR
12.84 17.55 9.06
PANEL B: α = 1%
V aR
18.57 25.24 13.16
PU
V aR
19.36 26.41 13.66
2002
2003
2004
2005
2006
2007
2008
13.89
14.29
9.68
9.89
8.07
8.23
17.72
18.29
11.04
11.30
15.79
16.29
44.24
45.02
20.63
21.52
14.36
14.91
11.98
12.41
26.27
27.53
16.38
17.02
23.42
24.51
56.13
56.55
Table 3: 21-Days ahead VaR estimates with and without parameter uncertainty for
the HAR model
16
1999 2000
PANEL A: α = 5%
V aR
23.20 30.26
PU
V aR
24.01 31.66
PANEL B: α = 1%
V aR
35.27 45.90
PU
V aR
37.44 49.26
2001
2002
2003
2004
2005
2006
2007
2008
17.24
17.69
25.36
26.37
18.55
19.09
15.89
16.27
31.40
32.84
20.78
21.43
28.27
29.50
75.82
79.83
26.27
27.67
38.53
41.08
28.27
29.84
24.24
25.52
47.64
51.08
31.63
33.45
42.93
45.92
106.10
110.08
Table 4: 63-Days ahead VaR estimates with and without parameter uncertainty for
the HAR model
4
Conclusions
In this paper we analyze the impact of model uncertainty and parameter uncertainty
on long-term value at risk. Using two dynamic models for realized volatility that
use high-frequency data we obtain a value at risk estimate for different investment
horizons. First, we show that using the dynamic properties of these models to obtain
a risk measure lead to significantly different results when compared to the mainly
used scaling rule. Moreover, we show that different models for volatility imply very
different scalings for long term value at risk.
Furthermore, we show that considering the HAR model with parameter uncertainty,
through bayesian analysis, leads to higher VaR estimates both for small and large T .
The current setting can be extended in many directions. First, we plan to consider
the latest developments in the HAR models literature which consists in using new
factors to improve the forecast on realized volatility. An example is given by Andersen, Bollerslev, and Diebold (2005) where they separate the continuous part and
the discontinuous part of the realized volatility to improve the forecast. The other
17
example we intend to consider is given by Bollerslev, Patton, and Quaedvlieg (2015)
where they use realized quarticity to attenuate the measurement error in realized
volatility.
Furthermore, we want to consider an alternative and more parsimonious specification, an AR(1) with time varying coefficients.
Moreover, given the analysis we have performed on estimation risk, we will consider
different priors to check how diverse specifications will affect the resulting VaR.
Finally, we plan to extend the current setting to a portfolio selection framework where
model and parameter risk affect both volatilities and correlations, hence having an
impact on portfolio weights and the overall portfolio’s risk measures.
18
References
Andersen, Torben G, Tim Bollerslev, and Francis X Diebold, 2005, Roughing It Up
: Including Jump Components, Review of Economics and Statistics 89, 701–720.
Andersen, Torben G, Tim Bollerslev, Francis X Diebold, and Paul Labys, 2003,
Modeling and Forecasting Realized Volatility, Econometrica 71, 579–625.
Bollerslev, Tim, Andrew J Patton, and Rogier Quaedvlieg, 2015, Exploiting the Errors : A Simple Approach for Improved Volatility Forecasting, Journal of Econometrics Forthcomin.
Christoffersen, P., F. Diebold, and T Schuermann, 1998, Horizon problems and extreme events in financial risk management, Reserve Bank NY Econ.Policy Rev
Policy Rev, 109–118.
Corsi, Fulvio, 2009, A simple approximate long-memory model of realized volatility,
Journal of Financial Econometrics 7, 174–196.
Danielsson, Jon, and Jean Pierre Zigrand, 2006, On time-scaling of risk and the
square-root-of-time rule, Journal of Banking and Finance 30, 2701–2713.
Escanciano, J. Carlos, and Jose Olmo, 2010, Backtesting Parametric Value-at-Risk
With Estimation Risk, Journal of Business & Economic Statistics 28, 36–51.
Escanciano, J. Carlos, and Jose Olmo, 2011, Robust backtesting tests for value-atrisk models, Journal of Financial Econometrics 9, 132–161.
19
Gourieroux, Christian, and Jean-Michel Zakoı̈an, 2013, Estimation-Adjusted Var,
Econometric Theory 29, 735–770.
20
A
A.1
Tecnical Appendix
Monte Carlo Simulation
The AR(1) and HAR model can be written in state space form as follows:
Yt+1 = F Yt + GEt+1
where Yt+1 is a vector containing ht+1 in the first position and its lags until ht−p+2
where p = 22 is the order of the restricted autoregressive process. Moreover, F is
the matrix containing in the first row the autoregressive coefficients: for the AR(1)
it is a row containing a non zero element in the first position and zero elsewhere
whereas for the HAR model it contains the coefficient estimates of the transformed
AR(22) process. The lower p − 1 × p − 1 block contains an identity matrix and the
last column is composed by p − 1 zeros. Finally G, is a column vector containing a
one in the first position and zeros elsewhere. By simple recursion,
E[Yt+T ] = F T Yt
T −1
X
V[Yt+T ] = ω 2
F j GG0 (F j )0
j=0
If T < p then only the T × T block is considered whereas if T > p the matrix F is a
T × T that differs from the previous case for the size of its inner blocks.
21
A.2
Gibbs Sampling
The second step in the analysis considers the HAR model with and without parameter
uncertainty: for the latter case the Monte Carlo step described in the previous section
is used whereas to get the parameter uncertainty one we use Gibbs sampling described
here.
The parameters we want to obtain with the Gibbs sampling are the µ, α and ω 2 .
Recalling the structure of the HAR model given by:
m
ht+1 = µ + α1 (ht − µ) + α2 (hw
t − µ) + α3 (ht − µ) + ωηt+1
(7)
We specify a flat prior on α: p(α) ∝ 1, a Normal prior on µ: µ ∼ N (µ0 , V02 ) and a
flat prior on ω 2 : p(ω 2 ) ∝
to µ0 = h̄t and V02 =
k
T
1
.
ω2
The prior parameters!for µ are respectively set equal
P
P l
C0 + 2 Ll=1 1 − L+1
Cl with Cl = T1 Tl+1 ht ht−l . The
algorithm draws the parameters conditional on the previous draws for the other ones.
Specifically the algorithm cycles as follows:
b=
1. Draw α|ω 2 , µ from the posterior: p(α|ω 2 , µ) ∼ N (b
α, ω 2 (H̃ 0 H̃)−1 ) where α
(H̃ 0 H̃)−1 H̃ 0 h̃, h̃ = ht+1 − µ and H̃ = H − Jµ with H being a T × 3 matrix
containing the RHS variables of the HAR model and J being a T × 3 matrix
containing ones.
We accept the draw if max|λF (α)| < 1 where λF denotes the eigenvalues of
the State space matrix described in the previous section.
0
2. Draw µ|α, ω 2 from the posterior: p(µ|α, ω 2 ) ∼ N (b
µ, Vbµ2 ) where Vbµ2 = jω2j +
−1
µ0
1
2 j0g
b
and
µ
b
=
V
+
. Furthermore, g = ht+1 − Hα and j = ι − Jα
µ ω2
V2
V2
µ
µ
22
where ι is a vector of ones.
3. Draw ω 2 |α, µ from the posterior:
e0 e
ω2
obtained in the first step.
23
∼ χ2 (T + 1). Where e are the residuals
B
Tables
Par.
α0
α1
α2
α3
GM
0.041
0.404
0.289
0.273
HD
0.044
0.396
0.381
0.178
HNZ
0.009
0.359
0.303
0.292
HON
0.059
0.424
0.337
0.179
IBM INTC
0.020 0.042
0.433 0.464
0.382 0.285
0.145 0.218
MSFT WFC WMT WYE XOM XRX
0.027 0.014 0.023 0.048 0.032 0.053
0.441 0.430 0.383 0.323 0.446 0.392
0.361 0.375 0.341 0.370 0.408 0.323
0.162 0.170 0.238 0.250 0.084 0.248
Table 5: OLS parameter estimates of the HAR model for different stocks
Company
GM
HD
HNZ
HON
IBM
INTC
MSFT
WFC
WMT
WYE
XOM
XRX
Min
-1.5984
-1.7046
-2.1337
-1.1859
-1.9867
-0.8819
-1.6334
-2.3207
-1.8243
-1.6016
-1.8106
-1.2731
Mean Median Max
1.1201 0.9626 7.5068
1.0016 0.9426 4.6232
0.1952 0.1250 4.0143
1.0059 0.9347 5.5792
0.5447 0.4598 4.2632
1.3180 1.2308 4.4911
0.7899 0.7751 4.1528
0.5745 0.5070 5.4122
0.6759 0.5856 4.2754
0.8784 0.7850 4.5365
0.5333 0.4789 4.9588
1.4551 1.3447 5.7347
Std
Skewness
1.0487
1.2283
0.9066
0.3815
0.9068
0.3275
0.9033
0.5140
0.9766
0.3825
0.9251
0.3010
0.9756
0.2502
1.1889
0.5179
0.9488
0.3238
0.8909
0.5448
0.8067
0.7063
1.0665
0.5346
Table 6: Summary Statistics of ht for different stocks
24
Kurtosis
5.8864
2.9883
2.8350
3.2652
2.8664
2.5528
2.5664
3.0866
2.5723
3.3008
4.4320
3.2075