Estimating Expected Shortfall Using a Conditional
Autoregressive Model: CARES∗
Yin Liao and Daniel Smith
Queensland University of Technology
Brisbane, QLD, 4001
April 21, 2015
Abstract
Expected shortfall (ES) has recently become an increasingly popular measure
of downside risk because of its conceptual appeal and desirable properties. This article proposes a new conditional autoregressive model for estimating ES (CARES)
by specifying the evolution of ES over time using an autoregressive process. We
develop a tail-based least-squares method of estimating the model parameters and
establish the consistency and asymptotic normality of the resultant estimator.
Our simulation results show that the CARES model demonstrates superior finitesample forecasting performance compared with other existing methods. To use
examples with real data, we implement the model to evaluate the ESs of one stock
index and two individual stocks.
1
Introduction
The recent financial crisis has spurred renewed interest in developing accurate downside
risk measures for financial markets and institutions. Value at risk (VaR), which measures
the possible maximum loss of an asset or a certain portfolio within a given time period
∗
We thank conference participants at the ”Princeton-QUT-SMU Financial Frontiers Workshop” at the Queensland
University of Technology, the ”Current Research in Empirical Finance and Macroeconomics Workshop” at the University
of Queensland and the 2014 Australasian Econometric Society Meeting at the University of Tasmania.
1
at a given confidence level, has been the most popular measure of downside risk used in
practice over the past decade. Despite its great success, VaR has two major drawbacks.
First, this measure lacks subadditivity and is not a coherent measure. Hence, the VaR
of a portfolio can be larger than the sum of its individual VaRs, which contradicts the
conventional wisdom that diversification reduces risk. Second, the VaR measure ignores
the magnitude of the loss because it accounts for only the probabilities of losses but not
their sizes. In response to these shortcomings, a number of alternative risk measures have
been gaining traction, particularly expected shortfall (ES) (see Artzner, Delbaen, Eber,
and Heath (1999)). ES is defined as the conditional expectation of a loss given that the
loss is larger than the VaR. In contrast to VaR, ES provides information regarding the
magnitude of the loss beyond the VaR level and is subadditive. However, the calculation
of ES can be an intricate computational exercise given the lack of a closed-form formula
(Yuan and Wong (2010)), and little research has examined the methods of ES estimation.
Our work here contributes to filling this gap.
Let Xt , t = 1, ..., n denote the price of an asset or a portfolio over n periods, which is a stationary process with the marginal distribution function F , and let yt = −log(Xt /Xt−1 )
represent the negative logarithmic return over the tth period. The VaR satisfies V aRτ =
inf{u : F (u) ≥ τ }, which is essentially the τ th quantile of the return distribution. Consequently, the ES is defined as ESτ = E(yt |yt > V aRτ ). Although the ES is conceptually
superior to the VaR, ES modeling and estimation remain a challenging problem whose
solution lacks consensus. First, the ES measures the tail magnitude of the return distribution beyond a certain point (VaR) such that its modeling and estimation cannot
be independent of the modeling and estimation of the VaR. Second, because the distribution of returns typically changes over time, estimating the ES requires a suitable
method of modeling the time-varying tail magnitude beyond a certain quantile.
A variety of techniques for VaR estimation have been proposed in the literature. Of these
methods, most focus on first modeling and estimating the entire distribution of returns
and then extracting a certain quantile of the estimated distribution. Although parametric methods for this purpose involve the parameterization of the time-varying financial
asset return distribution, nonparametric methods simply rely on historical data for the
estimation of the return distribution without imposing any assumptions. However, each
of these types of methods can be easily criticized. First, the parametric approach requires
an assumption regarding the return distribution, but unfortunately, these assumptions
are typically inconsistent with the real data. Meanwhile, the nonparametric approach
is notoriously difficult to apply when the data sample is limited, and it assumes that
returns are independent and identically distributed (i.i.d.) and hence does not allow for
time-varying volatility.
2
In a recent exception to traditional approaches, the quantile is modeled directly rather
than the entire distribution. Engle and Manganelli (2004) proposes a conditional autoregressive value-at-risk (CAViaR) model, which specifies the evolution of the VaR
over time using a specialized type of autoregressive process. Unknown parameters are
estimated using quantile regression. The CAViaR model relies on the empirical finding
that stock market return volatilities cluster over time. Consequently, the VaR, which
is tightly linked to the standard deviation of the distribution, must exhibit similar behavior. This approach has strong appeal in that it provides a modeling framework but
does not rely on distributional assumptions. However, the focus of this model is solely
on VaR estimation, with no consideration of how to estimate the corresponding ES.
We pursue a similar concept in proposing a conditional autoregressive model for the
calculation of ES (henceforth referred to as the CARES model). The CARES model
specifies the evolution of the ES itself over time using an autoregressive process, and the
model parameters are estimated by minimizing the squared loss function in the region
of losses in excess of the VaR. The resultant parameter estimates can be regarded as
tail-based least-squares estimators. More specifically, to solve the problem of minimizing the squared loss in excess of the VaR, we jointly address two other minimization
P
problems. The first involves minimizing the check loss function min T1 Tt=1 (τ − I(yt <
β
ft (β)))(yt − ft (β)) to obtain an estimate for ft , where ft (β) is a dynamic specification
for V aRt that is known up to β, and the second involves minimizing the squared loss
P
function min T1 Tt=1 I(yt < ft (β))(yt − gt (γ))2 , where gt (γ) is a dynamic specification
γ
for ESt that is known up to γ. The first-order conditions of these two minimization
problems provide two moment conditions that are useful in generalized method of moments (GMM) estimation, and we therefore derive the asymptotic theory of these model
parameter estimators under a GMM framework to study their theoretical properties.
Meanwhile, we conduct a series of Monte Carlo studies to investigate the finite-sample
properties of the model parameter estimators and the finite-sample forecasting performance of the CARES model.
Finally, a recent similar work by Taylor (2008) should be considered. The work of Taylor
(2008) also develops an autoregressive model for conditional ES (henceforth referred
to as the CARE model), but this model differs from our CARES model in several
aspects. First, given the one-to-one mapping between quantiles and expectiles for a
given distribution (Efron (1991); Yuan and Wong (2010); Yao and Tong (1996)), the
CARE model employs the expectile to obtain both VaR and ES estimates. Therefore,
the success of the CARE model largely depends on how the coverage probability of the
expectile (α) is formulated as a function of the coverage probability of the quantile (τ ).
Because the return distribution varies over time, the value of α changes over time even
3
for a fixed τ . The need to estimate α at each time point causes the CARE model not
only to be computationally demanding but also to incur greater estimation errors. This
effect will degrade the ES forecast, as shown in Section 3. Second, rather than directly
estimating the autoregressive model for the conditional ES, Taylor (2008) relies on an
algebraic link (which is a function of α and τ ) between the conditional expectile and
quantile to infer the CARE model parameters based on an autoregressive model for the
estimated conditional expectile. This link holds for many widely used distributions, but
it is violated when the underlying return distribution becomes overly complicated. In
this sense, our model allows for a greater degree of flexibility in the model parameters,
and the CARE model is therefore nested in our CARES model as a special case when the
set of restrictions on the model parameters implied by the link between the conditional
expectile and quantile holds. We formally test these restrictions on simulated and real
data in Section 3 and Section 4.
The remainder of this paper is structured as follows. Section 2 introduces the CARES
model and establishes the consistency and asymptotic normality of the model parameter
estimators. Section 3 presents a range of Monte Carlo simulations performed to study
the finite-sample properties of the model. Several empirical applications are presented
in Section 4. Section 5 concludes the paper.
2
CARES model
In this section, we provide a description of the CARES model, introduce the model estimation procedure, and discuss the asymptotic theory of the model parameter estimators.
2.1
Model Description
To motivate the functional form, we consider a simple scenario in which an asset return
rt follows a Gaussian distribution with mean µt and standard deviation σt . φ(rt ; µt , σt2 )
and Φ(rt ; µt , σt2 ) respectively denote the density and distribution functions of the return.
V aRt for a given probability τ can be simply expressed as
V aRt = µt + σt Φ[−1] (τ )
4
(1)
and the corresponding ES can be computed as
ESt = E(rt |rt ≤ V aRt )
VZaRt
φ(rt ; µt , σt2 )
=
rt
dr
Φ(V aRt ; µt , σt2 )
−∞
= µt + −σt2
= µt − σt2
V aR
φ(rt ; 0, σt2 ) t
Φ(V aRt ; 0, σt2 ) −∞
φ(V aRt ; 0, σt2 )
.
Φ(V aRt ; 0, σt2 )
Because (V aRt − µt )/σt = Φ[−1] (τ ), we can rearrange the above equation to obtain
ESt = E(rt |rt ≤ V aRt ) = µt − σt ·
φ(Φ[−1] (τ ))
.
τ
(2)
Thus far, both V aRt and ESt are clearly proportional to the standard deviation σt , and
we can therefore reasonably assume the same functional form for the ES as used for
the VaR. Note that we use the Gaussian distribution as an example here to illustrate
the link between volatility and ES (and VaR). This relationship should hold for any
other distribution with a different functional form. Consequently, we draw inspiration
from the CAViaR model of Engle and Manganelli (2004)) to propose a CARES model
to formalize the dynamic characteristics.
Recall the CAViaR model, in which the conditional quantile is specified as an autoregressive function ft (β) that depends on the parameter vector β as follows:
ft (β) = β0 +
q
X
βi · ft−i (β) +
i=1
r
X
βq+i · l(xt−i ),
(3)
i=1
where the βi ft−i (β), i = 1, ..., q are the autoregressive terms, which ensure that the
quantile changes smoothly over time, and the role of l(xt−j ) is to link ft (β) to observable
variables that belong to the information set.
We introduce a similar functional form for the conditional ES to specify its dynamics as
gt (γ) = γ0 +
q
X
γi · gt−i (γ) +
i=1
r
X
γq+i · m(xt−i ),
(4)
i=1
where γi gt−i (γ), i = 1, ..., q are the autoregressive terms and m(xt−j ) contains the exogenous variables that belong to the information set. Some examples of the CARES model
5
can be readily obtained as follows:
Symmetric absolute value:
ft (γ) = γ1 + γ2 ft−1 (γ) + γ3 |rt−1 |
Asymmetric slope:
ft (γ) = γ1 + γ2 ft−1 (γ) + γ3 (rt−1 )+ + γ4 (rt−1 )−
Indirect GARCH(1,1):
2
2
ft (γ) = (γ1 + γ2 ft−1
(γ) + γ3 rt−1
)1/2
Including the lagged returns as exogenous variables is a natural choice, as we would
indeed expect the VaR and ES to increase during the next period when the return
becomes either very negative or very positive. The first and third models allow the ES
to respond symmetrically to past returns, whereas the second model allows for variation
in the responses to positive and negative returns. As discussed in Engle and Manganelli
(2004)), the indirect GARCH model will correctly specify the ES when the underlying
return data truly follow a GARCH(1,1) process with an i.i.d. error distribution. The
symmetric absolute value or asymmetric slope quantile specification will be correct when
the actual return follows a GARCH process in which the standard deviation rather than
the variance is modeled symmetrically or asymmetrically with i.i.d. errors. In the
following analysis, we primarily focus on these three example models.
2.2
Model Estimation
Next, we estimate the model parameters using a tail-based least-squares method. This
method minimizes the mean of the squared deviation between the return observations
beyond a certain quantile (VaR) and the estimated ES from the model. The resulting parameter estimates can be regarded as GMM-type estimators, and we derive the
asymptotic theory for these estimators under a GMM framework.
In practice, the tail-based least-squares method can be implemented using a two-stage
process. Assuming that the τ th quantile of a sample of return observations y1,..., yT follows
the CAViaR model, that is,
V aRt (τ ) = ft (β0 (τ )),
(5)
where f is assumed to be known up to the vector of parameters β0 and the corresponding
6
ES depends on another vector of parameters γ0 such that
ESt (τ ) = gt (γ0 (τ )),
(6)
then both V aRt and ESt are determined by the θ0 = (θ01 , θ02 ) = (β0 (τ )0 , γ0 (τ )0 ) that
minimizes the loss function
E[I(yt < V aRt (τ )) · (yt − ESt (τ ))2 ].
(7)
A standard method of obtaining the estimator for θ0 , denoted as θ̂, is to minimize the
sample counterpart
T
−1
T
X
I(yt < V aRt (τ )) · (yt − ESt (τ ))2 .
(8)
t=1
We solve this minimization problem using a two-stage procedure.
In the first stage, we estimate equation (5) by solving
T
1X
(τ − I(yt < V aRt (τ ))) · (yt − V aRt (τ ))
min
β T
t=1
(9)
to obtain β̂ and V aRˆt (τ ). This standard quantile regression estimation is used in Engle and Manganelli (2004)) to estimate the CAViaR model. In the second stage, the
estimated V aRt (τ ) is used as an observation in estimating the parameters of (6) by
solving
T
1X
min
I(yt < V aRˆt (τ )) · (yt − ESt (τ ))2 .
(10)
γ T
t=1
This method can be regarded as a variation of the least-squares approach that focuses
only on the tail observations to minimize the sum of the squared estimation errors.
Alternatively, the parameters of equations (5) and (6) can be jointly estimated by solving
the two problems defined in (9) and (10) simultaneously. The two first-order conditions
involved, as specified below, yield moments that are useful in GMM estimation:
!
PT
1
∇
f
(β(τ
))
·
(τ
−
I(y
<
f
(β(τ
))))
=
0
t
t
PTT t=1 β t
,
(11)
1
∇
g
(γ(τ
))
·
(y
−
g
(γ(τ
)))
·
I(y
γ t
t
t
t < ft (β(τ )) = 0
t=1
T
where
∇ft (β) =
7
d
ft (β)
dβ
(12)
and
∇gt (γ) =
d
gt (γ).
dγ
(13)
Therefore, θ̂ is actually the resulting GMM estimator given the two moment conditions
presented above. The asymptotic distribution of θ̂ can then be established within the
GMM framework as follows.
Theorem 2.1 and Theorem 2.2 show that the GMM estimator θ̂ is consistent and asymptotically normal. The relevant assumptions and a detailed proof are provided in Appendix A.
Theorem 2.1. (Consistency) Under assumptions 5.1 and 5.2, we have
P
θ̂(τ ) −→ θ0 (τ )
as T −→ ∞.
Proof. See Appendix A.
Theorem 2.2. (Asymptotic normality) Given assumptions 5.3- 5.5, we have
√
D
T (θ̂ − θ0 ) −→ N (0, Σ(θ0 ))
(14)
as T −→ ∞,
0
where Σ(θ0 ) = D(θ0 )−1 S(θ0 )(D(θ0 )−1 ) with
D11 D12
D(θ0 ) =
D21 D22
0
−E(∇β ft (β0 (τ ))∇β ft (β0 (τ )) h(0))
0
=
0
0
E(∇γ gt (γ0 (τ ))∇β ft (β0 (τ )) (ft (β0 (τ )) − gt (γ0 (τ )))h(0)) −E(∇γ gt (γ0 (τ ))∇γ gt (γ0 (τ )) τ )
(15)
and
S11 S12
S(θ0 ) =
S21 S22
0
τ (1 − τ )E(∇β ft (β0 (τ ))∇β ft (β0 (τ )) )
0
=
,
0
0
E(∇gt (γ(τ ))∇γ gt (γ0 (τ )) ) · T V
8
(16)
where h(.) is the density function and T V = E((yt − gt (γ0 (τ ))2 · I(yt − ft (β0 (τ )) < 0)).
Proof. See Appendix A. The basic concept is that we approximate the (discontinuous)
gradient of the objective function as its continuously differentiable expectation and then
relate this approximation to the asymptotic first-order condition to set the approximation of the gradient asymptotically equal to zero. This approach enables the use of the
standard Taylor expansion to derive the asymptotic theory for the model parameter estimators. The method used to obtain such an approximation is provided by the theorem
of Huber (1967). This technique is widely used in quantile and expectile regression; see
Engle and Manganelli (2004) and Kuan, Yeh, and Hsu (2009) for recent applications.
Under the assumptions and conditions of Theorem 2.1 and Theorem 2.2, the asymptotic
variance-covariance matrix Σ(θ) can be consistently estimated as Σ̂(θ) = D̂(θ)−1 Ŝ(θ)D̂(θ)−1 ,
where
"
#
Dˆ11 (θ) Dˆ12 (θ)
D̂(θ) = ˆ
D21 (θ) Dˆ22 (θ)
T
1 X
0
Dˆ11 (θ) = −
∇β ft (β̂(τ ))∇β ft (β̂(τ )) I(|yt − ft (β̂(τ ))| < cT )
2T cT t=1
Dˆ12 (θ) = 0
Dˆ21 (θ) =
T
1 X
0
∇γ gt (γ̂(τ ))∇β ft (β̂(τ )) (ft (β̂(τ )) − gt (γ̂(τ )))I(|yt − ft (β̂(τ ))| < cT )
2T cT t=1
1
Dˆ22 (θ) = −
T
T
X
0
∇γ gt (γ̂(τ )) · ∇γ gt (γ̂(τ )) τ
t=1
"
#
Sˆ11 (θ) Sˆ12 (θ)
Ŝ(θ) = ˆ
S21 (θ) Sˆ22 (θ)
Sˆ11 (θ) =
T
1X
0
τ (1 − τ )∇β ft (β̂(τ ))∇β ft (β̂(τ ))
T t=1
Sˆ12 (θ) = 0
Sˆ21 (θ) = 0
Sˆ22 (θ) =
T
1X
0
∇gt (γ̂(τ ))∇γ gt (γ̂(τ )) )(yt − gt (γ̂(τ ))2 · I(yt − ft (β̂(τ )) < 0),
T t=1
where cT is a bandwidth. We can compute the bandwidth cT in several ways. First, we
9
follow Engle and Manganelli (2004) in estimating cT using a k-nearest neighbor estimator
with k = 40 for 1% coverage probability and k = 60 for 5% coverage probability.
Alternatively, we follow Koenker (2005) to estimate cT as
cT = ŝ(Φ−1 (τ + hT ) − Φ−1 (τ − hT )),
(17)
4
−1
4.5φ (Φ (t)) 1/5
where ŝ = min(SD(yt − ft (β̂)), IQR(yt − ft (β̂)))/1.34, and hT = T −1/5 [ (2Φ
−1 (t)2 +1)2 ]
2
−1
1.5φ (Φ (τ )) 1/3
(see Bofinger (1975)) or hT = T −1/3 Φ−1 (1 − 0.025)2/3 [ (2Φ
(see Hall and
−1 (τ )2 +1)2 ]
Sheather (1988)).
Next, we perform a small simulation study to examine the finite-sample performance
of the model parameter estimators. In doing so, we assume that an asset or portfolio’s
return follows a GARCH(1,1) model:
rt = σt zt ,
2
2
σt2 = a0 + a1 rt−1
+ a2 σt−1
,
where the parameters are set to a0 = 0.025, b0 = 0.0500, and c0 = 0.9250 and the disturbance zt follows a standard Gaussian distribution. Based on the relationship between
the conditional VaR/ES and the standard deviation of the return, as shown in Section 2,
the true values of the parameters of the CARES model with the indirect GARCH(1,1)
specification are implied to be β0 = a0 (Φ−1 (τ ))2 , β1 = a2 , β2 = a1 (Φ−1 (τ ))2 , γ0 =
a0 (−φ(Φ−1 (τ ))/τ )2 , γ1 = a2 , and γ2 = a1 (−φ(Φ−1 (τ ))/τ )2 , where Φ and φ are the cumulative density function and probability density function, respectively, of the standard
Gaussian distribution and τ is the coverage probability. See Appendix B for more details
on the derivation. We then generate 10,000 samples from the GARCH(1,1) model with
sample sizes of 1,000, 2,000, 5,000 and 10,000. The initial return and volatility values
are drawn from their unconditional distributions. For each sample, we estimate the parameters of the indirect GARCH(1,1) CARES model when the coverage probability is
5% or 1%. The means and standard deviations of these estimators computed based on
10,000 simulation iterations are reported in Table 1.
10
Table 1: Finite-Sample Properties of Each Parameter Estimator in the CARES Model
11
Panel A: τ = 0.05
Sample Size
T = 1000 T = 2000 T = 5000
T = 10000
True Parameter
Mean Estimated Parameter (Standard Deviation)
0.0790
0.0743
0.0703
0.0685
(−1)
2
(0.0470)
(0.0329)
(0.0194)
(0.0140)
β0 = a0 (Φ (τ )) = 0.0676
[0.0356]
[0.0314]
[0.0192]
[0.0123]
0.9219
0.9231
0.9246
0.9248
β1 = a2 = 0.9250
(0.0248)
(0.0173)
(0.0108)
(0.0086)
[0.0211]
[0.0111]
[0.0102]
[0.0082]
0.1310
0.1332
0.1333
0.1348
(−1)
2
(0.0561)
(0.0388)
(0.0248)
(0.0171)
β2 = a1 (Φ (τ )) = 0.1353
[0.0402]
[0.0285]
[0.0214]
[0.0141]
0.2119
0.1555
0.1216
0.1139
(0.0914)
(0.0432)
(0.0283)
γ0 = a0 (−φ(Φ(−1) (τ ))/τ )2 = 0.1064 (0.1881)
[0.1054]
[0.0825]
[0.0394]
[0.0223]
0.8924
0.9093
0.9199
0.9223
γ1 = a2 = 0.9250
(0.0496)
(0.0264)
(0.0147)
(0.0103)
[0.0309]
[0.0213]
[0.0134]
[0.0101]
0.2382
0.2276
0.2186
0.2163
(−1)
2
(0.0633)
(0.0401)
(0.0294)
γ2 = a1 (−φ(Φ (τ ))/τ ) = 0.2127 (0.0938)
[0.0828]
[0.0529]
[0.0392]
[0.0261]
Note: This table reports the means and standard deviations of the CARES model parameter
estimators (when the coverage probability is 5%) computed from 10,000 simulation iterations
with sample sizes of T = 1, 000, T = 2, 000, T = 5, 000 and T = 10, 000 when the underlying return follows a GARCH(1,1) process. The standard deviations of the parameter estimators over
all 10,000 simulations and the average theoretical standard deviations implied by the asymptotic
theory of these parameters are presented in round and square brackets, respectively.
Panel B: τ = 0.01
T = 1000 T = 2000 T = 5000
T = 10000
Mean Estimated Parameter (Standard Deviation)
0.1534
0.1531
0.1420
0.1398
(−1)
2
(0.0989)
(0.0815)
(0.0523)
(0.0366)
β0 = a0 (Φ (τ )) = 0.1353
[0.2360]
[0.1218]
[0.0685]
[0.0281]
0.9248
0.9221
0.9239
0.9243
β1 = a2 = 0.9250
(0.0279)
(0.0217)
(0.0144)
(0.0099)
[0.0489]
[0.0250]
[0.0141]
[0.0099]
0.2492
0.2658
0.2684
0.2689
(0.1395)
(0.0963)
(0.0611)
(0.0312)
β2 = a1 (Φ(−1) (τ ))2 = 0.2706
[0.1826]
[0.1175]
[0.0780]
[0.0302]
0.2342
0.2139
0.1897
0.1852
(−1)
2
(0.1382)
(0.0762)
(0.0406)
γ0 = a0 (−φ(Φ (τ ))/τ ) = 0.1776 (0.2089)
[0.3049]
[0.2156]
[0.0950]
[0.0404]
0.9151
0.9191
0.9229
0.9238
γ1 = a2 = 0.9250
(0.0388)
(0.0144)
(0.0139)
(0.0096)
[0.0711]
[0.0308]
[0.0217]
[0.0095]
0.3564
0.3559
0.3555
0.3553
(−1)
2
(0.1416)
(0.0860)
(0.0498)
γ2 = a1 (−φ(Φ (τ ))/τ ) = 0.3552 (0.2177)
[0.4288]
[0.2618]
[0.1073]
[0.0470]
Sample Size
True Parameter
12
Note: This table reports the means and standard deviations of the CARES model parameter
estimators (when the coverage probability is 1%) computed from 10,000 simulation iterations
with sample sizes of T = 1, 000, T = 2, 000, T = 5, 000 and T = 10, 000 when the underlying return follows a GARCH(1,1) process. The standard deviations of the parameter estimators over
all 10,000 simulations and the average theoretical standard deviations implied by the asymptotic
theory of these parameters are presented in round and square brackets, respectively.
These results reveal several noteworthy observations. First, these estimators perform
well even when the sample size is moderate (T = 1, 000). Their bias and standard deviations decline, as expected, with increasing sample size. The observation that each
parameter estimator converges to the true value of the parameter as T increases confirms the consistency of these estimators. Meanwhile, we calculate the average theoretical standard error of each parameter estimator (the values reported in square brackets
in Table 1) using the estimated values of the parameters from each simulation along
with the asymptotic theory described above and compare these theoretical values with
the standard deviation of each parameter estimator over all simulation iterations (the
values reported in parentheses in Table 1). The small deviation between the asymptotic and finite-sample standard errors confirms the validity of the asymptotic theory
derived above. Finally, to investigate the degree of efficiency loss in the CARES model
estimation, we alternatively compute the theoretical standard errors of the parameter
estimators in the CARES model based on the asymptotic standard errors of the above
GARCH(1,1) model parameters (a0 , a1 , and a2 .), with an appropriate scaling based on
the relationship between the GARCH(1,1) model parameters and the CARES model parameters. Using 10,000 samples with a sample size of T = 10, 000, the implied standard
errors of the parameter estimators in the CARES model with respect to the GARCH(1,1)
model are computed to be 0.0124 for β0 , 0.0081 for β1 , 0.0134 for β2 , 0.0194 for γ0 , 0.0081
for γ1 and 0.0211 for γ2 when τ = 0.05 and are computed to be 0.0248 for β0 , 0.0081
for β1 , 0.0269 for β2 , 0.0325 for γ0 , 0.0081 for γ1 and 0.0353 for γ2 when τ = 0.01.
The closeness between these implied standard errors and those computed based on the
simulation suggests that our estimation procedure yields efficient estimates of the model
parameters.
3
Simulation Study
In this section, we present a series of simulation studies to illustrate the finite-sample
properties of the CARES model. In all cases, we examine the model performance in
terms of the one-step-ahead ES forecast, and performance is measured in terms of the
root mean squared error (RMSE).q
The RMSE of an ES forecast ÊS from an arbitrary
ˆ 2 ), where ES is the true value of the
model has the standard definition E((ES − ES)
ES. Our one-step-ahead ES forecast procedure involves the following:
• Step 1. Return data are simulated with a sample size of T +500+1, where T = 500,
T = 1, 250, T = 2, 500 and T = 5, 000. The first 500 observations are discarded to
allow for a sufficiently long burn-in period.
13
• Step 2. With the remainder of the samples, the first T observations are used to fit
the three CARES models described in Section 2, and the (T + 1)th observation is
reserved for one-step-ahead out-of-sample forecast evaluation.
• Step 3. The above steps are repeated 104 times, and the RMSE is approximated
as the square root of the average of all 104 simulated values of (ES − ÊS)2 .
We consider the following three experimental designs.
Design 1 A simple GARCH(1,1) model, as described in Section 2.2.
Design 2 A GARCH(1,1) model with time-varying skewness and kurtosis:
σt2
= a0 +
rt = σt zt ,
+ +
−
2
b0 (rt−1 )2 + b−
0 (rt−1 )
2
.
+ c0 σt−1
The disturbance zt follows a generalized Student’s t-distribution 1 with a time-varying
asymmetry parameter λt and a time-varying tail-fatness parameter ηt , namely, zt ∼
GT (zt |ηt , λt ), where
+
− −
ηet = a1 + b+
et−1 ,
1 yt−1 + b1 yt−1 + c1 η
2
e
e
λt = a2 + b2 yt−1 + c2 λt−1 ,
(18)
ηt = g[2,+30] ηet ,
λt = g[−1,1] λet ,
with g representing the logistic map. Following the S&P 500 stock index return analysis
of Jondeau and Rockinger (2003), we set the model parameters to a0 = 0.0074, b+
0 =
−
+
−
0.0384, b0 = 0.0759, c0 = 0.9366, a1 = −0.5191, b1 = −0.5615, b1 = −0.0653, c1 =
0.5999, a2 = −0.0062, b2 = 0.0626, and c2 = 0.6961. This model not only accommodates
a time-varying volatility but also allows for dynamics in higher-order moments: skewness
and kurtosis.
Design 3 A Markov switching stochastic volatility (MS-SV) model:
1
yt = µst + σt ut
(19)
2
σt2 = ωst + αst ε2t−1 + βst σt−1
+ st ,
(20)
The density of the generalized t distribution (GT) is defined as
(
1
2 −(η+1)/2
bc(1 + η−2
( bz+a
if
1−λ ) )
gt(z|η, λ) =
1
bz+a 2 −(η+1)/2
bc(1 + η−2 ( 1+λ ) )
if
2
2
2
√ Γ((η+1)/2)
where a ≡ 4λc η−2
η−1 , b ≡ 1 + 3λ − a , and c ≡
π(η−2)Γ(η/2)
14
.
z < −a/b
z ≥ −a/b,
−1
−1.5
−2
−2.5
−3
−3.5
0
CAViaR−IGARCH−VaR
True VaR
500
1000
1500
1000
1500
−1
−1.5
−2
−2.5
−3
−3.5
0
CARES−IGARCH−ES
True ES
500
Figure 1: The ES forecasts of the CARES model vs. the true ES for a GARCH(1,1)GAUSSIAN model
where st is an ergodic homogeneous Markov chain on a finite set S = 1, ..., n, with a
transition matrix P defined by the probabilities ηij = P (st = i|st−1 = j). For simplicity,
we set n = 2 to reflect a two-regime switching SV model. Following the S&P 500
index analysis of Bauwens, Preminger, and Jeroen (2010), the parameters are set to
η11 = 0.979, η22 = 0.986, µ1 = 0.069, µ2 = −0.012, ω1 = 0.313, ω2 = 0.049, α1 = 0,
α2 = 0.055, β1 = 0, and β2 = 0.917.
To provide an initial visual impression of the performance of the CARES model, Figure 1 shows the 5% ES forecasts obtained using the CARES model (with the indirect
GARCH(1,1) specification) against the true values of the 5% ES when the sample size
is 2,000. We observe that the true value of the ES exhibits strong dynamic clustering
and that the 5% ES forecasts are able to properly capture this pattern.
For the sake of comparison, we also compute the RMSEs of the ES forecasts obtained
using three alternative methods: historical simulation (HS), a kernel density estimator
15
(KDE), and the CARE model of Taylor (2008). Assuming i.i.d. asset returns, the HS and
KDE approaches estimate empirical distributions of the return using past observations
to obtain the VaR and ES. When using HS, we vary the length of recent past observations from 250 to 500 to construct the empirical distribution. The ES determined using
P
the KDE approach takes the form ESKDE = (np)−1 nt=1 rt Gh (V aRKDE − rt ), where
Rt
V aRKDE is the kernel-based VaR estimator, Gh (t) = G(t/h), G(t) = −∞ K(u)du, and
K and h denote the standard Gaussian kernel and the optimal bandwidth, respectively.
Because the standard KDE estimator is known to be biased2 , we use the jackknife technique to correct this bias. In addition, to better describe the time-varying features of
the return distribution, we also apply the exponentially weighted HS (EWHS) and KDE
(EWKDE) approaches, in which the empirical distribution is constructed from exponentially weighted past observations. Following Taylor (2008), we set the exponential decay
parameter λ to its optimal value to minimize the RMSE of the ES forecast.
In the CARE model, the function that describes the relationship between the coverage
probability of the expectile (α) and the coverage probability of the quantile (τ ) has a
closed form for the GARCH(1,1) design; thus, the true value of α is known for a given
value of τ . Therefore, in the GARCH(1,1) data-generating process (DGP), we obtain
ES forecasts using the CARE model for two scenarios to study the effect of estimating
α on the ES forecast: in one scenario, α is estimated (the estimator is denoted as α̂)
using a grid search method3 , and the other scenario uses the true value of α (denoted
by α0 ). For the other two simulation designs, because the true value of α is unknown,
we obtain ES forecasts using only the CARE model with estimated α.
The results of the 5% and 1% VaR and ES forecasts for the three simulation designs
are shown in Table 2, Table 3 and Table 4 respectively4 . The far left-hand column
contains the model names, and the remaining columns present the bias and RMSEs of
the one-step-ahead ES forecasts for sample sizes of T = 500, T = 1, 250, T = 2, 500
and T = 5, 000. Regarding the outcome, the ranking of these methods in terms of the
RMSE is invariant with respect to the sample size. The CARES model with the indirect
GARCH specification yields the lowest RMSE, followed by the CARES model with the
other specifications, the CARE model with the true value of α, the CARE model with
estimated α, the exponentially weighted HS and KDE approaches, and standard HS and
KDE approaches, in that order.
2
See Theorem 2 of Chen (2008) for more details.
Following Taylor (2008), the optimal value of α is determined by estimating models with different
values on a grid with a step size of 0.0001.
4
Consistent with Taylor (2008), we find that the asymmetric-slope CARE model and CARES model
are outperformed by the other versions of the two models. Therefore, we do not report the results for
the asymmetric-slope CARE model and CARES model throughout the remainder of this analysis.
3
16
Although comparing the RMSEs provides an indication of the relative forecast accuracy,
it provides no information regarding whether any of the observed differences in performance are significant. To this end, we employ the test of equal predictive accuracy of
Diebold and Mariano (1995) (DM), and we use asterisks to indicate the results that are
found to be significant when the CARES model with the indirect GARCH specification is
used as the benchmark. The DMW test results confirm that the CARES model, particularly that with the indirect GARCH specification, demonstrates significantly superior
performance compared with the other alternatives, except for the CARE model with
the true value of α. The significant reduction in RMSE observed for the CARES model
compared with the HS and KDE approaches reflects the benefit of directly modeling the
time-varying tail of the return distribution. The significant reduction in RMSE observed
for the CARES model compared with the CARE model with estimated α provides evidence that the error that arises as a result of estimating α significantly diminishes the
model’s forecasting performance. Unsurprisingly, the CARES model with the indirect
GARCH specification performs the best, as it correctly specifies the ES when the underlying return data truly follow a GARCH(1,1) process with an i.i.d. error distribution.
To further understand the differences between our CARES model and the CARE model
of Taylor (2008), we apply the GMM Wald test to the demeaned simulated data to
empirically examine whether the relationship between the conditional quantile and ES
used in the CARE model holds. The test details are provided in Appendix C. We
employ the 5% significance level for the test; therefore, if the relationship holds, the
ideal rate of rejection of the null hypothesis should be approximately 5%. The test
results are reported in Table 5 based on 104 times of simulations. While the rejection
rates are 0.0678 and 0.0923 respectively in the GARCH(1,1)-GAUSSIAN model and the
GARCH(1,1) model with time-varying skewness and kurtosis, the NM-SV model yields
a higher rejection rate of 0.2874. These test results imply that the relationship used in
the CARE model holds when the underlying return follows a GARCH(1,1) process or a
GARCH(1,1) process with time-varying high-order moments but that the relationship
is violated when the return follows a more complicated NM-SV model.
4
Empirical Analysis
To evaluate our CARES model using real data, we perform a simple empirical study
to assess the ES of a stock index and two individual stocks. We apply several different
CARES model specifications and then evaluate both the in-sample and out-of-sample
forecasting performance of these specifications.
17
Table 2: VaR and ES Forecasts Obtained for Data from the GARCH(1,1)-GAUSSIAN
Model
T=500
Bias
RMSE
HS(250)
-0.0039 0.2788*
HS(500)
-0.0068 0.2736*
KDE
-0.0781 0.2784*
KDE-JK
-0.0074 0.2667*
EWHS(500)
-0.0018 0.2303*
EWKDE
-0.0636 0.3064*
EWKDE-JK
0.0220 0.2759*
CARE-SAV(α0 ) 0.0025 0.2020*
CARE-IG(α0 )
0.0193 0.2054*
CARE-SAV(α̂)
0.0181 0.2121*
CARE-IG(α̂)
0.0291 0.2058*
CARES-SAV
-0.0096 0.2014*
CARES-IG
0.0058 0.1886
HS(250)
HS(500)
KDE
KDE-JK
EWHS(500)
EWKDE
EWKDE-JK
CARE-SAV(α0 )
CARE-IG(α0 )
CARE-SAV(α̂)
CARE-IG(α̂)
CARES-SAV
CARES-IG
-0.0278
-0.0610
-0.1563
-0.0525
-0.0170
-0.0760
0.0339
0.0454
0.0699
0.0572
0.0629
-0.0734
0.0164
0.4372*
0.4196*
0.4267*
0.4047*
0.4203*
0.3372*
0.3560*
0.3276*
0.3294*
0.3368*
0.3348*
0.3329*
0.3047
Panel A:
T=1250
Bias
RMSE
-0.0031 0.2649*
-0.0084 0.2529*
-0.0615 0.2564*
-0.0111 0.2488*
0.0026 0.2184*
-0.0459 0.1961*
0.0042 0.1985*
0.0069 0.1268*
0.0091 0.1142
0.0394 0.1314*
0.0385 0.1182*
-0.0045 0.1232*
0.0064 0.1103
Panel B:
-0.0171 0.4272*
-0.0491 0.4000*
-0.1436 0.4034*
-0.0733 0.3869*
0.0125 0.4249*
-0.0375 0.3593*
0.0397 0.3861*
0.0470 0.2307*
0.0501 0.2162*
0.0245 0.2336*
0.0275 0.2231*
-0.0621 0.2208*
0.0053 0.2023
18
5% VaR
T=2500
T=5000
Bias
RMSE
Bias
RMSE
0.0025 0.2840* -0.0096 0.3106*
-0.0002 0.2643* -0.0050 0.2850*
-0.0385 0.2584* -0.0293 0.2485*
-0.0004 0.2537* -0.0004 0.2471*
0.0132 0.2356* 0.0164 0.2214*
-0.0271 0.2147* -0.0160 0.1997*
0.0109 0.2221* 0.0128 0.2065*
0.0030 0.0973* 0.0035 0.0828*
0.0038 0.0804 -0.00008 0.0588
0.0358 0.1026* 0.0857 0.0857*
0.0365 0.0873* 0.0626 0.0626*
-0.0039 0.0874* -0.0003 0.0762*
0.0019 0.0765
0.0050
0.0509
1% VaR
0.0228 0.3918* -0.0801 0.4047*
-0.0333 0.4232* -0.1307 0.3451*
-0.1091 0.3820* -0.1743 0.3594*
-0.0562 0.3718* -0.1347 0.3419*
0.0275 0.4241* 0.0134 0.3237*
-0.0110 0.3632* 0.0217 0.3131*
0.0456 0.3834* 0.0693 0.3339*
0.0626 0.2152* 0.0375 0.1291*
0.0602 0.1761* 0.0412 0.1039*
0.0259 0.2170* -0.0124 0.1324*
0.0358 0.1944* -0.0088 0.1115*
-0.0223 0.2307* -0.0616 0.1269*
0.0098 0.1667 -0.0058 0.0832
T=500
Bias
RMSE
HS(250)
-0.0285 0.3476*
HS(500)
-0.0302 0.3468*
KDE
0.0523 0.3471*
KDE-JK
-0.0482 0.3473*
EWHS(500)
-0.0017 0.2806*
EWKDE
0.1257 0.3064*
EWKDE-JK
0.0220 0.2759*
CARE-SAV(α0 ) 0.0031 0.2487*
CARE-IG(α0 )
0.0256 0.2333
CARE-SAV(α̂) 0.0025 0.2662*
CARE-IG(α̂)
0.0325 0.3199*
CARES-SAV
-0.0006 0.2556*
CARES-IG
0.0141 0.2398
HS(250)
-0.0492 0.5292*
HS(500)
-0.0780 0.5039*
KDE
-0.1399 0.5092*
KDE-JK
0.0230 0.5011*
EWHS(500)
-0.0038 0.5176*
EWKDE
0.2624 0.5235*
EWKDE-JK
0.0339 0.4387*
CARE-SAV(α0 ) 0.0655 0.3859*
CARE-IG(α0 )
0.2406 0.4247*
CARE-SAV(α̂) 0.0453 0.3883*
CARE-IG(α̂)
0.2460 0.4279*
CARES-SAV
-0.0010 0.3827*
CARES-IG
0.0189 0.3814
Panel A: 5% ES
T=1250
T=2500
Bias
RMSE
Bias
RMSE
-0.0237 0.3359* -0.0156 0.3522*
-0.0352 0.3285* -0.0252 0.3352*
0.0047 0.3225* -0.0005 0.3267*
-0.0630 0.3285* -0.0500 0.3305*
0.0026 0.2184* 0.0087 0.2841*
-0.0459 0.1961* 0.0964 0.3014*
0.0042 0.2685* 0.0444 0.2835*
0.0086 0.2008* 0.0037 0.1220*
0.0126 0.1590 0.0236 0.1084*
0.0110 0.1615* 0.0020 0.1228*
0.2443 0.2032* 0.2414 0.2601*
-0.0031 0.1578* -0.0050 0.1233*
0.0100 0.1559 0.0031 0.0942
Panel B: 1% ES
-0.0293 0.5055* 0.0109 0.5217*
-0.0543 0.4742* -0.0338 0.5114*
-0.1584 0.4764* -0.1062 0.3718*
-0.0585 0.4580* -0.0591 0.3820*
0.0034 0.5184* 0.0087 0.5238*
0.2502 0.5229* 0.2628 0.5456*
0.1288 0.4516* 0.1794 0.4947*
0.0281 0.2676* 0.0259 0.2452*
0.2032 0.3114* 0.2154 0.2829*
0.0204 0.2711* 0.0263 0.2463*
0.2124 0.3144* 0.2170 0.2834*
-0.0181 0.2641* -0.0096 0.2208*
0.0356 0.2540 0.0322 0.2064
T=5000
Bias
RMSE
-0.0343 0.3896*
-0.0316 0.3646*
-0.0163 0.3134*
-0.0514 0.3138*
0.0100 0.2736*
0.0823 0.2987*
0.0543 0.2543*
0.0044 0.0934*
0.0288 0.0654
-0.0002 0.1009*
0.2394 0.2499*
-0.0042 0.1023*
-0.0004 0.0617
-0.1029
-0.1752
-0.2325
-0.1830
0.0100
0.2382
0.1731
-0.0142
0.1601
-0.0182
0.1855
-0.0662
-0.0068
0.5018*
0.4266*
0.4328*
0.4086*
0.4736*
0.4702*
0.4255*
0.1429*
0.1962*
0.1479*
0.2149*
0.1420*
0.1268
Note: This table reports the bias and RMSEs of one-step-ahead ES forecasts obtained using CARES
models and other competing models when the underlying return follows a GARCH(1,1) process. Panels
A and B report the results obtained based on 5% and 1% coverage probabilities, respectively. HS(250)
and HS(500) denote the historical simulation method based on the most recent 250 and 500 observations. KDE and KDE-JK denote the kernel density estimator before and after jackknife bias correction.
CARE-SAV(α0 ) (CARE-SAV(α̂)) and CARE-IG(α0 ) (CARE-IG(α̂)) denote the CARE model with the
symmetric absolute value specification and the indirect GARCH(1,1) specification, respectively, when
α takes its true value (and when α is estimated). CARES-SAV and CARES-IG denote the CARES
model with the symmetric absolute value specification and the indirect GARCH(1,1) specification, respectively. We use the CARES-IG model as a benchmark (because it always yields the best performance with the smallest RMSE). Based on the test of Diebold and Mariano (1995) at the 5% level of
significance, an asterisk indicates that the RMSE of the ES forecast obtained using the corresponding
model is significantly larger than that of the benchmark for a sample size of 2,000.
19
Table 3: ES Forecasts Obtained for Data from the GARCH(1,1) Model with Timevarying Skewness and Kurtosis
T=500
Bias
RMSE
HS(250)
0.0046 0.2739*
HS(500)
0.0164 0.2569*
KDE
-0.0058 0.2534*
KDE-JK
0.0072 0.2535*
EWHS(500)
-0.0228 0.2129*
EWKDE
-0.0270 0.2029*
EWKDE-JK
-0.0141 0.2035*
CARE-SAV(α0 ) 0.0202 0.1900*
CARE-IG(α0 )
0.0237 0.1868*
CARE-SAV(α̂) -0.0391 0.1916*
CARE-IG(α̂)
-0.0338 0.1898*
CARES-SAV
1.2074 0.1818*
CARES-IG
-0.0623 0.1747
HS(250)
HS(500)
KDE
KDE-JK
EWHS(500)
EWKDE
EWKDE-JK
CARE-SAV(α0 )
CARE-IG(α0 )
CARE-SAV(α̂)
CARE-IG(α̂)
CARES-SAV
CARES-IG
-0.2568
-0.1472
-0.1578
-0.1392
-0.2133
-0.2133
-0.2070
0.1802
0.2083
0.0457
0.0763
-0.5070
-0.3474
1.3416*
0.6154*
0.6183*
0.6080*
0.6011*
0.6044*
0.6268*
0.4569*
0.5386*
0.6057*
0.5495*
0.4523*
0.4315
Panel A:
T=1250
Bias
RMSE
-0.0740 0.2186*
-0.0552 0.1603*
-0.0768 0.1333*
-0.0670 0.1291*
-0.0864 0.1494*
-0.0830 0.1406*
-0.0738 0.1388*
-0.0211 0.0940*
-0.0307 0.1110*
-0.0457 0.1044*
-0.0448 0.1414*
-0.0284 0.0937*
-0.0117 0.0752
Panel B:
-0.0123 0.6166*
0.0285 0.4779*
0.0920 0.4017*
0.1004 0.4036*
-0.2603 0.3982*
-0.1440 0.3147*
-0.1368 0.3181*
0.0894 0.3872*
0.1386 0.4617*
0.0920 0.3961*
0.1926 0.4704*
0.1345 0.4315*
0.1372 0.3208
20
5% VaR
T=2500
Bias
RMSE
-0.0872 0.1898*
-0.0657 0.1416*
-0.0491 0.0991*
-0.0408 0.0956*
-0.0627 0.1400*
-0.0646 0.1352*
-0.0566 0.1411*
-0.0069 0.0614*
0.0014 0.0721*
-0.0107 0.0659*
-0.0158 0.0746*
-0.0194 0.0593*
-0.0237 0.0629
1% VaR
-0.0158 0.4188*
0.0577 0.3902*
0.0359 0.2936*
0.0416 0.2929*
-0.4463 0.2066*
0.0234 0.1562*
0.0089 0.1478*
0.1005 0.3688*
0.1062 0.3503*
-0.0063 0.4041*
0.0383 0.3666*
0.0641 0.3132*
0.1148 0.3032
T=5000
Bias
RMSE
0.0880 0.2423*
0.0831 0.2993*
0.0292 0.2584*
0.0349 0.2584*
0.0337 0.2219*
0.0386 0.2267*
0.0459 0.2286*
0.0639 0.1269*
0.0642 0.1299*
0.0315 0.1323*
0.0141 0.1344*
0.0308 0.1279*
0.0233 0.0821
0.0408
0.0018
-0.0756
-0.0725
0.1025
0.1049
0.1114
0.0909
-0.0045
0.0536
0.0503
-0.0235
0.0196
0.5430*
0.6519*
0.3981*
0.3980*
0.3664*
0.3519*
0.3592*
0.5007*
0.4703*
0.5864*
0.5169*
0.2894*
0.2058
T=500
Bias
RMSE
HS(250)
0.0529 0.3988*
HS(500)
0.0692 0.3727*
KDE
-0.0058 0.3534*
KDE-JK
0.0072 0.3535*
EWHS(500)
0.0374 0.3129*
EWKDE
-0.0270 0.3029*
EWKDE-JK
-0.0141 0.3035*
CARE-SAV(α0 ) 0.0636 0.2113*
CARE-IG(α0 )
0.3006 0.2734*
CARE-SAV(α̂) 0.0350 0.2153*
CARE-IG(α̂)
0.3252 0.3036*
CARES-SAV
0.0332 0.2712
CARES-IG
-0.0102 0.2709
HS(250)
-0.3835 2.6997*
HS(500)
-0.1669 2.6212*
KDE
-0.1651 2.7188*
KDE-JK
-0.1721 2.7189*
EWHS(500)
0.3567 2.1214*
EWKDE
0.3186 2.0349*
EWKDE-JK
0.3057 2.0294*
CARE-SAV(α0 ) 0.2909 1.0061*
CARE-IG(α0 )
0.7468 1.0312*
CARE-SAV(α̂) 0.2636 1.0123*
CARE-IG(α̂)
0.6493 1.0466*
CARES-SAV
0.2090 1.0503
CARES-IG
-2.0151 1.1230*
Panel A: 5% ES
T=1250
T=2500
T=5000
Bias
RMSE
Bias
RMSE
Bias
RMSE
-0.1827 0.5105* -0.1134 0.4584* 0.1439 0.3871*
-0.0749 0.5017* -0.1307 0.3756* 0.1829 0.4958*
-0.0788 0.3150* -0.0558 0.3011* 0.0557 0.4867*
-0.0873 0.3178* -0.0624 0.3027* 0.0514 0.4863*
-0.0797 0.2911* -0.0162 0.3777* -0.0483 0.3767*
-0.0785 0.2471* -0.0646 0.2439* 0.0502 0.2471*
-0.0138 0.2388* 0.0326 0.2389* 0.0459 0.1286*
-0.0450 0.1693* 0.0001 0.1202* 0.0522 0.1917*
0.2134 0.2699* 0.2507 0.2841* 0.3191 0.2045*
-0.0382 0.1713* -0.0129 0.1283* 0.0569 0.2011*
0.2088 0.2884* 0.2350 0.2959* 0.3100 0.3823*
-0.0521 0.1695 0.0043 0.1143 0.0258 0.1905*
-0.0316 0.1663 -0.0046 0.1295* 0.0057 0.1788
Panel B: 1% ES
-0.0943 1.2984* -0.0158 1.2188* 0.0783 1.1130*
0.1486 1.1872* 0.0577 1.1902* 0.2824 0.9883*
0.1514 1.2419* 0.2541 0.9521* -0.0353 1.0147*
0.1458 1.2404* 0.2506 0.9506* -0.0379 1.0153*
-0.1628 1.5039* -0.4717 0.9540* 0.2704 0.9223*
0.3126 1.1220* -0.0099 0.9963* 0.5004 0.9223*
0.3064 1.1175* -0.0139 0.9937* 0.4958 0.9177*
0.1540 0.9652* 0.1661 0.6587* 0.1575 0.8471*
0.7007 0.9650* 0.6848 0.8826* 0.4970 0.8235*
0.0920 0.9961* 0.1749 0.6765* 0.1937 0.9612*
0.1926 0.9704* 0.6526 0.8697* 0.6117 0.9568*
0.1001 0.8788 0.1926 0.6132 0.1294 0.5919
0.1327 0.8995* 0.1031 0.8470* 0.1975 0.7501*
Note: This table reports the bias and RMSEs of one-step-ahead ES forecasts obtained using CARES
models and other competing models when the underlying return follows a GARCH(1,1) process with
time-varying skewness and kurtosis. Panels A and B report the results obtained based on 5% and
1% coverage probabilities, respectively. HS(250) and HS(500) denote the historical simulation method
based on the most recent 250 and 500 observations. KDE and KDE-JK denote the kernel density estimator before and after jackknife bias correction. CARE-SAV(α0 ) (CARE-SAV(α̂)) and CARE-IG(α0 )
(CARE-IG(α̂)) denote the CARE model with the symmetric absolute value specification and the indirect GARCH(1,1) specification, respectively, when α takes its true value (and when α is estimated).
CARES-SAV and CARES-IG denote the CARES model with the symmetric absolute value specification and the indirect GARCH(1,1) specification, respectively. We use the CARES-IG model as a benchmark (because it always yields the best performance with the smallest RMSE). Based on the test of
Diebold and Mariano (1995) at the 5% level of significance, an asterisk indicates that the RMSE of the
ES forecast obtained using the corresponding model is significantly larger than that of the benchmark
21
for a sample size of 2,000.
Table 4: ES Forecasts Obtained for Data from the MS(2)-SV Model
T=500
Bias
RMSE
HS(250)
-0.0644 0.4951*
HS(500)
-0.0644 0.5093*
KDE
-0.1276 0.5107*
KDE-JK
-0.0804 0.5025*
EWHS(500)
-0.1027 0.4925*
EWKDE-JK
-0.0656 0.4821*
EWKDE
-0.0093 0.3784*
CARE-SAV(α0 ) -0.0107 0.3866*
CARE-IG(α0 )
-0.0058 0.3787*
CARE-SAV(α̂)
0.0736 0.4050*
CARE-IG(α̂)
0.0807 0.3924*
CARES-SAV
0.0472 0.3618*
CARES-IG
0.0265 0.3550
HS(250)
HS(500)
KDE
KDE-JK
EWHS(500)
EWKDE
EWKDE-JK
CARE-SAV(α0 )
CARE-IG(α0 )
CARE-SAV(α̂)
CARE-IG(α̂)
CARES-SAV
CARES-IG
-0.6657
-0.5727
-0.6384
-0.5743
-0.4753
-0.4492
-0.4183
-0.0900
-0.1294
-0.1010
-0.0713
-0.6131
-0.1300
1.1556*
1.0228*
1.0630*
1.0156*
0.8988*
0.9742*
0.8586*
0.5761*
0.5489*
0.5902*
0.6105*
0.5531*
0.5234
Panel A:
T=1250
Bias
RMSE
-0.0577 0.4476*
-0.0365 0.5110*
-0.1779 0.5798*
-0.1423 0.5717*
0.0039 0.3367*
0.0278 0.3111*
-0.0190 0.3066*
-0.0083 0.1619*
-0.0120 0.1758*
0.1108 0.2055*
0.1073 0.2154*
-0.0166 0.1563
0.0261 0.1529
Panel B:
-0.3867 0.7811*
-0.3822 0.8264*
-0.5463 1.0023*
-0.5103 0.9899*
-0.1641 0.6246*
-0.1791 0.5006*
-0.1129 0.4987*
-0.0797 0.3158*
-0.1024 0.3165*
-0.0075 0.3304*
-0.0238 0.3423*
-0.2709 0.3083*
-0.0696 0.2650
22
5% VaR
T=2500
Bias
RMSE
0.2011 0.6667*
0.1337 0.7755*
0.0090 0.7101*
0.0343 0.7097*
0.0600 0.3945*
0.0723 0.3943*
0.0410 0.3922*
-0.0805 0.2597*
-0.0951 0.2481*
0.0578 0.2620*
0.0410 0.2494*
-0.1230 0.2294*
-0.1070 0.2177
1% VaR
-0.2706 1.0834*
-0.3195 0.9750*
-0.5524 0.8960*
-0.5240 0.8792*
0.0772 0.8174*
0.0157 0.9036*
0.0664 0.7799*
-0.0321 0.3253*
-0.0643 0.3216*
0.0620 0.3353*
0.0313 0.3232*
-0.2766 0.3220*
-0.0932 0.3178
T=5000
Bias
RMSE
-0.3133 0.6281*
-0.2351 0.6214*
-0.1328 0.6130*
-0.1125 0.6143*
-0.2546 0.5998*
-0.2664 0.5084*
-0.2835 0.5448*
-0.0518 0.3030*
-0.0604 0.3062*
0.0793 0.3068*
0.0701 0.3086*
-0.0402 0.3027*
-0.0235 0.2942
-0.6142
0.5685
-0.4977
-0.4760
-0.4112
-0.4012
-0.3808
-0.1737
-0.1803
-0.0610
-0.0699
-0.1702
-0.1067
0.9532*
0.9458*
0.8435*
0.8288*
0.8710*
0.8143*
0.8134*
0.4383*
0.4518*
0.5122*
0.5219*
0.4287
0.4231
T=500
Bias
RMSE
HS(250)
-0.2156 0.6750*
HS(500)
-0.2055 0.6505*
KDE
-0.1811 0.6437*
KDE-JK
-0.0804 0.6625*
EWHS(500)
0.0472 0.4925*
EWKDE
-0.0656 0.4821*
EWKDE-JK
0.0106 0.4772*
CARE-SAV(α0 ) -0.0160 0.4894*
CARE-IG(α0 )
0.1855 0.4862*
CARE-SAV(α̂) -0.0099 0.4899*
CARE-IG(α̂)
0.2329 0.4883*
CARES-SAV
-0.0106 0.4774*
CARES-IG
-0.0149 0.4662
HS(250)
HS(500)
KDE
KDE-JK
EWHS(500)
EWKDE
EWKDE-JK
CARE-SAV(α0 )
CARE-IG(α0 )
CARE-SAV(α̂)
CARE-IG(α̂)
CARES-SAV
CARES-IG
-0.8166
-0.7549
-0.7009
-0.7858
-0.3421
-0.4183
-0.3067
-0.1040
0.0146
-0.1797
0.0487
-0.5890
-0.1642
1.4289*
1.2389*
1.1809*
1.2469*
0.9210*
0.9742*
0.9148*
0.6525*
0.5925*
0.6622*
0.6529*
0.6708*
0.6324
Panel A: 5% ES
T=1250
T=2500
Bias
RMSE
Bias
RMSE
-0.2639 0.6069* 0.1080 0.8938*
-0.2336 0.6566* 0.0320 0.8641*
-0.3332 0.7887* -0.1333 0.8883*
-0.3727 0.8040* -0.1620 0.8939*
-0.1098 0.4256* -0.0096 0.5122*
-0.0854 0.4166* 0.0894 0.5200*
-0.0322 0.4137* 0.0508 0.5184*
-0.0117 0.2047* -0.1032 0.3286*
0.1747 0.2911* 0.1079 0.3177*
-0.0144 0.2426* -0.1185 0.3351*
0.2071 0.3412* 0.1777 0.3583*
-0.0561 0.2166* -0.1153 0.3123*
-0.0177 0.2084 -0.1367 0.3058
Panel B: 1% ES
-0.5454 0.9582* -0.3653 0.9564*
-0.5080 0.9570* -0.4224 0.9503*
-0.6526 1.1578* -0.7333 1.1352*
-0.6970 1.1781* -0.7652 1.1547*
-0.5124 0.6890* -0.0914 0.8594*
-0.2741 0.6423* 0.0415 0.9036*
-0.1930 0.6355* 0.1065 0.9298*
-0.0922 0.3801* -0.0379 0.3711*
0.0397 0.3679* 0.0850 0.3883*
-0.1040 0.3813* -0.0422 0.3743*
-0.1428 0.3625* 0.1348 0.3944*
-0.1342 0.3552* -0.2766 0.3981*
-0.0417 0.3314 -0.0932 0.3471
T=5000
Bias
RMSE
-0.4991 0.8217*
-0.4325 0.8033*
-0.3200 0.7915*
-0.1125 0.7884*
-0.3715 0.7155*
-0.3039 0.7059*
-0.3012 0.6977*
-0.0672 0.3817*
0.1271 0.3871*
-0.0725 0.3838*
0.1991 0.4179*
-0.0912 0.3735*
-0.0759 0.3521
-0.7935
-0.6789
-0.6692
-0.6929
-0.6367
-0.3833
-0.3524
-0.2007
-0.0384
-0.1906
0.0246
-0.2573
-0.2076
0.9532*
0.9458*
1.0521*
1.0669*
1.0520*
0.8829*
0.8713*
0.5694*
0.5549*
0.5899*
0.5173*
0.5615*
0.5085
Note: This table reports the bias and RMSEs of one-step-ahead ES forecasts obtained using CARES
models and other competing models when the underlying return follows the MS(2)-SV model. Panels
A and B report the results obtained based on 5% and 1% coverage probabilities, respectively. HS(250)
and HS(500) denote the historical simulation method based on the most recent 250 and 500 observations. KDE and KDE-JK denote the kernel density estimator before and after jackknife bias correction.
CARE-SAV(α0 ) (CARE-SAV(α̂)) and CARE-IG(α0 ) (CARE-IG(α̂)) denote the CARE model with the
symmetric absolute value specification and the indirect GARCH(1,1) specification, respectively, when
α takes its true value (and when α is estimated). CARES-SAV and CARES-IG denote the CARES
model with the symmetric absolute value specification and the indirect GARCH(1,1) specification, respectively. We use the CARES-IG model as a benchmark (because it always yields the best performance with the smallest RMSE). Based on the test of Diebold and Mariano (1995) at the 5% level of
significance, an asterisk indicates that the RMSE of the ES forecast obtained using the corresponding
model is significantly larger than that of the benchmark for a sample size of 2,000.
23
Table 5: CARE Model Specification Test
Rejection Rate
GARCH(1,1)-GAUSSIAN model
0.0678
GARCH(1,1) model with time-varying skewness and kurtosis
0.0963
MS(2)−GARCH model
0.2874
Note: This table reports the specification test (GMM Wald test) results for the
CARE model obtained from simulations of 1,000 samples based on the GARCH(1,1)GAUSSIAN model, the GARCH(1,1) model with time-varying skewness and kurtosis,
and the MS(2)−GARCH model with a sample size of 2,500. We perform the test at the
5% significance level; therefore, the ideal rejection rate should be 5%.
4.1
Data
We consider two individual stocks, General Motors (GM) and IBM, and one stock index,
S&P 500, to conduct this empirical study. Following Engle and Manganelli (2004),
we first obtain a sample of 3,392 daily prices from Datastream for each study target,
spanning the period from April 7, 1986, to April 7, 1999. This sample is useful for
testing whether the ES estimates produced by our CARES model can provide the same
risk indication provided by the VaR estimates produced by the CAViaR model of Engle
and Manganelli (2004). Second, we obtain a recent sample of daily prices from Wharton
Research Data Services (WRDS) for the above two stocks and one index, ranging from
January 1, 2005, to December 31, 2011. This sample period overlaps with the recent
global financial crisis to allow us to study the ability of our model to adapt to new
risk environments. The daily returns are calculated as 100 times the difference in the
logarithms of the prices.
4.2
Empirical Results
For the first sample, we use the first 2,892 observations to establish the CARES models
and reserve the last 500 observations for one-step-ahead out-of-sample forecasting. The
5% VaR and ES forecasts for GM are plotted in Figure 2, and the model estimation
results are reported in Table 6 and Table 7. As expected, the VaR results are similar to
those reported in Engle and Manganelli (2004); consistent with Engle and Manganelli
(2004), the always significant coefficient of the autoregressive term (β2 ) in both the VaR
and ES models confirms that the phenomenon of volatility clustering is also relevant in
the tails.
24
Table 6: Estimation Results of the CARES Models for Real Data (Part A)
β1
(Std1)
(Std2)
(Std3)
β2
(Std1)
(Std2)
(Std3)
β3
(Std1)
(Std2)
(Std3)
β1
(Std1)
(Std2)
(Std3)
β2
(Std1)
(Std2)
(Std3)
β3
(Std1)
(Std2)
(Std3)
Symmetric Absolute Value
Indirect GARCH
GM
IBM
S&P 500
GM
IBM
S&P 500
VaR: 1% Coverage Probability
0.4702*
0.0928
0.0457
0.1460
0.2648
0.0178
(0.2691)
(0.1533)
(0.0532)
(0.1346)
(0.2107)
(0.0248)
(0.3218)
(0.1422)
(0.0573)
(0.1600)
(0.3519)
(0.0290)
(0.3100)
(0.1394)
(0.0543)
(0.1430)
(0.4966)
(0.0260)
0.7851*** 0.9411*** 0.9240*** 0.9590*** 0.9614*** 0.9701***
(0.0929)
(0.0950)
(0.0390)
(0.0094)
(0.0136)
(0.0042)
(0.1127)
(0.0933)
(0.0430)
(0.0117)
(0.0230)
(0.0050)
(0.1059)
(0.0961)
(0.0411)
(0.0105)
(0.0317)
(0.0043)
0.3497***
0.1248
0.2020
0.2380
0.1444
0.2183*
(0.1650)
(0.2374)
(0.1307)
(1.4698)
(0.3609)
(1.2146)
(0.1915)
(0.2390)
(0.1497)
(1.7949)
(0.3387)
(1.3364)
(0.1774)
(0.2587)
(0.1398)
(1.6078)
(0.9364)
(1.2704)
VaR: 5% Coverage Probability
0.1583***
0.0250
0.0093* 0.3336*** 0.2350** 0.0262**
(0.0580)
(0.0217)
(0.0055)
(0.1037)
(0.1816)
(0.0099)
(0.0741)
(0.0237)
(0.0083)
(0.1277)
(0.0826)
(0.0149)
(0.0870)
(0.0251)
(0.0073)
(0.1521)
(0.1103)
(0.0117)
0.8856*** 0.9313*** 0.9528*** 0.9042*** 0.9010*** 0.9287***
(0.0248)
(0.0161)
(0.0211)
(0.0133)
(0.0394)
(0.0061)
(0.0344)
(0.0193)
(0.0251)
(0.0185)
(0.0165)
(0.0108)
(0.0376)
(0.0192)
(0.0239)
(0.0223)
(0.0217)
(0.0088)
0.1145*** 0.1213*** 0.0851*
0.1220
0.1241***
0.1407
(0.0419)
(0.0284)
(0.0427)
(0.1151)
(0.0481)
(0.6198)
(0.0329)
(0.0341)
(0.0512)
(0.0860)
(0.0699)
(0.3561)
(0.0388)
(0.0313)
(0.0482)
(0.1375)
(0.0553)
(0.4446)
Note: This table reports the CARES model estimation results for two individual stocks,
General Motors (GM) and IBM, and for one stock index, S&P 500. The sample contains
2,892 daily prices for each study target, spanning the period from April 7, 1986, to April
7, 1997. We report the standard errors (where cT is estimated using a k-nearest neighbor
estimator as in Engle and Manganelli (2004)) as Std1 and two bias-corrected standard
errors as Std2 (where cT is estimated using Koenker’s bandwidth with Bofinger’s hT )
and Std3 (where cT is estimated using Koenker’s bandwidth with Hall and Sheather’s
hT ). *, ** and *** indicate that the coefficients are significant at the 5% significance
level based on Std1, both Std1 and Std2, and all three standard errors, respectively.
25
Table 7: Estimation Results of the CARES Models for Real Data (Part B)
γˆ1
(Std1)
(Std2)
(Std3)
γˆ2
(Std1)
(Std2)
(Std3)
γˆ3
(Std1)
(Std2)
(Std3)
γˆ1
(Std1)
(Std2)
(Std3)
γˆ2
(Std1)
(Std2)
(Std3)
γˆ3
(Std1)
(Std2)
(Std3)
Symmetric Absolute Value
Indirect GARCH
GM
IBM
S&P 500
GM
IBM
S&P 500
ES: 1% Coverage Probability
2.1648
-0.0802
-0.0440
-0.9024
1.1747
-0.2470***
(1.5962)
(1.2069)
(0.5396)
(1.0177)
(5.4666)
(0.0703)
(1.4586)
(1.2582)
(0.6173)
(1.0245)
(9.1299)
(0.0710)
(1.6313)
(1.1975)
(0.5791)
(1.0161) (12.2810)
(0.0705)
0.2143
0.9001
0.6981*** 0.9171*** 0.9101*** 0.8919***
(0.2300)
(0.9532)
(0.4026)
(0.0878)
(0.1034)
(0.0214)
(0.2058)
(1.0297)
(0.4128)
(0.0883)
(0.1744)
(0.0216)
(0.2391)
(1.0202)
(0.4085)
(0.0876)
(0.2327)
(0.0215)
1.8901*** 0.6174* 1.8698*** 1.3730*** 0.9595*** 2.8273***
(0.8261)
(0.7204)
(0.4376)
(0.5860)
(0.3782)
(0.1030)
(0.8147)
(0.1286)
(0.4309)
(0.5899)
(0.4315)
(0.0990)
(0.8416)
(0.1226)
(0.4376)
(0.5832)
(0.3252)
(0.1010)
ES: 5% Coverage Probability
0.6092
0.0717
0.2367
2.0529
0.6679
1.0458
(0.5707)
(0.3806)
(0.3854)
(1.8022)
(2.2157)
(3.3992)
(0.9244)
(0.3916)
(0.4094)
(2.1404)
(1.0939)
(3.1835)
(0.9737)
(0.3896)
(0.3812)
(2.5239)
(1.2997)
(3.1485)
0.6488*** 0.9190*** 0.5610*** 0.6886*** 0.8764***
0.3302
(0.2449)
(0.2156)
(0.2775)
(0.0993)
(0.1320)
(0.3156)
(0.3578)
(0.2084)
(0.2769)
(0.1141)
(0.0628)
(0.2621)
(0.3710)
(0.2095)
(0.2588)
(0.1336)
(0.0756)
(0.2628)
0.5755*** 0.1924***
1.1267
0.8747***
0.3883
3.8413
(0.2205)
(0.0890)
(0.8126)
(0.5039)
(0.5361)
(2.6250)
(0.2077)
(0.0584)
(0.8020)
(0.5336)
(0.4246)
(2.6027)
(0.2125)
(0.0629)
(0.7828)
(0.5024)
(0.4433)
(2.6138)
Note: This table reports the CARES model estimation results for two individual stocks,
General Motors (GM) and IBM, and for one stock index, S&P 500. The sample contains
2,892 daily prices for each study target, spanning the dates from April 7, 1986, to April
7, 1997. We report the standard errors (where cT is estimated using a k-nearest neighbor
estimator as in Engle and Manganelli (2004)) as Std1 and two bias-corrected standard
errors as Std2 (where cT is estimated using Koenker’s bandwidth with Bofinger’s hT ) and
Std3 (where cT is estimated using Koenker’s bandwidth with Hall and Sheather’s hT ). *,
** and *** indicate that the coefficients are significant at the 5% significance level based
on Std1, both Std1 and Std2, and all three standard errors, respectively.
26
Figure 2: 5% VaR and ES Estimates Obtained Using the CARES Models for GM
The top panel of Figure 2 presents a plot of the 5% VaR and ES estimates obtained using
the CARES model with the symmetric absolute value specification for GM 5 , and the
bottom panel of Figure 2 presents a plot of the 5% VaR and ES estimates obtained using
the CARES model with the indirect GARCH specification for GM. The ES plot exhibits
a pattern similar to that of the VaR plot, with a spike near the beginning of the sample
corresponding to the 1987 crash and an increase toward the end of the sample that
reflects the increase in volatility following the Russian and Asian crises. These findings
indicate that the ES estimates produced by the CARES model are able to provide the
same risk indication as the VaR estimates produced by the CAViaR model of Engle and
Manganelli (2004) and can therefore be regarded as an alternative risk measure. With
respect to the model estimation results, most coefficients related to both VaR and ES are
statistically significant at the 5% significance level, strongly supporting the time-varying
nature of the tail of the distribution.
For the second sample, we use the first 1,262 observations to establish the CARES models
and again reserve the last 500 observations for out-of-sample forecasting. We estimate
the 1% and 5% one-day-ahead ESs using the two CARES specifications discussed in
5
The plot exhibits the same trend shown in Figure 1 of Engle and Manganelli (2004); the only
difference is that the VaR is reported as a negative rather than positive value.
27
Figure 3: 5% VaR and ES Estimates Obtained Using the CARES Models for IBM
Section 2.1. The 5% VaR and ES estimates for IBM and S&P 500 are plotted in Figure 3
and Figure 4, respectively. The VaR and ES estimates are reported as negative numbers
in these plots. The common spike in the middle of the sample (between the end of
2008 and 2009) reflects the recent global financial crisis, and the increased risk toward
the end of the sample reflects the recent euro zone crisis. The estimation results are
reported in Table 8 and Table 9. Again, the coefficients of the autoregressive terms in the
CARES models are always significant. This finding confirms that the phenomenon of the
clustering of volatilities and higher-order moments is also relevant in the tails. Finally, we
empirically test whether the relationship between the parameters of the autoregressive
VaR model (β1 ,β2 and β3 ) and the parameters of the autoregressive ES model implied by
the expectile-based models (CARE models) hold in our CARES model. We also apply
the GMM Wald test to the estimated parameters, with the results indicating that the
linear proportionality between the conditional quantile and ES used in Taylor (2008)
does not hold in these two empirical data sets.
28
Figure 4: 5% VaR and ES Estimates Obtained Using the CARES Models for the S&P
500 Index
29
Table 8: Estimation Results of the CARES Models for Real Data (Part C)
β1
(Std1)
(Std2)
(Std3)
β2
(Std1)
(Std2)
(Std3)
β3
(Std1)
(Std2)
(Std3)
β1
(Std1)
(Std2)
(Std3)
β2
(Std1)
(Std2)
(Std3)
β3
(Std1)
(Std2)
(Std3)
Symmetric Absolute Value
Indirect GARCH
GM
IBM
S&P 500
GM
IBM
S&P 500
VaR: 1% Coverage Probability
0.0353
0.1750
0.0423***
-0.0109
0.2781***
0.1178
(0.1220)
(0.0471)
(0.0203)
(0.0872)
(0.0846)
(0.0860)
(0.1185)
(0.0463)
(0.0208)
(0.0798)
(0.0925)
(0.0737)
(0.1205)
(0.0471)
(0.0207)
(0.0795)
(0.0906)
(0.0775)
0.9283*** 0.7196*** 0.9288*** 0.9686*** 0.7043*** 0.9158***
(0.0758)
(0.0564)
(0.0104)
(0.0078)
(0.0294)
(0.0099)
(0.0742)
(0.0592)
(0.0128)
(0.0064)
(0.0291)
(0.0099)
(0.0748)
(0.0650)
(0.0129)
(0.0064)
(0.0283)
(0.0099)
0.2330
0.5498*** 0.1880***
0.2590
1.0718***
0.4434
(0.1519)
(0.1774)
(0.0194)
(0.1822)
(0.4555)
(1.4155)
(0.1493)
(0.1840)
(0.0278)
(0.3956)
(0.4635)
(1.1890)
(0.1497)
(0.2091)
(0.0275)
(0.3860)
(0.4530)
(1.1683)
VaR: 5% Coverage Probability
-0.0033
0.0131***
0.0057
-0.0068
0.0268***
0.0040
(0.0070)
(0.0085)
(0.0182)
(0.0206)
(0.0123)
(0.0164)
(0.0110)
(0.0095)
(0.0123)
(0.0246)
(0.0108)
(0.0143)
(0.0092)
(0.0091)
(0.0121)
(0.0218)
(0.0107)
(0.0168)
0.9783*** 0.8785*** 0.9320*** 0.9824*** 0.8299*** 0.9421***
(0.0136)
(0.0236)
(0.0269)
(0.0061)
(0.0195)
(0.0090)
(0.0162)
(0.0333)
(0.0303)
(0.0072)
(0.0182)
(0.0075)
(0.0151)
(0.0308)
(0.0322)
(0.0064)
(0.0187)
(0.0096)
0.0561*** 0.2235*** 0.1528***
0.0615
0.3712***
0.1861
(0.0261)
(0.0394)
(0.0553)
(0.1883)
(0.1399)
(0.4788)
(0.0297)
(0.0613)
(0.0693)
(0.2471)
(0.2321)
(0.5035)
(0.0286)
(0.0551)
(0.0709)
(0.2050)
(0.2050)
(0.7112)
Note: This table reports the CARES model estimation results for two individual stocks,
General Motors (GM) and IBM, and for one stock index, S&P 500. The sample contains 1,262 daily prices for each study target, spanning the period from January 1, 2005,
to December 31, 2009. We report the standard errors (where cT is estimated using a
k-nearest neighbor estimator as in Engle and Manganelli (2004)) as Std1 and two biascorrected standard errors as Std2 (where cT is estimated using Koenker’s bandwidth
with Bofinger’s hT ) and Std3 (where cT is estimated using Koenker’s bandwidth with
Hall and Sheather’s hT ). *, ** and *** indicate that the coefficients are significant at
the 5% significance level based on Std1, both Std1 and Std2, and all three standard errors, respectively.
30
Table 9: Estimation Results of the CARES Models for Real Data (Part D)
γˆ1
(Std1)
(Std2)
(Std3)
γˆ2
(Std1)
(Std2)
(Std3)
γˆ3
(Std1)
(Std2)
(Std3)
γˆ1
(Std1)
(Std2)
(Std3)
γˆ2
(Std1)
(Std2)
(Std3)
γˆ3
(Std1)
(Std2)
(Std3)
Symmetric Absolute Value
Indirect GARCH
GM
IBM
S&P 500
GM
IBM
S&P 500
ES: 1% Coverage Probability
0.1538
0.3703
0.0304
0.5621***
0.8743
0.3303
(0.2907)
(0.5324)
(0.0626)
(0.1690)
(0.8722)
(0.2458)
(0.1524)
(0.5650)
(0.0592)
(0.1440)
(0.8765)
(0.2391)
(0.3747)
(0.5801)
(0.0592)
(0.1440)
(0.8877)
(0.2374)
0.8731*** 0.7000*** 0.9250*** 0.9456*** 0.6490*** 0.8902***
(0.0563)
(0.3401)
(0.1532)
(0.0975)
(0.1930)
(0.0201)
(0.0629)
(0.3631)
(0.1272)
(0.0974)
(0.1966)
(0.0210)
(0.0967)
(0.3708)
(0.1293)
(0.0974)
(0.2000)
(0.0211)
0.5127**
0.4603
0.2768
0.5515
1.0359*
0.7826
(0.3464)
(0.4445)
(0.6114)
(0.3690)
(0.5886)
(0.6683)
(0.2868)
(0.4727)
(0.5070)
(0.3690)
(0.6323)
(0.6334)
(0.1213)
(0.4749)
(0.5157)
(0.3690)
(0.6338)
(0.6341)
ES: 5% Coverage Probability
0.0139
0.1123
0.0442
0.0736*** 0.2058*
0.0757*
(0.0560)
(0.1458)
(0.0483)
(0.0320)
(0.1246)
(0.0465)
(0.0560)
(0.2173)
(0.0562)
(0.0372)
(0.1154)
(0.0505)
(0.0560)
(0.1956)
(0.0550)
(0.0330)
(0.1168)
(0.0522)
0.9192*** 0.7875*** 0.9081*** 0.9322*** 0.7016*** 0.9219***
(0.0604)
(0.1871)
(0.0507)
(0.0065)
(0.0521)
(0.0061)
(0.0594)
(0.2700)
(0.0473)
(0.0065)
(0.0471)
(0.0060)
(0.0594)
(0.2424)
(0.0502)
(0.0065)
(0.0469)
(0.0059)
0.2480***
0.3686
0.2297*
0.3837
0.8893**
0.3533
(0.1490)
(0.2720)
(0.1266)
(0.2158)
(0.4671)
(0.2101)
(0.1089)
(0.3709)
(0.1139)
(0.2153)
(0.4078)
(0.2212)
(0.1156)
(0.3344)
(0.1241)
(0.2156)
(0.4231)
(0.1954)
Note: This table reports the CARES model estimation results for two individual stocks,
General Motors (GM) and IBM, and for one stock index, S&P 500. The sample contains 1,262 daily prices for each study target, spanning the period from January 1, 2005,
to December 31, 2009. We report the standard errors (where cT is estimated using a
k-nearest neighbor estimator as in Engle and Manganelli (2004)) as Std1 and two biascorrected standard errors as Std2 (where cT is estimated using Koenker’s bandwidth
with Bofinger’s hT ) and Std3 (where cT is estimated using Koenker’s bandwidth with
Hall and Sheather’s hT ). *, ** and *** indicate that the coefficients are significant at
the 5% significance level based on Std1, both Std1 and Std2, and all three standard errors, respectively.
31
5
Conclusion
In this article, we propose a new model for ES estimation. Most existing ES estimation
methods focus on the entire distribution of returns and recover its quantile and the
expectation of loss beyond the quantile to estimate the VaR and ES indirectly. By
contrast, we directly model the quantile and the expected loss beyond the quantile. For
this purpose, we introduce a new class of models, the CARES models, which use the
CAViaR model for quantile estimation in addition to specifying the evolution of the ES
over time using a specialized type of autoregressive procedure. The model parameters
are estimated using a tail-based least-squares method, and we derive the limiting theory
of these parameter estimators within a GMM framework. Our simulations indicate that
the new model performs better than several popular alternatives even when a moderate
sample size is used. The advantage of the model appears to be more pronounced when
the return distribution exhibits more complicated tail dynamics. Applications to real
data highlight the ability of the new model to adapt to new risk environments.
Appendix A
Because the estimator θ̂ = (β̂, γ̂) can be asymptotically regarded as a GMM estimator,
its asymptotic distribution can be established within a GMM framework.
In our particular problem, θ̂ can be identified as
θ̂ = argmin{QT (θ)},
where
QT (θ)
=
mT (θ)
=
E0 [ϕt (θ)]
=
mT (θ)0 VT−1 mT (θ)
"
#
PT
T
1
1X
∇
f
(β)
·
(τ
−
I(y
<
f
(β)))
β t
t
t
t=1
T
ϕt (θ) = 1 P
T
T t=1
∇
g
(γ)
·
(y
−
g
(γ)
·
I(y
γ t
t
t
t < ft (β))
t=1
T
0
P
VT −→ V,
where E0 represents the expectation and V is the weighting matrix.
Proof of Theorem 2.1 To establish the consistency of the estimator θ̂, we require the
following assumptions:
32
P
Assumption 5.1. Let m0 (θ) = E0 [ϕt (θ)]; then, sup|mT (θ) − m0 (θ)| −→ 0, where | | is
the Euclidian norm. This assumption ensures that mT (θ) uniformly converges to m0 (θ)
in probability.
Assumption 5.2. For all θ ∈ Θ such that ||θ − θ0 || > ε, we have Q0 (θ) − Q0 (θ0 ) > 0.
This assumption ensures that the population objective function Q0 (θ) has a unique
maximum at θ0 .
Let us define the population objective function as Q0 (θ) = E0 [ϕt (θ)]0 V −1 E0 [ϕt (θ)]. Under assumption 5.1, we then obtain
sup|QT (θ) − Q0 (θ)| = sup|mT (θ)0 VT−1 mT (θ) − E0 [ϕ(ωi , θ)]0 V −1 E0 [ϕ(ωi , θ)]|
θ∈Θ
θ∈Θ
mT (θ)0 VT−1 mT (θ) − mT (θ)0 VT−1 E0 [ϕ(ωi , θ)]
= sup +mT (θ)0 VT−1 E0 [ϕ(ωi , θ)] − mT (θ)0 V −1 E0 [ϕ(ωi , θ)]
θ∈Θ +mT (θ)0 V −1 E0 [ϕ(ωi , θ)] − E0 [ϕ(ωi , θ)]0 V −1 E0 [ϕ(ωi , θ)]
≤ sup|mT (θ)0 VT−1 mT (θ) − mT (θ)0 VT−1 E0 [ϕ(ωi , θ)]|
θ∈Θ
+ sup|mT (θ)0 VT−1 E0 [ϕ(ωi , θ)] − mT (θ)0 V −1 E0 [ϕ(ωi , θ)]|
θ∈Θ
+ sup|mT (θ)0 V −1 E0 [ϕ(ωi , θ)] − E0 [ϕ(ωi , θ)]0 V −1 E0 [ϕ(ωi , θ)]|
θ∈Θ
= sup|mT (θ)0 VT−1 (mT (θ) − E0 [ϕ(ωi , θ)])|
θ∈Θ
+ sup|mT (θ)0 (VT−1 − V −1 )E0 [ϕ(ωi , θ)]|
θ∈Θ
+ sup|(mT (θ)0 − E0 [ϕ(ωi , θ)]0 )V −1 E0 [ϕ(ωi , θ)]|
θ∈Θ
= sup|mT (θ)0 VT−1 |sup|mT (θ) − E0 [ϕ(ωi , θ)]|
θ∈Θ
θ∈Θ
+ sup|mT (θ) |sup|VT−1 − V −1 |sup|E0 [ϕ(ωi , θ)]|
0
θ∈Θ
θ∈Θ
θ∈Θ
0
0
+ sup|mT (θ) − E0 [ϕ(ωi , θ)] |sup|V −1 E0 [ϕ(ωi , θ)]|
θ∈Θ
θ∈Θ
P
−→ 0;
that is,
P
sup|QT (θ) − Q0 (θ)| −→ 0 .
θ∈Θ
Then, let ε > 0 be an arbitrarily small real number. Suppose that ||θ − θ0 || > ε; then,
33
by assumption 5.2, there exists a δ > 0 such that Q0 (θ) − Q0 (θ0 ) > δ. Then,
P r(||θ̂ − θ0 || ≥ ε) ≤ P r(Q0 (θ̂ − Q0 (θ0 ) ≥ δ)
= P r(Q0 (θ̂) − QT (θ̂) + QT (θ̂) − Q0 (θ0 ) ≥ δ)
= P r(Q0 (θ̂) − QT (θ̂) + QT (θ0 ) + op (1) − Q0 (θ0 ) ≥ δ)
≤ P r[(|Q0 (θ̂) − QT (θ̂)| ≥ δ) ∪ (|QT (θ0 ) − Q0 (θ0 )| ≥ δ)]
≤ P r[2sup|Q0 (θ̂) − QT (θ̂)| ≥ δ].
θ∈Θ
P
We have sup|QT (θ)−Q0 (θ)| −→ 0 from the above proof; therefore, P r(||θ̂−θ0 || ≥ ε) → 0,
θ∈Θ
or equivalently,
P
θ̂ −→ θ0 .
Thus, we have proven that θ̂ is a consistent estimator.
Proof of Theorem 2.2 This proof builds on the asymptotic theory of the GMM estimator in addition to Theorem 3 of Huber (1967). One of the assumptions in the
asymptotic theory of the GMM estimator is that the objective function is twice continuously differentiable w.r.t. the parameters; however, this assumption does not hold for
our objective function Qn (θ) because mn (θ) is not continuously differentiable. Hence, we
find an approximation for mT (θ), denoted by m0 (θ), that is continuously differentiable,
and we derive the asymptotic distribution of θ based on this approximation.
The following assumptions are required:
Assumption 5.3. The parameter space Θ is compact.
Assumption 5.4. The matrix
all θ ∈ Θ.
∂m0 (θ̂)0 −1 ∂m0 (θ̃)
V
θ
θ0
is non-singular and has an inverse for
i ,θ)
Assumption 5.5. E0 (|| ∂ϕ(ω
||) uniformly has an upper bound on the parameter space
∂θ0
∂ϕ(ωi ,θ)
Θ; that is, E0 (sup|| ∂θ0 ||) < +∞.
θ∈Θ
Assumption 5.6. The variance of ϕ(ωi , θ) is finite; that is, V ar0 (ϕ(ωi , θ)) = E0 (ϕ(ωi , θ)ϕ(ωi , θ)0 ) <
+∞.
Let the expectation of the first-order condition (m0 (θ)) and its derivative be
m0 (θ) = E0 [ϕt (θ)] =
E(∇ft (β(τ )) · (τ − I(yt < ft (β(τ )) < 0))
E(∇γ gt (γ(τ )) · (yt − gt (γ(τ )) · I(yt < ft (β(τ )))
34
,
D11 D12
∇θ m0 (θ) = Dθ =
,
D21 D22
where the four elements of the Dθ matrix are as follows:
∂
E(∇β ft (β(τ )) · (τ − I(yt − ft (β(τ ) < 0))) = 0|Ft−1 )
∂β(τ )
ftZ
(β(τ ))
∂
=
E(∇β ft (β(τ )) · τ − ∇β ft (β(τ )) ·
h(y|Ft−1 )dy)
∂β(τ )
D11 =
−∞
∂2
∂2
E(ft (β(τ )) ·
=
0 E(ft (β(τ )) · τ ) −
∂β(τ )∂β(τ )
∂β(τ )∂β(τ )0
ftZ
(β(τ ))
h(y|Ft−1 )dy))
−∞
0
− E(∇β ft (β(τ )) · ∇β ft (β(τ )) h(ft (β(τ )|Ft−1 ))
0
D12
D21
= −E(∇β ft (β(τ )) · ∇β ft (β(τ )) h(0|Ft−1 )),
∂
E(∇β ft (β(τ )) · (τ − I(yt − ft (β(τ ) < 0))) = 0|Ft−1 ) = 0,
=
∂γ(τ )
∂
=
E(∇γ gt (γ(τ )) · (yt − gt (γ(τ ))I(yt − ft (β(τ ) < 0))
∂β(τ )
∂
=
E(∇γ gt (γ(τ )) · (yt − gt (γ(τ )))
∂β(τ )
ftZ
(β(τ ))
∂
−
E(∇γ gt (γ(τ )) ·
(yt − gt (γ(τ ))) · h(y|Ft−1 )dy))
∂β(τ )
−∞
0
D22
= −E(∇γ gt (γ(τ )) · ∇β ft (β(τ )) · (ft (β(τ )) − gt (γ(τ ))) · h(0|Ft−1 )),
∂
=
E(∇γ gt (γ(τ )) · (yt − gt (γ(τ ))(τ − I(yt − ft (β(τ ) < 0)))|Ft−1 )
∂γ(τ )
∂
=
E(∇γ gt (γ(τ )) · (yt − gt (γ(τ )) · τ )
∂γ(τ )
ftZ
(β(τ ))
∂
−
E(∇γ gt (γ(τ )) ·
(yt − gt (γ(τ ))) · h(y|Ft−1 )dy))
∂γ(τ )
−∞
0
= −E(∇γ gt (γ(τ )) · ∇γ gt (γ(τ )) · τ ).
Clearly, m0 (θ0 ) = 0. It can be verified that conditions 5.3- 5.5 are sufficient for (N-1)(N-3) of Huber (1967). Lemma 3 of Huber (1967) and assumption 5.6 together imply
35
that
√
T m0 (θ̂) +
"
√1
T
P
T
√1
t=1
T
PT
t=1 ∇ft (β0 (τ )) · (τ − I(yt − ft (β0 (τ )) < 0)
∇gt (γ0 (τ )) · (yt − gt (γ0 (τ )) · I(yt − ft (β0 (τ )) < 0)
#
= oP (1).
Assuming that there exists a θ̃ ∈ (θ0 , θ̂), we now apply the mean value theorem to
decompose m0 (θ̂) as
∂m0 (θ̃)
(θ̂ − θ0 ),
∂θ
where m0 (θ0 ) = 0 based on the first-order condition. We rearrange the above equation
to obtain
#
"
PT
√
√1
∇f
(β
(τ
))
·
(τ
−
I(y
−
f
(β
(τ
))
<
0)
t
0
t
t
0
PT T t=1
T m0 (θ̂) = −
√1
t=1 ∇gt (γ0 (τ )) · (yt − gt (γ0 (τ )) · I(yt − ft (β0 (τ )) < 0)
T
m0 (θ̂) = m0 (θ0 ) +
=
∂m0 (θ̃) √
T (θ̂ − θ0 ) + oP (1).
∂θ
Hence,
√
∂m0 (θ̃) −1
T (θ̂−θ0 ) = −(
)
∂θ
"
√1
T
P
T
√1
t=1
T
PT
t=1 ∇ft (β0 (τ )) · (τ − I(yt − ft (β0 (τ )) < 0)
∇gt (γ0 (τ )) · (yt − gt (γ0 (τ )) · I(yt − ft (β0 (τ )) < 0)
#
+oP (1).
By the continuity of ∇θ m0 (θ), we have
∇θ m0 (θ̃) =
∂m0 (θ̃) P
→ D(θ0 ).
∂θ
Meanwhile, the consistency of θ̂ implies that θ̃ also converges to θ0 . Therefore, it follows
that
#
"
PT
√
√1
∇f
(β
(τ
))
·
(τ
−
I(y
−
f
(β
(τ
))
<
0)
t 0
t
t 0
PT T t=1
T (θ̂−θ0 ) = −D(θ0 )−1
+oP (1).
√1
∇g
(γ
(τ
))
·
(y
−
g
(γ
(τ
))
·
I(y
−
f
(β
(τ
))
<
0)
t
0
t
t
0
t
t
0
t=1
T
Now, the central limit theorem is applied to yield
"
#
PT
√1
∇f
(β
(τ
))
·
(τ
−
I(y
−
f
(β
(τ
))
<
0)
t
0
t
t
0
D
PT T t=1
→ N (0, S(θ0 )),
1
√
t=1 ∇gt (γ0 (τ )) · (yt − gt (γ0 (τ )) · I(yt − ft (β0 (τ )) < 0)
T
36
S11 S12
where S(θ0 ) = E(m0 (θ0 )·m0 (θ0 )) =
and the four elements of the S(θ) matrix
S21 S22
are listed as follows:
S11 = E((∇β ft (β0 (τ )) · (τ − I(yt − ft (β0 (τ )) < 0))) · (∇β ft (β0 (τ )) · (τ − I(yt − ft (β0 (τ )) < 0)))
0
= E(∇β ft (β0 (τ )) · ∇β0 ft (β0 (τ )) ) · E((τ − I(yt − ft (β0 (τ )) < 0))2 ))
0
= τ · (1 − τ ) · E(∇β ft (β0 (τ )) · ∇β0 ft (β0 (τ )) ),
S12 = S21 = E((∇ft (β0 (τ )) · (τ − I(yt − ft (β0 (τ )) < 0)))
· (∇γ gt (γ0 (τ )) · (yt − gt (γ0 (τ )) · (τ − I(yt − ft (β0 (τ )) < 0)))
0
= E(∇β ft (β0 (τ )) · ∇γ0 gt (γ0 (τ )) )
· E(((τ − I(yt − ft (β0 (τ ) < 0))) · ((yt − gt (γ0 (τ ))(τ − I(yt − ft (β0 (τ )) < 0))))
0
= E(∇β ft (β0 (τ )) · ∇γ0 gt (γ0 (τ )) ) · E((τ − I(yt − ft (β0 (τ )) < 0)) · 0
= 0,
S22 = E((∇γ gt (γ0 (τ )) · (yt − gt (γ0 (τ )) · (τ − I(yt − ft (β0 (τ )) < 0)))
· (∇γ gt (γ0 (τ )) · (yt − gt (γ0 (τ )) · (τ − I(yt − ft (β0 (τ )) < 0)))
0
= E(∇γ gt (γ0 (τ )) · ∇γ0 gt (γ0 (τ )) ) · E((yt − gt (γ0 (τ )))2 |yt < ft (β0 (τ ))).
Therefore, it follows that Theorem 2.2 holds as
√
D
T (θ̂ − θ0 ) −→ N (0, Σ(θ0 )),
0
where Σ(θ0 ) = D(θ0 )−1 S(θ0 )(D(θ0 )−1 ) .
As in Engle and Manganelli (2004) and Kuan et al. (2009), the asymptotic covariance
matrix Σ(θ0 ) can be consistently estimated using its sample counterparts, as stated in
Section 2.2.
Appendix B
Assume that the asset return yt follows a GARCH(1,1) model
σt2
rt = σt zt ,
2
2
= a0 + a1 rt−1
+ a2 σt−1
;
37
meanwhile, when we specify an indirect GARCH model for the VaR or ES, we have
2
2
ft2 = b0 + b1 ft−1
+ b2 yt−1
,
(21)
where ft denotes the VaR or ES at time t, and
V aR = σΦ[−1] (π),
[−1]
ES = −σ · φ(Φ π (π)) ,
(22)
as described in Section 2. Then, we can equate the above equations to obtain
2
2
ft2 = (a0 + a1 yt−1
+ a2 σt−1
) · ξ2
2
2
= b0 + b1 ft−1
+ b2 yt−1
2
2
,
ξ 2 + b2 yt−1
= b0 + b1 σt−1
(23)
where Φ and φ are the cumulative density function and probability density function,
respectively, of the standard Gaussian distribution and τ is the coverage probability.
[−1]
ξ = Φ[−1] (π) for the conditional VaR model, and ξ = φ(Φ π (π)) for the conditional ES
model. Therefore, we have a0 ξ 2 = b0 , a1 ξ 2 = b02 , and a2 = b1 .
Appendix C
The CARE model of Taylor (2008) relies on the assumption of a relationship between
the ES and the corresponding expectile (quantile) as follows:
ESt (τ ) = (1 +
α
)µt (τ ).
(1 − 2α)τ
(24)
This expression relates the lower-tail ES associated with the τ quantile of a zero-mean
distribution and the α expectile that coincides with that quantile. Meanwhile, this
relationship assists in converting the conditional autoregressive expectile model into a
CARES model. For example, the symmetric absolute value conditional autoregressive
expectile model can be written as
µt (τ ) = β0 + β1 µt−1 (τ ) + β2 |yt−1 |.
38
(25)
Substituting µt (τ ) from expression (24) into expression (25) yields the following symmetric absolute value CARES model:
ESt (τ ) = γ0 + γ1 ESt−1 (τ ) + γ2 |yt−1 |,
where γ1 = β1 and γi = (1 +
α
)βi
(1−2α)τ
(26)
for i = 0 and 2.
Here, we empirically test whether the relationship between the model parameters in the
conditional quantile and ES model used in Taylor (2008) holds for real data or in more
general cases. Because the model parameters can be estimated using the GMM, we can
test these restrictions using a GMM Wald test.
Let h(θ0 ) = α0 be a set
of restrictions on
γ0 − (1 + (1−2α)τ )β0
the true parameter values θ0 , where h(θ0 ) =
γ1 − β1
; then, we test
α
γ2 − (1 + (1−2α)τ )β2
H0 :
h(θ0 ) = 0 against H1 :
h(θ0 ) 6= 0.
(27)
The Wald test is based on the concept of using the difference between h(θˆ0 ) and 0 to determine whether the null hypothesis is true. To determine whether h(θˆ0 ) is significantly
close to 0, we require the following result, which can be easily proven:
h(θ̂) ∼ N (h(θ0 ),
1
0
0
H(θ0 )[D(θ0 )−1 S(θ0 )(D(θ0 )−1 ) ]−1 H(θ0 ) ),
n
(28)
. If the null hypothesis is true, such that h(θ0 ) = 0, then we have
where H(θ) = ∇h(θ)
∇θ
the following result for the distribution of the Wald test statistic:
0
0
0
W ≡ nh (θ̂){H(θ̂)[D(θ0 )−1 S(θ0 )(D(θ0 )−1 ) ]−1 H(θ̂) }−1 h(θ̂) ∼ χ2 (J),
(29)
where J is the number of restrictions on h, n is the number of observations, and
0
D(θ0 )−1 S(θ0 )(D(θ0 )−1 ) is the asymptotic variance-covariance matrix of θ0 . Given the
size of the test and the corresponding critical value from the χ2 (J) distribution, the null
hypothesis is rejected if the value of the test statistic W is greater than the critical value.
39
References
P. Artzner, F. Delbaen, J. M. Eber, and D. Heath. Coherent measures of risk. Mathematical Finance, 9:203–228, 1999.
L. Bauwens, A. Preminger, and V.K.R Jeroen. Theory and inference for a markov
switching garch model. Econometric Journal, 13:218–244, 2010.
E. Bofinger. Estimation of a density function using order statistics. Australian Journal
of Statistics, 17:1–7, 1975.
S.X. Chen. Nonparametric estimation of expected shortfall. Journal of Financial Econometrics, 2:87–107, 2008.
F.X. Diebold and R. S. Mariano. Comparing predictive accuracy. Journal of Business
and Economic Statistics, 13:253–263, 1995.
B. Efron. Regression percentiles using asymmetric aquared error loss. Statistica Sinica,
1:93–125, 1991.
R. Engle and S. Manganelli. Caviar: Conditional autoregressive value at risk by regression quantiles. Journal of Business and Economic Statistics, 22:367–381, 2004.
P. Hall and S. Sheather. On the distribution of a studentized quantile. Journal of the
Royal Statistical Society, Series B, 50:381–391, 1988.
P.J. Huber. The behaviour of maximum likelihood estimates under nonstandard conditions. Proeedings of the Fifth Berkeley Symposium, 4:221–233, 1967.
E. Jondeau and M. Rockinger. Conditional volatility, skewness, and kurtosis: Existence,
persistence, and comovements. Journal of Economic Dynamics and Control, 27:1699–
1737, 2003.
R. Koenker. Quantile Regression (Econometric Society Monographs). Cambridge University Press, Cambridge, MA, 2005.
C.M. Kuan, J.H. Yeh, and Y.C. Hsu. Assessing value at risk with care, the conditional
autoregressive expectile models. Journal of Econometrics, 150:261–270, 2009.
J.W. Taylor. Estimating value at risk and expected shortfall using expectiles. Journal
of Financial Econometrics, 6:231–252, 2008.
Q. Yao and H. Tong. Asymmetric least squared regression estimation: A nonparametric
approach. Journal of Nonparametric Statistics, 6:273–292, 1996.
40
M.L. Yuan and M.C.W. Wong. Analytical var and expected shortfall for quadratic
portfolios. Journal of Derivatives, 7:33–44, 2010.
41
© Copyright 2026 Paperzz