Indirect Inference for Learning Models - Studies2

Indirect Inference for Learning Models
Laurent E. Calvet and Veronika Czellar∗
December 2011
Abstract
This paper proposes an indirect inference-based (Gouriéroux, Monfort and Renault 1993; Smith 1993) estimation method for a large class of structural dynamic equilibrium models, including economies with incomplete information,
heterogeneous agents, or market imperfections. We demonstrate by Monte
Carlo simulations its good accuracy on an asset pricing model with investor
learning, and also report empirical estimates based on 80 years of daily U.S.
equity returns. We find that the effect of learning in the return equation is
significant, i.e. agents are not fully informed about the state of fundamentals. The structural incomplete-information model has good predictive power
out-of-sample, and produces more accurate value at risk estimates than its
full-information variant, GARCH(1,1), or historical simulations.
JEL classification: C01; C13; C15; C53; C58
Keywords: Hidden Markov model, particle filter, learning, indirect inference,
value at risk.
Calvet: Department of Finance, HEC Paris, 1 rue de la Libération, 78351 Jouy en Josas, France;
[email protected]. Czellar: Department of Economics and Decision Sciences, HEC Paris, 1 rue de la
Libération, 78351 Jouy en Josas, France, [email protected]. We received helpful comments from Pavel
Chigansky, Adlai Fisher, Thierry Foucault, Itzhak Gilboa, Christian Gouriéroux, Lars Hansen, Per
Mykland, Hashem Pesaran, Nicholas Polson, Elvezio Ronchetti, Andrew Siegel, Ramon van Handel,
Pietro Veronesi, Paolo Zaffaroni, and seminar participants at CORE, the University of Chicago, the
University of Evry, the Second HEC Finance and Statistics Conference, CEF 2010 in London, the
2010 Toulouse School of Economics Financial Econometrics Conference, ISBIS 2010 in Portoroz,
and the 2010 Econometric Society World Congress in Shanghai. We gratefully acknowledge the
computer support of EUROFIDAI and the financial support of the American Statistical Association,
the Europlace Institute of Finance, and the National Institute of Statistical Sciences.
∗
1
Introduction
The structural estimation of nonlinear dynamic equilibrium models is the subject of
intense research in macroeconomics (e.g. Fernández-Villaverde and Rubio-Ramirez,
2007; Schorfheide, 2011; Smith, 1993), finance (Dridi, Guay and Renault, 2007),
labor economics (e.g. Guvenen and Smith, 2010), and many other fields. The researcher now has access to reliable methods for estimating equilibrium models in
which utility-maximizing agents select optimal economic outcomes. Given the technical challenges involved with the development of these methods, most work in this
area has until now focused on specifications in which agents are perfectly informed
about the state of fundamentals, have identical characteristics, and face no market
frictions.
A distinct branch of the literature relies on calibration to explore the equilibrium consequences of more general economic assumptions. Incomplete information,
heterogeneous agent characteristics, borrowing constraints, or other forms of market
incompleteness have been widely studied. The analysis frequently begins with a calibrated plain vanilla model that has good tractability properties; it is precisely the
type of model that can be efficiently estimated given recent econometric advances.
The full-fledged structural model, parameterized by θ ∈ Θ, is technically more challenging, but explains features of the data that the plain vanilla model cannot match.
In this paper, we propose that indirect inference (Gouriéroux, Monfort and Renault 1993; Smith 1993) is well-suited to conduct estimation in such environments.
Indirect inference is a simulation-based method that imputes the structural parameters of a model via an auxiliary parameter.1 Let φ ∈ Rq , q ≤ p, denote the parameter
of the plain vanilla model, and φ̂T denote an estimator of φ. Let η̂T denote a set
of statistics quantifying features of the data that the full-fledged structural model is
1
Further advances in indirect inference include Calzolari, Fiorentini and Sentana (2004), Czellar, Karolyi and Ronchetti (2007), Czellar and Ronchetti (2010), Genton and Ronchetti (2003),
Gouriéroux, Phillips and Yu (2010), Heggland and Frigessi (2004) and Sentana, Calzolari and
Fiorentini (2008).
2
designed to capture. The vector
µ̂T =
"
φ̂T
η̂T
#
(1.1)
is a natural auxiliary estimator for the computation of θ. Thus, the structure of
the dynamic model leads us to construct the auxiliary estimator by expanding the
plain vanilla model’s estimator with a vector of statistics that the structural model
is designed to capture.
The approach naturally applies to general equilibrium models with incomplete
information. Sequential learning by economic agents is a powerful mechanism that
theoretically explains key properties of asset returns, aggregate performance and
other equilibrium outcomes (e.g., Pástor and Veronesi, 2009a).2 In this earlier research, the learning model is calibrated and shown to account for features of the data
that full-information versions of the model cannot explain. Despite their theoretical
appeal, however, learning models are not used in practice to forecast and price assets
because they have until recently been econometrically intractable.
To clarify the exposition, we define a class of recursive incomplete-information
economies that nests the examples of Brandt, Zeng, and Zhang (2004), Calvet and
Fisher (2007), David and Veronesi (2006), Lettau, Ludvigson and Wachter (2008),
Moore and Schaller (1996), and van Nieuwerburgh and Veldkamp (2006). The economy includes three levels of information, corresponding to nature, an agent and the
econometrician, as is illustrated in Figure 1. At the beginning of every period t,
nature selects a Markov vector Mt . The agent observes a signal, whose distribution
is contingent on Mt , and computes the conditional probability distribution (“belief”)
Πt of the Markov vector Mt given current and past signals. According to her beliefs
2
In financial economics, sequential learning by economic agents investor learning has been used
to explain phenomena as diverse as the level and volatility of equity prices, return predictability,
portfolio choice, mutual fund flows, firm profitability following initial public offerings, and the
performance of venture capital investments. In particular, the portfolio and pricing implications of
learning are investigated in Brennan (1998), Brennan and Xia (2001), Calvet and Fisher (2007),
David (1997), Guidolin and Timmermann (2003), Hansen (2007), Pástor and Veronesi (2009b),
Timmermann (1993, 1996), and Veronesi (1999, 2000). We refer the reader to Pástor and Veronesi
(2009a) for a recent survey of learning in finance.
3
NATURE:
sets the state Mt
signal st
AGENT:
infers belief Πt about Mt
data yt
ECONOMETRICIAN:
observes yt
Figure 1: Information structure.
and signal, the agent also computes a data point yt , which the econometrician observes. The data point may for example include asset returns, prices, or production
decisions.
The parameter θ of an incomplete-information economy specifies the Markov
chain Mt , the precision of the signals received by the agent, and their impact on the
observation yt . The state of the learning economy consists of nature’s Markov vector
and the agent’s belief: xt = (Mt , Πt ). In a recent paper (Calvet and Czellar 2011),
we have introduced the State-Observation Sampling (SOS) filter, which expands the
bootstrap particle filter of Gordon, Salmond and Smith (1993) to dynamic systems
with intractable observation densities. This advance allows us to track the state xt
of a learning economy with known structural parameters. In addition, the new algorithm permits the estimation of the structural parameter θ of the model. However,
the computational requirements of this type of estimation methods are very large and
currently impractical on the large-scale applications that one would like to consider
in financial economics.
The parameter θ of an incomplete-information economy can be estimated by indirect inference. When the agent observes nature’s vector Mt , the economy reduces to
a plain vanilla Markov-switching economy (Hamilton, 1989). We define the auxiliary
4
estimator by stacking: (a) the full-information economy’s maximum likelihood estimator; with (b) a set of statistics that the incomplete-information model is designed
to capture.
We investigate the performance of our estimation technique on a structural model
of daily equity returns. Because the rich dynamics of the return series requires a large
state space, we base our analysis on the multifrequency learning economy of Calvet
and Fisher (“CF” 2007). We verify by Monte Carlo simulation that the indirect
inference estimator performs well in finite samples.
We estimate the structural model on the daily excess returns of the CRSP U.S.
value-weighted index between 1926 and 1999. We show that the effect of learning in the return equation is significant, i.e. agents are not fully informed about
the state of fundamentals. We conduct the out-of-sample analysis by relying on
the SOS filter of Calvet and Czellar (2011). For the out-of-sample period (20002009), the incomplete-information model provides accurate value-at-risk forecasts,
which significantly outperform the predictions obtained from historical simulations,
GARCH(1,1), and the full-information (FI) model.
The paper is organized as follows. In section 2, we discuss the indirect inference
estimation of structural economic models. In section 3, we introduce a class of recursive learning economies. Section 4 applies these methods to a multifrequency investor
learning model; we verify the accuracy of our inference methodology by Monte Carlo
simulations, and conduct inference on the daily returns of a U.S. aggregate equity
index between 1926 and 2009. Section 5 concludes.
2
Indirect Inference Estimation of a Dynamic Structural Model
2.1
Setup
We consider a dataset YT = (y1 , . . . , yT ) generated from a structural economic model
with parameter θ∗ ∈ Θ ⊆ Rp . We assume that for every θ ∈ Θ, it is possible to
5
simulate a sample path from the structural model defined by θ.
We also assume that to each structural model θ ∈ Θ, we can associate an auxiliary
plain vanilla economy that can be conveniently estimated. For instance, the structural model can be an incomplete-information economy, and the plain vanilla model
a full-information version in which agents fully observe the state of fundamentals, as
will be the case in section 3. In addition or alternatively, one could consider that the
structural model allows for market frictions or imperfections, while the basic model
is frictionless. The structural model may also include agents with heterogeneous
characteristics, while the basic model assumes identical characteristics.
The plain vanilla model is specified by φ ∈ Rq , where q ≤ p. The basic model
may be a restricted version of the full-fledged model (nested case), or a discretized
version of the possibly restricted structural model (nonnested case). From a practical
standpoint, the important feature is that there be a tight link between the structural
and the basic models. The other requirement is that the parameter φ of the basic
model can be efficiently imputed by an estimator φ̂T .
2.2
Estimation Methodology
The definition of the indirect inference estimator of the structural parameter θ∗
proceeds in three steps.
1. We define an auxiliary estimator µ̂T by stacking the plain vanilla model’s estimator with a set of p − q auxiliary statistics:
µ̂T =
"
φ̂T
η̂T
#
∈ Rp .
(2.1)
By construction, µ̂T has the same dimension as the structural parameter θ.
2. Let H ≥ 1. For any admissible parameter θ, we can simulate a sample path
YHT (θ) of length HT , and compute the corresponding pseudo-auxiliary estima-
6
tor :
µ̂HT (θ) =
"
φ̂HT (θ)
η̂HT (θ)
#
,
(2.2)
where φ̂HT (θ) and η̂HT (θ) denote the auxiliary statistics of the simulated path.
3. The indirect inference estimator θ̂T is defined by:
′ θ̂T = arg min µ̂HT (θ) − µ̂T Ω µ̂HT (θ) − µ̂T ,
θ
(2.3)
where Ω is a positive definite weighting matrix.
The investigation of a dynamic economic model often begins with the characterization of a plain vanilla version, so the estimation method we are proposing follows
the natural progression used in the literature. In the definition of the auxiliary estimator (2.1), the p − q auxiliary statistics η̂T are chosen to quantify features of the
dataset YT that the full-fledged structural model is designed to capture. We focus on
the exactly identified case (dim(µ̂T ) = dim(θ̂T )) to simplify the exposition and because earlier evidence indicates that parsimonious auxiliary models tend to provide
more accurate inference in finite samples (e.g. Andersen, Chung, and Sorensen, 1999;
Czellar and Ronchetti, 2010). Our approach naturally extends to the overidentified
case when it is economically important to match a larger set of statistics.
2.3
Asymptotics and Numerical Implementation
The asymptotic properties of θ̂T directly follow from the general results of Gouriéroux
et al. (1993) and Gouriéroux and Monfort (1996). Assume that assumptions A1–
A3 given in Appendix A hold and that the dataset is generated from the structural
model with parameter θ∗ . When the sample size T goes to infinity, the auxiliary
estimator µ̂T converges in probability to a deterministic function µ(θ∗ ), called the
√
d
binding function, and T [µ̂T − µ(θ∗ )] −→ N(0, W ∗ ). Furthermore, for a fixed H,
7
the estimator θ̂T is consistent and asymptotically normal:
√
where
Σ=
d
T (θ̂T − θ∗ ) −→ N(0, Σ),
1
1+
H
∂µ(θ∗ )
∂θ′
−1
W
∗
∂µ(θ∗ )′
∂θ
−1
.
(2.4)
A finite-sample equivalent of Σ is readily available, as is explained in Appendix A.
The numerical implementation can be accelerated by the efficient method of moments (EMM) when the estimation of the plain vanilla model is expensive. Assume
that the auxiliary estimator minimizes a criterion function QHT (µ). Since in the
just-identified case µ̂HT (θ̂T ) = µ̂T , the simulated auxiliary estimator µ̂HT (θ) satisfies
∂QHT
[µ̂HT (θ), YHT (θ)] = 0 .
∂µ | {z }
µ̂T
Hence, the indirect inference estimator θ̂T minimizes the EMM-type objective function:
∂QHT
∂QHT
[µ̂T , YHT (θ)] WT
[µ̂T , YHT (θ)],
(2.5)
′
∂µ
∂µ
where WT is any positive-definite weighting matrix. This property can be used to
compute θ̂T . For each iteration of θ, the evaluation of the EMM objective function
(2.5) requires only the evaluation of the score. By contrast, the evaluation of the
objective function (2.3) requires the calculation of the auxiliary estimator µ̂HT (θ).
We refer the reader to Appendix A for further discussion.
3
Recursive Incomplete-Information Economies
Indirect inference can be used to estimate incomplete-information economies. To
clarify the exposition, we consider a class of discrete-time stochastic economies
defined at t = 0, . . . , ∞ on the probability space (Ω, F, P) and parameterized by
θ ∈ Θ ⊆ Rp , p ≥ 1. In every period t, we define three levels of information cor8
responding to nature, a Bayesian agent, and the econometrician, as illustrated in
Figure 1.
3.1
Nature
At the beginning of every period t, nature selects a vector Mt driving the fundamentals of the economy. We assume that Mt follows a first-order Markov chain on
the set of mutually distinct states {m1 (θ), . . . , md (θ)}. For every i, j ∈ {1, .., d}, we
denote by
ai,j (θ) = P(Mt = mj (θ)|Mt−1 = mi (θ); θ)
the transition probability from state i to state j. We assume that the Markov chain
Mt is irreducible, aperiodic, positive recurrent, and therefore ergodic. For notational
simplicity, we henceforth drop the argument θ from the states mj and transition
probabilities ai,j .
3.2
Agent
At the beginning of every period t, the agent receives a signal st ∈ RnS , which is
partially revealing on the Markov chain Mt . Let fS (st |Mt ; θ) denote the probability
density function of the signal conditional on Mt , and let St = (s1 , . . . , st ) denote the
vector of signals up to date t.
Assumption 1 (Signal). The signal satisfies the following conditions:
(a) P(Mt = mj |Mt−1 = mi , St−1 ; θ) = ai,j for all i, j ;
(b) fS (st |Mt , Mt−1 , . . . , M0 , St−1 ; θ) = fS (st |Mt ; θ).
Condition (a) implies that the signal does not contain leading information about the
Markov chain Mt . Condition (b) states that st is impacted by the current but not
by lagged versions of Mt .
The agent is Bayesian and satisfies the following conditions.
9
Assumption 2 (Agent Knowledge). The agent knows the structural parameter
θ, the Markov chain’s levels mj (θ) and transition probabilities ai,j (θ), and the signal’s
conditional density fS (·|Mt ; θ).
The agent recursively applies Bayes’ rule to compute the conditional probability
distribution of Mt given the signal’s history.
Proposition 1 (Agent Belief ).
mj |St ; θ) satisfy the recursion:
The conditional probabilities Πjt = P(Mt =
ω j (Πt−1 , st ; θ)
Πjt = Pd
for all j ∈ {1, . . . , d} and t ≥ 1,
i (Π
ω
,
s
;
θ)
t−1
t
i=1
where Πt−1 = (Π1t−1 , . . . , Πdt−1 ) and ω j (Πt−1 , st ; θ) = fS (st |Mt = mj ; θ)
(3.1)
Pd
i=1
ai,j Πit−1 .
In applications, the agent values assets or makes financial, production or purchasing
decisions as a function of the belief vector Πt . Our methodology easily extends to
learning models with non-Bayesian agents, as in Brandt, Zeng, and Zhang (2004)
and Cecchetti Lam and Mark (2000).
The state of the learning economy at a given date t is the mixed variable xt =
(Mt , Πt ), which consists of nature’s Markov chain Mt and the agent’s belief Πt . The
state space is therefore
d−1
X = {m1 , . . . , md } × ∆+
,
(3.2)
P
d−1
where ∆+
= {Π ∈ Rd+ | di=1 Πi = 1} denotes the (d − 1)–dimensional unit simplex.
Proposition 2 (State of the Learning Economy). The state of the learning
economy, xt = (Mt , Πt ), is first-order Markov. It is ergodic if the transition probabilities between states of nature are strictly positive: ai,j > 0 for all i, j, and the signal’s
conditional probability density functions fS (s|Mt = mj ; θ) are strictly positive for all
s ∈ RnS and j ∈ {1, . . . , d}.
10
The state of the learning economy xt preserves the first-order Markov structure of the
nature’s Markov chain Mt . By Bayes’ rule (3.1), the transition kernel of the Markov
state xt is sparse when the dimension of the signal, nS , is lower than the dimension
of the unit simplex: nS < d − 1. Under the conditions stated in Proposition 2, the
state xt is ergodic for all values nS and d, which guarantees that the economy is
asymptotically independent of the initial state x0 .
3.3
Econometrician
Each period, the econometrician observes a data point yt ∈ RnY , which is assumed
to be a deterministic function of the agent’s signal and conditional probabilities over
states of nature:
yt = R(st , Πt , Πt−1 ; θ).
(3.3)
We include Πt−1 in this definition to accommodate the possibility that yt is a growth
rate or return.
The parameter vector θ ∈ Rp specifies the three building blocks of the incompleteinformation economy: (i) the levels and transition probabilities of the Markov chain
Mt ; (ii) the signal’s conditional density fS (·|Mt , θ), and (iii) the data function R(·; θ).
In some applications, it may be useful to add measurement error in (3.3); the estimation procedure of the next section applies equally well to such an extension.
3.4
Estimation of the Structural Parameter
The econometrician observes a dataset YT = (y1 , . . . , yT ) generated from the incompleteinformation (II) economy with parameter θ∗ , and seeks to estimate θ∗ . The learning
model can be conveniently simulated as follows.
11
Simulation of the Learning Model
Step 1: sample Mt from Mt−1 using the transition probabilities ai,j ;
Step 2: sample the signal st from fS (·|Mt ; θ);
Step 3: apply Bayes’ rule (3.1) to impute the agent’s belief Πt ;
Step 4: compute the simulated data point ỹt = R(st , Πt , Πt−1 ; θ).
The estimation of θ∗ can therefore proceed by indirect inference.
To each learning model θ ∈ Θ, we associate an auxiliary full information (FI)
economy in which the agent observes both the state of nature Mt and the signal
st . The agent’s conditional probabilities are then Πjt = P(Mt = mj |St , Mt ; θ) for all
j. The belief vector reduces to Πt = 1Mt , where 1Mt denotes the vector whose j th
component is equal to 1 if Mt = mj and 0 otherwise, and by (3.3) the full information
data point is defined by yt = R(st , 1Mt , 1Mt−1 ; θ).
Consistent with the method discussed in section 2, the auxiliary estimator µ̂T is
defined by stacking an estimator of the full-information auxiliary economy with a set
of p − q auxiliary statistics.
Proposition 3 (Auxiliary Full-Information Likelihood). Assume that the probability density function of the datapoint yt conditional on the current and lagged states
of nature, fi,j (yt ; φ) = fY,F I (yt |Mt = mj , Mt−1 = mi , φ) is available analytically for
all i, j ∈ {1, . . . , d}. The log-likelihood function LF I (φ|YT ) is then available analytically.
The ML estimator of the auxiliary full-information economy φ̂T = arg maxφ LF I (φ|YT ) ∈
Rq can therefore be conveniently computed and used in the definition of the auxiliary
estimator.
12
In this subsection, we have assumed that the Markov process Mt takes finitely
many values. When Mt has an infinite support, we can discretize its distribution and
use the corresponding full-information discretized economy as an auxiliary model.
The definition and properties of the indirect inference estimator are otherwise identical.
4
Application to an Asset Pricing Model with Investor Learning
We now apply our methodology to a consumption-based asset pricing model. We
adopt the Lucas tree economy with regime-switching fundamentals of CF (2007),
which we use to specify the dynamics of daily equity returns.
4.1
Specification
Markov chain. The rich dynamics of daily returns requires a large state space. For
this reason, we consider that the state is a vector containing k components:
Mt = (M1,t , . . . , Mk,t )′ ∈ Rk+ .
We assume that Mt follows a binomial Markov Switching Multifractal (CF 2001,
2004, 2008). That is, the components are mutually independent across k. Let M
denote a Bernoulli distribution that takes either a high value m0 or a low value
2 − m0 with equal probability. Given a value Mk,t for the k th component at date t,
the next-period multiplier Mk,t+1 is either:

drawn from the distribution M with probability γ ,
k
equal to its current value M with probability 1 − γ .
k,t
k
Since each component of the state vector can take two possible values, the state
space contains d = 2k elements m1 , . . . , md . The transition probabilities γk are
13
parameterized by γk = 1 − (1 − γk )b
k−k
, k = 1, . . . , k, where b > 1.
Bayesian Agent. The agent receives an exogenous consumption stream {Ct } and
prices the stock, which is a claim on an exogenous dividend stream {Dt }. Every
period, the agent observes a signal st ∈ Rk+2 consisting of dividend growth, consumption growth and a noisy version of nature’s vector Mt :
2
s1,t = ln(Dt /Dt−1 ) = gD − 0.5σD
(Mt ) + σD (Mt )εD,t ,
(4.1)
s2,t = ln(Ct /Ct−1 ) = gC + σC εC,t ,
(4.2)
si+2,t = Mi,t + σδ zi,t ,
i = 1, . . . , k .
(4.3)
The noise parameter σδ ∈ R+ controls information quality. The stochastic volatility
of dividends is given by:

σD (Mt ) = σ D 
k
Y
k=1
1/2
Mk,t 
,
(4.4)
where σ D ∈ R+ . The innovations εC,t , εD,t , and zt are jointly normal and have zero
means and unit variances. We assume that εC,t and εD,t have correlation ρC,D , and
that all the other correlation coefficients are zero.
P
t 1−α
/(1 − α), where δ
The agent has isoelastic expected utility, U0 = E0 ∞
t=0 δ Ct
is the discount rate and α is the coefficient of relative risk aversion. In equilibrium,
the log interest rate is constant. The stock’s price-dividend ratio is negatively related
to volatility and linear in the belief vector:
Q(Πt ) =
d
X
Q(mj )Πjt .
(4.5)
j=1
The linear coefficients Q(mj ) are available analytically (see Appendix C).
Econometric Specification of Stock Returns. The econometrician observes the
14
log excess return process:
1 + Q(Πt )
yt = ln
+ s1,t − rf .
Q(Πt−1 )
(4.6)
The return process is negatively skewed under incomplete information, as we now
explain. Assume for instance that the noise parameter σδ is large, so that the agent
learns about Mt primarily through the dividend growth s2,t . Because large realizations of dividend growth are implausible in a low-volatility regime, the agent tends to
learn abruptly about a volatility increase. Conversely, when volatility switches from
a high to a low state, the agent learns only gradually that volatility has gone down
because realizations of dividend growth near the mean are likely outcomes under any
Mt . Learning about volatility is therefore an asymmetric process. As a result, the
stock price falls abruptly after a volatility increase (bad news), but appreciates only
gradually following a drop in volatility (good news).
4.2
Indirect Inference Estimator
We now develop an estimator for the vector of structural parameters:
θ = (m0 , γk , b, σδ )′ ∈ [1, 2] × (0, 1] × [1, ∞) × R+ ,
where m0 controls the variability of dividend volatility, γk̄ the transition probability
of the most transitory volatility component, b the spacing of the transition probabilities, and σδ the precision of the signal received by the representative agent. As
is traditional in the asset pricing literature, we calibrate all the other parameters
on aggregate consumption data and constrain the mean price-dividend ratio to a
plausible long-run value
(4.7)
E[Q(Πt )] = Q.
The long-run mean Q is set equal to 25 in yearly units.3
3
An alternative approach would be to estimate all the parameters of the learning economy on
aggregate excess return data. In the 2005 NBER version of their paper, CF applied this method
to the FI model and obtained broadly similar results to the ones reported in the published version.
15
The learning economy is specified by p = 4 parameters, θ = (m0 , γk̄ , b, σδ )′ , while
the FI economy is specified by q = 3 parameters, φ = (m0 , γk̄ , b)′ . For this reason, the
definition of the auxiliary estimator requires an additional statistic η̂T ∈ R. Since
the noise parameter σδ controls the skewness of excess returns, the third moment
seems like a natural choice. We are concerned, however, that the third moment may
be too noisy to produce an efficient estimator of θ. For this reason, we consider an
alternative based on the observation that by restriction (4.7), the mean return is
nearly independent of the structural parameter:
E(yt ) ≈ ln(1 + 1/Q) + gD − rf − 0.5 σ 2D ,
(4.8)
as is verified in Figure 3. Since the mean is fixed, the median can be used as a robust
measure of skewness.
The auxiliary estimator µ̂T = (φ̂T , η̂T )′ is therefore defined by expanding the
ML estimator of the full-information economy, φ̂T , with either the third moment
P
(η̂T = T −1 Tt=1 yt3 ) or median (η̂T = median{yt }) of returns.
Plots of the Binding Function. In Figure 2, we illustrate the relation between
the median-based auxiliary estimator µ̂T and the structural parameter θ on a long
simulated sample of length HT = 107 with k̄ = 3. The graphs can be viewed as cuts
of the binding function µ(θ). The top three rows show that for all i ∈ {1, 2, 3}, the
auxiliary parameter µ̂T,i increases monotonically with the corresponding parameters
θi of the learning economy, and is much less sensitive to the other parameters θj ,
j 6= i (including σδ ). Moreover, we note that the auxiliary estimator of b, based on FI
ML, is a biased estimator of the parameter b of the incomplete-information economy;
this finding illustrates the pitfalls of employing quasi-maximum likelihood estimation
in this setting. The bottom row shows that as the noise parameter σδ increases, the
median return increases monotonically, consistent with the fact that returns become
more negatively skewed. In Figure 3, we verify that the third moment is decreasing
monotonically with σδ . The structural parameter θ is thus well identified by our two
This alternative approach has the disadvantage of not taking into account the economic constraints
imposed by the model, and we do not pursue it here.
16
0.02
0.06
0.02
0.06
m0
1.9
1.5
4
5
2
3
4
5
2
3
4
5
2
3
4
5
γk
0.5
1.5
0.5
1.5
0.5
1.5
0.5
1.5
0.02
2e−04 6e−04
1
3
5
0.02
5
2e−04 6e−04
1
3
5
3
1
1.55 1.70 1.85
3
0.08
0.06
2e−04 6e−04
2e−04 6e−04
1.55 1.70 1.85
1.7
1.9
1.5
0.02
2
0.08
0.06
0.08
0.02
0.02
0.08
0.02
3
1
FI b
5
1.55 1.70 1.85
Median
1.7
1.9
1.5
1.7
1.9
1.7
1.5
FI m0
FI γk
1.55 1.70 1.85
b
σδ
Figure 2: Auxiliary Estimator. This figure illustrates the relation between the
median-based auxiliary estimator and the structural parameter θ. In each column,
one structural parameter is allowed to vary while the other three parameters are set
to their reference values. Reference values for the four parameters are: m0 = 1.7,
γk = 0.06, b = 2, σδ = 1. The auxiliary estimate reported for every θ is obtained
from a simulated sample of length 107 generated under the learning model θ with
k = 3 volatility components.
17
0.02
0.06
m0
0.00022
3
4
5
2
3
4
5
2
3
4
5
γk
1.5
0.5
1.5
0.5
1.5
0.5
1.5
0e+00
4.5e−05
5e−08
1e−08
1e−08
5e−08
−4e−07
0e+00
4.5e−05
0e+00
−4e−07
5e−08
1e−08
1.55 1.70 1.85
2
0.5
6.5e−05
0.06
5
4.5e−05
0.02
4
−4e−07
0.06
3
0.00017
0.00022
0.02
2
6.5e−05
0.06
0.00017
0.00022
0.02
6.5e−05
0.00017
0.00022
0.00017
4.5e−05
6.5e−05
1.55 1.70 1.85
5e−08
−4e−07
0e+00
1.55 1.70 1.85
1e−08
1st mom.
2nd mom.
3rd mom.
4th mom.
1.55 1.70 1.85
b
σδ
Figure 3: Moments. We report the first four sample moments for simulated data
of size 107 generated under the learning model with k = 3 and reference parameter
values m0 = 1.7, γk = 0.06, b = 2 and σδ = 1. In a given sensitivity plot, one
parameter varies and the others are fixed at their true parameter values.
18
1.55 1.70 1.85
m0
2.0e−08
4
5
2
3
4
5
γk
0.5
1.5
0.5
1.5
4.0e−09
0.06
3
−1.0e−08
2.0e−08
0.02
2
4.0e−09
0.06
−1.0e−08
2.0e−08
0.02
4.0e−09
−1.0e−08
2.0e−08
−1.0e−08
4.0e−09
Leverage
Corr.
1.55 1.70 1.85
b
σδ
Figure
and correlation coefficients. We report thePleverage
coefficient
PT ′ 4: Leverage
T′
2
′
2
2
′
t=2 yt−1 yt /T , and the measure of volatility autocorrelation
t=2 yt−1 yt /T with
T ′ = 107 simulated under the learning model with k = 3 and reference parameter
values m0 = 1.7, γk = 0.06, b = 2 and σδ = 1. In a given sensitivity plot, one
parameter varies and the others are fixed at their true parameter values.
19
candidate auxiliary estimators.
Alternative Estimators. As a benchmark, we also construct a simulated method
of moments (SMM) estimator. In Figures 3 and 4, we illustrate the impact of the
structural parameter θ on the expected values of ytn , n ∈ {1, . . . , 4}, the leverage
2
coefficient yt−1 yt2 , and the volatility autocorrelation measure yt−1
yt2 . The leverage
measure and the second, third and fourth moments appear to be the most sensitive
to the structural parameter θ, and are therefore selected for the definition of the
SMM estimator.
4.3
Monte Carlo Simulations
In Figure 5, we report boxplots of SMM, third moment-based and median-based II
estimates of θ obtained from 100 simulated sample paths of length T = 20, 000 from
the learning model with k̄ = 3 volatility components. For all three estimators, we
set the simulation size to H = 500, so that each simulated path contains HT = 107
simulated data points. The indirect inference procedures provide more accurate
and less biased estimates of the structural parameters of the learning economy than
SMM. The median-based estimator provides substantially more accurate estimates
of the parameter σδ that controls the agent’s information quality. The median-based
estimator thus strongly dominates the other two candidate estimators, and we now
use it empirically. The Monte Carlo simulations confirm the excellent properties of
the estimation technique proposed in the paper.
4.4
Empirical Parameter and Likelihood Estimates
We apply our estimation methodology to the daily log excess returns on the U.S.
CRSP value-weighted equity index from 2 January 1926 to 31 December 2009. The
dataset contains 22,276 observations, which are illustrated in Figure 6. We partition
the dataset into an in-sample period, which runs until 31 Dec 1999, and an out-ofsample period, which covers the remaining ten years.
In Table 1, we report the empirical estimates of the II model for k̄ = 1, . . . , 4,
20
0.12
0.10
0.04
0.06
0.08
γk
1.72
1.68
1
0.5
2
1.0
3
1.5
4
2.0
σδ
5
6
2.5
7
3.0
8
1.64
m0
b
Figure 5: Monte Carlo Simulations of the Learning Model Estimators. This figure illustrates boxplots of the structural parameter estimates obtained using SMM
(left boxplot of each panel), the indirect inference estimator based on the third moment (middle boxplots), and the median-based indirect inference estimator (right
boxplots). The horizontal lines correspond to the true values of the parameters.
21
0.15
0.10
0.05
0.00
−0.10
−0.20
Log excess return
1940
1960
1980
2000
Figure 6: U.S. Equity Return Data. This figure illustrates the daily log excess
returns on the CRSP U.S. value-weighted equity index between 2 January 1926
and 31 December 2009. The dashed line separates the in-sample and out-of-sample
periods.
22
Table 1: Empirical estimatesa
k
Parameter Estimates
m0
γk
b
σδ
1
1.732
2
3
(0.0043)
(0.0038)
0.063
-
1.714
(0.0186)
0.054
(0.0035)
21.051
1.690
0.069
(0.0078)
1.587
4
(0.0111)
(0.0059)
0.046
(0.0066)
Estimated
Likelihood
(in logs)
47.291
65, 669.2
(8.9345)
3.834
(1.0387)
67, 106.8
16.631
2.487
67, 559.1
(2.8115)
5.110
(1.4316)
(99.6972)
(0.4138)
1.415
(0.3034)
(−10.4447)
(−7.9008)
(−7.8274)
68, 165.2
We report empirical estimates of the learning model
(with standard errors in parentheses) based on the daily
excess returns of the CRSP index between 2 January 1926
and 31 December 1999. The log-likelihood estimates are
based on an SOS filter containing N = 107 particles. HACadjusted Vuong tests comparing k ≤ 3 specifications to
k = 4 are reported in parentheses below the log-likelihood
estimates.
a
23
which minimize the EMM-type objective function with score provided in Appendix C.
We use the median-based indirect inference estimator, let HT = 107 , and report
standard errors in parentheses. The estimate of σδ is significant for k̄ > 1. Thus
the effect of learning in the return equation is statistically significant. The estimate
of σδ also declines with k̄.4 This finding is consistent with the intuition that as k
increases, the effect of learning becomes increasingly powerful, and a lower σδ better
matches the negatively skewed excess return series.
We also report the log-likelihood L of each specification, estimated by an SOS
filter (Calvet and Czellar, 2011) with kernel Kht (y) = K(y/ht )/ht :
L̂ =
T
X
t=1
"
#
N
1 X
(n)
ln
Kh (yt − ỹt ) .
N n=1 t
(4.9)
The pseudo-observations in (4.9) are defined by the following SOS algorithm.
SOS Filter
Step 1 (State-observation sampling): For every n = 1, . . . , N, we
(n) (n)
(n)
simulate a state-observation pair (x̃t , ỹt ) from fX,Y (·|xt−1 , Yt−1 ).
Step 2 (Importance weights): We observe the new data point yt and
compute
(n)
pt
(n)
Kht yt − ỹt
, n = 1, . . . , N.
=P
(n′ )
N
y
−
ỹ
K
t
ht
t
n′ =1
Step 3 (Multinomial resampling): For every n = 1, . . . , N, we draw
(n)
(1)
(N )
(1)
(N )
xt from x̃t , . . . , x̃t with importance weights pt , . . . , pt .
4
When k̄ = 1, the auxiliary parameter is nearly invariant to σδ in the relevant region of the
parameter space. The derivative of the score is almost singular, and we compute the standard
errors using 200 Monte Carlo estimates in this case. The specification with k̄ = 1 cannot match
the median of historical returns and is therefore severely misspecified. These findings illustrate the
empirical importance of using higher values of k̄.
24
We use the quasi-Cauchy kernel K(u) = 1/(1 + Cu2 )2 , C = (π/2)2 with the
9/2 1/5
optimal bandwidth ht = σt 5π
, where σt is estimated by the sample standard
48 N
(n)
deviation of the simulated observations {ỹt }N
n=1 .
Under regularity conditions, the SOS estimated log-likelihood converges in mean
squared error at rate N −2/5 . The SOS filter thus requires a large number of particles
and we choose N = 107 here. A simulated ML approach based on SOS would
be an alternative estimation method of the learning model, but would require here
the maximization of a function with numerical complexity T × N = 19, 761 × 107 ,
which would be highly impractical. Our indirect inference approach instead reduces
numerical complexity of the objective function to T × H = 19, 761 × 500.
The likelihood function of the II model increases steadily with k̄. We report in
parentheses the t−ratios of a HAC-adjusted Vuong (1989) test, that is the rescaled
differences between the log-likelihoods of the lower-dimensional (k̄ ∈ {1, 2, 3}) and
the highest-dimensional (k̄ = 4) specifications. The four-component model has a
significantly higher likelihood than the other specifications and is therefore selected
for the out-of-sample analysis.
4.5
Value at Risk Forecasts
We now turn to the out-of-sample implications of the incomplete-information model.
p
The value at risk V aRt+1
constructed on day t is such that the return on day t + 1
p
will be lower than −V aRt+1 with probability p. The failure rate is specified as the
fraction of observations where the actual return exceeds the value at risk. In a well
specified VaR model, the failure rate is on average equal to p. We use as a benchmark
historical simulations (e.g. Christoffersen 2009) and Student GARCH(1,1), which are
widely used in practice. The historical VaR estimates are based on a window of 60
days, which corresponds to a calendar period of about three months. In Table 2, we
p
report the failure rates of the V aRt+1
forecasts for p = 1%, 5%, 10%, at horizons of 1
and 5 days produced by: historical simulations, GARCH, the full-information model
and the learning model with k̄ = 4. Standard deviations are reported in parentheses.
A failure rate is in bold characters if it differs from its theoretical value at the 1%
25
Table 2: Failure rates of value-at-risk forecastsa
Models
Historical VaR
1%
−
One Day
5%
10%
0.069 0.119
(0.0051)
(0.0065)
Five Days
1%
5%
10%
−
0.066 0.129
(0.0111)
(0.0150)
GARCH
0.081
0.154
0.197
0.048
0.123
0.165
F I, k = 4
0.016
0.070
0.132
0.012
0.068
0.143
II, k = 4
(0.0054)
(0.0025)
0.010
(0.0020)
(0.0072)
(0.0051)
0.047
(0.0042)
(0.0079)
(0.0067)
0.094
(0.0058)
(0.0095)
(0.0048)
0.010
(0.0044)
(0.0147)
(0.0112)
0.060
(0.0106)
(0.0166)
(0.0156)
0.135
(0.0153)
This table reports the failure rates of the 1-day and 5-day value at risk
forecasts produced by various methods in the out-of-sample period (2000-2009).
The historical VaR is based on a rolling window of 60 days. The GARCH, FI
and II forecasts are computed using in-sample parameter estimates. II forecasts
are based on an SOS filter with N = 107 elements. The significance level is
1%.
a
26
significance level.
Historical simulations provide inaccurate VaR forecasts at the 1-day horizon. The
failure rates are significantly higher than their theoretical values, which suggests that
historical simulations provide overly optimistic estimates of value at risk. GARCH
VaR estimates are significantly higher in all cases, while the FI model’s VaR predictions are rejected in three out of six cases. On the other hand, the VaR predictions
from the learning model are all consistent with the data. Our empirical findings
suggest that the learning model captures well the dynamics of daily stock returns,
and outperforms out of sample some of the best reduced-form specifications. We
note that this is an excellent result for a consumption-based asset pricing model.
5
Conclusion
This paper illustrates that indirect inference is well suited for the estimation of a wide
class of dynamic equilibrium models. The technique builds on the observation that
plain vanilla equilibria, based on strong economic assumptions such as full information or the absence of frictions, can often be easily defined and efficiently estimated.
An auxiliary estimator of the full-fledged structural model can then be defined by
expanding the plain vanilla model’s estimator with a set of auxiliary statistics. We
have defined a class of recursive learning economies based on a hidden Markov chain,
and shown that the full-information version of the model, which has a closed-form
likelihood, is then a natural building block for parameter estimation.
We have applied these methods to a consumption-based asset pricing model with
investor learning, in which volatility has from 1 to 4 degrees of persistence. We have
verified by Monte Carlo simulations the accuracy of our indirect inference estimators,
and have implemented these techniques on a long series of daily excess stock returns.
We have estimated the parameters driving fundamentals and the quality of the signals received by investors. We have then used the SOS filter to estimate the likelihood
of each imputed model. The effect of learning in the return equation is reported to
be statistically significant in the preferred specifications with 2 to 4 degrees of persistence. The structural equilibrium model also provides good value-at-risk forecasts
27
out of sample. This paper illustrates that indirect inference, in combination with the
SOS filter, is a powerful method to estimate and filter structural general equilibrium
models.
28
A
Appendix to Section 2 (Estimation)
We provide a set of sufficient conditions for the asymptotic results stated in section
2, and then discuss numerical implementation.
A.1
Sufficient Conditions for Convergence
We assume that the plain vanilla model’s estimator φ̂T maximizes a criterion function
F(φ, YT ), and that the vector of auxiliary statistics η̂T maximizes a criterion H(η, YT )
that does not depend on φ̂T . The auxiliary estimator µ̂T = (φ̂′T , η̂T′ )′ can therefore
be written as:
µ̂T = arg max QT (µ, YT ),
(A.1)
µ
where QT (µ, YT ) = F(φ, YT ) + H(η, YT ).
Assumption A1 (Binding Function) Under the structural model θ∗ , the auxiliary criterion function QT (µ, YT ) converges in probability to Q∞ (µ, θ∗ ) for all µ.
Moreover, the binding function µ : Rp → Rp defined by
µ(θ) = arg max Q∞ (µ, θ)
µ
is injective.
Assumption A2 (Score) The renormalized score satisfies:
√ ∂QT
d
[µ(θ∗ ), YT ] → N(0, I0 ),
T
∂µ
where I0 is positive definite symmetric matrix.
Assumption A3 (Hessian of Criterion Function) The Hessian matrix
−
∂ 2 QT
[µ(θ∗ ), YT ]
′
∂µ∂µ
is invertible and converges in probability to a nonsingular matrix J0 .
29
√
d
Under assumptions A1-A3, the auxiliary estimator satisfies T [µ̂T − µ(θ∗ )] −→
N(0, W ∗ ), where W ∗ = J0−1 I0 J0−1 , and the asymptotic results at the end of section 2
hold (Gouriéroux, Monfort, and Renault, 1993; Gouriéroux and Monfort, 1996).
A.2
Numerical Implementation
In the just-identified case, the asymptotic variance-covariance matrix of the indirect
inference estimator simplifies to
Σ=
1
1+
H
2
∂ 2 Q∞
∗
∗ −1 ∂ Q∞
[µ(θ
),
θ
]
I
[µ(θ∗ ), θ∗ ]
0
∂θ∂µ′
∂µ∂θ′
−1
,
as shown in Gouriéroux and Monfort (1996). In practice, we can estimate it as
follows.
Assumption A4 (Decomposable Score) There exists a function ψ : Rt × Rp →
R, such that the score function can be written as
T
∂QT
1X
ψ(yt |Yt−1 ; µ)
(µ, YT ) ≡
∂µ
T t=1
for all YT and µ.
By Assumption A4, the auxiliary parameter satisfies the first-order condition:
T
∂QT
1X
ψ(yt |Yt−1 ; µ̂T ) = 0.
(µ̂T , YT ) =
∂µ
T t=1
(A.2)
We estimate I0 by the Newey and West (1987) variance-covariance matrix:
Iˆ0 = Γ̂0 +
τ X
v=1
v
1−
τ +1
30
Γ̂v +
Γ̂′v
,
(A.3)
P
where Γ̂v = T −1 Tt=v+1 ψ(yt |Yt−1 ; µ̂T )ψ(yt |Yt−1 ; µ̂T )′ . All the results reported in the
2Q
∗
∗
∞
paper are based on τ = 10 lags. We approximate ∂∂θ∂µ
′ [µ(θ ), θ ] by
∂ 2 QHT
[µ̂T , YHT (θ̂T )],
∂θ∂µ′
and obtain a finite-sample estimate of the asymptotic variance-covariance matrix Σ.
B
B.1
Appendix to Section 3 (Learning Economies)
Proof of Proposition 1
We infer from Bayes’ rule that
Πjt ∝ fS (st |Mt = mj , St−1 ; θ)P(Mt = mj |St−1 ; θ),
{z
}
|
=fS (st |Mt =mj ;θ) by As. 1(b)
where
j
P(Mt = m |St−1 ; θ) =
d
X
i=1
and Proposition 1 holds.
B.2
P(M = mj |Mt−1 = mi , St−1 ; θ)P(Mt−1 = mi |St−1 ; θ),
{z
}
| t
=aij by As. 1(a)
Proof of Proposition 2
Bayes’ rule (3.1) implies that for every i ∈ {1, . . . , d},
Πt |(Mt = mi , xt−1 , . . . , x1 ) = Πt |(Mt = mi , Πt−1 ) .
(B.1)
Also, by Assumption 1(a)
P(Mt = mi |xt−1 , . . . , x1 ; θ) = P(Mt = mi |Mt−1 ; θ) .
31
(B.2)
From (B.1) and (B.2), we conclude that xt is first-order Markov.
We know from Kaijser (1975) that under the conditions stated in the proposition,
the belief process Πt has a unique invariant distribution. Proposition 2.1 in van
Handel (2009) implies that (Mt , Πt ) also has a unique invariant measure Λ∞ .5 We
infer from the Birkhoff-Khinchin theorem that for any integrable function Φ : X → R,
P
the sample average T −1 Tt=1 Φ(xt ) converges almost surely to the expectation of Φ
under the invariant measure Λ∞ .
B.3
Proof of Proposition 3
The econometrician recursively applies Bayes’ rule:
P(Mt = mj |Yt ; φ) =
fY,F I (yt |Mt = mj , Yt−1 ; φ)P(Mt = mj |Yt−1 ; φ)
fY,F I (yt |Yt−1 ; φ)
P
Since fY,F I (yt |Mt = mj , Yt−1 ; φ) = di=1 fi,j (yt ; φ)P(Mt−1 = mi |Mt = mj , Yt−1 ; φ), we
P
infer that fY,F I (yt |Mt = mj , Yt−1 ; φ)P(Mt = mj |Yt−1 ; φ) = di=1 fi,j (yt ; φ)P(Mt−1 =
mi , Mt = mj |Yt−1 ; φ), and therefore
j
P(Mt = m |Yt ; φ) =
Pd
i=1
ai,j fi,j (yt ; φ)P(Mt−1 = mi |Yt−1 ; φ)
.
fY,F I (yt |Yt−1 ; φ)
The econometrician’s conditional probabilities are therefore computed recursively.
Since the conditional probabilities P(Mt = mj |Yt ; φ) add up to unity, the conditional density of yt satisfies
fY,F I (yt |Yt−1 ; φ) =
d X
d
X
i=1 j=1
ai,j fi,j (yt ; φ)P(Mt−1 = mi |Yt−1 ; φ).
The log-likelihood function LF I (φ|YT ) =
analytically.
5
PT
t=1
ln fY,F I (yt |Yt−1 ; φ) is thus available
Chigansky (2006) derives a similar result in continuous time.
32
C
Appendix to Section 4 (Application)
C.1
Bayesian Updating: Computation of Πjt
The conditional density of the signal st conditional on Mt = mj is
fS (st |Mt = mj ; θ) = N
"
σ 2 (mj )
gD − D 2
gC
!
2
σD
(mj )
σC σD (mj )ρC,D
,
σC σD (mj )ρC,D
σC2
!#
N mj , σδ2 Ik ,
and the probabilities Πjt are recursively determined from the initial value Πj1 = 1/d, ∀j
and Bayes’ rule given in Proposition 1.
C.2
Specification
The log interest rate is constant and satisfies rf = − ln(δ) + αgC − α2 σC2 /2. The
′
linear coefficients are given by Q(m1 ), . . . , Q(md ) = (I − B)−1 ι − ι , where B =
(bij )1≤i,j≤d is the matrix with components bij = ai,j exp gD − rf − α ρC,D σC σD (mj )
and ι = (1, . . . , 1)′ .
The calibrated parameters are chosen such that they provide empirically plausible
results in the analysis of U.S. daily excess returns reported in CF (2007). Specifically,
we let gC = 0.75 basis point (bp) (or 1.18% per year), rf = 0.42 bp per day (1%
per year), gD − rf = 0.5 bp per day (about 1.2% per year), σC = 0.189% (or
2.93% per year), σ D = 0.70% per day (about 11% per year), and ρC,D = 0.6. The
risk aversion coefficient α is chosen so that the average price-dividend ratio is Q =
P
d−1 di=1 Q(mi ) = 6000 in daily units (25 in yearly units).
C.3
Pseudo Data Generator
We can estimate the structural parameter θ using pseudo-data {yt∗ (θ)}t=2,...,HT , H ≥
1, simulated from the structural learning model with a given θ in the following way.
Simulate the following variates:
• j0 from a discrete uniform U{1, . . . , d};
33
k=1,...,k
with u∗k,t ∼ U[0, 1];
• {u∗k,t }t=2,...,HT
• {ε∗t }HT
t=2 with
ε∗t =
ε∗C,t
ε∗D,t
!
∼N
"
!#
!
1
ρC,D
0
;
,
ρC,D
1
0
∗
∗ ′
∗
• {zt∗ }t=2,...,HT where zt∗ = (z1,t
, . . . , zk,t
) with zk,t
∼ N(0, 1).
∗ k=1,...,k
∗
• {Nk,t
}t=2,...,HT with Nk,t
∼ Bernoulli(0.5) .
The pseudo-data is then obtained by calculating M1∗ = mj0 and
∗
Mk,t
=

∗
Nk,t
1−N ∗
 2−m
m k,t
0
0
M ∗
k,t−1
if u∗k,t < γk ,
otherwise.
The other variables are defined by:

2

[Mt∗ (θ)] + σD [Mt∗ (θ)]ε∗D,t
s∗1,t (θ) = gD − 0.5σD





∗
∗

 s2,t (θ) = gC + σC εC,t
for all t ≥ 2.
C.4
∗
∗
si+2,t = Mi,t
(θ) + σδ zi,t
, i = 1, . . . , k



∗


1 + Q[Πt (θ)]

∗

+ s∗1,t (θ) − rf
 yt (θ) = ln
Q[Π∗t−1 (θ)]
(C.1)
Indirect Inference Estimator Based on the Median
The auxiliary estimator µ̂T = (φ̂T , η̂T )′ maximizes the auxiliary criterion function
P
QT (µ, YT (θ)) = T −1 LF I (m0 , γk , b|YT ) − T −1 Tt=1 |yt − η| . Since it satisfies
T
1X
∂QT
(µ̂T , YT ) =
∂µ
T t=1
"
∂ ln fY,F I
(yt |Yt−1 ; φ̂T )
∂φ
−sign(yt − η̂T )
#
= 0,
we can apply the estimation methodology highlighted in section 2.
34
References
Andersen, T.G., H.J. Chung and B.E. Sorensen (1999). Efficient Method
of Moment Estimation of a Stochastic Volatility Model: A Monte Carlo Study.
Journal of Econometrics 91, 61–87.
Brandt, M. W., Zeng, Q., and L. Zhang (2004). Equilibrium Stock Return
Dynamics under Alternative Rules of Learning about Hidden States. Journal of
Economic Dynamics and Control 28, 1925–1954.
Brennan, M. (1998). The Role of Learning in Dynamic Portfolio Decisions. European Finance Review 1, 295–396.
Brennan, M., and Y. Xia (2001). Stock Price Volatility and the Equity Premium.
Journal of Monetary Economics 47, 249–283.
Calvet, L. E. and Czellar, V. (2011). State-Observation Sampling. manuscript.
Calvet, L. E., and A. J. Fisher (2001). Forecasting Multifractal Volatility. Journal of Econometrics 105, 27–58.
Calvet, L. E., and A. J. Fisher (2004). How to Forecast Long-Run Volatility:
Regime-Switching and the Estimation of Multifractal Processes. Journal of Financial Econometrics 2, 49–83.
Calvet, L. E., and A. J. Fisher (2007). Multifrequency News and Stock Returns.
Journal of Financial Economics 86: 178-212.
Calvet, L. E., and A. J. Fisher (2008). Multifractal Volatility: Theory, Forecasting and Pricing. Elsevier – Academic Press.
Calvet, L. E., Fisher, A. J., and S. Thompson (2006). Volatility Comovement:
A Multifrequency Approach. Journal of Econometrics 131, 179–215.
Calzolari G., Fiorentini, G. and Sentana, E. (2004). Constrained Indirect
Estimation. Review of Economic Studies 71, 945–973.
35
Cecchetti, S. G., Lam, P. and Mark, N. C. (2000). Asset Pricing with Distorted Beliefs: Are Equity Returns Too Good to Be True? American Economic
Review 90(4), 787–805.
Chigansky, P. (2006). An Ergodic Theorem for Filtering with Applications to
Stability. Systems and Control Letters 55(11), 908–917.
Christoffersen, P. (2009). Backtesting. In: Encyclopedia of Quantitative Finance, R. Cont (ed)., John Wiley and Sons.
Czellar, V., Karolyi, G. A. and Ronchetti, E. (2007). Indirect Robust Estimation of the Short-Term Interest Process. Journal of Empirical Finance 14,
546–563.
Czellar, V. and Ronchetti E. (2010). Accurate and Robust Tests for Indirect
Inference. Biometrika 97(3), 621–630.
David, A. (1997). Fluctuating Confidence in Stock Markets: Implications for Returns and Volatility. Journal of Financial and Quantitative Analysis 32, 427–462.
David, A., and P. Veronesi (2006). Inflation and Earnings Uncertainty and
Volatility Forecasts. Working paper, University of Calgary and University of
Chicago.
Dridi, R., Guay, A. and Renault, E. (2007). Indirect Inference and Calibration
of Dynamic Stochastic General Equilibrium Model. Journal of Econometrics 136,
397–430.
Fernández-Villaverde, J., and J. Rubio-Ramirez (2007). Estimating
Macroeconomic Models: A Likelihood Approach. Review of Economic Studies 74,
1059–1087.
Genton, M. G., and E. Ronchetti (2003). Robust Indirect Inference. Journal
of the American Statistical Association 98, 67–76.
36
Gordon, N., Salmond, D., and A. F. Smith (1993). Novel Approach to
Nonlinear/Non-Gaussian Bayesian State Estimation. IEE Proceedings F 140, 107–
113.
Gouriéroux, C. and Monfort, A. (1996). Simulation-Based Econometric Methods. – Oxford University Press.
Gouriéroux, C., Monfort, A. and Renault, E. (1993). Indirect Inference.
Journal of Applied Econometrics 8, S85–S118.
Gouriéroux, C., Phillips, P.C.B. and Yu, J. (2010). Indirect inference for
dynamic panel models. Journal of Econometrics 157, 68–77.
Guidolin, M. and Timmermann, A. (2003). Option Prices under Bayesian Learning: Implied Volatility Dynamics and Predictive Densities. Journal of Economic
Dynamics and Control 27, 717–769.
Guvenen, F. and Smith, A. (2010). Inferring Labor Income Risk from Economic
Choices: An Indirect Inference Approach. Econometrica, revise and resubmit.
Hamilton, J. (1989). A New Approach to the Economic Analysis of Nonstationary
Time Series and the Business Cycle. Econometrica 57, 357–384.
Hansen, L. P. (2007). Beliefs, Doubts and Learning: Valuing Macroeconomic Risk.
American Economic Review 97(2), 1–30.
Heggland, K. and Frigessi, A. (2004). Estimating Functions in Indirect Inference. Journal of the Royal Statistical Society Series B, 66, 447–462.
Kaijser, T. (1975). A Limit Theorem for Partially Observed Markov Chains. Annals of Probability 3(4), 677–696.
Lettau, M., Ludvigson, S. and Wachter J. (2008). The Declining Equity Premium: What Role Does Macroeconomic Risk Play? Review of Financial Studies
21, 1653–1687.
37
Moore, B. and Schaller, H. (1996). Learning, Regime Switches, and Equilibrium Asset Pricing Dynamics. Journal of Economic Dynamics and Control 20(67), 979–1006.
Newey, W., and K. West (1987). A Simple, Positive Definite Heteroscedasticity
and Autocorrelation Consistent Covariance Matrix. Econometrica 55, 703–708.
Pástor, L., and Veronesi, P. (2009a). Learning in Financial Markets. Annual
Review of Financial Economics 1, 361 – 381.
Pástor, L., and Veronesi, P. (2009b). Technological Revolutions and Stock
Prices. American Economic Review 99, 1451–1483.
Schorfheide, F. (2011). Estimation and Evaluation of DSGE Models: Progress
and Challenges. NBER Working Paper 16781.
Sentana, E., Calzolari, G. and Fiorentini, G. (2008). Indirect Estimation
of Large conditionally Heteroskedastic Factor Models, with an Application to the
Dow 30 Stocks, Journal of Econometrics 146, 10–25.
Smith, A. A. (1990). Three Essays on the Solution and Estimation of Dynamic
Macroeconomic Models. PhD dissertation, Duke University 1990.
Smith, A. A. (1993). Estimating Nonlinear Time Series Models Using Simulated
Vector Autoregressions. Journal of Applied Econometrics 8, S63–S84.
Storvik, G. (2002). Particle Filters in State Space Models with the Presence of
Unknown Static Parameters. IEEE Transactions on Signal Processing 50, 281–
289.
Timmermann, A. G. (1993). How Learning in Financial Markets Generates Excess
Volatility and Predictability in Stock prices. Quarterly Journal of Economics 108,
1135–1145.
38
Timmermann, A. G. (1996). Excess Volatility and Predictability of Stock Prices in
Autoregressive Dividend Models with Learning. Review of Economic Studies 63,
523–557.
Van Handel, R. (2009). Uniform Time Average Consistency of Monte Carlo Particle Filters. Stochastic Processes and Their Applications 119, 3835–3861.
Van Nieuwerburgh, S. and Veldkamp, L. (2006). Learning Asymmetries in
Real Business Cycles. Journal of Monetary Economics 53, 753–772.
Veronesi, P. (1999). Stock Market Overreaction to Bad News in Good Times: A
Rational Expectations Equilibrium Model. Review of Financial Studies 12, 975–
1007.
Veronesi, P. (2000). How Does Information Quality Affect Stock Returns? Journal
of Finance 55, 807–837.
Vuong, Q. (1989). Likelihood Ratio Tests for Model Selection and Non-Nested
Hypotheses. Econometrica 57, 307–333.
39