Efficient Perturbation Methods for Solving Switching DSGE Models

Efficient Perturbation Methods for Solving
Switching DSGE Models
Junior Maih∗
Norges Bank
[email protected]
December 15, 2014
Abstract
In this paper we present a framework for solving switching nonlinear DSGE
models. In our approach, the probability of switching can be endogenous and
agents may react to anticipated events. The solution algorithms derived are
suitable for solving large systems. They use a perturbation strategy which,
unlike Foerster et al. (2014), does not rely on the partitioning of the switching
parameters and is therefore more efficient. The algorithms presented are all
implemented in RISE, an object-oriented toolbox that is flexible and can easily
integrate alternative solution methods. We apply our algorithms to and can
replicate various examples found in the literature. Among those examples is a
switching RBC model for which we present a third-order perturbation solution.
JEL C6, E3, G1.
∗
First version: August 1, 2011. The views expressed herein are mine and do not necessarily
represent those of Norges Bank. Some of the ideas in this paper have often been presented under
the title : ”Rationality in Switching Environments”. I would like to thank Dan Waggoner, Tao Zha,
Hilde Bjørnland, and seminar participants at the Atlanta Fed (Sept. 2011), IMF, Norges Bank,
JRC, the National Bank of Belgium, dynare conferences (Zurich, Shanghai), Venastul, Board of
Governors of the Federal Reserve System and the 87th WEAI conference in San Francisco.
1
1
Introduction
In an environment in where economic structures break, variances change, distributions shift, conventional policies weaken and past events (e.g. crises) tend to reoccur, the Markov switching DSGE (MSDSGE) model appears as the natural framework for analyzing the dynamics of macroeconomic variables12 . MSDSGE models
are also important because many policy questions of interest seem to be best answered/addressed in a framework of changing parameters3 .
Not surprisingly then, besides the ever-growing empirical literature using MSDSGE models, many efforts have been invested in solving those models. In that
respect, the literature has considered 3 main angles of attack. One strand of literature considers approximating the solution of those models using ”global” methods.
Examples include Davig et al. (2011), Bi and Traum (2013) and Richter et al. (2014).
Just like in constant-parameter models, global approximation methods in switchingparameter models face problems of curse of dimensionality, reliance on a pre-specified
set of grid points typically constructed around one steady state although the model
may have many, etc. The curse of dimensionality in particular, implies that the number of state variables has to be as small as possible and even solving small models
involves substantial computational costs.
A second group of techniques applies Markov switching to the parameters of
linear or linearized DSGE models. Papers in this class include Farmer et al. (2011),
Cho (2014), Svensson and Williams (2007) to name a few. One advantage of this
approach is that one can handle substantially larger problems than the ones solved
using global methods. Insofar as the starting point is a linear model, all one has to
1
Changes in shock variances have been documented by Stock and Watson (2003), Sims and Zha
(2006), Justiniano and Primiceri (2008), while breaks in structural parameters have been advocated
by Bernanke et al. (1999), Lubik and Schorfheide (2004), Davig and Leeper (2007). Other papers
have also documented changes in both variances and structural parameters. Examples include Smets
and Wouters (2007), Svensson and Williams (2007, Svensson and Williams (2009) and Cogley et al.
(2012).
2
In such a context, constant-parameter models could be misleading in their predictions and their
implications for policymaking.
3
Some of those questions are:
• what actions should we undertake today given the non-zero likelihood of a bad state occurring
in the future?
• What can we expect of the dynamics of the macro-variables we care about when policy is
constrained?
• How is the economy stabilized when policy is constrained?
2
worry about is how to compute and characterize the solution of the model. If the
original model is nonlinear, however, a first-order approximation may not be sufficient
to approximate the nonlinear dynamics implied by the true policy functions of the
model. The nonlinearity of the original model also implies that one cannot assume
away the switches in the parameters for the sake of linearization and then re-apply
them to the model parameters once the linearized form is obtained. Consistently
linearizing the model while taking into account the switching parameters calls for a
different strategy.
Finally, the third group of techniques attempts to circumvent or find a workaround to the problems posed by the first two groups. More specifically, this literature embeds switching in perturbation solutions whose accuracy can be improved
with higher and higher orders4 . This is the approach followed by Foerster et al.
(2014) and in this paper.
For many years, we have been developing, in the RISE toolbox, algorithms for
solving and estimating models of switching parameters including DSGEs, VARs,
SVARs, optimal policy models of commitment, discretion and loose commitment5 .
In the present paper, however, we focus on describing the theory behind the routines
implementing the switching-parameter DSGE in RISE6 .
Our approach is more general than the ones discussed earlier or found in the
literature. In contrast to Foerster et al. (2014) and the papers cited above, in our
derivation of higher-order perturbations we allow for endogenous transition probabilities. We also allow for anticipated events following Maih (2010) and Juillard and
Maih (2010). This feature is useful as it offers an alternative to the news shocks
4
Model accuracy is often measured by the Euler approximation errors. It is important to note
that a high accuracy is not synonymous with the computed policy functions being close to the true
ones
5
RISE is a matlab-based object-oriented toolbox. The flexibility and modularity of the toolbox
are such that any user with little exposure to RISE is able to apply his own solution method
without having to change anything to the core codes. The toolbox is available, free of charge, at
https://github.com/jmaih/RISE toolbox.
6
We plan to describe algorithms for other modules of RISE in subsequent papers.
3
strategy that has often been used to analyze the effects of forward guidance78 . The
introduction of an arbitrary number of anticipated shocks does come at a cost though:
the number of cross terms to keep track of increases rapidly with the order of approximation and computing those terms beyond the first order becomes intractable.
This problem is circumvented by exploiting a simple trick by Levintal (2014), who
shows a way of computing higher-order approximations by keeping all state variables
in a single block9 .
But before computing higher-order perturbations, two problems relative to the
choice of the approximation point and the solving of a quadratic matrix equation,
have to be addressed. With respect to the choice of the approximation point, Foerster et al. (2014) propose a ”partition perturbation” strategy in which the switching
parameters are separated into two groups: one group comprises the switching parameters that affect the (unique) steady state and another group that collects the
switching parameters that do not affect the steady state. Here also, the approach
implemented in RISE is more general, more flexible and simpler: it does not require
any partitioning of the switching parameters and is therefore more efficient. Moreover, it allows for the possibility of multiple steady states and delivers the results of
7
There are substantial differences between what this paper calls ”anticipated events” or ”anticipated shocks” and ”news shocks”. First, solving the model with anticipated shocks does not
require a modification of the equations of the original system, in contrast to ”news shocks” that
are typically implemented by augmenting the law of motion of a shock process with additional
shocks. Secondly, in the anticipated shocks approach, future events are discounted while in the
news shocks approach the impact of a news shock does not depend on the horizon at which the
news occurs. Thirdly, an anticipated shock is genuinely a particular structural shock in the original
system, while in the news shocks, it is a different iid shock with no other interpretation than ”a
news” and unrelated to any structural shock in the system. Because it is unrelated, it will have its
own distribution independently of other parts of the system. Fourthly, the estimation of models
of news shocks requires additional variables to be declared as observables and enter the measurement equation. The estimation procedure tries to fit the future information is the same way it fits
the other observable variables. This feature makes Bayesian model comparison infeasible since the
comparison of two models requires that they have the same observable variables. In contrast, in the
estimation of models with anticipated shocks, the anticipated information, which may not materialize is separated from the actual data. Model comparison remains possible since the anticipated
information never enters the measurement equations. Finally, in the anticipated shocks approach
the policy functions are explicitly expressed in terms of leads of future shocks as opposed to lags in
the news shocks approach.
8
In particular, it makes it possible to analyze the effects of hard conditions as well as soft conditions on the future information The discounting of future events also depends on the uncertainty
around those events.
9
Before the new implementation, the higher-order perturbation routines of RISE separated the
state variables into three blocks: endogenous variables, perturbation parameter, exogenous shocks.
Cross products of those had to be taken explicitly, calculated and stored separately.
4
the ”partition perturbation” of Foerster et al. (2014) as a special case.
When it comes to solving the system of quadratic matrix equations implied by
the first-order perturbation, Foerster et al. (2014)’s proposal is to use the theory of
of Gröbner bases (see Buchberger (1965, Buchberger (2006)) to find all the solutions
of the system and then apply the engineering concept of mean square stability to
each of the solutions as a way to check whether the Markov switching system admits
a single stable solution. While recognizing the benefits of a procedure that can generate all possible solutions of a nonlinear system and even plan to integrate a similar
procedure in RISE, we argue that such an approach may not be practical or suitable for systems of the size that we have been accustomed to in constant-parameter
DSGE models10 . Furthermore the approach also has two major limitations: (1)
stability of first-order approximation does not imply stability of higher-order approximations, even in constant-parameter DSGE models; (2) there is no stability
concept for switching models with endogenous transition probabilities.
Because we are ultimately interested in estimating those models in order to make
them really useful for policymaking, we take a more modest route: we derive efficient
functional iterations and Newton algorithms that are suitable for solving relatively
large systems. As an example, we have successfully solved a second-order perturbation of a model of 272 equations using our algorithms. We further demonstrate
the efficiency and usefulness of our solution methods by applying them to various
examples found in the literature. Our algorithms easily replicate the results found
by other authors. Among the examples we consider is a switching model by Foerster
et al. (2014) for which we present a third-order perturbation solution.
The rest of the paper proceeds as follows. Section 2 introduces the notation
we use alongside the generic switching model that we aim to solve. Then section
3 derives the higher-order approximations of the model. At the zero-th order, we
present various flexible strategies for choosing the approximation point. Section 4
takes on the solving of the quadratic matrix polynomial arising in the first-order
approximation. Section 5 evaluates the performance of the proposed algorithms and
section 6 concludes.
10
Gröbner bases have problems of their own and we will return to some of them later in the text.
5
2
2.1
The Markov Switching Rational Expectations
Model
The economic environment
We are interested in characterizing an environment in which parameters switch in
a model that is potentially already nonlinear even in the absence of switching parameters. In that environment we would like to allow for the transitions controlling
parameter switches to be endogenous and not just exogenous as customarily found in
the literature. Finally, recognizing that at the time of making decisions agents may
have information about future events, it is desirable that in our framework future
events –forward guidance on the behavior of policy is an example of such a possibility
– ,which may or many not materialize in the end, influence the current behavior of
private agents.
2.2
Dating and notation conventions
The dating convention used in this paper is different from the widely used convention
of Schmitt-Grohe and Uribe (2004) in which the dating of the variables refers to the
beginning of the period. Instead we use the also widely used dating convention in
which the dating of the variables refers to the end of the period. In that way, the
dating determines the period in which the variable of interest is known. This is the
type of notation used for instance in Adjemian et al. (2011).
Some solution methods for constant-parameter DSGE models (e.g. Klein (2000),
Gensys, Schmitt-Grohe and Uribe (2004)) stack variables of different periods. This
type of notation has also been used in switching-parameter DSGE models by Farmer
et al. (2011), Foerster et al. (2013). This notation is not appropriate for the type
of problems this paper aims to solve. In addition to forcing the creation of auxiliary
variables and thereby increasing the size of the system, it also makes it cumbersome to
compute expectations of variables in our context since future variables could belong
to a state that is different from the current one. Clearly, doing so may restrict the
class of problems one can solve or give the wrong answer in some types of problems
11
. The stacking of different time periods therefore is not appealing for our purposes.
11
In the Farmer et al. (2011) procedure for instance, the Newton algorithm is constructed on the
assumption that the coefficient matrix on forward-looking variables depends only on the current
regime.
6
2.3
The generic model
Many solution approaches, like Farmer et al. (2011) or Cho (2014), start out with a
linear model and then apply a Markov switching to the parameters. This strategy is
reasonable as long as one takes a linear specification as the structural model. When
the underlying structural model is nonlinear, however, the agents are aware of the
nonlinear nature of the system and the switching process. This has implications for
the solutions based on approximation and for the decision rules. In this setting, an
important result by Foerster et al. (2013) is that the first-order perturbation may
be non-certainty equivalent. Hence starting out with a linear specification may miss
this important point.
The problem to solve is
Et
h
X
πrt ,rt+1 (It ) d˜rt (v) = 0
rt+1 =1
where Et is the expectation operator, d˜rt : Rnv −→ Rnd is a nd × 1 vector of possibly
nonlinear functions of their argument v (defined below), rt = 1, 2, .., h is the regime
a time t, πrt ,rt+1 (It ) is the transition probability for going from regime rt in the
current period to regime rt+1 in the next period. This probability is potentially
endogenous in the sense that it is a function of It , the information set at time t.
The only restriction imposed on the endogenous switching probabilities is that the
parameters affecting them do not switch over time and that the variables entering
those probabilities have a unique steady state12 .
The nv × 1 vector v is defined as
0
0
0
0
0
0
0
v ≡ bt+1 (rt+1 ) ft+1 (rt+1 ) st (rt ) pt (rt ) bt (rt ) ft (rt ) p0t−1 b0t−1 ε0t θr0 t+1
(1)
where st is a ns ×1 vector of static variables13 , ft is a nf ×1 vector of forward-looking
variables14 , pt is a np × 1 vector of predetermined variables15 , bt is a nb × 1 vector
of variables that are both predetermined and forward-looking, εt is a nε × 1 vector
of shocks with εt ∼ N (0, Inε ) and θrt+1 is a nθ × 1 vector of switching parameters
(appearing with a lead in the model).
Note that we do not declare the parameters of the current regime rt . They are
implicitly attached to d˜rt . Also note that we could get rid of the parameters of future
12
RISE automatically checks for these requirements
Those are the variables appearing in the model only at time t.
14
Those variables appearing in the model both time t and in future periods t + 1, t + 2, etc..
15
Those variables appear in the model at time t and at times t − 1, t − 2, etc.
13
7
regimes (θrt+1 ) by declaring auxiliary variables, which would then be forward-looking
variables.
If we define the nd × 1 vector drt ,rt+1 as drt ,rt+1 ≡ πrt ,rt+1 (It ) d˜rt , the objective
becomes
Et
h
X
drt ,rt+1 (v) = 0
(2)
rt+1 =1
We assume that the agents have information for all or some of the shocks k ≥ 0
periods ahead into the future. And so, including a perturbation parameter σ, we
define an nz × 1 vector of state variables as
0
zt ≡ p0t−1 b0t−1 σ ε0t ε0t+1 · · · ε0t+k
where nz = np + nb + (k + 1) nε + 1.
Denoting by yt (rt ), the ny × 1 vector of all the endogenous variables, where
ny = ns + np + nb + nf , we are interested in solutions of the form



 r
st (rt )
S t (zt )
 pt (rt ) 

 rt
 = T rt (zt ) ≡  P r (zt ) 
yt (rt ) ≡ 
(3)
 bt (rt ) 
 B t (zt ) 
F rt (zt )
ft (rt )
In general, there is no analytical solution to (2) even in cases where d˜rt or drt ,rt+1
is linear. In this paper we rely on a perturbation that will allow us to approximate
the decision rules in (3)16 . We can then solve these approximated decision rules by
inserting their functional forms into (2) and its derivatives. This paper develops
methods for doing that.
For the subsequent derivations, it is useful to define for all g ∈ {s, p, b, f }, an
ng × ny matrix
λg that select the solution
of g-type variables in T or y. We also
λp
λb
define λx ≡
and λbf ≡
as the selector for p-b and b-f variables
λb
λf
respectively. In the same way, we define for all g ∈ {pt−1 , bt−1 , σ, εt , εt+1 , ..., εt+k }, a
matrix mg of size ng × nz that selects the g-type variables in the state vector zt .
16
Alternatively one could approximate (2) directly as it is done for instance in Dynare.
8
3
Approximations
Since the solution is in terms of the vector of state variables zt , we proceed to
expressing all the variables in the system as a function of zt 17 . Since both bt+1 (rt+1 )
and ft+1 (rt+1 ) appear in the system (1) and given the solution (3) we need to express
zt+1 as a function of zt . This is given by
zt+1 = hrt (zt ) + uzt
where
hrt (zt ) ≡
(λx T rt (zt ))0 (mσ zt )0 (mε,1 zt )0 · · ·
and u is a nz × nz random matrix defined by
0(np +nb +1+knε )×nz
u≡
εt+k+1 mσ
(mε,k zt )0 (0nε ×1 )0
0
(4)
Following Foerster et al. (2014) , we postulate a perturbation solution for θrt+1
18
as
θrt+1 = θ̄rt + σ θ̂rt+1
(5)
In this respect, this paper differs from Foerster et al. (2013) and Foerster et al.
(2014) in two important ways. First, θ̄rt need not be the ergodic mean of the parameters as will be discussed below. Secondly, conditional on being in regime rt
perturbation is never done with respect to the θrt parameters. Perturbation is done
only with respect to the parameters of the future regimes (θrt+1 ) that appear in the
system for the current regime (rt ).
Given the solution, we can now express vector v in terms of the state variables


λbf T rt+1 (hrt (zt ) + uzt )


T rt (zt )




mp zt


v=
(6)

m
z
b
t




mε,0 zt
θ̄rt + θ̂rt+1 mσ zt
17
Here we differentiate the policy functions. One could, alternatively, differentiate the model
conditions as done for instance in Dynare and Dynare++. This would involve taking the Taylor
expansion of the original objective (2).
18
This assumption is not necessary. We could, as argued above, simply declare new auxiliary
variables and get rid of the parameters of the future regimes.
9
and the objective function (2) becomes
Et
h
X
drt ,rt+1 (v (zt , εt+k+1 )) = 0
(7)
rt+1 =1
Having expressed the problem to solve in terms of state variables in a consolidated vector zt , as in Levintal (2014), we stand ready to take successive Taylor
approximations of 7 to find the perturbation solutions.
3.1
Zero-th order
The first step of the perturbation technique requires the choice of the approximation
point. In a constant-parameter world, we typically approximate the system around
the steady state, the resting point to which the system will converge in the absence
of future shocks. In a switching environment the choice is not so obvious any more.
Approximation around the ergodic mean Foerster et al. (2014) propose to
take a perturbation of the system around its ergodic mean. This ergodic mean can
be found by solving
d˜rt bt , ft , st , pt , bt , ft , pt , bt , 0, θ̄ = 0
where θ̄ is the ergodic mean of the parameters19 . The ergodic mean is, however,
need not be an attractor or a resting point, a point towards which the system will
converge in the absence of further shocks. We propose two further possibilities.
Regime-specific steady states20 The first one is to approximate the system
around regime-specific means. The system may not be stable at the mean in a
certain regime, but at least we assume that if the system happens to be exactly
at one of its regime-specific means, it will stay there in the absence of any further
shocks. We compute those means by solving
d˜rt (bt (rt ) , ft (rt ) , st (rt ) , pt (rt ) , bt (rt ) , ft (rt ) , pt (rt ) , bt (rt ) , 0, θrt ) = 0
The intuition behind this strategy is two-fold. On the one hand, it is not too
difficult to imagine that the relevant issue for rational agents living in a particular
19
20
In the RISE toolbox, this is triggered by the option ”unique”.
This approach is the default behavior in the RISE toolbox.
10
state of the system at some point in time, is to insure against the possibility of
switching to a different state, and not to the ergodic mean. On the other hand, from
a practical point of view, the point to which the system is to return matters for
forecasting. Many inflation-targeting countries, have moved from a regime of high
inflation to a regime of lower inflation. Approximating the system around the ergodic
mean in this case implies that the unconditional forecasts will be pulled towards a
level that is consistently higher than the recent history of inflation, which is likely
to yield substantial forecast errors. All this contributes to reinforcing the idea that
the ergodic mean is not necessarily an attractor.
Approximation around an arbitrary point In the second possibility, one can
impose an arbitrary approximation point. If the chosen point of approximation
is chosen arbitrarily, obviously, none of the two equations above will hold and a
correction will be needed in the dynamic system, with consequences for the solution
as well. This approach may be particularly useful in certain applications, e.g. a
situation in which one of the regimes persistently deviates from the reference steady
state for an extended period of time21 . The approach also bears some similarity with
the constant-parameter case where the approximation is sometimes taken around the
risky or stochastic steady state.
h
i
Suppose we want to take an approximation around an arbitrary point s̆, p̆, b̆, f˘ .
The strategy we suggest is to evaluate that point in each regime. More specifically
we will have
d˘rt ≡ d˜rt b̆, f˘, s̆, p̆, b̆, f˘, p̆, b̆, 0, θrt , θrt
This discrepancy is then forwarded to the first-order approximation when solving
the first-order coefficients.
Interestingly, both the regime-specific
approach
and the ergodic approach are speh
i
cial cases. In the former, because s̆, p̆, b̆, f˘ = [st (rt ) , pt (rt ) , bt (rt ) , ft (rt )], the dish
i crepancy, d˘rt , is zero. In the later case, s̆, p̆, b̆, f˘ = sergodic , pergodic , bergodic , f ergodic
and the discrepancy need not be zero.
The approach suggested here is computationally more efficient than that suggested by Foerster et al. (2014) and does not require any partitioning of the switching parameters between those that affect the steady state and those that do not. It
will be shown later on in an application that we recover their results.
21
In the RISE toolbox, this is triggered by the option ”imposed”. When the steady state is not
imposed, RISE uses the values provided as an initial guess in the solving of the approximation point
11
3.2
First-order perturbation
Taking high-order approximations of this system is tedious especially if one attempts
to compute all cross/partial derivatives in isolation. This is even worse if one allows
for anticipated events. A less cumbersome approach exploits the strategy of Levintal
(2014) in which the differentiation is done with respect to the entire vector z directly,
rather than element by element or block by block22 . At first order, then, we seek to
approximate T rt in (3) with a solution of the form
T rt (z) ' T rt (z̄rt ) + Tzrt (zt − z̄rt )
(8)
Finding this solution requires differentiating (7) with respect to zt and applying
the correction due to the choice of the approximation point, which is equivalent to
taking a first-order Taylor expansion of (7). Using tensor notation, we have
h
h ii
X
˘
[drvt ,rt+1 ]iα [vz ]αj = 0
drt + Et
(9)
rt+1 =1
where [drvt ,rt+1 ]iα denotes the derivative of the ith row of d with respect to the αth row
of v and, similarly, [vz ]αj denotes the derivative of the αth row of v with respect to
the j th row of z.
Defining
r ,r
r ,r
r ,r
r ,r
A0rt ,rt+1 ≡ dst0 t+1 dpt0 t+1 db0t t+1 dft0 t+1
we have
dv =
r ,rt+1
r ,r
r ,r
r ,rt+1
dft+ t+1 A0rt ,rt+1 dpt− t+1 db−t
db+t
r ,rt+1
r ,r
dεt0
dθt+ t+1
The equality in (9) follows from the fact that all the derivatives of (7) must be
equal to 0. The derivatives of v with respect to z are given by
vz = a0z + a1z u
(10)
where the definitions of a0z and a1z are given in appendix (A.1).
The derivative of h with respect to z need in the calculation of vz , as can be seen
in (6), is given by
hrzt =
(λx Tzrt )0 m0σ m0ε,1 · · ·
22
m0ε,k 0n2z ×nε
0
This strategy is particularly useful when it comes to computing higher-order cross derivatives.
By not separating state variables, we always one block of cross products no matter the order of
approximation instead of an exponentially increasing number of cross blocks.
12
Unfolding the tensors this problem reduces to
d˘rt +
h
X
drvt ,rt+1 Et vz = 0
rt+1 =1
∂drt ,rt+1
∂g q
r ,r
dgtq t+1
Let’s define
≡
for g = s, p, f, b referring to static, predetermined, forward looking and ”both” variables and for q = 0, +, − referring to current variables,
future variables and past variables respectively. The problem can be expanded into
h
X

 rt+1 =1
r ,rt+1

r ,r
dft+ t+1 λbf Tzrt+1 hrzt + A0rt ,rt+1 Tzrt +
=0
mp
r ,r
r ,r
r ,r
db−t t+1
+ dεt0 t+1 mε,0 + dθt+ t+1 θ̂rt+1 mσ
mb
db+t
r ,r
dpt− t+1
Looking at Tzrt and hrzt in detail, we see that they can be partitioned. In particular,
with
rt
Tz,x
≡
rt
rt
Tz,b
Tz,p
we have
Tzrt =





rt
hz = 



rt
rt
rt
rt
Tz,x
Tz,σ
Tz,ε
Tz,ε
···
0
1
rt
rt
rt
rt
λx Tz,ε
λx Tz,ε
λx Tz,x
λx Tz,σ
0
1
01×nx
1
01×nε
01×nε
0nε ×nx 0nε ×1
0nε
Inε
..
..
..
..
.
.
.
.
0nε
0nε
0nε ×nx 0nε ×1
0nε ×nx 0nε ×1
0nε
0nε
rt
Tz,ε
k
rt
· · · λx Tz,ε
k
· · · 01×nε
···
0nε
..
...
.
···
Inε
···
0nε









Hence the solving can be decomposed into small problems
3.2.1
Impact of endogenous state variables
The problem to solve is
rt
0 = A0rt Tz,x
+ A−
rt +
with
A0rt
≡
Ph
rt+1 =1
h
X
rt+1 =1
13
rt+1
rt
A+
rt ,rt+1 Tz,x λx Tz,x
A0rt ,rt+1
(11)
A−
rt
h
X
≡
r ,r
r ,rt+1
dpt− t+1 db−t
rt+1 =1
A+
rt ,rt+1 ≡
r ,rt+1
r ,r
0nd ×ns 0nd ×np db+t
dft+ t+1
(12)
Since there are many algorithms for solving (11), we delay the presentation of
our solution algorithms until section 4. For the time being, the reader should note
the way A+
rt ,rt+1 enters (11). This says that our algorithms will be able to handle
cases where the coefficient matrix on forward-looking terms is known in the current
+
period (A+
rt ,rt+1 = Art ,rt ) as in Farmer et al. (2011) but also the more complicated
+
case where A+
rt ,rt+1 6= Art ,rt as in Cho (2014). This is part of the reasons why the
notation of Schmitt-Grohe and Uribe (2004), where one can stack variables is not
appropriate in this context. This assumption is very convenient in the Farmer et al.
(2011) algorithm as it allows them to derive their solution algorithm, which would be
more difficult otherwise. It is also convenient as it leads to substantial computational
savings. But as our the derivations show, the assumption is incorrect in problems
+
where A+
rt ,rt+1 6= Art ,rt .
3.2.2
Impact of uncertainty
rt
For the moment, we proceed with the assumption that we have solved for Tz,x
. Now
we have to solve
d˘rt +
h
X
rt+1
A+
rt ,rt+1 Tz,σ
+
rt
Arσt Tz,σ
rt+1 =1
h
X
+
rt+1 =1
which leads to
 1
Aσ + A+
A+
···
A+
1,1
1,2
1,h
+
+
+
2

A
A
+
A
·
·
·
A
2,1
2,2
σ
2,h

rt
Tz,σ = − 
..
..
..
.
.

.
.
.
.
+
+
h
Ah,2
· · · Aσ + A+
Ah,1
h,h
where
Arσt
≡
A0rt
r ,r
dθt+ t+1 θ̂rt ,rt+1 = 0
+
h
X
r ,r

−1  ˘ Ph
1,r
d1 + rt+1 =1 dθ+ t+1 θ̂1,rt+1
P

2,r
 
d˘2 + hrt+1 =1 dθ+ t+1 θ̂2,rt+1 
 


 
..

 
.

P
h,rt+1
h
˘
dh + rt+1 =1 dθ+ θ̂h,rt+1
(13)
r ,rt+1
rt+1
dft+ t+1 λf Tz,p
λp + db+t
rt+1 =1
14
r
λb Tz,bt+1 λb
It follows from equation (13) that in our framework, it is the presence of (1)
future parameters in the current state system and/or (2) an approximation taken
around a point that is not the regime-specific steady state that creates non-certainty
equivalence.
3.2.3
Impact of shocks
Define
Urt ≡
h
X
!
rt+1
A+
rt ,rt+1 Tz,x
λx + A0rt
(14)
r ,rt+1
(15)
rt+1 =1
Contemporaneous shocks We have
rt
Tz,ε
= −Ur−1
0
t
Ph
rt+1 =1
dεt0
Future shocks For k=1,2,... we have the recursive formula
P
rt+1
h
rt
+
−1
A
T
Tz,ε
×
=
−U
k
rt
rt+1 =1 rt ,rt+1 z,ε(k−1)
3.3
(16)
Second-order solution
The second-order perturbation solution of T rt in (3) takes the form
T rt (z) ' T rt (z̄rt ) + Tzrt (zt − z̄rt ) + 21 Tzzrt (zt − z̄rt )⊗2
Since T rt (z̄rt ) and Tzrt have been computed in earlier steps, at this stage we only
need to solve for Tzzrt . To get the second-order solution, we differentiate (9) with
respect to z to get
Et
h
X
[drvvt ,rt+1 ]iαβ [vz ]βk [vz ]αj + [drvt ,rt+1 ]iα [vzz ]αjk = 0
(17)
rt+1 =1
so that unfolding the tensors yields
h
X
drvvt ,rt+1 Et vz⊗2 + drvt ,rt+1 Et vzz = 0
(18)
rt+1 =1
We use the notation A⊗k as a shorthand for A ⊗ A ⊗ ... ⊗ A. We get vzz by
differentiating vz with respect to z, yielding
15
vzz = a0zz + a1zz u⊗2 + a1zz (u ⊗ hrzt + hrzt ⊗ u)
(19)
where the definitions of a0zz , a1zz as well as the expressions for Evz⊗2 , Evzz and Eu⊗2
needed to solving for Tzzrt are given in appendix (A.2).
With those expressions in hand, expanding the problem to solve in (18) gives
Arzzt
+
h
X
rt
rt+1 rt
A+
rt ,rt+1 Tzz Czz + Urt Tzz = 0
(20)
rt+1 =1
with
Arzzt ≡
h
X
drvvt ,rt+1 Et vz⊗2
rt+1 =1
and
rt
Czz
≡ (hrzt )⊗2 + Eu⊗2
In the case where h = 1, some efficient algorithms can be derived/used e.g.
Kamenik (2005). Such algorithms cannot be applied in the switching case unfortunately. The strategy implemented in RISE applies a technique called ”transpose-free
quasi-minimum residuals” by Freund (1993).
3.4
Third-order solution
To get the third-order solution, we differentiate (17) with respect to z to get


rt ,rt+1 i
[dvvv
]αβγ [vz ]γl [vz ]βk [vz ]αj +
P
P


0 = Et hrt+1 =1  [drvvt ,rt+1 ]iαβ rst∈Ω1 [vzz ]βrs [vz ]αt + 
[drvt ,rt+1 ]iα [vzzz ]αjkl
Ω1 = {klj, jlk, jkl}
In matrix form
0 = Et
Ph
rt+1 =1
t ,rt+1
(drvvv
(vz⊗3 ) + ω1 (drvvt ,rt+1 (vz ⊗ vzz )) + drvt ,rt+1 vzzz )
where ω1 (.) is a function that computes the sum of permutations of tensors of type
A (B ⊗ C) and where the permutations are given by the indices in Ω1 .
We get vzzz by differentiating vzz with respect to z, yielding
16
vzzz ≡ a0zzz + a1zzz P (hrzt )⊗2 ⊗ u + a1zzz P (hrzt ⊗ u⊗2 ) + a1zzz u⊗3 + ω1 (a1zz (u ⊗ hrzzt ))
(21)
23
rt
rt
where hzz , defined in (36) is the second derivative of h with respect to z .
The definitions of a0zzz and a1zzz as well as expressions for Evzzz , E (vz ⊗ vz ⊗ vz ),
rt
are given in appendix (A.3).
E (vzz ⊗ vz ), E (vz ⊗ vzz ) needed to solve for Tzzz
In equation (21), the function P (.) denotes the sum of all possible permutations
of Kronecker products. That is P (A ⊗ B ⊗ C) = A ⊗ B ⊗ C + A ⊗ C ⊗B + B ⊗ A ⊗
C + B ⊗ C ⊗ A + C ⊗ A ⊗ B + C ⊗ B ⊗ A. Only one product needs to be computed.
The remaining ones can be derived as perfect shuffles of the computed one (see e.g.
Van Loan (2000)).
Then, expanding the problem to solve, we get
t
Arzzz
+
h
X
rt+1 rt
rt
A+
rt ,rt+1 Tzzz Czzz + Urt Tzzz = 0
(22)
rt+1 =1
with
t
Arzzz
≡
h
X
r ,r
r ,r
t ,rt+1
drvvv
Evz⊗3 + ω1 drvvt ,rt+1 E (vz ⊗ vzz ) + db+t t+1 λb + dft+ t+1 λf Tzzrt+1 (hrzt ⊗ hrzzt )
rt+1 =1
(23)
and
rt
Czzz
≡ P hrzt ⊗ Eu⊗2 + Eu⊗3 + (hrzt )⊗3
(24)
The third-order perturbation solution of T rt in (3) takes the form
rt
T rt (z) ' T rt (z̄rt ) + Tzrt (zt − z̄rt ) + 21 Tzzrt (zt − z̄rt )⊗2 + 3!1 Tzzz
(zt − z̄rt )⊗3
3.5
Higher-order solutions
The derivatives of (7) can be generalized to an arbitrary order, say p. In that case
we have
23
For any k ≥ 2, the k-order derivative of hrt with respect to z is given by hrzt(k) =
t
λx Tzr(k)
.
0(1+(k+1)nε )×nkz
17
0 = Et
h
X
rt+1 =1

p
X

r ,r
dvt(l) t+1
i
X
l
Y
α1 α2 ···αl
m
[vz|cm | ]αk(c
m)
c∈Ml,p m=1
l=1



where:
• Ml,p is the set of all partitions of the set of p indices with class l. In particular,
M1,p = {{1, 2, · · · , p}} and Mp,p = {{1} , {2} , · · · , {p}}
• |.| denotes the cardinality of a set
• cm is the m-th class of partition c
• k (cm ) is a sequence of k’s indexed by cm
The p-order perturbation solution of T rt takes the form
rt
T rt (z) ' T rt (z̄rt ) + Tzrt (zt − z̄rt ) + 21 Tzzrt (zt − z̄rt )⊗2 + 3!1 Tzzz
(zt − z̄rt )⊗3 +
1 rt
rt
(zt − z̄rt )⊗4 + · · · + p!1 Tz...z
(zt − z̄rt )⊗p
T
4! zzzz
4
Solving the system of quadratic matrix equations
One of the main objectives of this paper is to present algorithms for solving (11).
4.1
Existing solution methods
In terms of solving the first-order approximation of the switching DSGE models,
there are important contributions in the literature, most of which are ideally suited
for small models. We briefly review the essence of some of the existing solution
methods, most of which are nicely surveyed and evaluated in Farmer et al. (2011)
and then present our own.
The FP algorithm: The strategy in this technique by Farmer et al. (2008) is
to stack all the regimes and re-write the MSDSGE model as a constant parameter
model, whose solution is the solution of the original MSDSGE. As demonstrated by
Farmer et al. (2011), the FP algorithm may not find solutions even when they exist.
18
The SW algorithm: This functional iteration technique by Svensson and Williams
(2007) uses an iterative scheme to find the solution of a fix point problem, through
successive approximations of the solution, just as the Farmer et al. (2008) approach.
But the SW algorithm does not resort to stacking matrices as Farmer et al. (2008),
which is an advantage for solving large systems. Unfortunately, again as shown by
Farmer et al. (2011), the SW algorithm may not find solutions even when they exist.
The FWZ algorithm: This algorithm, by Farmer et al. (2011), uses a Newton
method to find the solutions of the MSDSGE model. It is a robust procedure in the
sense that it is able to find solutions that the other methods are not able to find. The
Newton method has the advantage of being fast and locally stable around any given
solution. However, unlike what was claimed in Farmer et al. (2011), in practice,
choosing a large enough grid of initial conditions does not guarantee that one will
find all possible MSV solutions.
The Cho algorithm: This algorithm, by Cho (2014), finds the so-called ”forward
solution” by extending the forward method of Cho and Moreno (2011) to MSDSGE
models. The paper also addresses the issue of determinacy – existence of a unique
stable solution. The essence of the algorithm is the same as that of the SW algorithm.
The FRWZ algorithm24 : This technique, by Foerster et al. (2014), is one of the
latest in the literature. It applies the theory of Gröbner bases (see Buchberger (1965,
Buchberger (2006)), which helps find all possible solutions of the system symbolically.
While promising, the Gröbner bases avenue is not very attractive in the context
of models of the size routinely used in central banks. Because of its exponential
complexity, the computational cost incurred is substantial and prohibitively high (see
for instance Ghorbal et al. (2014) and also Mayr and Meyer (1982))25 .The algorithm
is sensitive to the ordering of the variables and in many instances may simply fail to
converge. In general, methods that aim at symbolically derive solutions to systems
of equations work well when the number of variables is small. When the number
of variables is not too small, say, bigger than 20, such methods tend not to work
well as in general the number of solutions increases exponentially with the number
of variables.
More to the point, Foerster et al. (2014)’s strategy is to find all possible solutions,
then establish the stability of each of the solutions, which in itself is a complicated
24
This algorithm is not yet implemented in RISE.
This approach sometimes requires parallelization for moderately large systems (see e.g. Leykin
(2004)).
25
19
step, in order to find out whether a particular parameterization admits a unique
stable solution or not. The problem gets worse if one attempts estimating a model,
which is the goal we have in mind when deriving the algorithms in this paper :
locating the global peak of the posterior kernel will typically require thousands of
function evaluations.
Also, stability of a first-order approximation does not imply stability of higherorder approximations even in the constant-parameter case. In the switching case
with constant transition probabilities, the problem is worse. When we allow for
endogenous probabilities as it is done in this paper, there is no stability concept
such as Mean Square Stability (MSS) that applies. Therefore provided we can even
compute all possible solutions, we do not have a criterion for picking one of them.
4.2
Proposed solution techniques
All of the algorithms presented above use constant transition probabilities. In many
cases, they also use the type of notation used in Klein (2000), Schmitt-Grohe and
Uribe (2004) and by Gensys. This notation works well with constant-parameter
models26 . When it comes to switching DSGE models, however, that notation has
some disadvantages as we argued earlier. Among other things, besides the fact that it
often requires considerable effort to manipulate the model into the required form, the
ensuing algorithms may be slow owing in part to the creation of additional auxiliary
variables.
The solution methods proposed below do not suffer from those drawbacks, but
have their own weaknesses as well. And so they should be viewed as complementary
to existing ones27 .
4.2.1
Solution technique 1: A functional iteration algorithm
Functional iteration is a well established technique for solving models that do not
have an analytical solution, but that are nevertheless amenable to a fix-point type
of representation. It is very much used when for instance when solving for the
value function in the Bellman equation in dynamic programming problems. In the
context of switching DSGE models, functional iteration has the advantages that (1)
it typically converges fast whenever it is able to find a solution, (2) can be used to
solve large systems, (3) is easily implementable.
26
This is an attractive feature as it provides a compact and efficient representation of the firstorder conditions or constraints of a system leading to the well known solution methods for constantparameter DSGE models such as Klein (2000), Gensys among others.
27
These solution algorithms have been part of the RISE toolbox for many years.
20
We concentrate on minimum state variable (MSV) solutions and promptly rewrite (11) as28
rt
Tz,x
h
P
i−1
h
0
+
rt+1
= − Art +
λx
A−
rt
rt+1 =1 Art ,rt+1 Tz,x
(25)
which is the basis for our functional iteration algorithm. We can iterate on this
rt (k)
rt (k−1)
equation (25) until convergence,
for rt = 1, 2, ..., h and
h i.e. until Tz,x i= Tz,x
(0)
1(0)
h(0)
given a start value for Tz,x ≡ Tz,x · · · Tz,x
.
Local convergence of functional iteration Now we are interested in deriving
the conditions under which an initial guess will converge to a solution. We have the
following theorem.
Theorem 1 Let Tr∗t , rt = 1, 2, ..., h, be a solutionto (??), such that Tr∗t ≤ θ and
−1 +
P
+
∗
≤ τ for rt = 1, 2, ..., h , where
Ar ≤ a+ and A0r + h
A
T
rt+1 =1 rt ,rt+1 rt+1
t
t
(0)
∗
+
θ > 0 and a > 0 and τ > 0. Assume there exists δ > 0 such that Trt − Trt ≤ δ,
n
oh
(k)
rt = 1, 2, ..., h. Then if τ a+ (θ + δ) < 1, the iterative sequence Trt
generated
rt =1
(k)
(0)
by functional iteration with initial guess Trt , rt = 1, 2, ..., h, satisfies Trt − Tr∗t ≤
Ph
(k−1)
∗ ϕ rt+1 =1 Trt+1 − Trt+1 , with ϕ ∈ (0, 1).
Proof. see appendix B.
Farmer et al. (2011) show that functional iterations are not able to find all
possible solutions. But still, functional iterations cannot be discounted: they are
useful for getting solutions for large systems. It is therefore important to understand
why in some cases functional iteration may fail to converge.
The theorem establishes some sufficient conditions for convergence of functional
iteration in this context. In other words it gives us hints about the cases in which
functional iteration might not perform well. In particular, inequality τ a+ (θ + δ) < 1
is more likely to be fulfilled for low values of τ , a+ , θ and δ rather than large ones.
28
The invertibility of A0rt +
P
h
rt+1 =1
rt+1
A+
T
λx is potentially reinforced by the fact that
z,x
rt ,rt+1
all variables in the system appear as current. That is, all columns of matrix A0rt have at least one
non-zero element.
21
4.2.2
Solution technique 2: Newton with kroneckers
We define
"
rt
+ A0rt +
Wrt (Tz,x ) ≡ Tz,x
h
X
#−1
!
rt+1
A+
rt ,rt+1 Tz,x
A−
rt
λx
(26)
rt+1 =1
We perturb the current solution Tz,x by a factor ∆ and expand (26) into
Wrt (Tz,x + ∆) = Wrt (Tz,x ) + ∆rt −
P
h
rt+1 =1
r ,r
Lxt+ t+1 ∆rt+1
Lrxt− + HOT
where
Lxt+ t+1 ≡ Ur−1
A+
rt ,rt+1
t
r ,r
(27)
rt
Lrxt− ≡ −λx Tz,x
(28)
and
Newton ignores HOT and attempts to solve for ∆rt so that
!
h
X
r ,r
Lxt+ t+1 ∆rt+1 Lrxt− = 0
Wrt (Tz,x ) + ∆rt −
(29)
rt+1 =1
, where ∆rt −
direction ∆rt .
P
h
rt+1 =1
• vectorize ∆rt −
r ,r
Lxt+ t+1 ∆rt+1
P
h
rt+1 =1
h
X
Lrxt− is the Frechet derivative of Wrt at Tz,x in
r ,r
Lxt+ t+1 ∆rt+1
Lrxt− = −Wrt (Tz,x ) to yield
r ,r
Lrxt− 0 ⊗ Lxt+ t+1 δrt+1 − δrt = wrt (Tz,x )
(30)
rt+1 =1
which
we have
0
0 to solve in order to find the Newton improving step δ ≡
0
δ1 · · · δh .
• Stacking all the equations, δ is solved as δ = Γ−1 w where


L1x− 0 ⊗ Lx1,1
···
L1x− 0 ⊗ L1,h
+
+ − Inx ×nT
x


..
..
..
Γ≡

.
.
.
h,1
h,h
h 0
h 0
Lx− ⊗ Lx+
· · · Lx− ⊗ Lx+ − Inx ×nT
0
with w ≡ w10 · · · wh0
22
(31)
Relation to Farmer, Waggoner and Zha (2011) As discussed earlier, Farmer
et al. (2011) consider problems in which the coefficient matrix on the forward-looking
variables is known in the current period. Our algorithms consider the more general
case where those matrices are unknown in the current period. Another difference
between Farmer et al. (2011) and this paper is that rather than perturbing the T
as we do, they build the Newton by computing derivatives directly. Our approach
is more efficient as it solves a smaller system. In particular, in our case, Γ in (31) is
of size nx nh ×nx nh, while it would be of size (n + l)2 h × (n + l)2 h in the Farmer
et al. (2011) case, where l the number of expectational errors, n is the number of
equations in the system and h is the number of regimes.
These computational gains can be substantial, but still the computational costs
are significant. This owes to two facts:
• First, Kronecker products expensive operations
• Secondly, for models used for policy analysis, the matrices to build and invert
are huge. In Smets and Wouters (2007), one of the models we will solve in
the computational experiments, n = 40. With h = 2, the matrix to build and
invert (Γ) is of size 3200 × 3200. And this has to be repeated at each step of
the Newton algorithm until convergence.
The natural question then is to ask whether there is a chance of solving medium
scale (or large) MSDSGE models of the size that is relevant to policymakers in central
banks. The answer is yes, at least if we find a more efficient way of solving (30).
Here, the ideal method would:
1. only need matrix-vector products and compute those products efficiently
2. have modest memory requirements and in particular, avoid building and storing
the huge matrix Γ in (31).
3. be based on some kind of minimization principle
4.2.3
Solution technique 3: Newton without kroneckers
The first and the second requirements are achieved by exploiting the insights of
Tadonki and Philippe (1999). They consider the problem of computing an expression
of the form (A ⊗ B) z where A and B are matrices and z is a vector. Relying on the
identity in (32), they show how (I ⊗ B) z can be computed efficiently without using
a Kronecker product. The result of this calculation is a vector, say z 0 . Then they
23
also show that (A ⊗ I) z 0 can also be computed efficiently without using a Kronecker
product.
(A ⊗ B) z = (A ⊗ I) [(I ⊗ B) z]
(32)
Hence we can compute (A ⊗ B) z without building (A ⊗ B) and without storing it.
Equation (30) involves this type of computation.
The third requirement is achieved by exploiting the ”transpose-free quasi minimum residuals” (TFQMR) method, for solving large linear systems, due to Freund
(1993). It is an iterative method defined by a quasi-minimization of the norm of a
residual vector over a Krylov subspace. The advantage of this technique in addressing
our problem is that it involves the coefficient matrix Γ only in the form of matrixvector products, which we can compute efficiently as discussed above. Moreover,
the TFQMR technique overcomes the problems of the bi-conjugate gradient (BCG)
method, which is known to (i) exhibit erratic convergence behavior with large oscillations in the residual norm, (ii) have a substantial likelihood of breakdowns (more
precisely, division by 0)29 .
The system to solve for finding the Newton improving step (30), is equivalent to
Γδ = w. It turns out that we can for a given δ, we can compute Γδ, without ever
building Γ and without using a single Kronecker product. Hence we can solve for δ
using the TFQMR technique.
We are now armed with the tools for summarizing the whole Newton algorithm.
Algorithm 2 Pseudo code for the complete Newton algorithm
1. k=0
(k)
Tz,x
h
1(k)
Tz,x
h(k)
Tz,x
2. pick a starting value
≡
···
i
h
(k)
(k)
, using (26)
· · · Wh Tz,x
W1 Tz,x
i
and compute W (k) (Tz,x ) ≡
3. while W (k) 6= 0
r ,r
(a) compute Lxt+ t+1 and Lrxt− , rt , rt+1 = 1, 2, ..., h, as in (27) and (28) respectively
h
i
(k)
(b) solve (29) for ∆(k) ≡ ∆(k)
using algorithm (4.2.2) or algo·
·
·
∆
1
h
rithm (4.2.3)
(c) k=k+1
29
The BCG method, due to Lanczos (1952), extends the classical conjugate gradient algorithm
(CG) to non-Hermitian matrices.
24
(k)
(k−1)
(d) Tz,x = Tz,x + ∆(k−1)
h
P
i−1
(k)
r (k)
rt+1 (k)
h
+
(e) Wrt Tz,x = Tz,xt + A0rt +
A
T
λx
A−
r t , rt =
rt+1 =1 rt ,rt+1 z,x
1, 2, ..., h
4.3
Stability in the non-time-varying transition probabilities
case
The traditional stability concepts for constant-parameter linear rational expectations
models do not extend to the Markov switching case. Following the lead of Svensson and Williams (2007) and Farmer et al. (2011) among others, this paper uses
the concept of mean square stability (MSS), as defined in Costa et al. (2005), to
characterize stable solutions of non-time-varying transition probabilities.
Consider the MSDSGE system whose solution is given by equation (8) and with
constant transition probability matrix Q such that Qrt ,rt+1 = prt ,rt+1 . Using the
selection matrix λx , we can extract the portion of the solution for state endogenous
variables only to get
rt
rt
rt
rt
xt (z) = λx T rt (z̄rt )+λx Tz,x
(xt−1 − λx T rt (z̄rt ))+λx Tz,σ
σ +λx Tz,ε
0 εt +...+λx Tz,εk εt+k
This system and thereby (8) is MSS if for any initial condition x0 , there exist a
vector µ and a matrix Σ independent of x0 such that limt−→∞ kExt − µk = 0 and
limt−→∞ kExt x0t − Σk = 0. Hence the covariance matrix of the process is bounded.
As shown by Gupta et al. (2003) and Costa et al. (2005, chapter 3), a necessary and
sufficient condition MSS is that matrix Υ, as defined in (33), has all its eigenvalues
inside the unit circle30 .


1
1
⊗ λx Tz,x
λx Tz,x


...
(33)
Υ ≡ (Q ⊗ In2 ×n2 ) 

h
h
λx Tz,x ⊗ λx Tz,x
A particular note should be taken of the facts that:
• With endogenous probabilities as this paper discusses, the MSS concept become
useless.
30





It is not very hard to
1
1
p1,1 λx Tz,x
⊗ λx Tz,x
1
1
p2,1 λx Tz,x ⊗ λx Tz,x
..
.
1
1
ph,1 λx Tz,x
⊗ λx Tz,x
see that a computationally
more efficient representation
 of Υ given by:
2
2
h
h
p1,2 λx Tz,x
⊗ λx Tz,x
·
·
·
p
λ
T
⊗
λ
T
1,h
x
x
z,x
z,x
2
2
h
h

p2,2 λx Tz,x
⊗ λx Tz,x
· · · p2,h λx Tz,x
⊗ λx Tz,x

.
..
..

.
.
2
2
h
h
ph,2 λx Tz,x
⊗ λx Tz,x
· · · ph,h λx Tz,x
⊗ λx Tz,x
25
• Even in the constant-transition probabilities case, there are no theorems implying that stability of the first-order perturbation implies the stability of higher
orders.
4.4
Proposed initialization strategies
Both functional iterations and the Newton methods require an initial guess of the
solution. As stated earlier, we do not attempt to compute all possible solutions of
(11) and then check that only one of them is stable as it is done in Foerster et al.
(2014). Initialization is key not only for finding a solution but also, potentially, for
the speed of convergence of any algorithm.
We derive all of our initialization schemes from making assumptions on equation (25), which, for convenience we report in (34), abstracting from the iteration
superscript.
"
! #−1
h
X
rt
0
+
rt+1
Tz,x = Art +
Art ,rt+1 Tz,x
A−
λx
(34)
rt
rt+1 =1
there are at least three ways to initialize a solution:
• The first approach makes the assumption that A−
rt = 0, rt = 1, 2, ..., h. In this
case then, we initialize the solution at the solution that we would obtain in the
absence of predetermined variables. This solution is trivially zero as can be
seen from equation (34). Formally, it corresponds to
rt (0)
Tz,x
= 0, for rt = 1, 2, ..., h
• The second approach makes the assumption that A+
rt ,rt+1 = 0, rt , rt+1 = 1, 2, ..., h.
While the first approach assumes there are no predetermined variables, this approach instead assumes there are no forward-looking variables. In that case,
the solution is also easy to compute using (34). In RISE, this option is the
default initialization scheme. It corresponds to setting:
−1 −
rt (0)
Tz,x
= − A0rt
Art , for rt = 1, 2, ..., h
• The third approach recognizes that making a good guess for a solution is difrt+1
ficult and instead tries to make guesses on the products A+
rt ,rt+1 Tz,x . We
proceed as follows: (i) we would like the guesses for each regime to be related,
just as they would be in the final solution; (ii) we would like the typical element
rt+1
to have a norm roughly of the squared order of magnitude of
of A+
rt ,rt+1 Tz,x
26
a solvent or final solution. Given those two requirements, we make the asrt+1
sumption that the elements of A+
rt ,rt+1 Tz,x , rt , rt+1 = 1, 2, ..., h are drawn from
a standard normal distribution scaled by the approximate norm of a solvent
squared31 . We approximate the norm of a solvent by
p
η0 + η02 + 4η+ η
σ≡
2η+
. Then
, η+ ≡ maxrt A+
where η0 ≡ maxrt A0rt , η ≡ maxrt A−
rt
rt
rt+1
+
with draws of Art ,rt+1 Tz,x in hand, we can use (34) to compute a guess for
a solution. In RISE this approach is used when trying to locate alternative
solutions. It can formally be expressed as
Tr(0)
= − A0rt +
t
! !−1
h
X
+
guess
rt+1
Art ,rt+1 Tz,x
A−
λx
rt
rt+1 =1
rt+1 guess
with A+
T
∼ σ 2 N (0, 1) .
rt ,rt+1 z,x
ij
5
Performance of alternative solution methods
In this section, we investigate the performance of the different algorithms in terms
of their ability to find solutions but also in terms of their efficiency. To that end, we
use a set of problems found in the literature.
5.1
Algorithms in contest
The algorithms in the contest are :
1. Our two functional iteration algorithms: MFI full and MFI, with the latter
version further exploiting sparsity.
2. Our Newton algorithms using Kronecker products: MNK full and MNK,
with the latter version further exploiting sparsity.
3. Our Newton algorithms without Kronecker products: MN full and MN, with
the latter version further exploiting sparsity.
31
Implicit also is the assumption that A+
st+1 and Tst+1 are of the same order of magnitude.
27
4. Our Matlab version of the Farmer et al. (2011) algorithm: FWZ: It is included
for direct comparison as it is shown to outperform the FP and SW algorithms
described above.32 .
5.2
Test problems
The details of the test problems considered are given in appendix (C). The models
considered are the following:
1. cho214 EIS is a 6-variable model by Cho (2014), with switching elasticity of
intertemporal substitution.
2. cho214 MP is a 6-variable model by Cho (2014), with switching monetary
policy parameters.
3. frwz2013 nk 1 is a 3-variable new Keynesian model by Foerster et al. (2013)
4. frwz2013 nk 2 is the same model as frwz2013 nk 1 but with a different
parameterization.
5. frwz2013 nk hab is a 4-variable new Keynesian model with habit persistence
by Foerster et al. (2013).
6. frwz2014 rbc is a 3-variable RBC model by Foerster et al. (2014)
7. fwz2011 1 is a 2-variable model by Farmer et al. (2011)
8. fwz2011 2 is the same model as fwz2011 1 but with a different parameterization.
9. fwz2011 3 is the same model as fwz2011 1 but with a different parameterization.
10. smets wouters is the well-known model by Smets and Wouters (2007). Originally the model is a constant-parameter model. We turn it into a switching
model by creating a regime in which monetary policy is unresponsive to changes
in inflation and output gap. See appendix (C) for further details.
32
This algorithm is also included among the solvers in RISE and the coding strictly follows the
formulae in the Farmer et al. (2011) paper.
28
All the tests use the same tolerance level for all the algorithms and are run
on a Dell Precision M4700 computer with Intel Core i7 processor running at 2.8
GHz with 16GB of RAM. Windows 7 (64-bit) is the operating system used. The
results included in this paper are computed using MATLAB 8.4.0.150421 (R2014b)
environment.
5.3
Finding a solution: computational efficiency
The purpose of this experiment is to assess both the ability of each algorithm to find
a solution and the speed with which the solution is found. To that end, we solved
each model 50 times for each solution procedure and averaged the computing time
spent across runs 33 . The computation times we report include:
• the time spent in finding a solution,
• the time spent in checking whether the solution is MSS (which is oftentimes
the most costly step)
• some further overhead from various functions that call the solution routines in
the object oriented system.
The last two elements are constant across models and therefore should not have
an impact on the relative performances of the algorithms.
The results presented in table 1 are all computed using the default initialization
scheme of RISE, i.e. the one that assumes there are no forward-looking variables
in the model34 . The table reports for each model, its relative speed performance.
This relative performance is computed as the absolute performance divided by the
best performance. A relative performance of say, x, means that for that particular
model, the algorithm is x times slower than the best, whose performance is always
1. The table also reports in column 2 the number of variables in the model under
consideration and in column 3 the speed in seconds of the fastest solver.
The speed of computation tends to increase with the size of the model as could
have been expected, although this pattern is not uniform for small models. But we
note that the MNK full algorithm tends to be the fastest algorithm on small models.
The FWZ algorithm comes first in two of the contests but as argued earlier because
it imposes that the coefficient matrix on forward-looking variables depends only on
33
Running the same code several times is done simply to smooth out the uncertainty across runs.
The results for other initialization schemes have also been computed and are available upon
request
34
29
the current regime, it saves some computations and gives the wrong answer when
the coefficient matrix on forward-looking variables also depends on the future regime.
This is the case in the Cho2014 EIS model.
The MNK full algorithm is also faster than its version that exploits the sparsity of
the problem, namely MNK. This is not surprising however because when the number
of variables is small, the overhead incurred in trying to exploit sparsity outweighs the
benefits. This is also true for the functional iteration algorithms as MFI full tends
to dominate MFI.
For the same small models, the MNK full algorithm, which uses Kronecker products,is also faster than its counterparts that do not use Kronecker products, namely
MN full and MN. Again, here there is no surprise: when the number of variables
is small, computing Kronecker products and inverting small matrices is faster than
trying to avoid those computations, which is the essence of the MN full and MN
algorithms.
The story does not end there, however. As the number of variables increases
the MN algorithm shows better performance. By the time, the number of variables
gets to 40 as in the Smets and Wouters model, the MN algorithm, which exploits
sparsity and avoid the computation of Kronecker products becomes the fastest. In
particular, on the smets wouters model, the MN algorithm is about 7 times faster
than its Kronecker version (MNK), which also exploits sparsity and about 39 times
faster than the version of the algorithm that builds Kronecker products and and
does not exploit sparsity (MNK full), which is the fastest in small models. For
the parameterization under consideration, the FWZ and the functional iteration
algorithms never finds a solution35 .
35
Although the Newton algorithms converge fast, rapid convergence occurs only in the neighborhood of a solution (see for instance Benner and Byers (1998)). It may well be the case that the first
Newton step is disastrously large, requiring many iterations to find the region of rapid convergence.
This is an issue we have not encountered so far, but it may be addressed by incorporating line
searches along a search direction so as to reduce too long steps or increase too short ones, which in
both cases would accelerate convergence even further.
30
nvars best speed
cho2014 EIS
6
0.01798
cho2014 MP
6
0.01259
3
0.01486
frwz2013 nk 1
3
0.01571
frwz2013 nk 2
frwz2013 nk hab
4
0.01622
3
0.01628
frwz2014 rbc
fwz2011 1
2
0.00924
2
0.01043
fwz2011 2
fwz2011 3
2
0.01108
40
0.2515
smets wouters
mfi
4.971
1
1.34
1.309
1.059
2.676
Inf
1.57
1.877
Inf
mnk
1.624
1.221
1.198
1.111
1.106
1.259
1.051
1.281
1.185
6.514
mn mfi full mnk full mn full
2.802
3.365
1.177
2.126
1.101
1.11
1.124
1.082
1.459
1.219
1
1.435
1.42
1.218
1
1.407
1.134
1
1.009
1.125
1.313
2.15
1
1.284
1.139
Inf
1.015
1.068
1.374
1.27
1
1.265
1.35
1.443
1
1.279
1
Inf
39.48
1.364
Table 1: Relative efficiency of various algorithms using the backward initialization
scheme
31
fwz
1
Inf
1.092
1.059
1.146
Inf
1
1.278
1.112
Inf
5.4
Finding many solutions
Our algorithms are not designed to find all possible solutions since the results they
produce depend on the initial condition. Although sampling randomly may lead to
different solutions, many different starting points may also lead to the same solution.
Despite these shortcomings, we investigated the ability of the algorithms to find
many solutions for the test problems considered. Unlike Foerster et al. (2013) and
Foerster et al. (2014) our algorithms only produce real solutions, which are the ones
of interest. Starting from 100 different random points, we computed the results given
in table 2. For every combination of solution algorithm and test problem, there are
two numbers. The first one is the total number of solutions found and the second
one is the number of stable solutions. As the results show, we easily replicate the
findings in the literature. In particular,
• for the Cho (2014) problems, we replicate the finding that the model with
switching elasticity of intertemporal substitution is determinate. Indeed all of
the algorithms tested find at least one solution and at most of the solutions
found is stable in the MSS sense. We do not, however, find that the solution
found for the model with switching monetary policy is stable.
• For the Farmer et al. (2011) examples, we replicate all the results. In the first
parameterization, there is a unique solution and the solution is stable. This
is what all the Newton algorithms find. The functional iteration algorithms
are not able to find that solution. In the second parameterization, our Newton
algorithms return 2 stable solutions and in the third parameterization, they
return 4 stable solutions just as found by Farmer et al. (2011).
• Foerster et al. (2013) examples: In the new Keynesian model without habits
our Newton algorithms return 3 real solutions out of which only one is stable.
In the second parameterization, they return 3 real solutions but this time two
of them are stable. For the new Keynesian model with habits, our Newton
algorithms return 6 real solutions out of which one is stable.
• for the Foerster et al. (2014) RBC model, there is a unique stable solution.
• The Smets and Wouters (2007) model, shows that there are many real solutions and many of those solutions are stable. The different algorithms return
different solutions since they are all initialized randomly in different places.
This suggests that we most likely have not found all possible solutions and
that by increasing the number of simulations we could have found many more
solutions.
32
nvars mfi mnk mn mfi full mnk full mn full fwz
cho2014 EIS
6 11
21 21
11
21
21 21
cho2014 MP
6 10
10 10
10
10
10 00
3 11
31 31
11
31
31 31
frwz2013 nk 1
3 11
32 32
11
32
32 32
frwz2013 nk 2
frwz2013 nk hab
4 11
61 61
11
61
61 61
3 11
11 11
11
21
21 00
frwz2014 rbc
fwz2011 1
2 00
11 11
00
11
11 11
2 11
22 22
11
22
22 11
fwz2011 2
fwz2011 3
2 11
44 44
11
44
44 33
40 0 0 11 2 11 4
00
16 2
10 3 0 0
smets wouters
Table 2: Solutions found after 100 random starting values and number of stable
solutions
33
5.5
Higher orders
Our algorithms do not just solve higher order approximations for switching DSGE
models. They solve higher-order approximations of constant-parameter DSGE models as a special case.
We have successfully solved a second-order approximation of a version of NEMO,
the DSGE model used by Norges Bank. This model includes upwards of 272 equations in its nonlinear form.
We have also tested our algorithms on various models against some of the existing and freely available algorithms solving for higher-order approximations. Those
algorithms include : dynare, dynare++ and codes by Binning (2013b) and Binning
(2013a). In all cases we find the same results.
Finally, we have run the switching RBC model by Foerster et al. (2014) in RISE
and the results for a third-order approximation are presented below. Because in
the Foerster et al. (2014)’s the steady state is unique, we simply had to pass this
information to the software, without any further need to partition the parameters.
RISE prints only the unique combination of cross products and uses the sign @sig to
denote the perturbation term. The results produced by RISE closely replicate the
second-order approximation presented by Foerster et al. (2014)36 .
MODEL SOLUTION
SOLVER :: mfi
Regime
1 : a = 1 & const = 1
Endo Name
C
steady state
2.082588
K{-1}
0.040564
Z{-1}
0.126481
@sig
-0.009066
EPS
0.009171
K{-1},K{-1}
-0.000461
K{-1},Z{-1}
0.001098
K{-1},@sig
-0.000462
K{-1},EPS
0.000080
Z{-1},Z{-1}
-0.058668
Z{-1},@sig
0.000323
Z{-1},EPS
0.000299
@sig,@sig
-0.024022
@sig,EPS
0.000023
EPS,EPS
0.000022
K{-1},K{-1},K{-1}
0.000011
K{-1},K{-1},Z{-1}
-0.000009
K{-1},K{-1},@sig
-0.000001
K{-1},K{-1},EPS
-0.000001
K{-1},Z{-1},Z{-1}
-0.000332
K
22.150375
0.969201
-2.140611
-0.362957
-0.155212
-0.000167
-0.047836
-0.008170
-0.003468
1.168197
0.018293
0.007642
0.026995
0.001326
0.000554
0.000005
0.000001
-0.000000
0.000000
0.017407
Z
1.007058
0.000000
0.100000
0.018459
0.007251
0.000000
0.000000
0.000000
0.000000
-0.044685
0.000916
0.000360
0.000169
0.000066
0.000026
0.000000
0.000000
0.000000
0.000000
0.000000
36
The second order terms of the RISE output should be multiplied by 2 in order to compare them
with the output presented in Foerster et al. (2014).
34
K{-1},Z{-1},@sig
K{-1},Z{-1},EPS
K{-1},@sig,@sig
K{-1},@sig,EPS
K{-1},EPS,EPS
Z{-1},Z{-1},Z{-1}
Z{-1},Z{-1},@sig
Z{-1},Z{-1},EPS
Z{-1},@sig,@sig
Z{-1},@sig,EPS
Z{-1},EPS,EPS
@sig,@sig,@sig
@sig,@sig,EPS
@sig,EPS,EPS
EPS,EPS,EPS
0.000005
0.000002
-0.000185
0.000000
0.000000
0.037330
-0.000250
-0.000097
-0.000418
-0.000001
0.000001
-0.002111
-0.000028
-0.000000
0.000000
0.000269
0.000115
0.000231
0.000020
0.000008
-0.811474
-0.006515
-0.002777
-0.000480
-0.000043
-0.000018
0.001640
-0.000038
-0.000003
-0.000001
0.000000
0.000000
0.000000
0.000000
0.000000
0.028102
-0.000273
-0.000107
0.000006
0.000002
0.000001
0.000001
0.000000
0.000000
0.000000
Regime
2 : a = 2 & const = 1
Endo Name
C
steady state
2.082588
K{-1}
0.040564
@sig
-0.100434
EPS
0.026867
K{-1},K{-1}
-0.000461
K{-1},@sig
-0.001160
K{-1},EPS
0.000233
@sig,@sig
-0.023223
@sig,EPS
-0.000525
EPS,EPS
0.000187
K{-1},K{-1},K{-1}
0.000011
K{-1},K{-1},@sig
0.000007
K{-1},K{-1},EPS
-0.000002
K{-1},@sig,@sig
-0.000188
K{-1},@sig,EPS
-0.000002
K{-1},EPS,EPS
0.000001
@sig,@sig,@sig
-0.001411
@sig,@sig,EPS
-0.000077
@sig,EPS,EPS
-0.000002
EPS,EPS,EPS
0.000001
K
22.150375
0.969201
0.926307
-0.464994
-0.000167
0.020327
-0.010399
0.043449
-0.009756
0.004982
0.000005
-0.000003
0.000000
0.000476
-0.000146
0.000075
0.002643
-0.000228
0.000069
-0.000036
Z
1.007058
0.000000
-0.041021
0.021752
0.000000
0.000000
0.000000
0.000835
-0.000443
0.000235
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
-0.000011
0.000006
-0.000003
0.000002
Finally we can use the same Foerster et al. (2014) model to illustrate in a thirdorder approximation, the effect of technology shock that is announced 4 periods from
now. We compare two economies. In the first economy, agents fully anticipate the
shock and therefore internalize their effect already in the current period. In the
second economy, agents will be aware of the shock only when it will materialize 4
periods from now. As figure 1 illustrates, consumption already increases today while
capital falls in the ”anticipate” scenario.
6
Conclusion
In this paper we have proposed new solution algorithms for solving switching-parameter
DSGE models. The algorithms developed are useful tools for analyzing economic
35
Figure 1: Third-order perturbation impulse responses to a TFP shock in the
FRWZ2014 rbc model
36
data subject to breaks and shifts in a rational expectations environment, where
agents, while forming their expectations, explicitly take into account the probability of a regime shift. In the process we have derived higher-order perturbations of
switching DSGE models in which the probability of switching can be endogenous
and in which anticipated future events can affect the current behavior of economic
agents.
Despite the many benefits of departing from the assumption of constant parameters, the techniques and the ideas of MSDSGE modeling have yet to be fully adopted
by many macroeconomists as is the case for constant-parameter DSGE models popularized through Dynare. Two main reasons could potentially explain this. On the
one hand such models are a lot more difficult to solve and more computationally
intensive than their constant-parameter counterparts and on the other hand, there
are not many flexible and easy-to-use tools that would allow economists unable or
unwilling to write complex code to take full advantage of these techniques and address the current policy questions. By providing both efficient solution algorithms
and the implementation of those algorithms in a toolbox, we hope to have assuaged
these problems.
References
Adjemian, S., H. Bastani, M. Juillard, F. Karame, F. Mihoubi, G. Perendia,
J. Pfeifer, M. Ratto, and S. Villemot (2011). Dynare: Reference manual, version 4. Dynare Working Papers 1, CEPREMAP.
Benner, P. and R. Byers (1998). An exact line search method for solving generalized
continuous-time algebraic riccati equations. IEEE Transactions on Automatic
Control 43 (1), 101–107.
Bernanke, B. S., M. Gertler, and S. Gilchrist (1999). The financial accelerator
in a quantitative business cycle framework. In J. B. T. . M. Woodford (Ed.),
Handbook of Macroeconomics (1 ed.), Volume 1, Chapter 21, pp. 1341–1393.
Elsevier.
Bi, H. and N. Traum (2013). Estimating fiscal limits: The case of greece. Journal
of Applied Econometrics(forthcoming).
Binning, A. J. (2013a). Solving second and third-order approximations to dsge
models: A recursive sylvester equation solution. Working Paper 2013—18,
Norges Bank.
37
Binning, A. J. (2013b). Third-order approximation of dynamic models without the
use of tensors. Working Paper 2013—13, Norges Bank.
Buchberger, B. (1965). Ein Algorithmus zum Auffinden der Basiselemente des
Restklassenringes nach einem nulldimensionalen Polynomideal. Ph. D. thesis,
University of Innsbruck.
Buchberger, B. (2006, March). An Algorithm for Finding the Basis Elements in
the Residue Class Ring Modulo a Zero Dimensional Polynomial Ideal. Ph. D.
thesis, RISC (Research Institute for Symbolic Computation),Johannes Kepler
University Linz.
Cho, S. (2014, February). Characterizing markov-switching rational expectation
models. Working paper, Yonsei University.
Cogley, T., T. J. Sargent, and P. Surico (2012, March). The return of the gibson
paradox. Working paper, New York University.
Costa, O. L. D. V., M. D. Fragoso, and R. P. Marques (2005). Discrete-Time
Markov Jump Linear Systems. Springer.
Davig, T. and E. M. Leeper (2007). Fluctuating macro policies and the fiscal theory. In NBER Macroeconomics Annual 2007, Volume 21, pp. 247–316. National
Bureau of Economic Research.
Davig, T., E. M. Leeper, and T. B. Walker (2011). Inflation and the fiscal limit.
European Economic Review 55 (1), 31–47.
Farmer, R. F., D. F. Waggoner, and T. Zha (2008). Minimal state variable solutions
to markov switching rational expectations models. Working Paper 2008-23,
Federal Reserve Bank of Atlanta.
Farmer, R. F., D. F. Waggoner, and T. Zha (2011). Minimal state variable solutions to markov switching rational expectations models. Journal of Economics
Dynamics and Control 35 (12), 2150–2166.
Foerster, A., J. Rubio-Ramirez, D. Waggoner, and T. Zha (2013). Perturbation
methods for markov-switching models. Working Paper 2013-1, Federal Reserve
Bank of Atlanta.
Foerster, A., J. Rubio-Ramirez, D. Waggoner, and T. Zha (2014). Perturbation
methods for markov-switching models. Working Paper 20390, National Bureau
of Economic Research.
Freund, R. W. (1993). A transpose-free quasi-minimal residual algorithm for nonhermitian linear systems. SIAM Journal of Scientific Computing 2 (14), 470–
482.
38
Ghorbal, K., A. Sogokon, and A. Platzer (2014). Invariance of conjunctions of polynomial equalities for algebraic differential equations. Lecture Notes in Computer Science 8723, 151–167.
Gupta, V., R. Murray, and B. Hassibi (2003). On the control of jump linear markov
systems with markov state estimation. In Proceedings of the 2003 American
Automatic Control Conference, pp. 2893–2898.
Juillard, M. and J. Maih (2010). Estimating dsge models with observed real-time
expectation data. Technical report, Norges Bank.
Justiniano, A. and G. E. Primiceri (2008). The time-varying volatility of macroeconomic fluctuations. American Economic Review 98 (3), 604–41.
Kamenik, O. (2005, February). Solving sdge models: A new algorithm for the
sylvester equation. Computational Economics 25 (1), 167–187.
Klein, P. (2000). Using the generalized schur form to solve a multivariate linear
rational expectations model. Journal of Economic Dynamics and Control 24,
1405–1423.
Lanczos, C. (1952). Solution of systems of linear equations by minimized iterations.
Journal of Research of the National Bureau of Standards 49, 33–53.
Levintal, O. (2014). Fifth order perturbation solution to dsge models. Working
paper, Interdisciplinary Center Herzliya (IDC), School of Economics.
Leykin, A. (2004). On parallel computation of gröbner bases. In ICPP Workshops,
pp. 160–164.
Lubik, T. A. and F. Schorfheide (2004, March). Testing for indeterminacy: An
application to u.s. monetary policy. American Economic Review 94 (1), 190–
217.
Maih, J. (2010). Conditional forecasts in dsge models. Working Paper 2010—07,
Norges Bank.
Mayr, E. W. and A. R. Meyer (1982). The complexity of the word problems
for commutative semigroups and polynomial ideals. Advances in Mathematics 46 (3), 305–329.
Richter, A. W., N. A. Throckmorton, and T. B. Walker (2014, December). Accuracy speed and robustness of policy function iteration. Computational Economics 44 (4), 445–476.
39
Schmitt-Grohe, S. and M. Uribe (2004). Solving dynamic general equilibrium models using a second-order approximation to the policy function. Journal of Economics Dynamics and Control 28, 755–775.
Sims, C. A. and T. Zha (2006). Were there regime switches in us monetary policy?
American Economic Review 96 (1), 54–81.
Smets, F. and R. Wouters (2007). Shocks and frictions in us business cycles: A
bayesian dsge approach. American Economic Review, American Economic Association 97 (3), 586–606.
Stock, J. H. and M. W. Watson (2003). Has the business cycle changed? evidence
and explanations. FRB Kansas City symposium, Jackson Hole.
Svensson, L. E. O. and N. Williams (2007). Monetary policy with model uncertainty: Distribution forecast targeting. CEPR Discussion Papers 6331,
C.E.P.R.
Svensson, L. E. O. and N. Williams (2009). Optimal monetary policy under
uncertainty in dsge models: A markov jump-linear-quadratic approach. In
K. Schmidt-Hebbel, C. E. Walsh, and N. Loayza (Eds.), Central Banking,
Analysis, and Economic Policies (1 ed.), Volume 13 of Monetary Policy under
Uncertainty and Learning, pp. 077–114. Central Bank of Chile.
Tadonki, C. and B. Philippe (1999). Parallel multiplication of a vector by
a kronecker product of matrices. Parallel Distributed Computing Practices
PDCP 2 (4).
Van Loan, C. F. (2000). The ubiquitous kronecker product. Journal of Computational and Applied Mathematics 123 (123), 85–100.
A
A.1
Expressions for various expectations terms
First order
The definitions of a0z and a1z appearing in equation (10) are as follows


λbf Tzrt+1 hrzt


Tzrt




mp
0


az ≡ 

m
b




mε,0
θ̂rt+1 mσ
40
a1z
!
λbf Tzrt+1
≡
0(ns +2np +2nb +nf +nε +nθ )×nz
We have
Evz = a0z
A.2
(35)
Second order
hrzzt
=
λx Tzzrt
(36)
0(1+(k+1)nε )×n2z
The definitions of a0zz and a1zz appearing in equation (19) are as follows:


rt+1
λbf Tz,x
λx Tzzrt + λbf Tzzrt+1 (hrzt )⊗2

a0zz ≡ 
Tzzrt
0(nx +nε +nθ )×n2z


λbf Tzzrt+1

0
a1zz ≡ 
0
It follows that
Evzz ≡ a0zz + a1zz Eu⊗2
(37)
⊗2
(38)
and
Evz⊗2 = (a0z )
⊗2
+ (a1z )
Eu⊗2
The definition of u in (4) implies that37
0(np +nb +1+knε )2 ×n2z
⊗2
E (u ) =
I~nε (mσ ⊗ mσ )
0(np +nb +1+knε )k ×nkz
=
, where Mk is the tensor of the
In general, we will have E u
vec (Mk ) m⊗k
σ
k-order moments of εt . This expression should be cheap to compute since mσ is a sparse vector.
37
⊗k
41
A.3
Third order
The definitions of a0zz and a1zz appearing in equation (21) are as follows:


rt+1
t
λbf Tzzz
(hrzt )⊗3 + ω1 (λbf Tzzrt+1 (hrzt ⊗ hrzzt )) + λbf Tzrt+1 hrzzz
rt

a0zzz ≡ 
Tzzz
0(nx +nε +nθ )×n3z
rt+1
λbf Tzzz
1
azzz ≡
0(nT +nx +nε +nθ )×n3z
where
t
hrzzz
∂ 3 T rt+1 (hrt (z)+uz)
∂z 3
=
rt
λx Tzzz
(39)
(40)
0(1+(k+1)nε )×n3z
rt+1
rt
rt+1
+
λx Tzzz
= Tzzz
(hrzt )⊗3 + ω1 (Tzzrt+1 (hrzt ⊗ hrzzt )) + Tz,x
rt+1 ⊗3
rt+1
rt
Tzzz u + ω1 (Tzz (u ⊗ hzz )) + rt+1
Tzzz P (hrzt )⊗2 ⊗ u + P (hrzt ⊗ u⊗2 )
We have
Evzzz = a0zzz + a1zzz P hrzt ⊗ Eu⊗2 + a1zzz Eu⊗3
(41)
We also need E (vz⊗3 ), E (vz ⊗ vzz ). Using (19) and (10), those quantities are
computed as
Evz⊗3 =
⊗3
⊗2
(a0z ) + Ea0z ⊗ (a1z u) +
⊗2
⊗3
E (a1z u) ⊗ a0z + E (a1z u) ⊗ a0z ⊗ (a1z u) + E (a1z u)
E (vz ⊗ vzz ) = a0z ⊗ a0zz + (a1z ⊗ a1zz ) (Eu⊗2 ⊗ hrzt + E (u ⊗ hrzt ⊗ u)) +
a0z ⊗ a1zz Eu⊗2 + (a1z ⊗ a1zz ) Eu⊗3
B
(42)
(43)
Proof of local convergence of functional iteration
Before proving the theorem we first present a perturbation lemma that will facilitate
the subsequent derivations.
Lemma 3 Perturbation Lemma: Let Ξ be a non-singular matrix. If Ω is such that
kΞ−1 k
kΞ−1 k kΩk < 1, then (Ξ + Ω)−1 ≤
−1
1−kΞ
42
kkΩk
Proof. We have
−1
(Ξ + Ω)−1 = h [Ξ (I + Ξ−1 Ω)]i
P∞
i
−1
−1
=
i=0 (−Ξ Ω) Ξ
which implies that
P
i
∞
(Ξ + Ω)−1 ≤ i=0 (−Ξ−1 Ω) kΞ−1 k
P∞
i
≤
k(−Ξ−1 Ω)k kΞ−1 k
i=0
P∞
i
−1
−1
=
i=0 k(Ξ Ω)k kΞ k
1
−1
=
kΞ k
1−k(Ξ−1 Ω)k
−1
kΞ k
≤
1−kΞ−1 kkΩk
Now we can prove the theorem. Define
Ξrt ≡
A0rt
h
X
+
∗
A+
rt ,rt+1 Trt+1
rt+1 =1
and
Ω(k−1)
rt
h
X
≡
(k−1)
∗
A+
rt ,rt+1 Trt+1 − Trt+1
rt+1 =1
which implies
P
h
(k−1) (k−1)
(∗) +
Ωrt = rt+1 =1 Art ,rt+1 Trt+1 − Trt+1 P
(k−1)
(∗) ≤ a+ hrt+1 =1 prt ,rt+1 Trt+1 − Trt+1 (44)
Also, we can write
(k)
Trt
−
(∗)
Trt
=
Ξ−1
rt
−
A0rt
+
(k−1)
+
rt+1 =1 Art ,rt+1 Trt+1
Ph
−1 A−
rt
−1
P
(k−1)
(k−1) (∗)
= − A0rt + hrt+1 =1 A+
Ωrt Trt
T
r
rt ,rt+1 t+1
−1 h
i
(k−1)
(k−1)
(∗)
− Ωrt
+ Ξrt
Ωrt
Trt
43
so that
i −1 h
(k−1)
(∗) (k)
Ph
(k−1)
(∗)
(∗) +
+ Ξrt
Trt Trt − Trt ≤ Ωrt
rt+1 =1 Art ,rt+1 Trt+1 − Trt+1
i
−1 hP
(∗) (k−1)
(k−1)
(∗)
h
+
Ω
+
Ξ
A
T
−
T
= Trt r
r
r
r
t
t+1
t+1
t
rt+1 =1 rt ,rt+1
−1 P
(∗) (k−1)
(k−1)
(∗) h
+
Ω
+
Ξ
T
−
T
≤
a
Trt r
r
r
rt
t
t+1
t+1 rt+1 =1
−1 P
(k−1)
(k−1)
(∗) h
≤
θa+ Ω
+
Ξ
T
−
T
rt+1
rt+1 rt
rt
rt+1 =1
(45)
(0)
∗
We can now use induction in the rest of the proof. Suppose Trt − Trt ≤ δ,
−1 (0) (0) +
rt = 1, 2, ..., h. Then (44) implies Ωrt ≤ a δ. Hence Ξrt Ωrt ≤ τ a+ δ. Note
+
that the requirement that τ a+ (θ + δ) < 1 automatically implies
that τ a δ < 1.
−1 (0)
≤
Therefore we can use the perturbation lemma to establish that Ω
+
Ξ
r
r
t
t
−1
kΞrt k (1)
(∗) τ
(0) ≤ 1−τ a+ δ . Plugging this result into (45) implies that Trt − Trt ≤
1−kΞ−1
Ω
rt
rt k
Ph
(0)
(∗) θa+ τ
θa+ τ
T
−
T
rt+1
rt+1 , essentially defining ϕ ≡ 1−τ a+ δ and saying that the
rt+1 =1
1−τ a+ δ
proposition holds for k = 1.
that the proposition holds
for some arbitrary order k, that is
Now suppose
Ph
(k−1)
(k)
∗ ∗
Trt − Trt ≤ ϕ rt+1 =1 Trt+1 − Trt+1 . We now have to show that it also holds
(k−1)
for order k + 1. Since by assumption Trt+1 − Tr∗t+1 ≤ δ, for rt = 1, 2, ..., h, we
Ph
(k)
(k−1)
(k) ∗
∗ have Trt − Trt ≤ ϕ rt+1 =1 Trt+1 − Trt+1 ≤ ϕδ ≤ δ. (44) implies Ωrt ≤
−1 (k)
τ
+
≤
a δ. Then by the perturbation lemma, if follows that Ωrt + Ξrt
.
1−τ a+ δ
This implies by (45) that
−1 Ph
(k+1)
(k)
(∗) (k)
(∗) +
− Trt ≤ θa Ωrt + Ξrt
T
−
T
Trt
r
r
t+1
t+1
rt+1 =1
Ph
(k)
(∗) θa+ τ
≤
rt+1 =1 Trt+1 − Trt+1 1−τ a+ δ
. And so, the proposition also holds at order k + 1.
44
C
C.1
C.1.1
Models
Cho 2014: Markov switching elasticity of intertemporal
substitution
Model equations
πt − κyt = βπt+1 + zS,t
yt +
1
1
σ (st+1 )
1
it =
πt +
yt +
zD,t
σ (st )
σ (st )
σ (st )
σ (st )
−(1 − ρ)φπ πt + it = ρit−1 + zM P,t
zS,t = ρs zS,t−1 + St
St ∼ N (0, 1)
zD,t = ρd zD,t−1 + D
t
D
t ∼ N (0, 1)
P
zM P,t = ρmp zM P,t−1 + M
t
C.1.2
Parameterization
ρ
0.95
C.2
C.2.1
P
M
∼ N (0, 1)
t
φπ σ(1) σ(2) κ
1.5 1
5
0.132
β
ρs
ρd
0.99 0.95 0.95
ρmp
0
p1,1 p2,2
0.95 0.875
Cho 2014: Regime switching monetary policy
Model equations
πt − κyt = βπt+1 + zS,t
yt +
1
1
it = πt + yt + zD,t
σ
σ
−(1 − ρ)φπ πt + it = ρit−1 + zM P,t
45
zS,t = ρs zS,t−1 + St
St ∼ N (0, 1)
zD,t = ρd zD,t−1 + D
t
D
t ∼ N (0, 1)
P
zM P,t = ρmp zM P,t−1 + M
t
C.2.2
Parameterization
ρ
0.95
C.3
C.3.1
P
M
∼ N (0, 1)
t
φπ (1) φπ (2) σ
0.9
1.5
1
κ
β
ρs
0.132 0.99 0
ρd
0
ρmp
0
p1,1
0.85
p2,2
0.95
Farmer, Waggoner and Zha (2011)
Model equations
φπt = πt+1 + δπt−1 + βrt
εt ∼ N (0, 1)
r = ρrt−1 + εt
C.3.2
Parameterization
δ (1)
parameterization 1 0
parameterization 2 -0.7
parameterization 3 -0.7
C.4
C.4.1
δ (2) β (1) β (2) ρ (1) ρ (2) φ (1) φ (2) p1,1
0
1
1
0.9
0.9
0.5
0.8
0.8
0.4
1
1
0
0
0.5
0.8
0
-0.2 1
1
0
0
0.2
0.4
0.9
p2,2
0.9
0.64
0.8
Foerster, Rubio-Ramirez, Waggoner and Zha (2014):
RBC model
Model equations
1−α α−1
cυ−1
= βztυ−1 cυ−1
+ 1 − δ)
t
t+1 (αzt+1 kt
α
ct + zt kt = zt1−α kt−1
+ (1 − δ)kt−1
46
log(zt ) = (1 − ρ (st ))µ (st ) + ρ (st ) log(zt−1 ) + σ (st ) εt
C.4.2
εt ∼ N (0, 1)
steady state
zt = exp (µ)
1
kt =
αzt1−α
1
−1+δ
βztυ−1
1
α−1
ct = zt1−α ktα + (1 − δ) kt − zt kt
C.4.3
α
0.33
C.5
C.5.1
Parameterization
β
υ δ
µ(1)
µ(2)
0.9976 -1 0.025 0.0274 -0.0337
ρ(1) ρ(2) σ(1)
σ(2)
p1,1 p2,2
0.1
0
0.0072 0.0216 0.75 0.5
Foerster, Rubio-Ramirez, Waggoner and Zha (2013):
NK model
Model equations
1 − κ2 (πt − 1)2 Yt
1
Rt
1 − βEt
=0
2
κ
1 − 2 (πt+1 − 1) Yt+1 exp (µt+1 ) πt+1
2
κ
(π
−
1)
1
−
κ
t
2
(πt+1 − 1) πt+1 −κ (πt − 1) πt = 0
(1 − η)+η 1 − (πt − 1)2 Yt +βκEt
2
1 − κ2 (πt+1 − 1)2
C.5.2
Rt−1
Rss
ρ
(1−ρ)ψt
πt
exp (σεt ) −
Rt
Rss
=0
Steady state solution
πt
1
yt
Rt
η−1
η
exp(µ)
πt
β
47
εt ∼ N (0, 1)
C.5.3
Parameterization
β
κ
η ρ r σr
p1,1
0.9976 161 10 0.8 0.0025 0.9
C.6
p2,2
0.9
µ (1)
0.0075
µ (2)
0.0025
ψ (1) ψ (2)
3.1
0.9
Foerster, Rubio-Ramirez, Waggoner and Zha (2013):
NK model with habits
C.6.1
Model equations
1
ϕ
−β
− λt = 0
Ct − ϕ exp (−µ̄) Ct−1
Xt+1 exp (µt+1 ) − ϕCt
ψ
βλt+1 Rssπt t exp(σεt )
exp(µt+1 )
πt+1
(1 − η) +
− λt = 0
εt ∼ N (0, 1)
η
λt+1 Xt+1 1 − κ2 (πt − 1)2
− κ(πt − 1)πt
+ βκ(πt+1 − 1)πt+1
λt
λt Ct 1 − κ2 (πt+1 − 1)2
Xt − C t
C.6.2
steady state
πt
1
C.6.3
λt
Xt
η
η−1
exp(µ)−βϕ η−1
exp(µ)−ϕ η
Parameterization
β
κ
η ρr σ
p1,1
0.9976 161 10 0.8 0.0025 0.9
C.7
Ct
Xt
p2,2
0.9
µ (1)
0.0075
µ (2)
0.0025
ψ (1) ψ (2) Rss
exp(µ)
3.1
0.9
β
The Smets-Wouters model with switching monetary policy
We consider a medium scale DSGE model by Smets and Wouters (2007). We turn
it into a Markov switching model by introducing a Markov chain in two regimes,
controlling the behavior of the parameters entering the monetary policy reaction
function. In the good regime, the parameters assume the mode values estimated by
Smets and Wouters (2007) for the constant parameter case. In the bad regime the
48
ϕ
0.7
interest rate is unresponsive to inflation and output gap. However, the interest rate is
allowed to respond to a persistent exogenous monetary policy shock and is therefore
exogenous. In a constant-parameter model, this bad regime would be unstable. In
a switching environment, however, to the extent that agents assign a sufficiently
high probability to the system returning to the good regime, the system may be
stable in the MSS sense. The table below displays the parameters: rπ is the reaction
to inflation; ρ, the smoothing parameter; ry , the reaction to output gap; r∆y , the
reaction to the change in output gap; pgood,good is the probability of remaining in the
good regime; pbad,bad is the probability of remaining in the bad regime.
rπ
ρ
ry
r∆y
pgood,good
Good regime 1.4880 0.8762 0.0593 0.2347 0.9
Bad regime
0.0000 0.0000 0.0000 0.0000 NA
49
pbad,bad
NA
0.3