Efficient Perturbation Methods for Solving Switching DSGE Models Junior Maih∗ Norges Bank [email protected] December 15, 2014 Abstract In this paper we present a framework for solving switching nonlinear DSGE models. In our approach, the probability of switching can be endogenous and agents may react to anticipated events. The solution algorithms derived are suitable for solving large systems. They use a perturbation strategy which, unlike Foerster et al. (2014), does not rely on the partitioning of the switching parameters and is therefore more efficient. The algorithms presented are all implemented in RISE, an object-oriented toolbox that is flexible and can easily integrate alternative solution methods. We apply our algorithms to and can replicate various examples found in the literature. Among those examples is a switching RBC model for which we present a third-order perturbation solution. JEL C6, E3, G1. ∗ First version: August 1, 2011. The views expressed herein are mine and do not necessarily represent those of Norges Bank. Some of the ideas in this paper have often been presented under the title : ”Rationality in Switching Environments”. I would like to thank Dan Waggoner, Tao Zha, Hilde Bjørnland, and seminar participants at the Atlanta Fed (Sept. 2011), IMF, Norges Bank, JRC, the National Bank of Belgium, dynare conferences (Zurich, Shanghai), Venastul, Board of Governors of the Federal Reserve System and the 87th WEAI conference in San Francisco. 1 1 Introduction In an environment in where economic structures break, variances change, distributions shift, conventional policies weaken and past events (e.g. crises) tend to reoccur, the Markov switching DSGE (MSDSGE) model appears as the natural framework for analyzing the dynamics of macroeconomic variables12 . MSDSGE models are also important because many policy questions of interest seem to be best answered/addressed in a framework of changing parameters3 . Not surprisingly then, besides the ever-growing empirical literature using MSDSGE models, many efforts have been invested in solving those models. In that respect, the literature has considered 3 main angles of attack. One strand of literature considers approximating the solution of those models using ”global” methods. Examples include Davig et al. (2011), Bi and Traum (2013) and Richter et al. (2014). Just like in constant-parameter models, global approximation methods in switchingparameter models face problems of curse of dimensionality, reliance on a pre-specified set of grid points typically constructed around one steady state although the model may have many, etc. The curse of dimensionality in particular, implies that the number of state variables has to be as small as possible and even solving small models involves substantial computational costs. A second group of techniques applies Markov switching to the parameters of linear or linearized DSGE models. Papers in this class include Farmer et al. (2011), Cho (2014), Svensson and Williams (2007) to name a few. One advantage of this approach is that one can handle substantially larger problems than the ones solved using global methods. Insofar as the starting point is a linear model, all one has to 1 Changes in shock variances have been documented by Stock and Watson (2003), Sims and Zha (2006), Justiniano and Primiceri (2008), while breaks in structural parameters have been advocated by Bernanke et al. (1999), Lubik and Schorfheide (2004), Davig and Leeper (2007). Other papers have also documented changes in both variances and structural parameters. Examples include Smets and Wouters (2007), Svensson and Williams (2007, Svensson and Williams (2009) and Cogley et al. (2012). 2 In such a context, constant-parameter models could be misleading in their predictions and their implications for policymaking. 3 Some of those questions are: • what actions should we undertake today given the non-zero likelihood of a bad state occurring in the future? • What can we expect of the dynamics of the macro-variables we care about when policy is constrained? • How is the economy stabilized when policy is constrained? 2 worry about is how to compute and characterize the solution of the model. If the original model is nonlinear, however, a first-order approximation may not be sufficient to approximate the nonlinear dynamics implied by the true policy functions of the model. The nonlinearity of the original model also implies that one cannot assume away the switches in the parameters for the sake of linearization and then re-apply them to the model parameters once the linearized form is obtained. Consistently linearizing the model while taking into account the switching parameters calls for a different strategy. Finally, the third group of techniques attempts to circumvent or find a workaround to the problems posed by the first two groups. More specifically, this literature embeds switching in perturbation solutions whose accuracy can be improved with higher and higher orders4 . This is the approach followed by Foerster et al. (2014) and in this paper. For many years, we have been developing, in the RISE toolbox, algorithms for solving and estimating models of switching parameters including DSGEs, VARs, SVARs, optimal policy models of commitment, discretion and loose commitment5 . In the present paper, however, we focus on describing the theory behind the routines implementing the switching-parameter DSGE in RISE6 . Our approach is more general than the ones discussed earlier or found in the literature. In contrast to Foerster et al. (2014) and the papers cited above, in our derivation of higher-order perturbations we allow for endogenous transition probabilities. We also allow for anticipated events following Maih (2010) and Juillard and Maih (2010). This feature is useful as it offers an alternative to the news shocks 4 Model accuracy is often measured by the Euler approximation errors. It is important to note that a high accuracy is not synonymous with the computed policy functions being close to the true ones 5 RISE is a matlab-based object-oriented toolbox. The flexibility and modularity of the toolbox are such that any user with little exposure to RISE is able to apply his own solution method without having to change anything to the core codes. The toolbox is available, free of charge, at https://github.com/jmaih/RISE toolbox. 6 We plan to describe algorithms for other modules of RISE in subsequent papers. 3 strategy that has often been used to analyze the effects of forward guidance78 . The introduction of an arbitrary number of anticipated shocks does come at a cost though: the number of cross terms to keep track of increases rapidly with the order of approximation and computing those terms beyond the first order becomes intractable. This problem is circumvented by exploiting a simple trick by Levintal (2014), who shows a way of computing higher-order approximations by keeping all state variables in a single block9 . But before computing higher-order perturbations, two problems relative to the choice of the approximation point and the solving of a quadratic matrix equation, have to be addressed. With respect to the choice of the approximation point, Foerster et al. (2014) propose a ”partition perturbation” strategy in which the switching parameters are separated into two groups: one group comprises the switching parameters that affect the (unique) steady state and another group that collects the switching parameters that do not affect the steady state. Here also, the approach implemented in RISE is more general, more flexible and simpler: it does not require any partitioning of the switching parameters and is therefore more efficient. Moreover, it allows for the possibility of multiple steady states and delivers the results of 7 There are substantial differences between what this paper calls ”anticipated events” or ”anticipated shocks” and ”news shocks”. First, solving the model with anticipated shocks does not require a modification of the equations of the original system, in contrast to ”news shocks” that are typically implemented by augmenting the law of motion of a shock process with additional shocks. Secondly, in the anticipated shocks approach, future events are discounted while in the news shocks approach the impact of a news shock does not depend on the horizon at which the news occurs. Thirdly, an anticipated shock is genuinely a particular structural shock in the original system, while in the news shocks, it is a different iid shock with no other interpretation than ”a news” and unrelated to any structural shock in the system. Because it is unrelated, it will have its own distribution independently of other parts of the system. Fourthly, the estimation of models of news shocks requires additional variables to be declared as observables and enter the measurement equation. The estimation procedure tries to fit the future information is the same way it fits the other observable variables. This feature makes Bayesian model comparison infeasible since the comparison of two models requires that they have the same observable variables. In contrast, in the estimation of models with anticipated shocks, the anticipated information, which may not materialize is separated from the actual data. Model comparison remains possible since the anticipated information never enters the measurement equations. Finally, in the anticipated shocks approach the policy functions are explicitly expressed in terms of leads of future shocks as opposed to lags in the news shocks approach. 8 In particular, it makes it possible to analyze the effects of hard conditions as well as soft conditions on the future information The discounting of future events also depends on the uncertainty around those events. 9 Before the new implementation, the higher-order perturbation routines of RISE separated the state variables into three blocks: endogenous variables, perturbation parameter, exogenous shocks. Cross products of those had to be taken explicitly, calculated and stored separately. 4 the ”partition perturbation” of Foerster et al. (2014) as a special case. When it comes to solving the system of quadratic matrix equations implied by the first-order perturbation, Foerster et al. (2014)’s proposal is to use the theory of of Gröbner bases (see Buchberger (1965, Buchberger (2006)) to find all the solutions of the system and then apply the engineering concept of mean square stability to each of the solutions as a way to check whether the Markov switching system admits a single stable solution. While recognizing the benefits of a procedure that can generate all possible solutions of a nonlinear system and even plan to integrate a similar procedure in RISE, we argue that such an approach may not be practical or suitable for systems of the size that we have been accustomed to in constant-parameter DSGE models10 . Furthermore the approach also has two major limitations: (1) stability of first-order approximation does not imply stability of higher-order approximations, even in constant-parameter DSGE models; (2) there is no stability concept for switching models with endogenous transition probabilities. Because we are ultimately interested in estimating those models in order to make them really useful for policymaking, we take a more modest route: we derive efficient functional iterations and Newton algorithms that are suitable for solving relatively large systems. As an example, we have successfully solved a second-order perturbation of a model of 272 equations using our algorithms. We further demonstrate the efficiency and usefulness of our solution methods by applying them to various examples found in the literature. Our algorithms easily replicate the results found by other authors. Among the examples we consider is a switching model by Foerster et al. (2014) for which we present a third-order perturbation solution. The rest of the paper proceeds as follows. Section 2 introduces the notation we use alongside the generic switching model that we aim to solve. Then section 3 derives the higher-order approximations of the model. At the zero-th order, we present various flexible strategies for choosing the approximation point. Section 4 takes on the solving of the quadratic matrix polynomial arising in the first-order approximation. Section 5 evaluates the performance of the proposed algorithms and section 6 concludes. 10 Gröbner bases have problems of their own and we will return to some of them later in the text. 5 2 2.1 The Markov Switching Rational Expectations Model The economic environment We are interested in characterizing an environment in which parameters switch in a model that is potentially already nonlinear even in the absence of switching parameters. In that environment we would like to allow for the transitions controlling parameter switches to be endogenous and not just exogenous as customarily found in the literature. Finally, recognizing that at the time of making decisions agents may have information about future events, it is desirable that in our framework future events –forward guidance on the behavior of policy is an example of such a possibility – ,which may or many not materialize in the end, influence the current behavior of private agents. 2.2 Dating and notation conventions The dating convention used in this paper is different from the widely used convention of Schmitt-Grohe and Uribe (2004) in which the dating of the variables refers to the beginning of the period. Instead we use the also widely used dating convention in which the dating of the variables refers to the end of the period. In that way, the dating determines the period in which the variable of interest is known. This is the type of notation used for instance in Adjemian et al. (2011). Some solution methods for constant-parameter DSGE models (e.g. Klein (2000), Gensys, Schmitt-Grohe and Uribe (2004)) stack variables of different periods. This type of notation has also been used in switching-parameter DSGE models by Farmer et al. (2011), Foerster et al. (2013). This notation is not appropriate for the type of problems this paper aims to solve. In addition to forcing the creation of auxiliary variables and thereby increasing the size of the system, it also makes it cumbersome to compute expectations of variables in our context since future variables could belong to a state that is different from the current one. Clearly, doing so may restrict the class of problems one can solve or give the wrong answer in some types of problems 11 . The stacking of different time periods therefore is not appealing for our purposes. 11 In the Farmer et al. (2011) procedure for instance, the Newton algorithm is constructed on the assumption that the coefficient matrix on forward-looking variables depends only on the current regime. 6 2.3 The generic model Many solution approaches, like Farmer et al. (2011) or Cho (2014), start out with a linear model and then apply a Markov switching to the parameters. This strategy is reasonable as long as one takes a linear specification as the structural model. When the underlying structural model is nonlinear, however, the agents are aware of the nonlinear nature of the system and the switching process. This has implications for the solutions based on approximation and for the decision rules. In this setting, an important result by Foerster et al. (2013) is that the first-order perturbation may be non-certainty equivalent. Hence starting out with a linear specification may miss this important point. The problem to solve is Et h X πrt ,rt+1 (It ) d˜rt (v) = 0 rt+1 =1 where Et is the expectation operator, d˜rt : Rnv −→ Rnd is a nd × 1 vector of possibly nonlinear functions of their argument v (defined below), rt = 1, 2, .., h is the regime a time t, πrt ,rt+1 (It ) is the transition probability for going from regime rt in the current period to regime rt+1 in the next period. This probability is potentially endogenous in the sense that it is a function of It , the information set at time t. The only restriction imposed on the endogenous switching probabilities is that the parameters affecting them do not switch over time and that the variables entering those probabilities have a unique steady state12 . The nv × 1 vector v is defined as 0 0 0 0 0 0 0 v ≡ bt+1 (rt+1 ) ft+1 (rt+1 ) st (rt ) pt (rt ) bt (rt ) ft (rt ) p0t−1 b0t−1 ε0t θr0 t+1 (1) where st is a ns ×1 vector of static variables13 , ft is a nf ×1 vector of forward-looking variables14 , pt is a np × 1 vector of predetermined variables15 , bt is a nb × 1 vector of variables that are both predetermined and forward-looking, εt is a nε × 1 vector of shocks with εt ∼ N (0, Inε ) and θrt+1 is a nθ × 1 vector of switching parameters (appearing with a lead in the model). Note that we do not declare the parameters of the current regime rt . They are implicitly attached to d˜rt . Also note that we could get rid of the parameters of future 12 RISE automatically checks for these requirements Those are the variables appearing in the model only at time t. 14 Those variables appearing in the model both time t and in future periods t + 1, t + 2, etc.. 15 Those variables appear in the model at time t and at times t − 1, t − 2, etc. 13 7 regimes (θrt+1 ) by declaring auxiliary variables, which would then be forward-looking variables. If we define the nd × 1 vector drt ,rt+1 as drt ,rt+1 ≡ πrt ,rt+1 (It ) d˜rt , the objective becomes Et h X drt ,rt+1 (v) = 0 (2) rt+1 =1 We assume that the agents have information for all or some of the shocks k ≥ 0 periods ahead into the future. And so, including a perturbation parameter σ, we define an nz × 1 vector of state variables as 0 zt ≡ p0t−1 b0t−1 σ ε0t ε0t+1 · · · ε0t+k where nz = np + nb + (k + 1) nε + 1. Denoting by yt (rt ), the ny × 1 vector of all the endogenous variables, where ny = ns + np + nb + nf , we are interested in solutions of the form r st (rt ) S t (zt ) pt (rt ) rt = T rt (zt ) ≡ P r (zt ) yt (rt ) ≡ (3) bt (rt ) B t (zt ) F rt (zt ) ft (rt ) In general, there is no analytical solution to (2) even in cases where d˜rt or drt ,rt+1 is linear. In this paper we rely on a perturbation that will allow us to approximate the decision rules in (3)16 . We can then solve these approximated decision rules by inserting their functional forms into (2) and its derivatives. This paper develops methods for doing that. For the subsequent derivations, it is useful to define for all g ∈ {s, p, b, f }, an ng × ny matrix λg that select the solution of g-type variables in T or y. We also λp λb define λx ≡ and λbf ≡ as the selector for p-b and b-f variables λb λf respectively. In the same way, we define for all g ∈ {pt−1 , bt−1 , σ, εt , εt+1 , ..., εt+k }, a matrix mg of size ng × nz that selects the g-type variables in the state vector zt . 16 Alternatively one could approximate (2) directly as it is done for instance in Dynare. 8 3 Approximations Since the solution is in terms of the vector of state variables zt , we proceed to expressing all the variables in the system as a function of zt 17 . Since both bt+1 (rt+1 ) and ft+1 (rt+1 ) appear in the system (1) and given the solution (3) we need to express zt+1 as a function of zt . This is given by zt+1 = hrt (zt ) + uzt where hrt (zt ) ≡ (λx T rt (zt ))0 (mσ zt )0 (mε,1 zt )0 · · · and u is a nz × nz random matrix defined by 0(np +nb +1+knε )×nz u≡ εt+k+1 mσ (mε,k zt )0 (0nε ×1 )0 0 (4) Following Foerster et al. (2014) , we postulate a perturbation solution for θrt+1 18 as θrt+1 = θ̄rt + σ θ̂rt+1 (5) In this respect, this paper differs from Foerster et al. (2013) and Foerster et al. (2014) in two important ways. First, θ̄rt need not be the ergodic mean of the parameters as will be discussed below. Secondly, conditional on being in regime rt perturbation is never done with respect to the θrt parameters. Perturbation is done only with respect to the parameters of the future regimes (θrt+1 ) that appear in the system for the current regime (rt ). Given the solution, we can now express vector v in terms of the state variables λbf T rt+1 (hrt (zt ) + uzt ) T rt (zt ) mp zt v= (6) m z b t mε,0 zt θ̄rt + θ̂rt+1 mσ zt 17 Here we differentiate the policy functions. One could, alternatively, differentiate the model conditions as done for instance in Dynare and Dynare++. This would involve taking the Taylor expansion of the original objective (2). 18 This assumption is not necessary. We could, as argued above, simply declare new auxiliary variables and get rid of the parameters of the future regimes. 9 and the objective function (2) becomes Et h X drt ,rt+1 (v (zt , εt+k+1 )) = 0 (7) rt+1 =1 Having expressed the problem to solve in terms of state variables in a consolidated vector zt , as in Levintal (2014), we stand ready to take successive Taylor approximations of 7 to find the perturbation solutions. 3.1 Zero-th order The first step of the perturbation technique requires the choice of the approximation point. In a constant-parameter world, we typically approximate the system around the steady state, the resting point to which the system will converge in the absence of future shocks. In a switching environment the choice is not so obvious any more. Approximation around the ergodic mean Foerster et al. (2014) propose to take a perturbation of the system around its ergodic mean. This ergodic mean can be found by solving d˜rt bt , ft , st , pt , bt , ft , pt , bt , 0, θ̄ = 0 where θ̄ is the ergodic mean of the parameters19 . The ergodic mean is, however, need not be an attractor or a resting point, a point towards which the system will converge in the absence of further shocks. We propose two further possibilities. Regime-specific steady states20 The first one is to approximate the system around regime-specific means. The system may not be stable at the mean in a certain regime, but at least we assume that if the system happens to be exactly at one of its regime-specific means, it will stay there in the absence of any further shocks. We compute those means by solving d˜rt (bt (rt ) , ft (rt ) , st (rt ) , pt (rt ) , bt (rt ) , ft (rt ) , pt (rt ) , bt (rt ) , 0, θrt ) = 0 The intuition behind this strategy is two-fold. On the one hand, it is not too difficult to imagine that the relevant issue for rational agents living in a particular 19 20 In the RISE toolbox, this is triggered by the option ”unique”. This approach is the default behavior in the RISE toolbox. 10 state of the system at some point in time, is to insure against the possibility of switching to a different state, and not to the ergodic mean. On the other hand, from a practical point of view, the point to which the system is to return matters for forecasting. Many inflation-targeting countries, have moved from a regime of high inflation to a regime of lower inflation. Approximating the system around the ergodic mean in this case implies that the unconditional forecasts will be pulled towards a level that is consistently higher than the recent history of inflation, which is likely to yield substantial forecast errors. All this contributes to reinforcing the idea that the ergodic mean is not necessarily an attractor. Approximation around an arbitrary point In the second possibility, one can impose an arbitrary approximation point. If the chosen point of approximation is chosen arbitrarily, obviously, none of the two equations above will hold and a correction will be needed in the dynamic system, with consequences for the solution as well. This approach may be particularly useful in certain applications, e.g. a situation in which one of the regimes persistently deviates from the reference steady state for an extended period of time21 . The approach also bears some similarity with the constant-parameter case where the approximation is sometimes taken around the risky or stochastic steady state. h i Suppose we want to take an approximation around an arbitrary point s̆, p̆, b̆, f˘ . The strategy we suggest is to evaluate that point in each regime. More specifically we will have d˘rt ≡ d˜rt b̆, f˘, s̆, p̆, b̆, f˘, p̆, b̆, 0, θrt , θrt This discrepancy is then forwarded to the first-order approximation when solving the first-order coefficients. Interestingly, both the regime-specific approach and the ergodic approach are speh i cial cases. In the former, because s̆, p̆, b̆, f˘ = [st (rt ) , pt (rt ) , bt (rt ) , ft (rt )], the dish i crepancy, d˘rt , is zero. In the later case, s̆, p̆, b̆, f˘ = sergodic , pergodic , bergodic , f ergodic and the discrepancy need not be zero. The approach suggested here is computationally more efficient than that suggested by Foerster et al. (2014) and does not require any partitioning of the switching parameters between those that affect the steady state and those that do not. It will be shown later on in an application that we recover their results. 21 In the RISE toolbox, this is triggered by the option ”imposed”. When the steady state is not imposed, RISE uses the values provided as an initial guess in the solving of the approximation point 11 3.2 First-order perturbation Taking high-order approximations of this system is tedious especially if one attempts to compute all cross/partial derivatives in isolation. This is even worse if one allows for anticipated events. A less cumbersome approach exploits the strategy of Levintal (2014) in which the differentiation is done with respect to the entire vector z directly, rather than element by element or block by block22 . At first order, then, we seek to approximate T rt in (3) with a solution of the form T rt (z) ' T rt (z̄rt ) + Tzrt (zt − z̄rt ) (8) Finding this solution requires differentiating (7) with respect to zt and applying the correction due to the choice of the approximation point, which is equivalent to taking a first-order Taylor expansion of (7). Using tensor notation, we have h h ii X ˘ [drvt ,rt+1 ]iα [vz ]αj = 0 drt + Et (9) rt+1 =1 where [drvt ,rt+1 ]iα denotes the derivative of the ith row of d with respect to the αth row of v and, similarly, [vz ]αj denotes the derivative of the αth row of v with respect to the j th row of z. Defining r ,r r ,r r ,r r ,r A0rt ,rt+1 ≡ dst0 t+1 dpt0 t+1 db0t t+1 dft0 t+1 we have dv = r ,rt+1 r ,r r ,r r ,rt+1 dft+ t+1 A0rt ,rt+1 dpt− t+1 db−t db+t r ,rt+1 r ,r dεt0 dθt+ t+1 The equality in (9) follows from the fact that all the derivatives of (7) must be equal to 0. The derivatives of v with respect to z are given by vz = a0z + a1z u (10) where the definitions of a0z and a1z are given in appendix (A.1). The derivative of h with respect to z need in the calculation of vz , as can be seen in (6), is given by hrzt = (λx Tzrt )0 m0σ m0ε,1 · · · 22 m0ε,k 0n2z ×nε 0 This strategy is particularly useful when it comes to computing higher-order cross derivatives. By not separating state variables, we always one block of cross products no matter the order of approximation instead of an exponentially increasing number of cross blocks. 12 Unfolding the tensors this problem reduces to d˘rt + h X drvt ,rt+1 Et vz = 0 rt+1 =1 ∂drt ,rt+1 ∂g q r ,r dgtq t+1 Let’s define ≡ for g = s, p, f, b referring to static, predetermined, forward looking and ”both” variables and for q = 0, +, − referring to current variables, future variables and past variables respectively. The problem can be expanded into h X rt+1 =1 r ,rt+1 r ,r dft+ t+1 λbf Tzrt+1 hrzt + A0rt ,rt+1 Tzrt + =0 mp r ,r r ,r r ,r db−t t+1 + dεt0 t+1 mε,0 + dθt+ t+1 θ̂rt+1 mσ mb db+t r ,r dpt− t+1 Looking at Tzrt and hrzt in detail, we see that they can be partitioned. In particular, with rt Tz,x ≡ rt rt Tz,b Tz,p we have Tzrt = rt hz = rt rt rt rt Tz,x Tz,σ Tz,ε Tz,ε ··· 0 1 rt rt rt rt λx Tz,ε λx Tz,ε λx Tz,x λx Tz,σ 0 1 01×nx 1 01×nε 01×nε 0nε ×nx 0nε ×1 0nε Inε .. .. .. .. . . . . 0nε 0nε 0nε ×nx 0nε ×1 0nε ×nx 0nε ×1 0nε 0nε rt Tz,ε k rt · · · λx Tz,ε k · · · 01×nε ··· 0nε .. ... . ··· Inε ··· 0nε Hence the solving can be decomposed into small problems 3.2.1 Impact of endogenous state variables The problem to solve is rt 0 = A0rt Tz,x + A− rt + with A0rt ≡ Ph rt+1 =1 h X rt+1 =1 13 rt+1 rt A+ rt ,rt+1 Tz,x λx Tz,x A0rt ,rt+1 (11) A− rt h X ≡ r ,r r ,rt+1 dpt− t+1 db−t rt+1 =1 A+ rt ,rt+1 ≡ r ,rt+1 r ,r 0nd ×ns 0nd ×np db+t dft+ t+1 (12) Since there are many algorithms for solving (11), we delay the presentation of our solution algorithms until section 4. For the time being, the reader should note the way A+ rt ,rt+1 enters (11). This says that our algorithms will be able to handle cases where the coefficient matrix on forward-looking terms is known in the current + period (A+ rt ,rt+1 = Art ,rt ) as in Farmer et al. (2011) but also the more complicated + case where A+ rt ,rt+1 6= Art ,rt as in Cho (2014). This is part of the reasons why the notation of Schmitt-Grohe and Uribe (2004), where one can stack variables is not appropriate in this context. This assumption is very convenient in the Farmer et al. (2011) algorithm as it allows them to derive their solution algorithm, which would be more difficult otherwise. It is also convenient as it leads to substantial computational savings. But as our the derivations show, the assumption is incorrect in problems + where A+ rt ,rt+1 6= Art ,rt . 3.2.2 Impact of uncertainty rt For the moment, we proceed with the assumption that we have solved for Tz,x . Now we have to solve d˘rt + h X rt+1 A+ rt ,rt+1 Tz,σ + rt Arσt Tz,σ rt+1 =1 h X + rt+1 =1 which leads to 1 Aσ + A+ A+ ··· A+ 1,1 1,2 1,h + + + 2 A A + A · · · A 2,1 2,2 σ 2,h rt Tz,σ = − .. .. .. . . . . . . + + h Ah,2 · · · Aσ + A+ Ah,1 h,h where Arσt ≡ A0rt r ,r dθt+ t+1 θ̂rt ,rt+1 = 0 + h X r ,r −1 ˘ Ph 1,r d1 + rt+1 =1 dθ+ t+1 θ̂1,rt+1 P 2,r d˘2 + hrt+1 =1 dθ+ t+1 θ̂2,rt+1 .. . P h,rt+1 h ˘ dh + rt+1 =1 dθ+ θ̂h,rt+1 (13) r ,rt+1 rt+1 dft+ t+1 λf Tz,p λp + db+t rt+1 =1 14 r λb Tz,bt+1 λb It follows from equation (13) that in our framework, it is the presence of (1) future parameters in the current state system and/or (2) an approximation taken around a point that is not the regime-specific steady state that creates non-certainty equivalence. 3.2.3 Impact of shocks Define Urt ≡ h X ! rt+1 A+ rt ,rt+1 Tz,x λx + A0rt (14) r ,rt+1 (15) rt+1 =1 Contemporaneous shocks We have rt Tz,ε = −Ur−1 0 t Ph rt+1 =1 dεt0 Future shocks For k=1,2,... we have the recursive formula P rt+1 h rt + −1 A T Tz,ε × = −U k rt rt+1 =1 rt ,rt+1 z,ε(k−1) 3.3 (16) Second-order solution The second-order perturbation solution of T rt in (3) takes the form T rt (z) ' T rt (z̄rt ) + Tzrt (zt − z̄rt ) + 21 Tzzrt (zt − z̄rt )⊗2 Since T rt (z̄rt ) and Tzrt have been computed in earlier steps, at this stage we only need to solve for Tzzrt . To get the second-order solution, we differentiate (9) with respect to z to get Et h X [drvvt ,rt+1 ]iαβ [vz ]βk [vz ]αj + [drvt ,rt+1 ]iα [vzz ]αjk = 0 (17) rt+1 =1 so that unfolding the tensors yields h X drvvt ,rt+1 Et vz⊗2 + drvt ,rt+1 Et vzz = 0 (18) rt+1 =1 We use the notation A⊗k as a shorthand for A ⊗ A ⊗ ... ⊗ A. We get vzz by differentiating vz with respect to z, yielding 15 vzz = a0zz + a1zz u⊗2 + a1zz (u ⊗ hrzt + hrzt ⊗ u) (19) where the definitions of a0zz , a1zz as well as the expressions for Evz⊗2 , Evzz and Eu⊗2 needed to solving for Tzzrt are given in appendix (A.2). With those expressions in hand, expanding the problem to solve in (18) gives Arzzt + h X rt rt+1 rt A+ rt ,rt+1 Tzz Czz + Urt Tzz = 0 (20) rt+1 =1 with Arzzt ≡ h X drvvt ,rt+1 Et vz⊗2 rt+1 =1 and rt Czz ≡ (hrzt )⊗2 + Eu⊗2 In the case where h = 1, some efficient algorithms can be derived/used e.g. Kamenik (2005). Such algorithms cannot be applied in the switching case unfortunately. The strategy implemented in RISE applies a technique called ”transpose-free quasi-minimum residuals” by Freund (1993). 3.4 Third-order solution To get the third-order solution, we differentiate (17) with respect to z to get rt ,rt+1 i [dvvv ]αβγ [vz ]γl [vz ]βk [vz ]αj + P P 0 = Et hrt+1 =1 [drvvt ,rt+1 ]iαβ rst∈Ω1 [vzz ]βrs [vz ]αt + [drvt ,rt+1 ]iα [vzzz ]αjkl Ω1 = {klj, jlk, jkl} In matrix form 0 = Et Ph rt+1 =1 t ,rt+1 (drvvv (vz⊗3 ) + ω1 (drvvt ,rt+1 (vz ⊗ vzz )) + drvt ,rt+1 vzzz ) where ω1 (.) is a function that computes the sum of permutations of tensors of type A (B ⊗ C) and where the permutations are given by the indices in Ω1 . We get vzzz by differentiating vzz with respect to z, yielding 16 vzzz ≡ a0zzz + a1zzz P (hrzt )⊗2 ⊗ u + a1zzz P (hrzt ⊗ u⊗2 ) + a1zzz u⊗3 + ω1 (a1zz (u ⊗ hrzzt )) (21) 23 rt rt where hzz , defined in (36) is the second derivative of h with respect to z . The definitions of a0zzz and a1zzz as well as expressions for Evzzz , E (vz ⊗ vz ⊗ vz ), rt are given in appendix (A.3). E (vzz ⊗ vz ), E (vz ⊗ vzz ) needed to solve for Tzzz In equation (21), the function P (.) denotes the sum of all possible permutations of Kronecker products. That is P (A ⊗ B ⊗ C) = A ⊗ B ⊗ C + A ⊗ C ⊗B + B ⊗ A ⊗ C + B ⊗ C ⊗ A + C ⊗ A ⊗ B + C ⊗ B ⊗ A. Only one product needs to be computed. The remaining ones can be derived as perfect shuffles of the computed one (see e.g. Van Loan (2000)). Then, expanding the problem to solve, we get t Arzzz + h X rt+1 rt rt A+ rt ,rt+1 Tzzz Czzz + Urt Tzzz = 0 (22) rt+1 =1 with t Arzzz ≡ h X r ,r r ,r t ,rt+1 drvvv Evz⊗3 + ω1 drvvt ,rt+1 E (vz ⊗ vzz ) + db+t t+1 λb + dft+ t+1 λf Tzzrt+1 (hrzt ⊗ hrzzt ) rt+1 =1 (23) and rt Czzz ≡ P hrzt ⊗ Eu⊗2 + Eu⊗3 + (hrzt )⊗3 (24) The third-order perturbation solution of T rt in (3) takes the form rt T rt (z) ' T rt (z̄rt ) + Tzrt (zt − z̄rt ) + 21 Tzzrt (zt − z̄rt )⊗2 + 3!1 Tzzz (zt − z̄rt )⊗3 3.5 Higher-order solutions The derivatives of (7) can be generalized to an arbitrary order, say p. In that case we have 23 For any k ≥ 2, the k-order derivative of hrt with respect to z is given by hrzt(k) = t λx Tzr(k) . 0(1+(k+1)nε )×nkz 17 0 = Et h X rt+1 =1 p X r ,r dvt(l) t+1 i X l Y α1 α2 ···αl m [vz|cm | ]αk(c m) c∈Ml,p m=1 l=1 where: • Ml,p is the set of all partitions of the set of p indices with class l. In particular, M1,p = {{1, 2, · · · , p}} and Mp,p = {{1} , {2} , · · · , {p}} • |.| denotes the cardinality of a set • cm is the m-th class of partition c • k (cm ) is a sequence of k’s indexed by cm The p-order perturbation solution of T rt takes the form rt T rt (z) ' T rt (z̄rt ) + Tzrt (zt − z̄rt ) + 21 Tzzrt (zt − z̄rt )⊗2 + 3!1 Tzzz (zt − z̄rt )⊗3 + 1 rt rt (zt − z̄rt )⊗4 + · · · + p!1 Tz...z (zt − z̄rt )⊗p T 4! zzzz 4 Solving the system of quadratic matrix equations One of the main objectives of this paper is to present algorithms for solving (11). 4.1 Existing solution methods In terms of solving the first-order approximation of the switching DSGE models, there are important contributions in the literature, most of which are ideally suited for small models. We briefly review the essence of some of the existing solution methods, most of which are nicely surveyed and evaluated in Farmer et al. (2011) and then present our own. The FP algorithm: The strategy in this technique by Farmer et al. (2008) is to stack all the regimes and re-write the MSDSGE model as a constant parameter model, whose solution is the solution of the original MSDSGE. As demonstrated by Farmer et al. (2011), the FP algorithm may not find solutions even when they exist. 18 The SW algorithm: This functional iteration technique by Svensson and Williams (2007) uses an iterative scheme to find the solution of a fix point problem, through successive approximations of the solution, just as the Farmer et al. (2008) approach. But the SW algorithm does not resort to stacking matrices as Farmer et al. (2008), which is an advantage for solving large systems. Unfortunately, again as shown by Farmer et al. (2011), the SW algorithm may not find solutions even when they exist. The FWZ algorithm: This algorithm, by Farmer et al. (2011), uses a Newton method to find the solutions of the MSDSGE model. It is a robust procedure in the sense that it is able to find solutions that the other methods are not able to find. The Newton method has the advantage of being fast and locally stable around any given solution. However, unlike what was claimed in Farmer et al. (2011), in practice, choosing a large enough grid of initial conditions does not guarantee that one will find all possible MSV solutions. The Cho algorithm: This algorithm, by Cho (2014), finds the so-called ”forward solution” by extending the forward method of Cho and Moreno (2011) to MSDSGE models. The paper also addresses the issue of determinacy – existence of a unique stable solution. The essence of the algorithm is the same as that of the SW algorithm. The FRWZ algorithm24 : This technique, by Foerster et al. (2014), is one of the latest in the literature. It applies the theory of Gröbner bases (see Buchberger (1965, Buchberger (2006)), which helps find all possible solutions of the system symbolically. While promising, the Gröbner bases avenue is not very attractive in the context of models of the size routinely used in central banks. Because of its exponential complexity, the computational cost incurred is substantial and prohibitively high (see for instance Ghorbal et al. (2014) and also Mayr and Meyer (1982))25 .The algorithm is sensitive to the ordering of the variables and in many instances may simply fail to converge. In general, methods that aim at symbolically derive solutions to systems of equations work well when the number of variables is small. When the number of variables is not too small, say, bigger than 20, such methods tend not to work well as in general the number of solutions increases exponentially with the number of variables. More to the point, Foerster et al. (2014)’s strategy is to find all possible solutions, then establish the stability of each of the solutions, which in itself is a complicated 24 This algorithm is not yet implemented in RISE. This approach sometimes requires parallelization for moderately large systems (see e.g. Leykin (2004)). 25 19 step, in order to find out whether a particular parameterization admits a unique stable solution or not. The problem gets worse if one attempts estimating a model, which is the goal we have in mind when deriving the algorithms in this paper : locating the global peak of the posterior kernel will typically require thousands of function evaluations. Also, stability of a first-order approximation does not imply stability of higherorder approximations even in the constant-parameter case. In the switching case with constant transition probabilities, the problem is worse. When we allow for endogenous probabilities as it is done in this paper, there is no stability concept such as Mean Square Stability (MSS) that applies. Therefore provided we can even compute all possible solutions, we do not have a criterion for picking one of them. 4.2 Proposed solution techniques All of the algorithms presented above use constant transition probabilities. In many cases, they also use the type of notation used in Klein (2000), Schmitt-Grohe and Uribe (2004) and by Gensys. This notation works well with constant-parameter models26 . When it comes to switching DSGE models, however, that notation has some disadvantages as we argued earlier. Among other things, besides the fact that it often requires considerable effort to manipulate the model into the required form, the ensuing algorithms may be slow owing in part to the creation of additional auxiliary variables. The solution methods proposed below do not suffer from those drawbacks, but have their own weaknesses as well. And so they should be viewed as complementary to existing ones27 . 4.2.1 Solution technique 1: A functional iteration algorithm Functional iteration is a well established technique for solving models that do not have an analytical solution, but that are nevertheless amenable to a fix-point type of representation. It is very much used when for instance when solving for the value function in the Bellman equation in dynamic programming problems. In the context of switching DSGE models, functional iteration has the advantages that (1) it typically converges fast whenever it is able to find a solution, (2) can be used to solve large systems, (3) is easily implementable. 26 This is an attractive feature as it provides a compact and efficient representation of the firstorder conditions or constraints of a system leading to the well known solution methods for constantparameter DSGE models such as Klein (2000), Gensys among others. 27 These solution algorithms have been part of the RISE toolbox for many years. 20 We concentrate on minimum state variable (MSV) solutions and promptly rewrite (11) as28 rt Tz,x h P i−1 h 0 + rt+1 = − Art + λx A− rt rt+1 =1 Art ,rt+1 Tz,x (25) which is the basis for our functional iteration algorithm. We can iterate on this rt (k) rt (k−1) equation (25) until convergence, for rt = 1, 2, ..., h and h i.e. until Tz,x i= Tz,x (0) 1(0) h(0) given a start value for Tz,x ≡ Tz,x · · · Tz,x . Local convergence of functional iteration Now we are interested in deriving the conditions under which an initial guess will converge to a solution. We have the following theorem. Theorem 1 Let Tr∗t , rt = 1, 2, ..., h, be a solutionto (??), such that Tr∗t ≤ θ and −1 + P + ∗ ≤ τ for rt = 1, 2, ..., h , where Ar ≤ a+ and A0r + h A T rt+1 =1 rt ,rt+1 rt+1 t t (0) ∗ + θ > 0 and a > 0 and τ > 0. Assume there exists δ > 0 such that Trt − Trt ≤ δ, n oh (k) rt = 1, 2, ..., h. Then if τ a+ (θ + δ) < 1, the iterative sequence Trt generated rt =1 (k) (0) by functional iteration with initial guess Trt , rt = 1, 2, ..., h, satisfies Trt − Tr∗t ≤ Ph (k−1) ∗ ϕ rt+1 =1 Trt+1 − Trt+1 , with ϕ ∈ (0, 1). Proof. see appendix B. Farmer et al. (2011) show that functional iterations are not able to find all possible solutions. But still, functional iterations cannot be discounted: they are useful for getting solutions for large systems. It is therefore important to understand why in some cases functional iteration may fail to converge. The theorem establishes some sufficient conditions for convergence of functional iteration in this context. In other words it gives us hints about the cases in which functional iteration might not perform well. In particular, inequality τ a+ (θ + δ) < 1 is more likely to be fulfilled for low values of τ , a+ , θ and δ rather than large ones. 28 The invertibility of A0rt + P h rt+1 =1 rt+1 A+ T λx is potentially reinforced by the fact that z,x rt ,rt+1 all variables in the system appear as current. That is, all columns of matrix A0rt have at least one non-zero element. 21 4.2.2 Solution technique 2: Newton with kroneckers We define " rt + A0rt + Wrt (Tz,x ) ≡ Tz,x h X #−1 ! rt+1 A+ rt ,rt+1 Tz,x A− rt λx (26) rt+1 =1 We perturb the current solution Tz,x by a factor ∆ and expand (26) into Wrt (Tz,x + ∆) = Wrt (Tz,x ) + ∆rt − P h rt+1 =1 r ,r Lxt+ t+1 ∆rt+1 Lrxt− + HOT where Lxt+ t+1 ≡ Ur−1 A+ rt ,rt+1 t r ,r (27) rt Lrxt− ≡ −λx Tz,x (28) and Newton ignores HOT and attempts to solve for ∆rt so that ! h X r ,r Lxt+ t+1 ∆rt+1 Lrxt− = 0 Wrt (Tz,x ) + ∆rt − (29) rt+1 =1 , where ∆rt − direction ∆rt . P h rt+1 =1 • vectorize ∆rt − r ,r Lxt+ t+1 ∆rt+1 P h rt+1 =1 h X Lrxt− is the Frechet derivative of Wrt at Tz,x in r ,r Lxt+ t+1 ∆rt+1 Lrxt− = −Wrt (Tz,x ) to yield r ,r Lrxt− 0 ⊗ Lxt+ t+1 δrt+1 − δrt = wrt (Tz,x ) (30) rt+1 =1 which we have 0 0 to solve in order to find the Newton improving step δ ≡ 0 δ1 · · · δh . • Stacking all the equations, δ is solved as δ = Γ−1 w where L1x− 0 ⊗ Lx1,1 ··· L1x− 0 ⊗ L1,h + + − Inx ×nT x .. .. .. Γ≡ . . . h,1 h,h h 0 h 0 Lx− ⊗ Lx+ · · · Lx− ⊗ Lx+ − Inx ×nT 0 with w ≡ w10 · · · wh0 22 (31) Relation to Farmer, Waggoner and Zha (2011) As discussed earlier, Farmer et al. (2011) consider problems in which the coefficient matrix on the forward-looking variables is known in the current period. Our algorithms consider the more general case where those matrices are unknown in the current period. Another difference between Farmer et al. (2011) and this paper is that rather than perturbing the T as we do, they build the Newton by computing derivatives directly. Our approach is more efficient as it solves a smaller system. In particular, in our case, Γ in (31) is of size nx nh ×nx nh, while it would be of size (n + l)2 h × (n + l)2 h in the Farmer et al. (2011) case, where l the number of expectational errors, n is the number of equations in the system and h is the number of regimes. These computational gains can be substantial, but still the computational costs are significant. This owes to two facts: • First, Kronecker products expensive operations • Secondly, for models used for policy analysis, the matrices to build and invert are huge. In Smets and Wouters (2007), one of the models we will solve in the computational experiments, n = 40. With h = 2, the matrix to build and invert (Γ) is of size 3200 × 3200. And this has to be repeated at each step of the Newton algorithm until convergence. The natural question then is to ask whether there is a chance of solving medium scale (or large) MSDSGE models of the size that is relevant to policymakers in central banks. The answer is yes, at least if we find a more efficient way of solving (30). Here, the ideal method would: 1. only need matrix-vector products and compute those products efficiently 2. have modest memory requirements and in particular, avoid building and storing the huge matrix Γ in (31). 3. be based on some kind of minimization principle 4.2.3 Solution technique 3: Newton without kroneckers The first and the second requirements are achieved by exploiting the insights of Tadonki and Philippe (1999). They consider the problem of computing an expression of the form (A ⊗ B) z where A and B are matrices and z is a vector. Relying on the identity in (32), they show how (I ⊗ B) z can be computed efficiently without using a Kronecker product. The result of this calculation is a vector, say z 0 . Then they 23 also show that (A ⊗ I) z 0 can also be computed efficiently without using a Kronecker product. (A ⊗ B) z = (A ⊗ I) [(I ⊗ B) z] (32) Hence we can compute (A ⊗ B) z without building (A ⊗ B) and without storing it. Equation (30) involves this type of computation. The third requirement is achieved by exploiting the ”transpose-free quasi minimum residuals” (TFQMR) method, for solving large linear systems, due to Freund (1993). It is an iterative method defined by a quasi-minimization of the norm of a residual vector over a Krylov subspace. The advantage of this technique in addressing our problem is that it involves the coefficient matrix Γ only in the form of matrixvector products, which we can compute efficiently as discussed above. Moreover, the TFQMR technique overcomes the problems of the bi-conjugate gradient (BCG) method, which is known to (i) exhibit erratic convergence behavior with large oscillations in the residual norm, (ii) have a substantial likelihood of breakdowns (more precisely, division by 0)29 . The system to solve for finding the Newton improving step (30), is equivalent to Γδ = w. It turns out that we can for a given δ, we can compute Γδ, without ever building Γ and without using a single Kronecker product. Hence we can solve for δ using the TFQMR technique. We are now armed with the tools for summarizing the whole Newton algorithm. Algorithm 2 Pseudo code for the complete Newton algorithm 1. k=0 (k) Tz,x h 1(k) Tz,x h(k) Tz,x 2. pick a starting value ≡ ··· i h (k) (k) , using (26) · · · Wh Tz,x W1 Tz,x i and compute W (k) (Tz,x ) ≡ 3. while W (k) 6= 0 r ,r (a) compute Lxt+ t+1 and Lrxt− , rt , rt+1 = 1, 2, ..., h, as in (27) and (28) respectively h i (k) (b) solve (29) for ∆(k) ≡ ∆(k) using algorithm (4.2.2) or algo· · · ∆ 1 h rithm (4.2.3) (c) k=k+1 29 The BCG method, due to Lanczos (1952), extends the classical conjugate gradient algorithm (CG) to non-Hermitian matrices. 24 (k) (k−1) (d) Tz,x = Tz,x + ∆(k−1) h P i−1 (k) r (k) rt+1 (k) h + (e) Wrt Tz,x = Tz,xt + A0rt + A T λx A− r t , rt = rt+1 =1 rt ,rt+1 z,x 1, 2, ..., h 4.3 Stability in the non-time-varying transition probabilities case The traditional stability concepts for constant-parameter linear rational expectations models do not extend to the Markov switching case. Following the lead of Svensson and Williams (2007) and Farmer et al. (2011) among others, this paper uses the concept of mean square stability (MSS), as defined in Costa et al. (2005), to characterize stable solutions of non-time-varying transition probabilities. Consider the MSDSGE system whose solution is given by equation (8) and with constant transition probability matrix Q such that Qrt ,rt+1 = prt ,rt+1 . Using the selection matrix λx , we can extract the portion of the solution for state endogenous variables only to get rt rt rt rt xt (z) = λx T rt (z̄rt )+λx Tz,x (xt−1 − λx T rt (z̄rt ))+λx Tz,σ σ +λx Tz,ε 0 εt +...+λx Tz,εk εt+k This system and thereby (8) is MSS if for any initial condition x0 , there exist a vector µ and a matrix Σ independent of x0 such that limt−→∞ kExt − µk = 0 and limt−→∞ kExt x0t − Σk = 0. Hence the covariance matrix of the process is bounded. As shown by Gupta et al. (2003) and Costa et al. (2005, chapter 3), a necessary and sufficient condition MSS is that matrix Υ, as defined in (33), has all its eigenvalues inside the unit circle30 . 1 1 ⊗ λx Tz,x λx Tz,x ... (33) Υ ≡ (Q ⊗ In2 ×n2 ) h h λx Tz,x ⊗ λx Tz,x A particular note should be taken of the facts that: • With endogenous probabilities as this paper discusses, the MSS concept become useless. 30 It is not very hard to 1 1 p1,1 λx Tz,x ⊗ λx Tz,x 1 1 p2,1 λx Tz,x ⊗ λx Tz,x .. . 1 1 ph,1 λx Tz,x ⊗ λx Tz,x see that a computationally more efficient representation of Υ given by: 2 2 h h p1,2 λx Tz,x ⊗ λx Tz,x · · · p λ T ⊗ λ T 1,h x x z,x z,x 2 2 h h p2,2 λx Tz,x ⊗ λx Tz,x · · · p2,h λx Tz,x ⊗ λx Tz,x . .. .. . . 2 2 h h ph,2 λx Tz,x ⊗ λx Tz,x · · · ph,h λx Tz,x ⊗ λx Tz,x 25 • Even in the constant-transition probabilities case, there are no theorems implying that stability of the first-order perturbation implies the stability of higher orders. 4.4 Proposed initialization strategies Both functional iterations and the Newton methods require an initial guess of the solution. As stated earlier, we do not attempt to compute all possible solutions of (11) and then check that only one of them is stable as it is done in Foerster et al. (2014). Initialization is key not only for finding a solution but also, potentially, for the speed of convergence of any algorithm. We derive all of our initialization schemes from making assumptions on equation (25), which, for convenience we report in (34), abstracting from the iteration superscript. " ! #−1 h X rt 0 + rt+1 Tz,x = Art + Art ,rt+1 Tz,x A− λx (34) rt rt+1 =1 there are at least three ways to initialize a solution: • The first approach makes the assumption that A− rt = 0, rt = 1, 2, ..., h. In this case then, we initialize the solution at the solution that we would obtain in the absence of predetermined variables. This solution is trivially zero as can be seen from equation (34). Formally, it corresponds to rt (0) Tz,x = 0, for rt = 1, 2, ..., h • The second approach makes the assumption that A+ rt ,rt+1 = 0, rt , rt+1 = 1, 2, ..., h. While the first approach assumes there are no predetermined variables, this approach instead assumes there are no forward-looking variables. In that case, the solution is also easy to compute using (34). In RISE, this option is the default initialization scheme. It corresponds to setting: −1 − rt (0) Tz,x = − A0rt Art , for rt = 1, 2, ..., h • The third approach recognizes that making a good guess for a solution is difrt+1 ficult and instead tries to make guesses on the products A+ rt ,rt+1 Tz,x . We proceed as follows: (i) we would like the guesses for each regime to be related, just as they would be in the final solution; (ii) we would like the typical element rt+1 to have a norm roughly of the squared order of magnitude of of A+ rt ,rt+1 Tz,x 26 a solvent or final solution. Given those two requirements, we make the asrt+1 sumption that the elements of A+ rt ,rt+1 Tz,x , rt , rt+1 = 1, 2, ..., h are drawn from a standard normal distribution scaled by the approximate norm of a solvent squared31 . We approximate the norm of a solvent by p η0 + η02 + 4η+ η σ≡ 2η+ . Then , η+ ≡ maxrt A+ where η0 ≡ maxrt A0rt , η ≡ maxrt A− rt rt rt+1 + with draws of Art ,rt+1 Tz,x in hand, we can use (34) to compute a guess for a solution. In RISE this approach is used when trying to locate alternative solutions. It can formally be expressed as Tr(0) = − A0rt + t ! !−1 h X + guess rt+1 Art ,rt+1 Tz,x A− λx rt rt+1 =1 rt+1 guess with A+ T ∼ σ 2 N (0, 1) . rt ,rt+1 z,x ij 5 Performance of alternative solution methods In this section, we investigate the performance of the different algorithms in terms of their ability to find solutions but also in terms of their efficiency. To that end, we use a set of problems found in the literature. 5.1 Algorithms in contest The algorithms in the contest are : 1. Our two functional iteration algorithms: MFI full and MFI, with the latter version further exploiting sparsity. 2. Our Newton algorithms using Kronecker products: MNK full and MNK, with the latter version further exploiting sparsity. 3. Our Newton algorithms without Kronecker products: MN full and MN, with the latter version further exploiting sparsity. 31 Implicit also is the assumption that A+ st+1 and Tst+1 are of the same order of magnitude. 27 4. Our Matlab version of the Farmer et al. (2011) algorithm: FWZ: It is included for direct comparison as it is shown to outperform the FP and SW algorithms described above.32 . 5.2 Test problems The details of the test problems considered are given in appendix (C). The models considered are the following: 1. cho214 EIS is a 6-variable model by Cho (2014), with switching elasticity of intertemporal substitution. 2. cho214 MP is a 6-variable model by Cho (2014), with switching monetary policy parameters. 3. frwz2013 nk 1 is a 3-variable new Keynesian model by Foerster et al. (2013) 4. frwz2013 nk 2 is the same model as frwz2013 nk 1 but with a different parameterization. 5. frwz2013 nk hab is a 4-variable new Keynesian model with habit persistence by Foerster et al. (2013). 6. frwz2014 rbc is a 3-variable RBC model by Foerster et al. (2014) 7. fwz2011 1 is a 2-variable model by Farmer et al. (2011) 8. fwz2011 2 is the same model as fwz2011 1 but with a different parameterization. 9. fwz2011 3 is the same model as fwz2011 1 but with a different parameterization. 10. smets wouters is the well-known model by Smets and Wouters (2007). Originally the model is a constant-parameter model. We turn it into a switching model by creating a regime in which monetary policy is unresponsive to changes in inflation and output gap. See appendix (C) for further details. 32 This algorithm is also included among the solvers in RISE and the coding strictly follows the formulae in the Farmer et al. (2011) paper. 28 All the tests use the same tolerance level for all the algorithms and are run on a Dell Precision M4700 computer with Intel Core i7 processor running at 2.8 GHz with 16GB of RAM. Windows 7 (64-bit) is the operating system used. The results included in this paper are computed using MATLAB 8.4.0.150421 (R2014b) environment. 5.3 Finding a solution: computational efficiency The purpose of this experiment is to assess both the ability of each algorithm to find a solution and the speed with which the solution is found. To that end, we solved each model 50 times for each solution procedure and averaged the computing time spent across runs 33 . The computation times we report include: • the time spent in finding a solution, • the time spent in checking whether the solution is MSS (which is oftentimes the most costly step) • some further overhead from various functions that call the solution routines in the object oriented system. The last two elements are constant across models and therefore should not have an impact on the relative performances of the algorithms. The results presented in table 1 are all computed using the default initialization scheme of RISE, i.e. the one that assumes there are no forward-looking variables in the model34 . The table reports for each model, its relative speed performance. This relative performance is computed as the absolute performance divided by the best performance. A relative performance of say, x, means that for that particular model, the algorithm is x times slower than the best, whose performance is always 1. The table also reports in column 2 the number of variables in the model under consideration and in column 3 the speed in seconds of the fastest solver. The speed of computation tends to increase with the size of the model as could have been expected, although this pattern is not uniform for small models. But we note that the MNK full algorithm tends to be the fastest algorithm on small models. The FWZ algorithm comes first in two of the contests but as argued earlier because it imposes that the coefficient matrix on forward-looking variables depends only on 33 Running the same code several times is done simply to smooth out the uncertainty across runs. The results for other initialization schemes have also been computed and are available upon request 34 29 the current regime, it saves some computations and gives the wrong answer when the coefficient matrix on forward-looking variables also depends on the future regime. This is the case in the Cho2014 EIS model. The MNK full algorithm is also faster than its version that exploits the sparsity of the problem, namely MNK. This is not surprising however because when the number of variables is small, the overhead incurred in trying to exploit sparsity outweighs the benefits. This is also true for the functional iteration algorithms as MFI full tends to dominate MFI. For the same small models, the MNK full algorithm, which uses Kronecker products,is also faster than its counterparts that do not use Kronecker products, namely MN full and MN. Again, here there is no surprise: when the number of variables is small, computing Kronecker products and inverting small matrices is faster than trying to avoid those computations, which is the essence of the MN full and MN algorithms. The story does not end there, however. As the number of variables increases the MN algorithm shows better performance. By the time, the number of variables gets to 40 as in the Smets and Wouters model, the MN algorithm, which exploits sparsity and avoid the computation of Kronecker products becomes the fastest. In particular, on the smets wouters model, the MN algorithm is about 7 times faster than its Kronecker version (MNK), which also exploits sparsity and about 39 times faster than the version of the algorithm that builds Kronecker products and and does not exploit sparsity (MNK full), which is the fastest in small models. For the parameterization under consideration, the FWZ and the functional iteration algorithms never finds a solution35 . 35 Although the Newton algorithms converge fast, rapid convergence occurs only in the neighborhood of a solution (see for instance Benner and Byers (1998)). It may well be the case that the first Newton step is disastrously large, requiring many iterations to find the region of rapid convergence. This is an issue we have not encountered so far, but it may be addressed by incorporating line searches along a search direction so as to reduce too long steps or increase too short ones, which in both cases would accelerate convergence even further. 30 nvars best speed cho2014 EIS 6 0.01798 cho2014 MP 6 0.01259 3 0.01486 frwz2013 nk 1 3 0.01571 frwz2013 nk 2 frwz2013 nk hab 4 0.01622 3 0.01628 frwz2014 rbc fwz2011 1 2 0.00924 2 0.01043 fwz2011 2 fwz2011 3 2 0.01108 40 0.2515 smets wouters mfi 4.971 1 1.34 1.309 1.059 2.676 Inf 1.57 1.877 Inf mnk 1.624 1.221 1.198 1.111 1.106 1.259 1.051 1.281 1.185 6.514 mn mfi full mnk full mn full 2.802 3.365 1.177 2.126 1.101 1.11 1.124 1.082 1.459 1.219 1 1.435 1.42 1.218 1 1.407 1.134 1 1.009 1.125 1.313 2.15 1 1.284 1.139 Inf 1.015 1.068 1.374 1.27 1 1.265 1.35 1.443 1 1.279 1 Inf 39.48 1.364 Table 1: Relative efficiency of various algorithms using the backward initialization scheme 31 fwz 1 Inf 1.092 1.059 1.146 Inf 1 1.278 1.112 Inf 5.4 Finding many solutions Our algorithms are not designed to find all possible solutions since the results they produce depend on the initial condition. Although sampling randomly may lead to different solutions, many different starting points may also lead to the same solution. Despite these shortcomings, we investigated the ability of the algorithms to find many solutions for the test problems considered. Unlike Foerster et al. (2013) and Foerster et al. (2014) our algorithms only produce real solutions, which are the ones of interest. Starting from 100 different random points, we computed the results given in table 2. For every combination of solution algorithm and test problem, there are two numbers. The first one is the total number of solutions found and the second one is the number of stable solutions. As the results show, we easily replicate the findings in the literature. In particular, • for the Cho (2014) problems, we replicate the finding that the model with switching elasticity of intertemporal substitution is determinate. Indeed all of the algorithms tested find at least one solution and at most of the solutions found is stable in the MSS sense. We do not, however, find that the solution found for the model with switching monetary policy is stable. • For the Farmer et al. (2011) examples, we replicate all the results. In the first parameterization, there is a unique solution and the solution is stable. This is what all the Newton algorithms find. The functional iteration algorithms are not able to find that solution. In the second parameterization, our Newton algorithms return 2 stable solutions and in the third parameterization, they return 4 stable solutions just as found by Farmer et al. (2011). • Foerster et al. (2013) examples: In the new Keynesian model without habits our Newton algorithms return 3 real solutions out of which only one is stable. In the second parameterization, they return 3 real solutions but this time two of them are stable. For the new Keynesian model with habits, our Newton algorithms return 6 real solutions out of which one is stable. • for the Foerster et al. (2014) RBC model, there is a unique stable solution. • The Smets and Wouters (2007) model, shows that there are many real solutions and many of those solutions are stable. The different algorithms return different solutions since they are all initialized randomly in different places. This suggests that we most likely have not found all possible solutions and that by increasing the number of simulations we could have found many more solutions. 32 nvars mfi mnk mn mfi full mnk full mn full fwz cho2014 EIS 6 11 21 21 11 21 21 21 cho2014 MP 6 10 10 10 10 10 10 00 3 11 31 31 11 31 31 31 frwz2013 nk 1 3 11 32 32 11 32 32 32 frwz2013 nk 2 frwz2013 nk hab 4 11 61 61 11 61 61 61 3 11 11 11 11 21 21 00 frwz2014 rbc fwz2011 1 2 00 11 11 00 11 11 11 2 11 22 22 11 22 22 11 fwz2011 2 fwz2011 3 2 11 44 44 11 44 44 33 40 0 0 11 2 11 4 00 16 2 10 3 0 0 smets wouters Table 2: Solutions found after 100 random starting values and number of stable solutions 33 5.5 Higher orders Our algorithms do not just solve higher order approximations for switching DSGE models. They solve higher-order approximations of constant-parameter DSGE models as a special case. We have successfully solved a second-order approximation of a version of NEMO, the DSGE model used by Norges Bank. This model includes upwards of 272 equations in its nonlinear form. We have also tested our algorithms on various models against some of the existing and freely available algorithms solving for higher-order approximations. Those algorithms include : dynare, dynare++ and codes by Binning (2013b) and Binning (2013a). In all cases we find the same results. Finally, we have run the switching RBC model by Foerster et al. (2014) in RISE and the results for a third-order approximation are presented below. Because in the Foerster et al. (2014)’s the steady state is unique, we simply had to pass this information to the software, without any further need to partition the parameters. RISE prints only the unique combination of cross products and uses the sign @sig to denote the perturbation term. The results produced by RISE closely replicate the second-order approximation presented by Foerster et al. (2014)36 . MODEL SOLUTION SOLVER :: mfi Regime 1 : a = 1 & const = 1 Endo Name C steady state 2.082588 K{-1} 0.040564 Z{-1} 0.126481 @sig -0.009066 EPS 0.009171 K{-1},K{-1} -0.000461 K{-1},Z{-1} 0.001098 K{-1},@sig -0.000462 K{-1},EPS 0.000080 Z{-1},Z{-1} -0.058668 Z{-1},@sig 0.000323 Z{-1},EPS 0.000299 @sig,@sig -0.024022 @sig,EPS 0.000023 EPS,EPS 0.000022 K{-1},K{-1},K{-1} 0.000011 K{-1},K{-1},Z{-1} -0.000009 K{-1},K{-1},@sig -0.000001 K{-1},K{-1},EPS -0.000001 K{-1},Z{-1},Z{-1} -0.000332 K 22.150375 0.969201 -2.140611 -0.362957 -0.155212 -0.000167 -0.047836 -0.008170 -0.003468 1.168197 0.018293 0.007642 0.026995 0.001326 0.000554 0.000005 0.000001 -0.000000 0.000000 0.017407 Z 1.007058 0.000000 0.100000 0.018459 0.007251 0.000000 0.000000 0.000000 0.000000 -0.044685 0.000916 0.000360 0.000169 0.000066 0.000026 0.000000 0.000000 0.000000 0.000000 0.000000 36 The second order terms of the RISE output should be multiplied by 2 in order to compare them with the output presented in Foerster et al. (2014). 34 K{-1},Z{-1},@sig K{-1},Z{-1},EPS K{-1},@sig,@sig K{-1},@sig,EPS K{-1},EPS,EPS Z{-1},Z{-1},Z{-1} Z{-1},Z{-1},@sig Z{-1},Z{-1},EPS Z{-1},@sig,@sig Z{-1},@sig,EPS Z{-1},EPS,EPS @sig,@sig,@sig @sig,@sig,EPS @sig,EPS,EPS EPS,EPS,EPS 0.000005 0.000002 -0.000185 0.000000 0.000000 0.037330 -0.000250 -0.000097 -0.000418 -0.000001 0.000001 -0.002111 -0.000028 -0.000000 0.000000 0.000269 0.000115 0.000231 0.000020 0.000008 -0.811474 -0.006515 -0.002777 -0.000480 -0.000043 -0.000018 0.001640 -0.000038 -0.000003 -0.000001 0.000000 0.000000 0.000000 0.000000 0.000000 0.028102 -0.000273 -0.000107 0.000006 0.000002 0.000001 0.000001 0.000000 0.000000 0.000000 Regime 2 : a = 2 & const = 1 Endo Name C steady state 2.082588 K{-1} 0.040564 @sig -0.100434 EPS 0.026867 K{-1},K{-1} -0.000461 K{-1},@sig -0.001160 K{-1},EPS 0.000233 @sig,@sig -0.023223 @sig,EPS -0.000525 EPS,EPS 0.000187 K{-1},K{-1},K{-1} 0.000011 K{-1},K{-1},@sig 0.000007 K{-1},K{-1},EPS -0.000002 K{-1},@sig,@sig -0.000188 K{-1},@sig,EPS -0.000002 K{-1},EPS,EPS 0.000001 @sig,@sig,@sig -0.001411 @sig,@sig,EPS -0.000077 @sig,EPS,EPS -0.000002 EPS,EPS,EPS 0.000001 K 22.150375 0.969201 0.926307 -0.464994 -0.000167 0.020327 -0.010399 0.043449 -0.009756 0.004982 0.000005 -0.000003 0.000000 0.000476 -0.000146 0.000075 0.002643 -0.000228 0.000069 -0.000036 Z 1.007058 0.000000 -0.041021 0.021752 0.000000 0.000000 0.000000 0.000835 -0.000443 0.000235 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 -0.000011 0.000006 -0.000003 0.000002 Finally we can use the same Foerster et al. (2014) model to illustrate in a thirdorder approximation, the effect of technology shock that is announced 4 periods from now. We compare two economies. In the first economy, agents fully anticipate the shock and therefore internalize their effect already in the current period. In the second economy, agents will be aware of the shock only when it will materialize 4 periods from now. As figure 1 illustrates, consumption already increases today while capital falls in the ”anticipate” scenario. 6 Conclusion In this paper we have proposed new solution algorithms for solving switching-parameter DSGE models. The algorithms developed are useful tools for analyzing economic 35 Figure 1: Third-order perturbation impulse responses to a TFP shock in the FRWZ2014 rbc model 36 data subject to breaks and shifts in a rational expectations environment, where agents, while forming their expectations, explicitly take into account the probability of a regime shift. In the process we have derived higher-order perturbations of switching DSGE models in which the probability of switching can be endogenous and in which anticipated future events can affect the current behavior of economic agents. Despite the many benefits of departing from the assumption of constant parameters, the techniques and the ideas of MSDSGE modeling have yet to be fully adopted by many macroeconomists as is the case for constant-parameter DSGE models popularized through Dynare. Two main reasons could potentially explain this. On the one hand such models are a lot more difficult to solve and more computationally intensive than their constant-parameter counterparts and on the other hand, there are not many flexible and easy-to-use tools that would allow economists unable or unwilling to write complex code to take full advantage of these techniques and address the current policy questions. By providing both efficient solution algorithms and the implementation of those algorithms in a toolbox, we hope to have assuaged these problems. References Adjemian, S., H. Bastani, M. Juillard, F. Karame, F. Mihoubi, G. Perendia, J. Pfeifer, M. Ratto, and S. Villemot (2011). Dynare: Reference manual, version 4. Dynare Working Papers 1, CEPREMAP. Benner, P. and R. Byers (1998). An exact line search method for solving generalized continuous-time algebraic riccati equations. IEEE Transactions on Automatic Control 43 (1), 101–107. Bernanke, B. S., M. Gertler, and S. Gilchrist (1999). The financial accelerator in a quantitative business cycle framework. In J. B. T. . M. Woodford (Ed.), Handbook of Macroeconomics (1 ed.), Volume 1, Chapter 21, pp. 1341–1393. Elsevier. Bi, H. and N. Traum (2013). Estimating fiscal limits: The case of greece. Journal of Applied Econometrics(forthcoming). Binning, A. J. (2013a). Solving second and third-order approximations to dsge models: A recursive sylvester equation solution. Working Paper 2013—18, Norges Bank. 37 Binning, A. J. (2013b). Third-order approximation of dynamic models without the use of tensors. Working Paper 2013—13, Norges Bank. Buchberger, B. (1965). Ein Algorithmus zum Auffinden der Basiselemente des Restklassenringes nach einem nulldimensionalen Polynomideal. Ph. D. thesis, University of Innsbruck. Buchberger, B. (2006, March). An Algorithm for Finding the Basis Elements in the Residue Class Ring Modulo a Zero Dimensional Polynomial Ideal. Ph. D. thesis, RISC (Research Institute for Symbolic Computation),Johannes Kepler University Linz. Cho, S. (2014, February). Characterizing markov-switching rational expectation models. Working paper, Yonsei University. Cogley, T., T. J. Sargent, and P. Surico (2012, March). The return of the gibson paradox. Working paper, New York University. Costa, O. L. D. V., M. D. Fragoso, and R. P. Marques (2005). Discrete-Time Markov Jump Linear Systems. Springer. Davig, T. and E. M. Leeper (2007). Fluctuating macro policies and the fiscal theory. In NBER Macroeconomics Annual 2007, Volume 21, pp. 247–316. National Bureau of Economic Research. Davig, T., E. M. Leeper, and T. B. Walker (2011). Inflation and the fiscal limit. European Economic Review 55 (1), 31–47. Farmer, R. F., D. F. Waggoner, and T. Zha (2008). Minimal state variable solutions to markov switching rational expectations models. Working Paper 2008-23, Federal Reserve Bank of Atlanta. Farmer, R. F., D. F. Waggoner, and T. Zha (2011). Minimal state variable solutions to markov switching rational expectations models. Journal of Economics Dynamics and Control 35 (12), 2150–2166. Foerster, A., J. Rubio-Ramirez, D. Waggoner, and T. Zha (2013). Perturbation methods for markov-switching models. Working Paper 2013-1, Federal Reserve Bank of Atlanta. Foerster, A., J. Rubio-Ramirez, D. Waggoner, and T. Zha (2014). Perturbation methods for markov-switching models. Working Paper 20390, National Bureau of Economic Research. Freund, R. W. (1993). A transpose-free quasi-minimal residual algorithm for nonhermitian linear systems. SIAM Journal of Scientific Computing 2 (14), 470– 482. 38 Ghorbal, K., A. Sogokon, and A. Platzer (2014). Invariance of conjunctions of polynomial equalities for algebraic differential equations. Lecture Notes in Computer Science 8723, 151–167. Gupta, V., R. Murray, and B. Hassibi (2003). On the control of jump linear markov systems with markov state estimation. In Proceedings of the 2003 American Automatic Control Conference, pp. 2893–2898. Juillard, M. and J. Maih (2010). Estimating dsge models with observed real-time expectation data. Technical report, Norges Bank. Justiniano, A. and G. E. Primiceri (2008). The time-varying volatility of macroeconomic fluctuations. American Economic Review 98 (3), 604–41. Kamenik, O. (2005, February). Solving sdge models: A new algorithm for the sylvester equation. Computational Economics 25 (1), 167–187. Klein, P. (2000). Using the generalized schur form to solve a multivariate linear rational expectations model. Journal of Economic Dynamics and Control 24, 1405–1423. Lanczos, C. (1952). Solution of systems of linear equations by minimized iterations. Journal of Research of the National Bureau of Standards 49, 33–53. Levintal, O. (2014). Fifth order perturbation solution to dsge models. Working paper, Interdisciplinary Center Herzliya (IDC), School of Economics. Leykin, A. (2004). On parallel computation of gröbner bases. In ICPP Workshops, pp. 160–164. Lubik, T. A. and F. Schorfheide (2004, March). Testing for indeterminacy: An application to u.s. monetary policy. American Economic Review 94 (1), 190– 217. Maih, J. (2010). Conditional forecasts in dsge models. Working Paper 2010—07, Norges Bank. Mayr, E. W. and A. R. Meyer (1982). The complexity of the word problems for commutative semigroups and polynomial ideals. Advances in Mathematics 46 (3), 305–329. Richter, A. W., N. A. Throckmorton, and T. B. Walker (2014, December). Accuracy speed and robustness of policy function iteration. Computational Economics 44 (4), 445–476. 39 Schmitt-Grohe, S. and M. Uribe (2004). Solving dynamic general equilibrium models using a second-order approximation to the policy function. Journal of Economics Dynamics and Control 28, 755–775. Sims, C. A. and T. Zha (2006). Were there regime switches in us monetary policy? American Economic Review 96 (1), 54–81. Smets, F. and R. Wouters (2007). Shocks and frictions in us business cycles: A bayesian dsge approach. American Economic Review, American Economic Association 97 (3), 586–606. Stock, J. H. and M. W. Watson (2003). Has the business cycle changed? evidence and explanations. FRB Kansas City symposium, Jackson Hole. Svensson, L. E. O. and N. Williams (2007). Monetary policy with model uncertainty: Distribution forecast targeting. CEPR Discussion Papers 6331, C.E.P.R. Svensson, L. E. O. and N. Williams (2009). Optimal monetary policy under uncertainty in dsge models: A markov jump-linear-quadratic approach. In K. Schmidt-Hebbel, C. E. Walsh, and N. Loayza (Eds.), Central Banking, Analysis, and Economic Policies (1 ed.), Volume 13 of Monetary Policy under Uncertainty and Learning, pp. 077–114. Central Bank of Chile. Tadonki, C. and B. Philippe (1999). Parallel multiplication of a vector by a kronecker product of matrices. Parallel Distributed Computing Practices PDCP 2 (4). Van Loan, C. F. (2000). The ubiquitous kronecker product. Journal of Computational and Applied Mathematics 123 (123), 85–100. A A.1 Expressions for various expectations terms First order The definitions of a0z and a1z appearing in equation (10) are as follows λbf Tzrt+1 hrzt Tzrt mp 0 az ≡ m b mε,0 θ̂rt+1 mσ 40 a1z ! λbf Tzrt+1 ≡ 0(ns +2np +2nb +nf +nε +nθ )×nz We have Evz = a0z A.2 (35) Second order hrzzt = λx Tzzrt (36) 0(1+(k+1)nε )×n2z The definitions of a0zz and a1zz appearing in equation (19) are as follows: rt+1 λbf Tz,x λx Tzzrt + λbf Tzzrt+1 (hrzt )⊗2 a0zz ≡ Tzzrt 0(nx +nε +nθ )×n2z λbf Tzzrt+1 0 a1zz ≡ 0 It follows that Evzz ≡ a0zz + a1zz Eu⊗2 (37) ⊗2 (38) and Evz⊗2 = (a0z ) ⊗2 + (a1z ) Eu⊗2 The definition of u in (4) implies that37 0(np +nb +1+knε )2 ×n2z ⊗2 E (u ) = I~nε (mσ ⊗ mσ ) 0(np +nb +1+knε )k ×nkz = , where Mk is the tensor of the In general, we will have E u vec (Mk ) m⊗k σ k-order moments of εt . This expression should be cheap to compute since mσ is a sparse vector. 37 ⊗k 41 A.3 Third order The definitions of a0zz and a1zz appearing in equation (21) are as follows: rt+1 t λbf Tzzz (hrzt )⊗3 + ω1 (λbf Tzzrt+1 (hrzt ⊗ hrzzt )) + λbf Tzrt+1 hrzzz rt a0zzz ≡ Tzzz 0(nx +nε +nθ )×n3z rt+1 λbf Tzzz 1 azzz ≡ 0(nT +nx +nε +nθ )×n3z where t hrzzz ∂ 3 T rt+1 (hrt (z)+uz) ∂z 3 = rt λx Tzzz (39) (40) 0(1+(k+1)nε )×n3z rt+1 rt rt+1 + λx Tzzz = Tzzz (hrzt )⊗3 + ω1 (Tzzrt+1 (hrzt ⊗ hrzzt )) + Tz,x rt+1 ⊗3 rt+1 rt Tzzz u + ω1 (Tzz (u ⊗ hzz )) + rt+1 Tzzz P (hrzt )⊗2 ⊗ u + P (hrzt ⊗ u⊗2 ) We have Evzzz = a0zzz + a1zzz P hrzt ⊗ Eu⊗2 + a1zzz Eu⊗3 (41) We also need E (vz⊗3 ), E (vz ⊗ vzz ). Using (19) and (10), those quantities are computed as Evz⊗3 = ⊗3 ⊗2 (a0z ) + Ea0z ⊗ (a1z u) + ⊗2 ⊗3 E (a1z u) ⊗ a0z + E (a1z u) ⊗ a0z ⊗ (a1z u) + E (a1z u) E (vz ⊗ vzz ) = a0z ⊗ a0zz + (a1z ⊗ a1zz ) (Eu⊗2 ⊗ hrzt + E (u ⊗ hrzt ⊗ u)) + a0z ⊗ a1zz Eu⊗2 + (a1z ⊗ a1zz ) Eu⊗3 B (42) (43) Proof of local convergence of functional iteration Before proving the theorem we first present a perturbation lemma that will facilitate the subsequent derivations. Lemma 3 Perturbation Lemma: Let Ξ be a non-singular matrix. If Ω is such that kΞ−1 k kΞ−1 k kΩk < 1, then (Ξ + Ω)−1 ≤ −1 1−kΞ 42 kkΩk Proof. We have −1 (Ξ + Ω)−1 = h [Ξ (I + Ξ−1 Ω)]i P∞ i −1 −1 = i=0 (−Ξ Ω) Ξ which implies that P i ∞ (Ξ + Ω)−1 ≤ i=0 (−Ξ−1 Ω) kΞ−1 k P∞ i ≤ k(−Ξ−1 Ω)k kΞ−1 k i=0 P∞ i −1 −1 = i=0 k(Ξ Ω)k kΞ k 1 −1 = kΞ k 1−k(Ξ−1 Ω)k −1 kΞ k ≤ 1−kΞ−1 kkΩk Now we can prove the theorem. Define Ξrt ≡ A0rt h X + ∗ A+ rt ,rt+1 Trt+1 rt+1 =1 and Ω(k−1) rt h X ≡ (k−1) ∗ A+ rt ,rt+1 Trt+1 − Trt+1 rt+1 =1 which implies P h (k−1) (k−1) (∗) + Ωrt = rt+1 =1 Art ,rt+1 Trt+1 − Trt+1 P (k−1) (∗) ≤ a+ hrt+1 =1 prt ,rt+1 Trt+1 − Trt+1 (44) Also, we can write (k) Trt − (∗) Trt = Ξ−1 rt − A0rt + (k−1) + rt+1 =1 Art ,rt+1 Trt+1 Ph −1 A− rt −1 P (k−1) (k−1) (∗) = − A0rt + hrt+1 =1 A+ Ωrt Trt T r rt ,rt+1 t+1 −1 h i (k−1) (k−1) (∗) − Ωrt + Ξrt Ωrt Trt 43 so that i −1 h (k−1) (∗) (k) Ph (k−1) (∗) (∗) + + Ξrt Trt Trt − Trt ≤ Ωrt rt+1 =1 Art ,rt+1 Trt+1 − Trt+1 i −1 hP (∗) (k−1) (k−1) (∗) h + Ω + Ξ A T − T = Trt r r r r t t+1 t+1 t rt+1 =1 rt ,rt+1 −1 P (∗) (k−1) (k−1) (∗) h + Ω + Ξ T − T ≤ a Trt r r r rt t t+1 t+1 rt+1 =1 −1 P (k−1) (k−1) (∗) h ≤ θa+ Ω + Ξ T − T rt+1 rt+1 rt rt rt+1 =1 (45) (0) ∗ We can now use induction in the rest of the proof. Suppose Trt − Trt ≤ δ, −1 (0) (0) + rt = 1, 2, ..., h. Then (44) implies Ωrt ≤ a δ. Hence Ξrt Ωrt ≤ τ a+ δ. Note + that the requirement that τ a+ (θ + δ) < 1 automatically implies that τ a δ < 1. −1 (0) ≤ Therefore we can use the perturbation lemma to establish that Ω + Ξ r r t t −1 kΞrt k (1) (∗) τ (0) ≤ 1−τ a+ δ . Plugging this result into (45) implies that Trt − Trt ≤ 1−kΞ−1 Ω rt rt k Ph (0) (∗) θa+ τ θa+ τ T − T rt+1 rt+1 , essentially defining ϕ ≡ 1−τ a+ δ and saying that the rt+1 =1 1−τ a+ δ proposition holds for k = 1. that the proposition holds for some arbitrary order k, that is Now suppose Ph (k−1) (k) ∗ ∗ Trt − Trt ≤ ϕ rt+1 =1 Trt+1 − Trt+1 . We now have to show that it also holds (k−1) for order k + 1. Since by assumption Trt+1 − Tr∗t+1 ≤ δ, for rt = 1, 2, ..., h, we Ph (k) (k−1) (k) ∗ ∗ have Trt − Trt ≤ ϕ rt+1 =1 Trt+1 − Trt+1 ≤ ϕδ ≤ δ. (44) implies Ωrt ≤ −1 (k) τ + ≤ a δ. Then by the perturbation lemma, if follows that Ωrt + Ξrt . 1−τ a+ δ This implies by (45) that −1 Ph (k+1) (k) (∗) (k) (∗) + − Trt ≤ θa Ωrt + Ξrt T − T Trt r r t+1 t+1 rt+1 =1 Ph (k) (∗) θa+ τ ≤ rt+1 =1 Trt+1 − Trt+1 1−τ a+ δ . And so, the proposition also holds at order k + 1. 44 C C.1 C.1.1 Models Cho 2014: Markov switching elasticity of intertemporal substitution Model equations πt − κyt = βπt+1 + zS,t yt + 1 1 σ (st+1 ) 1 it = πt + yt + zD,t σ (st ) σ (st ) σ (st ) σ (st ) −(1 − ρ)φπ πt + it = ρit−1 + zM P,t zS,t = ρs zS,t−1 + St St ∼ N (0, 1) zD,t = ρd zD,t−1 + D t D t ∼ N (0, 1) P zM P,t = ρmp zM P,t−1 + M t C.1.2 Parameterization ρ 0.95 C.2 C.2.1 P M ∼ N (0, 1) t φπ σ(1) σ(2) κ 1.5 1 5 0.132 β ρs ρd 0.99 0.95 0.95 ρmp 0 p1,1 p2,2 0.95 0.875 Cho 2014: Regime switching monetary policy Model equations πt − κyt = βπt+1 + zS,t yt + 1 1 it = πt + yt + zD,t σ σ −(1 − ρ)φπ πt + it = ρit−1 + zM P,t 45 zS,t = ρs zS,t−1 + St St ∼ N (0, 1) zD,t = ρd zD,t−1 + D t D t ∼ N (0, 1) P zM P,t = ρmp zM P,t−1 + M t C.2.2 Parameterization ρ 0.95 C.3 C.3.1 P M ∼ N (0, 1) t φπ (1) φπ (2) σ 0.9 1.5 1 κ β ρs 0.132 0.99 0 ρd 0 ρmp 0 p1,1 0.85 p2,2 0.95 Farmer, Waggoner and Zha (2011) Model equations φπt = πt+1 + δπt−1 + βrt εt ∼ N (0, 1) r = ρrt−1 + εt C.3.2 Parameterization δ (1) parameterization 1 0 parameterization 2 -0.7 parameterization 3 -0.7 C.4 C.4.1 δ (2) β (1) β (2) ρ (1) ρ (2) φ (1) φ (2) p1,1 0 1 1 0.9 0.9 0.5 0.8 0.8 0.4 1 1 0 0 0.5 0.8 0 -0.2 1 1 0 0 0.2 0.4 0.9 p2,2 0.9 0.64 0.8 Foerster, Rubio-Ramirez, Waggoner and Zha (2014): RBC model Model equations 1−α α−1 cυ−1 = βztυ−1 cυ−1 + 1 − δ) t t+1 (αzt+1 kt α ct + zt kt = zt1−α kt−1 + (1 − δ)kt−1 46 log(zt ) = (1 − ρ (st ))µ (st ) + ρ (st ) log(zt−1 ) + σ (st ) εt C.4.2 εt ∼ N (0, 1) steady state zt = exp (µ) 1 kt = αzt1−α 1 −1+δ βztυ−1 1 α−1 ct = zt1−α ktα + (1 − δ) kt − zt kt C.4.3 α 0.33 C.5 C.5.1 Parameterization β υ δ µ(1) µ(2) 0.9976 -1 0.025 0.0274 -0.0337 ρ(1) ρ(2) σ(1) σ(2) p1,1 p2,2 0.1 0 0.0072 0.0216 0.75 0.5 Foerster, Rubio-Ramirez, Waggoner and Zha (2013): NK model Model equations 1 − κ2 (πt − 1)2 Yt 1 Rt 1 − βEt =0 2 κ 1 − 2 (πt+1 − 1) Yt+1 exp (µt+1 ) πt+1 2 κ (π − 1) 1 − κ t 2 (πt+1 − 1) πt+1 −κ (πt − 1) πt = 0 (1 − η)+η 1 − (πt − 1)2 Yt +βκEt 2 1 − κ2 (πt+1 − 1)2 C.5.2 Rt−1 Rss ρ (1−ρ)ψt πt exp (σεt ) − Rt Rss =0 Steady state solution πt 1 yt Rt η−1 η exp(µ) πt β 47 εt ∼ N (0, 1) C.5.3 Parameterization β κ η ρ r σr p1,1 0.9976 161 10 0.8 0.0025 0.9 C.6 p2,2 0.9 µ (1) 0.0075 µ (2) 0.0025 ψ (1) ψ (2) 3.1 0.9 Foerster, Rubio-Ramirez, Waggoner and Zha (2013): NK model with habits C.6.1 Model equations 1 ϕ −β − λt = 0 Ct − ϕ exp (−µ̄) Ct−1 Xt+1 exp (µt+1 ) − ϕCt ψ βλt+1 Rssπt t exp(σεt ) exp(µt+1 ) πt+1 (1 − η) + − λt = 0 εt ∼ N (0, 1) η λt+1 Xt+1 1 − κ2 (πt − 1)2 − κ(πt − 1)πt + βκ(πt+1 − 1)πt+1 λt λt Ct 1 − κ2 (πt+1 − 1)2 Xt − C t C.6.2 steady state πt 1 C.6.3 λt Xt η η−1 exp(µ)−βϕ η−1 exp(µ)−ϕ η Parameterization β κ η ρr σ p1,1 0.9976 161 10 0.8 0.0025 0.9 C.7 Ct Xt p2,2 0.9 µ (1) 0.0075 µ (2) 0.0025 ψ (1) ψ (2) Rss exp(µ) 3.1 0.9 β The Smets-Wouters model with switching monetary policy We consider a medium scale DSGE model by Smets and Wouters (2007). We turn it into a Markov switching model by introducing a Markov chain in two regimes, controlling the behavior of the parameters entering the monetary policy reaction function. In the good regime, the parameters assume the mode values estimated by Smets and Wouters (2007) for the constant parameter case. In the bad regime the 48 ϕ 0.7 interest rate is unresponsive to inflation and output gap. However, the interest rate is allowed to respond to a persistent exogenous monetary policy shock and is therefore exogenous. In a constant-parameter model, this bad regime would be unstable. In a switching environment, however, to the extent that agents assign a sufficiently high probability to the system returning to the good regime, the system may be stable in the MSS sense. The table below displays the parameters: rπ is the reaction to inflation; ρ, the smoothing parameter; ry , the reaction to output gap; r∆y , the reaction to the change in output gap; pgood,good is the probability of remaining in the good regime; pbad,bad is the probability of remaining in the bad regime. rπ ρ ry r∆y pgood,good Good regime 1.4880 0.8762 0.0593 0.2347 0.9 Bad regime 0.0000 0.0000 0.0000 0.0000 NA 49 pbad,bad NA 0.3
© Copyright 2026 Paperzz