UvA-DARE (Digital Academic Repository) On the limiting and empirical distribution of IV estimators when some of the instruments are invalid Kiviet, J.F.; Niemczyk, J. Link to publication Citation for published version (APA): Kiviet, J. F., & Niemczyk, J. (2006). On the limiting and empirical distribution of IV estimators when some of the instruments are invalid. (UvA-Econometrics Working Paper; No. 2006/02). Amsterdam: Faculteit Economie en Bedrijfskunde. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: http://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl) Download date: 29 Jul 2017 On the limiting and empirical distribution of IV estimators when some of the instruments are invalid Jan F. Kiviet and Jerzy Niemczyk Tinbergen Institute, University of Amsterdam preliminary draft of 11 July 2006 JEL-classi…cation: C13, C15, C30 Keywords: empirical density, inconsistent estimators, invalid instruments, limiting distribution, weak instruments Abstract In practice IV estimation will often be employed when in fact some of the instruments are invalid. This is because moment conditions can only be tested if already a su¢ cient number of valid but untestable instruments is available. Moreover, tests for the validity of additional instruments, i.e. so-called overidenti…cation restriction tests, will have limited power when instruments are weak and samples are small. We examine the case where the model is treated as over or just identi…ed, although some of the instruments may actually be invalid, whereas all variables are stationary. We derive an expression in terms of the various parameters of the data generating process for the inconsistency of the invalid IV estimator and obtain its limiting normal distribution. In speci…c simple models we scan this approximation to the …nite sample distribution over the relevant parameter space and compare it with the actual empirical distribution obtained by simulation. This parameter space is transformed to measures for: (i) model …t; (ii) simultaneity; (iii) instrument invalidity; and (iv) instrument weakness. We present our …ndings over this multi-dimensional parameter space by using dynamic visualization techniques, which can best be enjoyed on screen. Our major …ndings are that: (a) for the accuracy of large sample asymptotic approximations instrument weakness is much more detrimental than instrument invalidity; (b) the realizations of IV estimators obtained from strong but possibly invalid instruments seem usually much closer to the true parameter values than those obtained from valid but weak instruments. Department of Quantitative Economics, Faculty of Economics and Business, University of Amsterdam, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands; phone +31.20.5254217; email [email protected] and [email protected] 1 1 Introduction When in a regression model some of the explanatory variables are contemporaneously correlated with the disturbance term, whereas this correlation is unknown, then one needs further variables to use as instruments in order to …nd consistent estimators by the method of moments. These instrumental variables should have known (usually zero) correlation with the disturbances. Then they provide so-called orthogonality conditions that make it possible to obtain consistent instrumental variable (IV) estimators. In practice, however, it is mostly hard to assess whether an instrumental variable is valid indeed, i.e. is uncorrelated with the disturbance term. Firstly, instrument validity or orthogonality tests are only viable under just identi…cation or overidenti…cation by truly valid instruments. That is, they are built on the prerequisite of having already available a number of undisputed valid instruments, at least as great as the number of coe¢ cients (k) to be estimated, whereas the validity of the initial k instruments is untestable. Moreover, orthogonality tests will have reasonable power only when the instruments employed and those under test are not too weak (are su¢ ciently correlated with the regressors) and the sample size is substantial. Therefore, it seems very likely that IV estimation will often be employed when in fact some of the instruments are invalid. In this case the IV estimator for the structural parameters is inconsistent, even when the structural equation itself is correctly speci…ed for the parameters of interest. In this paper we consider general and speci…c forms of linear structural equations and corresponding partial reduced form systems in stationary variables and examine the IV estimator when some of its exploited orthogonality conditions actually do not hold. We cover the general case where the number of moment conditions exploited (l); i.e. the valid and invalid conditions together, is at least as large as the number of unknown coe¢ cients, i.e. we consider the (alleged) over or just identi…ed case (l k): We focus on the distribution of such an invalid IV estimator for a single structural equation that for the rest has been speci…ed correctly in the sense that its implied error term is IID (independent and identically distributed) with unconditional expectation equal to zero. An alternative point of departure is chosen in Hale et al. (1980) where instruments are invalid due to omitted regressors. We derive an expression in terms of parameters and data moments for the inconsistency of the invalid IV estimator in an otherwise correctly speci…ed model and also obtain its limiting distribution. These results yield a …rst-order asymptotic approximation to the actual distribution in …nite sample of IV estimators. The asymptotic variance proves to be a rather complicated expression, although it can be substantially simpli…ed when specialized for the just identi…ed case (l = k): By simulation we verify over the relevant parameter space of simple classes of models whether these analytic …ndings are accurate regarding the actual estimator distribution in …nite sample and how these depend on the various model parameters. In the illustrations we focus …rst on a very simple speci…c type of model, entailing just one explanatory variable and one instrument. Here we show that all our …ndings, both the analytic asymptotic and the simulated …nite sample results, are driven by just four primal econometric model characteristics, in addition to the sample size. These four characteristics are straightforward transformations of the underlying parameters of the data generating process. They are all related to particular correlation coe¢ cients, viz.: (i) the model …t; (ii) the degree of simultaneity; (iii) the degree of invalidity of the instrument; and (iv) the degree of instrument weakness. Thus, even in the simple 2 one-regressor one-instrument model, the distributional properties of the IV estimator are functions involving …ve arguments, which makes it di¢ cult to depict their behavior over all relevant argument values. Instead of presenting extensive tables, we present some series of 2D and 3D graphs in print, and we use dynamic multi-dimensional visualization techniques to present our …ndings more elegantly and e¤ectively on screen through animations. We also present some results for an (alleged) overidenti…ed one-regressor model with two instruments. Here we …nd that both the actual …nite sample distribution and its asymptotic approximation can be expressed using just one extra argument, whereas at …rst face one might conjecture that both the invalidity and the strength of the extra instrument might matter separately. The analysis of IV estimators employing invalid instruments has not received much attention in the literature yet. Although its limiting behavior has been examined before, see Maasumi and Phillips (1982), Hahn and Hausman (2003) and Hall and Inoue (2003), none of these studies provides a simple explicit formula for the asymptotic variance in the general linear multivariate case, where l k 1: Such a formula is obtained here by extending an approach1 that yielded similar results for inconsistent OLS estimators. The latter can be found in Kiviet and Niemczyk (2005), which completes some initial results obtained in Joseph and Kiviet (2005). As far as we know simulation evidence on the actual …nite sample distribution of valid and invalid IV estimators covering almost the entire parameter space of some basic models as presented here has not been produced before. The analysis of the exact …nite sample properties of consistent IV estimators has a long history, see Sawa (1969) and Phillips (1980). Recent contributions and further references can be found in, for instance, Phillips (2005) and Hillier (2005). Our …ndings illustrate those on the e¤ects of instrument weakness on the …nite sample density of consistent IV estimators by both Woglom (2001), who focusses on just identi…ed IV estimators, and Forchini (2006), who gives further theoretical underpinnings in case of overidenti…cation. They also supplement these …ndings, because we provide more extensive illustrations and consider invalid instruments as well. Whereas much of the recent literature on weak instruments focusses on developing appropriate tests and con…dence sets when instruments are weak but valid, see for instance Hahn and Inoue (2002) and Andrews and Stock (2005), the present study analyzes and illustrates properties of the distribution of coe¢ cient estimators. From our simulations we establish that invalid but reasonably strong instruments yield IV estimators which have a distribution in small samples that is rather close to the analytic large-sample asymptotic approximations derived here. Hence, the distribution of these estimators is often close to normal, but has its probability mass centered around the pseudo-true-value instead of the true value. However, when instruments are very weak, we establish that the accuracy of standard large-sample asymptotics is very poor, as had already been established for the valid instrument case. More importantly, though, for both valid and invalid instruments we …nd also that when the instrument is weak the probability mass of the actual distribution of instrumental variable estimators is generally much closer to the true value of the coe¢ cient than indicated by these much too ‡at asymptotic approximations. For valid but rather weak instruments it had already been established that the …nite sample distribution of IV can be skew, and that it becomes bimodal for very strong simultaneity, whereas for extreme weakness (i.e. 1 A related approach can be found in an unplished discussion paper by Rothenberg (1972). 3 close to underidenti…cation) the dispersion explodes, while the median moves away from the true parameter value towards the probability limit of OLS. We …nd that for invalid weak instruments skewness, bimodality and a median away from the pseudo-true-value may occur for much more moderate weakness and simultaneity. Note, however, that in practice one can easily avoid to use weak instruments if one would lift the ban on invalid instruments, since weakness (unlike validity) can straight-forwardly be assessed. Because the invalid IV estimator is reasonably well behaved for reasonably strong instruments, a tentative conclusion is that it seems more promising to attempt to produce accurate inference from IV estimators based on strong (as in OLS) but possibly invalid instruments, than on valid but weak instruments. In the latter case, not only the standard asymptotic approximation is poor, but also the actual behavior of the distribution of the estimator is rather erratic and has much larger estimation errors than invalid but strong instruments produce. So, even when its actual behavior could be adequately approximated by alternative weak-instrument asymptotic methods it still may have an actual distribution that is less attractive than that of an IV estimator based on strong but possibly invalid instruments. The structure of this paper is as follows. In Section 2 we introduce the model to be estimated and the generating schemes for all explanatory and instrumental variables, with their underlying statistical assumptions. Focussing on the alleged overidenti…ed case we consider the generalized IV or 2SLS estimator and derive its inconsistency and limiting distribution (proofs in appendices) for the case where all variables are weakly stationary, i.e. their …rst and second moments are constant through time. Next the results are specialized for the just identi…ed case. Section 3 contains graphic illustrations of both the asymptotic and …nite sample distributions in speci…c simple models. From these we examine the accuracy of the asymptotic approximations and the actual behavior over di¤erent values of all the various determining factors. Moreover, we compare the e¤ectiveness of IV with respect to OLS, which uses always extremely strong but possibly invalid instruments. In separate subsections we consider models with l = k = 1 and with l 1 = k = 1: Hence, we focus on the very simple model with just one endogenous explanatory variable and one or two - possibly weak and possibly invalid - instrumental variables. Finally, Section 4 concludes. 2 Model, assumptions and theorems We consider data generating processes for variables for which n observations have been collected in y; X and Z; which all have n rows. The matrices X and Z have k and l columns respectively, with l k. X contains the explanatory variables for vector y in a linear structural model with structural disturbance vector ": The l variables collected in Z will be used as instrumental variables for estimating the k structural parameters of interest : Not all these instruments are necessarily valid, some of them may be weak, whereas others may be extremely strong, especially when columns of X correspond to (or are spanned by) columns of Z: The basic framework is characterized by the following parametrization and stationarity and regularity conditions. Framework A. We have: (i) the structural equation y = X + "; (ii) with disturbances having (for i 6= h = 1; :::; n) the (…nite) unconditional moments E("i ) = 0; E("i "h ) = 4 0; E("2i ) = 2" ; E("3i ) = 3 3" and E("4i ) = 4 4" ; (iii) while X = X + " 0 and Z = Z + " 0 ; such that E(X 0 ") = 0 and E(Z 0 ") = 0; with and …xed parameter vectors of k and l elements respectively. Moreover, (iv) X 0 X plimn!1 n 1 X 0 X; Z 0 Z 1 0 1 0 plimn!1 n Z Z and Z 0 X plimn!1 n Z X have all full column rank, and (v) so 0 0 0 have X X; Z Z and Z X with probability one. Finally, (vi) we have E( n1 Z 0 Z j Z) 1=2 1=2 ): ) and E( n1 X 0 Z j X; Z) X 0 Z = op (n Z 0 Z = op (n Note that A(iii) implies E(X 0 ") = n 2 " E(Z 0 ") = n and 2 " : (1) Hence, if j = 0 for some j 2 f1; :::kg then the j-th regressor in X is predetermined and will establish a valid instrument; otherwise, when j 6= 0; the j-th regressor is endogenous. Likewise, if g = 0 for some g 2 f1; :::; lg then the g-th column of Z establishes a valid instrument, and an invalid instrument otherwise. It can be shown that A(vi) boils down to the mild regularity assumptions n1 Z 0 Z plimn!1 n 1 Z 0 Z = op (n 1=2 ) and n1 X 0 Z plimn!1 n 1 X 0 Z = op (n 1=2 ): Since l k the generalized instrumental variable (GIV) or 2SLS estimator of exists and is given by ^ GIV = [X 0 Z(Z 0 Z) 1 Z 0 X] 1 X 0 Z(Z 0 Z) 1 Z 0 y ^ 0 X) ^ 1X ^ 0 y; = (X (2) where we introduced the notation ^ = Z^ X = Z(Z 0 Z) 1 Z 0 X; (3) where ^ = (Z 0 Z) 1 Z 0 X contains the (reduced form) coe¢ cient estimates of the …rststage regressions. In Framework A the probability limit of ^ GIV exists. We de…ne GIV plim ^ GIV = + plim[X 0 Z(Z 0 Z) 1 Z 0 X] 1 X 0 Z(Z 0 Z) 1 Z 0 " = + 2" [ X 0 Z Z 01Z Z 0 X ] 1 X 0 Z Z 01Z ; (4) where GIV is also known as the pseudo-true-value of ^ GIV : We shall denote the inconsistency of ^ GIV as • GIV GIV 1 2 " [ X0Z Z0Z 1 0 2 ; " X ^ 0X ^ = = (5) Z0X ] 1 X0Z 1 Z0Z 1 where we used X^ 0 X^ plim(Z 0 Z) 1 Z 0 X = Z 01Z Z 0 X : Note X 0 Z Z 0 Z Z 0 X and that in Framework A the GIV estimator is consistent if and only if = 0: Below, we shall also look into the special case l = k (just identi…cation), where the above GIV results specialize to simple IV, i.e. ^ • IV = (Z 0 X) 1 Z 0 y; = + 2" Z 01X ; IV = IV 2 " 5 1 Z0X : (6) When in fact Z = X (all regressors are used as instruments), i.e. specializes to OLS, i.e. ^ OLS OLS = ; then IV = (X 0 X) 1 X 0 y; = + 2" X10 X ; • OLS = 2 " 1 X0X (7) : For the sake of simplicity, we will start with deriving special results for models with disturbances that have 3th and 4th moments corresponding to those of the normal distribution. Therefore, we also state: Framework B. This specializes Framework A to the case: 3 = 0 and 4 = 3: For GIV and IV estimators we now obtain the following results (proved in appendices) on their convergence in distribution. Theorem 1. In Framework B we have n1=2 ( ^ GIV N VGIV where c1 2 0 " N ! N 0; VGIV ; with 1 1 1 2 2 ^ 0X ^ + " c4 X ^ 0X ^ X0X X ^ 0X ^ X 0 0 1 • • • • c4 [ X^ 0 X^ X 0 X GIV GIV + GIV GIV X 0 X X^10 X^ ] + 4" c4 (1 2c4 ) X^10 X^ 0 X^10 X^ 0 + 2" c4 (1 2c5 )[ X^10 X^ • GIV + • GIV 0 X^10 X^ ] 0 0 +[c5 (1 2c5 ) c3 + " 2 ( • GIV X 0 X • GIV )] • GIV • GIV ; 2 " c5 (1 = 1 Z0Z GIV ) ; c2 c3 + c4 ) 0 • GIV ; 0• c3 GIV ; c2 and c5 c 4 = c1 1 c3 c4 : N The N in the superindex of VGIV indicates that it refers to the case where the disturbances are "almost normal", because 3 = 0 and 4 = 3: We …nd that the limiting distribution of ^ GIV is still genuinely normal when instruments are invalid, although no longer centered at but at the pseudo-true-value GIV : When all instruments are valid, i.e. = 0; then GIV = ; • GIV = 0 and c1 = c2 = c3 = 0; giving c4 = 0 and c5 = 1; so that Theorem 1 specializes to the standard result n1=2 ( ^ GIV ) ! N(0; 2" X^10 X^ ): Note that when all instruments are valid the asymptotic variance of ^ GIV is not determined by the simultaneity ; because X^ 0 X^ = X 0 Z Z 01Z Z 0 X = X 0 Z Z 01Z Z 0 X : However, when instruments are invalid, i.e. 6= 0; then Z 0 X = Z 0 X + 2" 0 and thus X^ 0 X^ is determined by both and : Then, when …tting X to Z; the " 0 part of X is no longer (asymptotically) orthogonal to Z; due to the presence of " 0 : This does not only lead to the inconsistency, but also to the many extra terms in the asymptotic variance. For the special case l = k we have 0 • GIV = 2" 0 Z 01Z Z 0 X Z 01X = 2" 0 Z 01Z ; so c1 = c2 ; c4 = 0 and c5 = 1 c3 ; giving: Corollary 1. In Framework B for the special case l = k we have n1=2 ( ^ IV N ; with N 0; VIV N VIV = 2 " (1 c3 )2 1 Z0X Z0Z 1 X0Z [2c23 6 2c3 + 1 2 " 0 ( • IV X0X • • IV )] IV IV ) •0 ; IV ! 0• where c3 IV 2 0 " = 1 Z0X : When all instruments are valid, i.e. = 0; this result specializes to the standard re1 1 2 1=2 ^ and the scalar sult n ( IV IV ) ! N 0; " Z 0 X Z 0 Z X 0 Z : Since for general 1 2 0 can either be positive or negative no general conclusions can be drawn on " Z0X 1 1 N 2 the behavior of VIV in comparison to the reference case " Z 0 X Z 0 Z X 0 Z : Depending on the particular parametrization and data moment matrices the asymptotic variance of individual coe¢ cient estimates may either increase or decrease, due to 6= 0 or 6= 0: N N When Z = X; which gives = and ^ IV = ^ OLS ; the resulting VIV = VOLS is the same as the formula found for an inconsistent OLS estimator when the disturbances are (almost) normal, as derived in Kiviet and Niemczyk (2005). Next we look at the case where the disturbances may have general 3rd and 4th moment. Let be a n 1 vector of unit elements. Upon de…ning plim n 1 Z 0 = plim n 1 Z 0 plim n 1 X 0 = plim n 1 X 0 Z0 X0 Z0 ; X0 ; we …nd (superindex N N indicates nonnormal disturbances): NN NN Theorem 2. In Framework A we have n1=2 ( ^ GIV GIV ) ! N 0; VGIV ; where VGIV N is equal to VGIV ; given in Theorem 1, plus two additional terms. When 4 6= 3 the additional term is 0 0 ( 4 3)f 4" c24 X^10 X^ 0 X^10 X^ + 2" c4 c5 [ • GIV 0 X^10 X^ + X^10 X^ • GIV ] + c25 • GIV • GIV g; where c4 = term is 2 0 " 3 3 fc4 [ " c4 1 ^ 0X ^ X +c5 [ " c4 • 0 1 Z0Z and c5 = 1 • 0 1 ^ 0X ^ X 3 +( " X^10 X^ 1 •0 ^ 0X ^ X 0 GIV X X0 +( +c4 [ 3" c4 GIV 1 ^ 0X ^ X + +c5 [ " c4 • GIV 1 ^ 0X ^ X 1 0 X0 ^ 0X ^ X 1 0 Z0 ^ 0X ^ X 1 ^ 0X ^ X + • GIV 0Z 0 0 X0 •0 GIV : When 3 6= 0 the additional 0 1 1 3 0 ^ 0X ^ + " c5 X ^ 0X ^ X 1 2 0 • •0 " GIV )( " GIV X 0 Z ) Z 0 Z Z 0 1 1 •0 0 •0 • GIV X 0 GIV GIV + " c5 X " ^ 0X ^ 1 2 0 1• •0 GIV )( " " GIV X 0 Z ) Z 0 Z Z 0 1 1 0 3 • •0 " X X 0 GIV GIV + " c5 X ^ 0X ^ ^ 0X ^ " GIV " 0• c4 GIV X0 Z0X 1 2 Z0Z ( " Z0X • • GIV )( " 0 0 1 ^ 0X ^ X 0 1 ^ 0X ^] X •0 Z 0 GIV •0 ] GIV 1 0 Z0 ^ 0X ^ X 0 • " GIV )] 1 3 0 GIV )( " ^ 0X ^ X 0 • • • X 0 GIV GIV + " c5 GIV 1 2 Z0Z ( " 1 •0 " GIV Z0 0 Z0 1 ^ 0X ^ X " 1 ^ 0X ^ X 1 •0 GIV )]g: When all instruments are valid, i.e. = 0; then this result again collapses to the standard NN result, i.e. VGIV = 2" X^10 X^ ; which highlights that normality of the disturbances is not a requirement for the standard normal limiting distribution of ^ GIV . For the special case l = k Theorem 2 yields: Corollary 2. In Framework A for the special case l = k we have n1=2 ( ^ IV NN ; with N 0; VIV NN VIV = 2 2 " c5 [(5 1 Z0X Z0Z 2 4 )c5 1 X0Z + 2c5 ( 1 Z0X 2 3 " c5 [ 1 •0 3 " IV + X0 7 Z0 •0 IV 1) + 1 + • IV 0 Z0 2 •0 " IV 1 X0Z ] X0X IV ) • IV ] • IV • 0 : IV ! where c5 1 0• IV 2 0 " =1 1 Z0X : Of course, for 3 = 0 and 4 = 3 this result simpli…es to that of Corollary 1: It also shows that an increase (decrease) in the kurtosis leads to a larger (smaller) asymptotic variance. In the proofs of the above theorems we employ a lemma that is a straightforward extension of the following simple CLT (central limit theorem), which says: Let vi be a for i 6= h = Vi and E(vi vh0 ) = O P k 1 random vector such that E(vi ) = 0; E(vi vi0 ) = P n n 1 1=2 1 1; :::; n; then n v ! N(0; limn!1 V ); where v = n i=1 Vi : We i=1 vi and V = n employ the following generalized version: Lemma. Let W = (w1 ; :::; wn )0 be a n k random matrix and ! a k 1 nonrandom vector, whereas the n 1 vector " = ("1 ; :::; "n )0 has mutually uncorrelated elements for which E("i j wi ) = 0; E("2i j wi ) = 2" ; E("3i j wi ) = 3 3" and E("4i ) = 4 4" : 2 Then the k 1 vector vi = wi "i + !("2i " ) has zero expectation, conditional variance 0 3 0 2 0 + 3 " (wi ! + !wi0 ) + ( 4 1) 4" !! 0 ; whereas E(vi vh0 ) = O E(vi vi j wi ) = Vi = " wi wi P for i 6= h; so that for n 1=2 ni=1 vi = n 1=2 [W 0 " + !("0 " n 2" )] the CLT implies n1=2 v ! N[0; 2 " W 0W where W 0 W plim n 1 W 0 W and unit elements. 3 3 3 "( W0 W0 plim n 1 W 0 + !0 + ! 0W )+( 0 0W 4 1) 4" !! 0 ]; with a n 1 vector of Illustrations To illustrate the analytical asymptotic …ndings obtained in the foregoing section, we shall calculate the various formulas for particular models and show the corresponding normal densities over relevant parts of the parameter space. In addition, we will simulate these models and depict the empirical density of the estimators to check the relevance and accuracy of the …rst-order asymptotic approximations in …nite sample. Also we compare IV and GIV estimators (using possibly invalid and possibly weak instruments) with OLS. The latter estimator always uses extremely strong instruments that at the same time are invalid in case of simultaneity. The limiting distributions obtained in the foregoing section are all of the generic form 1=2 ^ • ) ! N(0; V ) and they imply a …rst-order approximation to the distribution n ( ^ of in …nite sample that can be expressed as ^ a 1 N( + • ; n V ): (8) This entails a …rst-order asymptotic approximation to the mean error of ^ equal to •= and to the mean squared error (AMSE) given by AMSE( ^ ) 0 n 1V + • • : (9) The actual values of • and of (the square root of) AMSE( ^ ) can be computed for any n and any given values of the model parameters and asymptotic data moments. To …nd out how accurate the …rst-order asymptotic approximation (9) is, it should be compared 8 with corresponding Monte Carlo estimates obtained from a series of realizations of ^ in simulated …nite samples. However, these cannot be achieved in the standard way when ^ does not have …nite …rst or second moments in …nite sample, as is the case when l k 1: Then, irrespective of the number of Monte Carlo replications employed, the sample moments from Monte Carlo experiments are not informative as they do not converge. Appropriate alternatives for the mean error and for the root mean squared error are then the median error and the median of the absolute error. For a scalar estimator ^ of the median error ME( ^ ) and the median absolute error MAE( ^ ) are de…ned as Prf( ^ Prfj ^ ) ME( ^ )g = 0:5; j MAE( ^ )g = 0:5: (10) (r) From a series of R independent Monte Carlo realizations ^ (r = 1; :::; R) we estimate (r) ME( ^ ) by sorting the values ( ^ ) and taking the median value, and likewise for (r) ^ ^ MAE( ) after sorting the values j j : Of course, AMSE( ^ ) is not the natural asymptotic counterpart of the Monte Carlo estimate of MAE( ^ ): We assess the (scalar) asymptotic version AMAE( ^ ) of MAE( ^ ) in the following way. Let the CDF of the normal approximation to the distribution of ^ be indicated by • ; ^ (x): Then, for m AMAE( ^ ); we have 0:5 = Prfj ^ j mg = 1 Prfj ^ j> mg ^ ^ = 1 Prf > mg Prf < mg = Prf ^ < mg Prf ^ < mg a = • ; ^ (m) • ; ^ ( m); so that we can solve2 m from •; ^ (m) = 0:5 + •; ^ ( m): (11) Below we will examine the empirical …nite sample distribution of scalar ^ GIV and a N compare it with ^ GIV ): In addition, for various estimators ^ GIV N( + • GIV ; n 1 VGIV ^ ^ (including IV and OLS ), we examine MAE( ^ GIV ) and compare it with AMAE( ^ GIV ) over the entire parameter space of two simple classes of models. We examined these models under Framework B only, employing normally distributed disturbances. 3.1 A simple just identi…ed model We commence by considering the most basic example one can think of, viz. a model with one regressor and one possibly invalid and either strong or weak instrument, i.e. k = l = 1: The two variables x and z, together with the dependent variable y, are 2 1 •; Since m = 1 •; m= [0:5 + ^ 1 0; ^ •; ^ [0:5 + ^ •; ^ ( m)]; we employed the iterative scheme, m0 = 0; mi+1 = ( mi )] for i = 0; 1; ::: until convergence. When • = 0 no iteration is required since (0:75) conforms to the quartile. 9 supposed to be jointly IID with zero mean and …nite second moments. Hence, the variables are strongly stationary and our Theorems 1 and 2 apply. This model has been addressed often before, recently in Woglom (2001) and Hillier (2005), and for l 1 in Bound et al. (1995) and Hahn and Hausman (2003), although only in the latter paper invalid instruments are being considered. We …rst evaluate the relevant expressions for the asymptotic distribution given in Corollary 1. In the model with k = l = 1 we can simplify the notation considerably, by writing 2x for X 0 X ; xz or xz x z for Z 0 X ; etc. Using = z" = 2" and = x" = 2" we obtain • = IV c3 = •0 IV X0X • IV = 2 0 " 4 0 " 1 Z0X 1 Z0X 1 Z0X 2 " = IV 1 = = = xz z" x" 2 " = = z" " xz x (12) z" x" xz 1 X0Z X0X z" 2 z" xz 2 x 2 xz = 2 2 z" " 2 ; xz giving N VIV = 2 " 2 x (1 2 z" )( xz 2 z" x" ) 4 xz + 4 z" (1 2 x" ) ; (13) in the case where the disturbances are (almost) normally distributed. The expression for the inconsistency • IV shows that its sign is determined by the sign of z" = xz ; whereas its magnitude is inversely related to the strength of the instrument, cf. Bound et al. N is una¤ected by the signs of z" ; x" and xz as long as the sign of the (1995). VIV N product z" x" xz remains the same, or when either x" or z" is zero. Self-evidently, VIV diverges for xz approaching zero. Without loss of generality we may focus in this model on the case = 1: This is just a normalization and not a restriction, because we can imagine that we started o¤ from # a model yi = # x# 6= 0; and rescaled the explanatory variable such that i + "i ; with # # xi = xi = : An important characteristic of the model is the signal-to-noise ratio (SN ), which is equal to SN = 2 2 x 2 " = 2 x : 2 " (14) N and • are proportional to (the square root of) the From (12) and (13) we …nd that VIV inverse of SN: In fact, after normalization to = 1; the approximation to the distribution a N of the IV estimator in this simple model ^ IV ) is completely determined N( + • ; n 1 VIV by n and the four model characteristics xz ; x" ; z" and SN: Next we focus on obtaining an appropriate data generating scheme for this model, which is to be used in the simulations. In the notation of Section 2 it should be given by 9 y i = x i + "i = x i = x i + "i (15) ; zi = zi + "i where and are scalar. In order to obtain ("i ; xi ; zi )0 IID(0; ); with appropriate 3 3 covariance matrix ; we can …rst generate vi = (vi;1 ; vi;2 ; vi;3 )0 IID(0; I3 ) and 10 then parameterize as follows: "i = xi = zi = " vi;1 ; 1 vi;2 ; 2 vi;2 + 3 vi;3 : This provides full generality. The coe¢ cient 1 determines 2x ; whereas E(xi "i ) = 0; as it should. Also E(zi "i ) = 0; and 2 and 3 enable any correlation between xi and zi and any value of 2z : The above implies 0 1 0 10 1 "i 0 0 vi;1 " @ xi A = 1=2 vi = @ " 0 A @ vi;2 A : (16) 1 zi vi;3 " 2 3 Note that the zero elements do not entail restrictions on ; because 1=2 is non-unique and a lower-triangular form with positive diagonal elements can be found for any positive de…nite . In this simple model with k = l = 1 we have P P ^ IV = P zi yi = + P zi "i ; (17) zi xi zi xi which clari…es that, irrespective of the sample size, the distribution of ^ IV is invariant to the scale of zi : We may also change the sign of all the zi without a¤ecting ^ IV : Therefore, we may restrict ourselves in the illustrations to cases with xz > 0 (the case xz = 0 leads to underidenti…cation and was already excluded in the assumptions). Since the distribution of ^ IV becomes just its mirror-image when all xi are changed in sign, we shall also restrict ourselves to cases where z" 0; because of the following reasoning. The value of xz is invariant to changing the signs of all xi and zi values. Hence, for any value of xz > 0 the distribution of ^ IV for z" 0 and arbitrary positive or negative value of x" is equivalent with the distribution of ( ^ IV ) for 0 and z" x" : It is also obvious that x and " do not a¤ect the distribution of ^ IV separately, but only through their ratio. Hence, without loss of generality, we can impose some genuine equality restrictions on the 6 parameters of : For these we choose " 2 z = 1; = 2+ 2 2 + By (18) we normalize all results with respect to SN = 2 x: 2 3 "; = 1: (18) (19) and (14) simpli…es to (20) Because any GIV estimator is invariant to the scale of the instruments (only the space spanned by the instruments is relevant) we may impose (19), which will be used to obtain the value 2 2 1=2 ; (21) 3 = (1 2) (3) where, without loss of generality, we may stick to positive values for 3 as long as vi is symmetrically distributed. For similar reasons we would get observationally equivalent 11 data realizations if both 1 and 2 would be changed in sign. Therefore, below we will restrict ourselves to just positive values for both 1 and 3 : The above yields the following data (co)variances and correlations: 2 x = x" = z" = xz = 2 + + 2 1 2 = 2+ p2 2 + 1 + 1 + 21 x" = = z" = p + 1 2 )= 2 + xz = ( 2 y 1 2 (22) 2 1 Note that these, after the normalizations = 1; " = 1 and z = 1, depend on only 4 free parameters of the data generating process (DGP), viz. ; ; 1 and 2 : As we already established, the expressions for inconsistency in (12) and asymptotic variance (13) evaluated under 3 = 0 and 4 = 3 (the 3rd and 4th moment of vi;1 ) depend on just four characteristics too, viz. on x" ; z" ; xz and SN = 2x . The latter four can be used in this simple model as a base for the Monte Carlo design parameter space, since they determine the parameters of the DGP through the relationships = z" ; = x" x ; 1 = x (1 2 = ( xz (23) 2 1=2 x" ) x" ; )= (1 z" 2 1=2 x" ) ; from which 3 follows directly by evaluating (21). This reparametrization is useful, because the parameters x" ; z" ; xz and SN have a direct econometric interpretation, viz. the degree of simultaneity, instrument (in)validity, and instrument strength, whereas SN is directly related to the model …t, which can be expressed as SN=(SN + 1). We prefer to avoid to use the ‘concentration parameter’as one of the relevant characteristics of this model in the present context, because this concept refers exclusively to the case where all instruments are valid. From the above it follows that by varying the four parameters j x" j < 1; 0 z" < 1; 0 < xz < 1 and 0 < 2x =( 2x + 1) < 1; we can examine the limiting and …nite sample distributions of ^ IV over the entire parameter space of this model. Note, however, that not all admissible values of these parameters will be compatible. For example, when x" is large and z" is small, this cannot be compatible with xz being very large. Moreover, N ^ • ; so we may choose just one x has just an e¤ect on the scale of IV ; VIV and IV …xed value for x and from these …ndings the results for any value of x can be obtained simply by rescaling. In our calculations and simulations we will …x 2x = 2" = 10; yielding a population …t of the model of 10=11 = 0:909: Actual values of • IV and of AMAE( ^ IV ) can be calculated now for any set of compatible values of n; x" ; z" ; xz and x : Next they can be compared with corresponding Monte Carlo estimates obtained from ^ IV realizations in simulated …nite samples, in order to …nd out how accurate the …rst-order asymptotic approximations are. Before we present these summarizing characteristics, we will …rst examine the full density functions themselves. The Figures 1 through 4 contain 8 panels each. In all these panels four densities are presented, viz. for n = 50 (dark/black lines) and n = 200 (grey/red lines), both for the actual empirical distribution (solid lines) and for its asymptotic a N approximation (dashed lines). The latter has been taken as ^ IV ): N( + • IV ; n 1 VIV In the simulations we took vi IIN (0; I3 ) and used 1,000,000 replications. From the 12 results we may expect to get quick insights into issues as the following. For which combinations of the …ve design parameter values is: (a) the actual density of ^ IV close to normal (symmetric, unimodal, etc.); (b) the actual median of ^ IV close to IV ; (c) the N ) actual tail behavior of ^ IV reasonably well represented by that of the N( IV ; n 1 VIV distribution. Hence, we focus on the correspondence in shape, location and spread of the asymptotic and the empirical distributions. Since in this just identi…ed model the IV estimator does not have …nite moments, we do know that even when the instruments are valid, the asymptotic approximation will not capture the fat tail characteristic of the …nite sample distribution. In Figure 1 xz = 0:8; so the instrument is not ultra strong, but certainly not weak. In Figure 2 xz = 0:3; in Figure 3 xz = 0:1 and in Figure 4 xz = 0:01: Hence, in the latter …gure the instrument in certainly weak and we may expect that standard large sample asymptotics does not provide a very accurate approximation. All four …gures contain eight panels for particular combinations of x" and z" values. The panels in the left-hand columns have z" = 0, i.e. the instrument is valid and the standard asymptotic result applies. In the right-hand columns z" = 0:2; i.e. the instrument is invalid and the IV estimator is inconsistent. Nevertheless, the asymptotic approximation presented in Corollary 1 applies. The four rows of panels cover the cases x" = 0:3; x" = 0 (no simultaneity; hence, OLS would be more appropriate than IV), x" = 0:3 and x" = 0:6: From Figure 1 we see in the left-hand column that the standard asymptotic approximation of IV when using a valid and strong instrument is very accurate when the simultaneity is not very serious, but deteriorates when x" increases, especially when n is small. We note some skewness and one fat tail, but the asymptotic distribution is never extremely bad for the cases examined. In the right-hand column we see that the new result of Corollary 1 is almost of the same quality but slightly less accurate. Especially for the smaller sample size we note some skewness and at least one fat tail in the empirical distribution, which are not captured by the …rst-order normal asymptotic approximation. In Figure 2, where the instrument is weaker, we …nd that when the instrument is valid the distribution is more skew, and more so for serious simultaneity. In the right-hand column this occurs for di¤erent x" values. In most cases there is a substantial but not a dramatic di¤erence between the actual distribution and its approximation. The discrepancies are more pronounced in Figure 3, and a¤ect both the standard ( z" = 0) and the new ( z" 6= 0) asymptotic approximations. From Figure 4 it is clear that the asymptotic approximations are useless (at the sample sizes examined) when the instrument is really weak. When the instrument is valid the actual distributions show some median bias, but they are much less dispersed than suggested N by nVIV : The magnitude of the bias in relation to the OLS bias and weakness of the instrument has been analyzed by amy authors, see Sawa (1969) and (further refrences in) Hillier (2005). When the instrument is invalid and very weak then the …nite sample distribution of the inconsistent IV estimator is not centered at the pseudo-true-value. Surprisingly, it is actually much closer to the true value (also when the instrument is not so weak), whereas the distribution becomes bimodal when the instrument is very weak. Maddala and Jeong (1992), Woglom (2001), Hillier (2005) and Forchini (2005) show that bimodality of the consistent IV estimator occurs for much more severe simultaneity than examined here, viz. for x" = 0:99; whereas Phillips (2005) shows that it is omnipresent in the simple Keynesian model where simultaneity is always severe. Our …ndings suggest that using instruments that are both weak and invalid leads to bimodality, irrespective 13 of the degree of simultaneity. From Figures 1 through 4 we conclude that, irrespective of whether the instruments are valid or not, one should avoid to use standard large sample asymptotics when instruments are really weak. If one replaces the weak instrument with a strong one that is invalid (which is always possible by reverting to OLS), one may be able to produce inference on by an inconsistent estimator, such as depicted in the right-hand column of Figure 1, which has a distribution that is much more concentrated around the true value than that from the consistent estimator depicted in the left-hand column of Figure 4. The general validity of the …ndings from Figures 1 through 4 will be illustrated now by scanning the median absolute error over almost the full parameter space of this simple model. Figure 5 provides an overview of the (in)accuracy of the asymptotic distribution of IV as an approximation to the actual distribution in …nite sample for n = 20 and for n = 100: These …gures (based on 10,000 replications) cover all compatible positive values of x" and xz ; for z" = 0; 0:1; 0:3 and 0:6: This accuracy is expressed as log[MAE( ^ IV )= AMAE( ^ IV )]: Hence, positive values (yellow, amber) indicate larger absolute errors in …nite sample than indicated by the asymptotic approximation and negative values (blue) indicate that standard asymptotics is too pessimistic about the absolute errors of ^ IV in …nite sample. Note that this log-ratio is invariant regarding the value of SN = 2x = 2" : We …nd that the degree of simultaneity x" has little e¤ect, and neither has the (in)validity of the instrument z" : Just instrument weakness (roughly, when j xz j < n 1=2 ) seriously deteriorates the accuracy of the large-n asymptotic approximation. Figure 6 examines log[MAE( ^ OLS )= MAE( ^ IV )]; which is also invariant with respect to SN: It shows that in …nite sample the absolute estimation errors committed by OLS are larger than those of IV only when both x" and xz are large. The area where IV beats OLS gets smaller for larger z" : We also note that OLS may beat IV by a much larger margin (when the instrument is weak and the simultaneity not so serious) than IV will ever beat OLS (which happens when the instrument is strong, the simultaneity serious, and the instrument not strongly invalid). 3.2 A simple overidenti…ed model The model of the above subsection can be extended such that we have two instruments zi;1 and zi;2 ; i.e. l = 2 and = ( 1 ; 2 )0 : First, we examine by which minimal set of data moments the limiting distribution is determined in this model. We assume again that all variables in the regression have been scaled such that = 1 and 2" = 1; whereas the instruments Z have been transformed such that Z 0 Z = I (while still spanning the original subspace). Such an orthonormal base for this subspace is nonunique, and without loss of generality we may choose one in which only zi;1 is possibly correlated with "i ; so that 2 = 0: This implies that X0Z 1 Z0Z Z0" = x ^" = 14 xz1 z1 " = xz1 z1 " x ; (24) where, of course, to z1 " = 1: Now the various entries in the formula of Theorem 1 specialize = = 2 x 2 x ^ > 0; = 2x^x 2 x = 2 " 1 ^ 0X ^ X X0Z c1 = 2 " "0 Z 1 Z0Z c2 = 2 " "0 Z Z0Z c3 = "0 X X0X ^ 0X ^ X • GIV (25) > 0; 1 Z0Z Z0" 1 Z0" x ^" 2 x ^ = xz1 z1 " x 2 2 x^ x x = xz1 z1 " ; 2 x^ x x 2 x ^" = 2 2 xz1 z1 " ; 2 x^ x 2 z1 " ; = 1 ^ 0X ^ X Z0X • GIV = = x" x ^" x 1 X0Z = Z0" Z0Z x" x ^" x ^ = x" xz1 z1 " ; 2 x^ x = x^ x from which c4 and c5 readily follow. From the above we conclude that the limiting distribution of Theorem 1 is fully determined by (and varies with) the 5 data moments: x ; x^ x ; x" ; z1 " and xz1 : However, in the special case z1 " = 0 the minimal set of parameters is just one dimensional, because x^x x su¢ ces. For the general case we …nd N VGIV = 2 " (1 2 2 x x^ x 2 + 4 "4 4z1 " x x^ x 4 z1 " ) + 1 z1 " 2 x^ x 2 xz1 2 x^ x 1 1 2 4 x^ x 2 2 z1 " xz1 + 2 z1 " 2 +2 2 xz1 2 x^ x 1 4 4 z1 " xz1 3 6 x^ x (26) ; where 1 2 3 = 3z1 " 2 x" = 2x" + 4 z1 " = 2 ( x" xz1 (1 + x" xz1 2 2 2 4z1 " ) z1 " xz1 ( z1 " 2 4 z1 " [1 + 3 z1 " x" xz1 z1 " xz1 )(3 x" + 2 5z1 " ); 2 2 2 x" + xz1 (1 z1 " )]; 5 3 z1 " z1 " xz1 ): Note that this variance is invariant to sign changes of the correlations as long as the sign of z1 " x" xz1 is not a¤ected, or when either x" or z1 " is zero. The sign of the inconsistency • GIV is determined by the sign of z1 " xz1 : For given values of x^x and • z1 " the magnitude of GIV is a multiple of xz1 , so it will be large when the invalid instrument is relatively strong. For the special case xz1 = x^x ; i.e. the second instrument is orthogonal to x; the variance formula specializes to 2 " 2 x (1 2 z1 " )( xz1 2 z1 " x" ) 4 xz1 + 4 z1 " (1 2 x" ) (27) which, not surprisingly, corresponds to (13). Next we examine whether, apart from n; the same number of parameters is required to obtain in all generality the …nite sample distribution of GIV by generating the appropriate data processes. For that purpose the schemes (15) and (16) can be extended as follows. Let now vi IID(0; I4 ): Again we take "i = vi;1 and, again restricting ourselves to positive 1 for symmetrically distributed vi ; we have xi = = vi;1 + 1 vi;2 x [ x" vi;1 + (1 15 2 1=2 x" ) vi;2 ]; (28) with 2x = SN (this is all similar to the earlier example with l = 1): Now, however, we have to compose the l = 2 instruments as zi;1 = zi;2 = 1 vi;1 + 2 vi;1 + 2 vi;2 + 5 vi;2 + 3 vi;3 + 6 vi;3 + 4 vi;4 ; 7 vi;4 : These entail full generality, because they allow for both instruments any correlation with the disturbance "i ; any correlation with the regressor xi and any mutual correlation. Since it is only the space spanned by these two instruments that matters for ^ GIV , we may replace zi;2 by a linear combination of zi;1 and zi;2 such that it no longer depends on vi;1 : This corresponds to taking 2 = 0 and re-interpreting 5 ; 6 and 7 : Hence, the general case of two possibly invalid instruments can be represented fully by that of one valid and one possibly invalid instrument, as we already argued above from the asymptotic perspective. We can perform a similar operation again, now with respect to zi;1 ; such that we may impose 4 = 0: Next rescaling the instruments such that they have unit variance leads to the generating schemes zi;1 = zi;2 = 1 vi;1 + v 5 i;2 + 2 1 2 5 2 vi;2 + (1 v 6 i;3 + (1 2 1=2 2) 2 1=2 6) vi;3 ; vi;4 : (29) Due to the symmetry of vi generality is maintained when we restrict ourselves to cases where particular coe¢ cients are nonnegative. This extends to 2 and 5 ; because the space spanned by the instruments does not change by multiplying all elements by 1, yielding 0 1; 0 1: (30) 2 5 We also maintain full generality by imposing that the two instruments have zero covari2 2 1=2 = 0; from which we …nd ance, which implies 2 5 + 6 (1 2) 1 6 = 5 (1 2 2 1 2 1=2 : 2) (31) So, for given values of x ; x" and z1 " = 1 ; we would be able to generate data according to (28) and (29) if we also knew 2 and 5 : The asymptotic overall strength of the two instruments can be controlled by the population R2 of the regression of x on Z = (z1 ; z2 ); which is 2 RxZ = Note that 2 x^ x and, since we imposed Z0Z = ( x0 Z 1 Z0Z 2 x 1 2 Z0Z Z0x) 2 2 x x ^ x0 Z Z0x 2 x ^ 2 x = (32) : 2 = RxZ ; (33) = I; we have 2 x^ x = xz1 = x0 Z Z0x 2 x x" z1 " = + (1 2 xz1 + (1 2 1=2 x" ) From these we can express the (nonnegative) values of 5 = 2 x^ x 1 16 2 xz1 2 x" 2 1=2 x" ) 2 5; 2: 5 and 2 as 1=2 (34) and 2 = xz1 (1 x" z1 " 2 )1=2 x" ; (35) from which 6 follows directly by evaluating (31). Hence, we can scan the …nite sample distribution of GIV for this class of model for any n over its entire parameter space by simulating data for all compatible values of x , x^ x ; x" ; xz1 and z1 " : Here again, these are found to be those data moments that characterize the asymptotic distribution. They determine and 1 via (28) and 2 ; 5 and 6 via (35), (34) and (31), respectively. We may restrict ourselves to cases where 2 x^ x > 0 (since the coe¢ cients of the simulation design are just determined by x^ x ). Note that the coe¢ cients of the data generation process, notably 2 ; are una¤ected (thus yielding the same distribution of ^ GIV ) if both z1 " and xz1 are changed in sign. Therefore, we will only examine cases with xz1 0: However, we shall also examine only nonnegative values of z1 " ; because changing the signs of both z1 " and x" yields the mirror image of the distribution of ^ GIV when the distribution of " is symmetric. In line with the just identi…ed model the distribution of ^ GIV for z1 " 0 is equivalent ^ with the distribution of ( GIV ) for 0 and z1 " x" ; because: If we change the signs of z1 " ; x" and all "i then the variables xi ; zi;1 ; zi;2 and thus x^i remain the same, P P whereas ^ GIV = x^i "i = x^2i changes sign. In the special case that no instrument is invalid we have 1 = z1 " = 0 in (29) and thus full generality is maintained by making zi;2 independent of vi;3 ; giving 6 = 0; and zero covariance of the two instruments implies now 2 5 = 0: Hence, we may choose 5 = 0; resulting in the simpli…ed generating schemes zi;1 = 2 vi;2 + (1 zi;2 = vi;4 : 2 1=2 vi;3 ; 2) (36) 1=2 2 instead of (35). Hence, when = 0 These imply xz1 = x^x and 2 = x^x (1 x" ) the …nite sample distribution is determined by just 3 parameters (viz. x ; x" and x^x ) instead of 5 (apart from n), whereas the limiting distribution just depends on 2x^ = 2x^x 2x : In all calculations we …xed again 2x = 2" = 10 (which here too has only a multiplicative e¤ect, i.e. just a¤ects the scale of the densities), as before we chose values x" = f 0:3; 0:0; 0:3; 0:6g; z1 " = 1 = f0:0; 0:2g and x^x = f0:8; 0:3; 0:1; 0:01g; whereas xz1 = f x^x ; x^x =2; x^x =8g: The latter values are associated with decreasing relative strongness of z1 (and, complementary, a valid instrument z2 that is either uncorrelated with x; contributes 50% to the joint strength of the instruments, or is relatively strong). Figures 7 and 8 contain some illustrative densities for xz1 < x^x ; again for n = 50 and n = 200: Since l k = 1; GIV does have a …nite …rst moment now. To save space we have put more cases into one …gure. Moreover, we have omitted the cases where xz1 = x^x (and xz2 = 0): Although we already established that these yield similar asymptotic results as the k = l = 1 case, from the simulations we found that in this situation the …nite sample densities do di¤er slightly from the "no …nite moments" case, the more so for a weaker instrument z1 , especially when xz1 = x^x = 0:01: When both instruments are valid and x^ x = 0:01 the l = 2 case produces estimators which are slightly more e¢ cient than the corresponding l = 1 estimators. This seems at odds with the …ndings in Donald and Newey (2001) which suggest that e¢ ciency bene…ts when weak instruments are discarded. Note, however, that their analysis assumes that the number of instruments 17 grows at a smaller rate than the sample size, whereas in our experiments the number of instruments is …xed. When xz1 = 0:01; xz2 = 0 and z1 " = 0:2 we …nd that the bimodality of the GIV estimator is less pronounced than for the IV estimator. Figure 7 presents densities for x^x = 0:8 and 0:3; and Figure 8 for the weaker instruments. In Figure 7 the asymptotic approximations are mostly reasonably accurate, but not in Figure 8, especially when x^x = 0:01: In the latter case we note again that the asymptotic approximations are much too pessimistic. The actual (median) bias of the GIV estimator is much less dramatic as the inconsistency • GIV suggests, and even though both instruments are very weak and one of them is also invalid, the actual density of ^ GIV has most of its probability mass remarkably close to the true value 1. Although Forchini (2005) suspects bimodality in the overidenti…ed model when the instruments are valid but weak, we do not …nd it at x^x = 0:01: Finally, we look again at median absolute error results. Figure 9 gives a more global impression of the accuracy of the asymptotic approximation in this model. We present results for n = 20 only and establish that the overall instrument strength x^x is the major determining factor, although the measures for instrument invalidity and simultaneity have an e¤ect too. Figure 10 makes comparisons with OLS for n = 100: We note that especially in the presence of invalid instruments there is much scope for OLS to produce more e¢ cient inference than GIV. Anyhow, our simulation results do not generally support the conclusion by Hahn and Hausman (2003) that 2SLS is the preferred estimator when n 100 and 2x^x 0:1: They arrive at this conclusion by comparing second-order asymptotic approximations to MSE. 4 Conclusions In this paper we obtained an explicit formula for the asymptotic variance of the generalized instrumental variable estimator when some of the employed instruments are invalid. We showed that the limiting distribution of such an inconsistent estimator is normal, and is centered at the pseuso-true-value (true coe¢ cient plus inconsistency), whereas its asymptotic variance includes a number of terms and factors additional to the standard result. It can only be expressed when one is willing to make assumptions on the …rst four moments of the disturbances. To obtain our results we assumed covariance stationarity of all variables, i.e. the dependent, the explanatory and the instrumental variables. In the simple illustrative models that we used, the data observations are in fact IID, as is often assumed in cross-section applications. Note, however, that our theorems also hold for time-series applications, where independence of the sample observations is unrealistic. They are also directly applicable in case non-stationary series are involved, provided the model is formulated in error correction form and the long-run multipliers (the coe¢ cients of the cointegrating vector) have been imposed, so that the model and the instruments can all be represented by transformations of the original data that are integrated to order zero. We examined the accuracy of our analytic large sample results in small samples by simulating a simple just identi…ed and a simple overidenti…ed model and establishing the actual behavior of instrumental variable estimators. Through a reparametrization of the structural and reduced form coe¢ cients into parameters that directly express the degree of simultaneity, the degree of (in)validity of the instrument(s), the strength 18 of the instrument(s) and the signal-to-noise ratio of the model, and by condensing the numerical results into graphic displays, it proved possible to produce a rather complete taxonomy of the behavior of the examined instrumental variables estimators over their full parameter space. There is a quickly expanding literature on the shortcomings of standard large sample asymptotic approximations to the distribution of IV estimators when the sample size is small or moderate and some of the instruments are weak but valid, and how alternative and better approximations could be obtained. The present study shows that it is possible to obtain an explicit large sample asymptotic approximation to the distribution of IV estimators when some of the instruments are invalid. Not surprisingly, however, that approximation is found to be vulnerable too, when instruments are weak. One option now would be to replace it by an alternative approximation that can cope with weakness of instruments. However, our illustrations suggest that it seems more worthwhile to abandon the employment of weak instruments altogether and just stick to strong instruments, even if they are invalid. For that situation we at least seem to have obtained here a reasonably accurate approximation to its …nite sample distribution, whereas at the same time this …nite sample distribution is such that it may yield much more accurate inference than that obtained on the basis of weak instruments. References Andrews, D.W.K., Stock, J.H., 2005. Inference with weak instruments. Invited paper for the 2005 World Congress of the Econometric Society, London. Cowles Foundation Discussion Paper No. 1530. Bound, J., Jaeger, D.A., Baker, R.M., 1995. Problems with instrumental variable estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association 90, 443-450. Donald, S.G., Newey, W.K., 2001. Choosing the number of instruments. Econometrica 69, 1161-1191. Forchini, G., 2006. On the bimodality of the exact distribution of the TSLS estimator. Forthcoming in Econometric Theory. Hahn, J., Hausman, J.A., 2003. IV estimation with valid and invalid instruments: application to the returns of education. mimeo. To appear in Les Annales d’Economie et de Statistique. Hahn, J., Inoue, A., 2002. A Monte Carlo comparison of various asymptotic approximations to the distribution of instrumental variables estimators. Econometric Reviews 21, 309-336. Hale, C., Mariano, R.S., Ramage, J.G., 1980. Finite sample analysis of misspeci…cation in simultaneous equation models. Journal of the American Statistical Association 75, 418-427. Hall, A.R., Inoue, A., 2003. The large sample behaviour of the generalized method of moments estimator in misspeci…ed models. Journal of Econometrics 114, 361-394. Hillier, G., 2005. Yet more on the exact properties of IV estimators. To appear in Econometric Theory. Joseph, A.S., Kiviet, J.F., 2005. Viewing the relative e¢ ciency of IV estimators in models with lagged and instantaneous feedbacks. Journal of Computational Statistics 19 and Data Analysis 49, 417-444. Kiviet, J.F., Niemczyk, J., 2005. The asymptotic and …nite sample distributions of OLS and simple IV in simultaneous equations. UvA-Econometrics discussion paper 2005/01. Maasumi, E., Phillips, P.C.B., 1982. On the behavior of inconsistent instrumental variable estimators. Journal of Econometrics 19, 183-201. Maddala, G.S., Jeong, J., 1992. On the exact small sample distribution of the instrumental variable estimator. Econometrica 60, 181-183. Phillips, P.C.B., 1980. The exact distribution of instrumental variable estimators in an equation containing n + 1 endogenous variables. Econometrica 48, 861-878. Phillips, P.C.B., 2005. A remark on bimodality and weak instrumentation in structural equation estimation. Cowles Foundation Discussion Paper No. 1540. Rothenberg, T.J., 1972. The asymptotic distribution of the least squares estimator in the errors in variables model. Unpublished mimeo. Sawa, T., 1969. The exact sampling distribution of ordinary least squares and twostage least squares estimators. Journal of the American Statistical Association 64, 923937. Woglom, G., 2001. More results on the exact small sample properties of the instrumental variable estimator. Econometrica 69, 1381-1389. A Proof of Theorem 1 Because the estimator ^ GIV tends for an increasing sample size not to ; but to GIV ; in order to p ); but choose a center of the establish its limiting distribution we should not focus on n( ^ GIV distribution that tends to GIV too, see Rothenberg (1972). For the sake of simplicity we shall center at GIV itself. Note that p n( ^ GIV GIV )= p ^ 0 X) ^ n[(X 1 ^ 0" X 1 ^ ^ 0X X 2 " 1 Z0Z X0Z ]: (37) To obtain the limiting distribution we shall rewrite the right-hand side of (37) such that we can invoke the Lemma given at the end of Section 2. Below we …rst show that (37) can be rewritten as p n( ^ GIV GIV 1 ^0 ^ )=( X X) n 1 1 p [W 0 " + !("0 " n for appropriate n k matrix W; with E(W 0 ") = 0; and nonrandom k a theorem often attributed to Cramér, the lemma yields p n( ^ GIV GIV ) ! N 0; 2 " 1 ^ 0X ^ X 1 0 W W +2 n n!1 plim upon assuming 3 = 0 and 4 = 3. We …rst set out to rewrite (37) in the form (38). Using ^ p n( ^ GIV GIV ) = p = 1 ^0 ^ ( X X) n ^ 0 X) ^ n(X 1 1 f ^ 0 [Z 0 " (Z 0 Z) E(Z 0 ")] + ^ 0 E(Z 0 ") 1 p f ^ 0 [Z 0 " n 20 E(Z 0 ")] + n 2 ^0 "[ 2 " )] n + op (1); (38) 1 vector !: Next, invoking also 2 0 " !! 1 1 ^ 0X ^ X ; (39) Z 0 X we get 2 ^0 ^ "X X 1 ^0 ^ XX n 1 ^ 0X ^ X 1 ^ 0X ^ X X0Z X0Z 1 Z0Z 1 Z0Z ] g(40) g: For the second expression between square brackets in the …nal line of (40) we …nd 1 ^0 ^ 1 X X X^ 0 X^ n ^ 0 ( Z 0 Z 1 Z 0 Z) n 0 ^ ( Z 0 Z 1 Z 0 Z) n ^0 = = X0Z 1 Z0Z 1 Z0Z 1 Z0Z (41) 1 + ( X 0Z n 1 + [( X 0 Z n 1 ^0 ^ XX n 1 ^ 0X ^ X X0Z ) +( 1 Z0Z X0Z ) 1 ^0 ^ X X) n ^ 0X ^ X 1 ^ 0X ^ X 1 Z0Z ; X 0 Z )] and this contains a factor which can be rewritten as 1 ^0 ^ XX n ^ 0X ^ X = = ( ^ 0 1 Z 0X n 1 0 ^ ) Z0X ^ 0( 1 Z 0X Z0X ) Z0Z n 1 0 1 1 0 X Z) Z 01Z X Z[( Z 0 Z) 1 n n n 1 0 1 0 X Z) Z 01Z ^ 0 [( Z 0 Z Z Z) n n 1 Z0Z X0Z X0Z = f( X0Z = f( X0Z (42) Z0X 1 Z 0 Z ]g 1 Z 0 Z ]g ^ 0( 1 Z 0X n 1 0 ^ ( Z 0X n Z0X Z0X Z0X ) Z 0 X ): Now substituting the decompositions obtained in (41) and (42) into the expression within curly brackets in the …nal line of (40) we obtain 1 ^0 ^ 1 X X X^ 0 X^ n 1 0 E(Z 0 ")] + n 2" ^ 0 ( Z 0 Z Z Z) n ^ 0 [Z 0 " ^ 0 [Z 0 " = +n = ^ 0 [Z 0 " +n 2 " E(Z 0 ")] + n 2 1 0 "( X Z ( X0Z X0Z ) n E(Z 0 ")] n 2 ^0 "[ 1 Z0Z 2 ^0 1 0 ( ZZ " 1 0 X Z) n Z0Z ) n ^ 0( 2 "( +n Z0Z 1 0 Z Z) n 1 ^0 ^ X X) n +n 1 Z0Z 1 ^ and • GIV where we substituted Z 0 Z Z 0 X = plim (vi) of Framework A, we can employ (43) 1 Z0Z ^ 0X ^ X 1 Z0Z 1 Z0Z ] X0Z 2 " 1 ^ 0X ^ X 2 1 0 "( X Z n Z0X 1 ^ 0X ^ X X0Z 1 Z0Z X0Z ) 1 Z0Z ^ 0( 1 Z 0X n X0Z 1 Z0Z Z0X ) • GIV ; : Now exploiting item Z 0 " E(Z 0 ") = Z 0 " + ("0 " n 2" ) ; 1 0 1 0 1 1 ZZ = ZZ E(Z 0 Z j Z) + E(Z 0 Z j Z) Z0Z Z0Z n n n n 1 0 0 1 0 1 = Z" + " Z + ("0 " n 2" ) 0 + op (n 1=2 ); n n n 1 0 1 0 1 1 0 XZ = XZ E(X Z j X; Z) + E(X 0 Z j X; Z) X0Z n n n n 1 0 0 1 0 1 0 = X" + " Z + (" " n 2" ) 0 + op (n 1=2 ); n n n X0Z so that the …nal expression given for (43) can be written as ^ 0 Z 0 " + ("0 " + + 2 0 0 "X " ^0 0 n 1 Z0Z n 2" ) (" " ^ 0 Z 0 " 0 • GIV 2 ^0 ") + 2" "0 Z 0 • GIV ^0 0 (" " 1 2 ^0 0 0 Z" " Z0Z 1 2 0 Z 0 Z + " (" " X 0 " 0 • GIV n 2" ) 0 • GIV + n 2 ^0 0 " Z Z 01Z " 0 1 2 ") Z0Z + "0 Z • GIV op (n1=2 ): 21 2 ^0 0 (" " n 2" ) " ^ 0 Z 0 " 0 • GIV + ^ 0 ("0 " n 2" ) 0 • GIV 0 1 Z0Z "0 Z • GIV ^ 0 "0 X • GIV This can be simpli…ed further by using c2 1 2 0 " Z0Z ; 0 • GIV = c3 0• c1 ^ 0 Z 0 " c1 GIV 2 0 " 1 Z0Z 1 ^ 0X ^ X Z0X 1 Z0Z X0Z ; ; giving ^ 0 Z 0 " + ("0 " n 2 ^0 ") +c1 X 0 " + 2 " 0 1 0 Z0Z Z " c2 X 0 " •0 0 GIV = + c1 ("0 " Z 0" n c2 ("0 " n ^ 0 •0 [c4 Ik 2 ^0 " +(c4 + c5 GIV ]X 0 ^0 0 0 1 0 Z0Z Z " ^ 0 • 0GIV X 0 " 2 ") n 2 ") + op (n 0 Z 0 " + c2 ("0 " c3 ^ 0 Z 0 " ^ 0 )( 1=2 2 ^0 ") n 0 + c2 ^ 0 Z 0 " + ^ 0 • GIV 2 ") " + [c5 ^ 0 + ( )(" " c1 ("0 " c3 ("0 " •0 2 0 " GIV X0Z ) 2 ^0 ") n n 2 ^0 ") + op (n1=2 ) 1 0 Z 0 Z ]Z " (44) ); where c4 c1 c2 and c5 1 c3 c4 : Note that (44) is equal to the factor in curly bracketspin the …nal line of (40), and we want to derive its limiting distribution after scaling by the factor 1= n; so we may neglect the remainder term. Cramér’s theorem implies that in deriving this limiting distribution we may replace ^ by its probability limit : Hence, de…ning the k k matrix 1 ; the k l matrix 2 and the k 1 vector !; such that c5 0 + ( c4 + c5 2 ! •0 0 c4 Ik 1 GIV 0 0 ; (45) •0 GIV 2 0 " )( 1 Z0Z ; X0Z ) ; we can invoke the Lemma now with W0 = For the case 3 = 0 and 4 1X 0 + 2Z 0 : (46) = 3 we then obtain the limiting distribution 1 p n 0 "+ 2Z + 2 Z0Z 1X 0 " + !("0 " 2 ") n 2 " V0 ); ! N(0; where V0 = 1 X0X 0 1 0 2 2 0 " !! +2 1 X0Z 0 2 Z0Z 2 " 0 + + 2 Z0X 0 1: (47) In evaluating V0 we can make use of plim n plim n plim n plim n Z0Z X0X Z0X X0Z Z 0Z = 1 0 XX= 1 0 ZX= 1 0 XZ= 1 2 " 2 " X0X Z0X 0 Z0X 0 0 and …nd that V0 can be expressed as 0 [c4 Ik •0 2 0 )[c4 Ik " 0 0 )( 2" 0 • GIV X 0 Z ) 0 ][c4 0 + c5 0 ] GIV +[c5 0 + ( +2 2" [c4 + c5 0 •0 +[c4 Ik ]( GIV +[c5 0 0 +( X0X ]( X0Z )( 2 0 " 2 0 )[c5 " 0 • GIV X 0 Z ) • GIV 1 Z 0 Z ]( + 0 ] 1 2 Z0Z ( " 1 Z 0 Z ]( 0 2 " Z0Z )[c5 + • )( Z0X 2 " Z0X 0 GIV )[c4 Ik 1 2 Z0Z ( " 0 • 0 GIV Z0X • )] 0 ]: Next we examine these 5 terms of V0 one by one. The …rst one is c24 ( 2 " 0 ) c4 ( 2 " X0X 0 ) • GIV 0 0 2 0 2 0 • ) + 0 • GIV ( X 0 X ) GIV 0 GIV ( X 0 X " " 0 0 2 2 2 0 2 c4 X 0 X c4 X 0 X • GIV + " c3 c4 " c4 0 0 2 2 0 c4 0 • GIV X 0 X + 2" c3 c4 0 0 + ( • GIV X 0 X • GIV " c3 ) c4 = X0X 0 •0 22 0 ; GIV )( 0 0 )] the second one is [c5 0 0 0 2 " [c5 0 = c25 0 +( Z0Z 0 2 0 " )( 0 2 0 " c5 [c5 0 2 0 + " [c5 0 ( 0 ( )( 2 0 " 0 0 +( 0 ( )( 2 0 " c5 (c5 2 0 " [c1 c5 2 0 " [c1 c5 + • Z0X 2 " )( Z0X 0 +( 0 0 0 )(c1 0 )(c1 0 0 )( 2 " c5 2 0 " )( 2 " + c1 c1 0 +( )c1 c4 0 ] + 0 0 + c1 c4 ( 0 1 Z0Z 1 Z0Z 0 0 2 " c5 2 0 " c5 0 2 " c2 + )( 0 GIV )( = c25 c4 ) 0 + 2c5 ) + c25 ] 0 2 " c4 (1 ^ 0X ^ + X 2 " [c4 (c4 )] 1 2 " c4 (c4 0 • 0 0 0 GIV • 0 )] • Z0X GIV )( Z0X • GIV 0 0 0 )( 0 )] ) )] X0Z ) • 2 " c4 c5 0 0 0 ) 2 •0 " GIV 0 0 + c2 +( 2 0 [c c " 2 5 c5 ) ) 0 2 •0 " GIV 0 0 ) ) )c2 c4 0 ] + c2 c4 ( 0 0 1 0 0 0 0 ) GIV 0 2 " c5 + 0 GIV 0 0 0 2 2 0 0 2 + 2" c2 ^ 0X ^ + " c4 " c4 " c2 X 0 0 2 + 2" c4 0 + 2" (c4 c5 )c5 0 0 " c4 2 0 0 2 2 0 + 2" c24 0 0 + 2" c4 c5 0 " c4 c5 " c4 = c25 0 ) 0 2 " c2 )( c2 2 " [c2 c5 0 0 ] 0 c5 0 )( )] 2 " 0 0 0 0 Z0X 0 0 )+( 0 • GIV 1 2 Z0Z ( " 2 " c5 0 1 Z0Z • GIV c2 )]( 0 )( Z0X 0 0 c5 • GIV X 0 Z + c5 0 • GIV 0 0 + c5 0 Z 0 X • GIV 0 0 • ) Z0X c2 0 c2 )]( 2" ( 0 2 " c2 GIV 0 0 2 0 " c5 0 2 0 " c5 0 2 " c2 ) • 0 )(c1 0 0 0 GIV •0 GIV 0 +( 2 " c5 2 " c5 GIV )+( X0Z ) 1 Z0Z 0 0 •0 1 Z0Z 1 Z0Z 0 2 " c1 0 2 " c1 0 )( 0 1 2 Z0Z ( " 0 ][c5 + X0Z GIV 2 " c5 +( + Z0Z 2 0 0 " c5 •0 0 2 " c5 c5 0 2 0 " 0 = c5 + GIV )( 2 0 " c5 [c5 0 2 0 " [c5 0 2 0 " [c5 + 2 0 " )( •0 0 0 )( 2" 0 • GIV 0 )(c1 0 c2 0 )][( 2" +( + Z0Z 2 0 0 " c5 +( 0 1 Z0Z 0 = c25 + 2 0 " GIV X 0 Z )][c5 + 1 •0 GIV X 0 Z ) Z 0 Z )( + c5 ( Z0Z +( 0 +( •0 2 0 " )( 0 0 0 0 )] 0 0 2 " c4 c5 + 2" c24 0 0 0 0 2 " c4 (c4 + 1 0 0 2 2 " c4 0 c5 ) ; the third one is 2 2 " c4 2 0 +2 0 •0 0 +2 X 0 Z ][c5 + 2 " c4 c5 0 +2 0 0 ; GIV )( 0 2 2 " c5 the fourth one is [c4 = c4 c5 X 0 Z + 2" c4 X 0 Z + 2 0 " c2 2 " c4 c5 2 " c1 c4 0 0 0 X0Z [ 2 " c4 c5 0 0 0 0 •0 + + 2 " •0 1 Z0Z GIV 0 •0 GIV GIV X0Z c4 X0Z GIV 2 " c3 c5 2 " c1 c3 X0Z 0 0 0 0 0 ][c5 1 2 Z0Z ( " + Z 01Z ( 2" Z0X • Z0X • 0 1 1 2 • " c4 X 0 Z Z 0 Z Z 0 Z Z 0 X GIV 0 1 0 • + 2" c2 0 0 Z 0 Z Z 0 X GIV 0 2 + 2" c1 c3 0 0 + 2" c2 c4 " c1 c4 0 2 + 2" c2 c3 0 0 " c2 c4 23 GIV 0 )( 0 + c4 •0 GIV 0 0 )] 0 )] X0Z X0Z 2 " c2 c3 1 Z0Z 1 Z0Z 0 0 Z0X Z0X • • GIV GIV 0 0 X0Z 2 0 " c2 2 " c4 (c4 + = c4 c5 0 2 2 " c4 ^ 0X ^ X 0 + 2" c4 0 0 0 + 2" c2 0 0 + 2" c3 (c5 c4 ) 0 + c5 ) 2 0 " c5 2 0 " c2 0 + 2 " c4 (c4 = c4 c5 0 0 c5 ) 2 0 0 " c4 0 2 0 " c2 0 2 2 0 " c4 0 0 2 " c3 c4 + + + 2 " c3 c4 0 0 c4 ) 0 0 2 " c3 (c5 + 0 0 2 " c4 0 0 2 " c4 0 0 2 " c5 ; and the …nal …fth one is the transpose of the fourth. Collecting terms we …nd V0 2 2 0 c4 X 0 X • GIV 0 + 2" c3 c4 0 " c4 0 0 0 2 2 0 c4 0 • GIV X 0 X + 2" c3 c4 0 0 + ( • GIV X 0 X • GIV " c3 ) +(1 c4 c3 )2 X^ 0 X^ + 2" c4 (1 c4 ) 0 + 2" c4 (2c4 + c3 2) 0 0 + 2" c4 (2c4 + c3 2 1 2c5 ) + c25 ] 0 0 + 2 2" c24 0 + 2 2" c4 c5 0 0 + 2 2" c4 c5 0 + 2 2" c45 " [c4 (c4 +2c4 c5 X^ 0 X^ 2 2" c24 0 + 2" c4 (c4 + c3 c5 ) 0 + 2" c4 (c4 + c3 c5 ) 0 0 +2 2" c3 (c2 c1 + c5 ) 0 0 2 2" c5 0 0 = c24 X0X 0 2 c4 X 0 X • GIV 0 c4 0 • GIV X 0 X ^ 0X ^ + c5 X ^ 0X ^ X 0 X + 2c4 c5 X 2 2 0 + 2" c4 (1 c4 ) 0 + 2 2" c24 0 2 2" c24 0 " c4 + 2" c3 c4 0 + 2" c4 (2c4 + c3 2) 0 + 2 2" c4 c5 0 + 2" c4 (c4 + c3 + 2" c3 c4 0 0 + 2" c4 (2c4 + c3 2) 0 0 + 2 2" c4 c5 0 0 + 2" c4 (c4 + c3 0 0 2 2 0 2 +( • GIV X 0 X • GIV 1 2c5 ) + c25 ] 0 0 " c3 ) " [c4 (c4 +2 2" c25 0 0 + 2 2" c3 (c2 c1 + c5 ) 0 0 2 2" c5 0 0 2) 0 0 0 = c24 = c24 + + = c24 + + (2c4 + c5 )c5 X0X 2 " c4 (1 2 2 " f c3 0 2c4 ) + [c4 (c4 1 c3 )2 + [(1 X0X 2 " c4 (1 0 2c5 ) X0X • 0 GIV 0 2 " c4 (c4 c4 ^ 0X ^ X 2 " c4 (1 2c5 ) 0 •0 + c3 c5 ) c4 ) 2c5 g 0 • c4 X0X •0 0 • 0 0 X0X GIV 2 " c4 (c4 0 GIV 0 0 + ( • GIV X0X 0 c4 + c3 c5 ) + 2 2 2c5 ) + c5 ] + 2c5 + 2c3 (c5 c24 ] + c4 ^ ^ 0X X c5 ) 0 c5 ) 0 0 ) 0 + • GIV X0X • GIV 0 0 c3 ] 0 0 X0X GIV GIV 0 0 + + 2 " c4 (1 2c4 ) 2 " [c5 (1 2c5 ) 0 0 : Thus, the asymptotic variance of the (generalized) GIV estimator (39) is N VGIV = 2 " [(1 + c3 )2 4 " c4 [1 +[c5 (1 0 1 1 1 2 2 0 c4 [ X^10 X^ X 0 X • GIV • GIV ^ 0X ^ + " c4 X ^ 0X ^ X X X ^ 0X ^ X 0 0 1 1 2 2c5 )[ X^10 X^ • GIV + • GIV 0 X^10 X^ ] ^ + " c4 (1 ^ ^ 0X ^ 0X X X 0 0 c3 + " 2 ( • GIV X 0 X • GIV )] • GIV • GIV ; c24 ] 2c4 ] 2c5 ) which gives the result of Theorem 1. Note that when k = l we have c1 = c2 ; so c4 = 0 and c5 = 1 to N VIV = 2 2 " c5 = 2 2 " c5 1 ^ 0X ^ X 1 ^ 0X ^ X + [(1 c3 )(2c3 [2c23 1) c3 + •0 " ( IV 2 2c3 + 1 2 0 + • GIV • GIV 1 ^ 0X ^] X c3 ; and then the variance simpli…es 0 ( • IV • • IV )] IV • • •0 X 0 X IV )] IV IV ; " X0X X0X •0 IV which is the result of Corollary 1. B Proof of Theorem 2 When 4 = 6 3 then there is an additional contribution to the asymptotic variance for which we have to evaluate 4" !! 0 : In obtaining the third term of (47) we already found !! 0 = c24 0 + c4 c5 0 0 + c4 c5 24 0 + c25 0 0 ; so 1 1 0 ^ 0X ^ !! X ^ 0X ^ X 0 1 1 4 2 c " 4 X ^ 0X ^ ^ 0X ^ X 0 1 1 4 2 " c4 X ^ 0X ^ ^ 0X ^ X 4 " = = 1 ^ 0X ^ X + 4 " c4 c5 + 2 • " c4 c5 GIV 0 1 ^ 0X ^ X 1 ^ 0X ^ X + 0 0 0 1 1 1 4 2 ^ 0X ^ ^ 0X ^ + " c5 X ^ 0X ^ X X 0 0 1 2 • • • GIV + c5 GIV GIV : ^ 0X ^ X 2 " c4 c5 0 0 4 " c4 c5 + 1 ^ 0X ^ X NN In the overall asymptotic variance this term has factor 4 1: The expression for VGIV already contains it with factor 2, so the additional term given in Theorem 4 has factor 4 3: When 3 6= 0 there is another additional contribution to the asymptotic variance, for which we have to evaluate 3 0 0 W ): 3 "( W 0 ! + ! Since W0 = 1X 0 + = c4 X 0 0 2Z 0 0 • GIV X 0 + c5 0 Z0 + ( 0 )( 2 0 " •0 +( 0 )( 2 0 " •0 GIV X0Z ) 1 0 Z0Z Z X0Z ) 1 Z0Z we …nd = c4 W0 0 and with ! 0 = c4 W0 = c4 [c4 ! •0 0 X0 0 + c5 0 + c5 X0 GIV Z0 GIV Z0 ; ; we obtain 0 0 X0 +c5 [c4 •0 0 0 X0 0 0 X0 GIV •0 + c5 X0 GIV 0 0 0 Z0 + c5 0 0 +( Z0 0 )( 0 +( •0 2 0 " GIV )( 2 0 " 0 1 ] X0Z ) Z0Z Z0 0 1 • GIV X 0 Z ) Z 0 Z Z0 Thus, 3 " = = 1 ^ 0X ^ X W0 !0 1 ^ 0X ^ X 0 0 0 1 1 1 1 0 •0 0 Z0 GIV X 0 ^ 0X ^ ^ 0X ^ + c5 X ^ 0X ^ ^ 0X ^ X X X 0 0 1 1 1 1 0 2 0 • )( " +( X^ 0 X^ GIV X 0 Z ) Z 0 Z Z 0 ^] ^ ^ 0X ^ 0X X X 0 0 1 1 1 1 0 0 •0 + 3" c5 [c4 X^10 X^ X 0 0 Z0 GIV X 0 ^ ^ + c5 X ^ 0X ^ 0X ^ 0X ^ ^ 0X ^ X X X 0 1 1 0 +( X^10 X^ )( 2" 0 • GIV X 0 Z ) Z 01Z Z 0 0 ^] ^ 0X ^ 0X ^ X X 0 0 0 1 1 1 3 0 • • c4 [ 3" c4 X^10 X^ X 0 0 X^10 X^ Z0 " GIV GIV X 0 ^ 0X ^ ^ 0X ^ ^ 0X ^ + " c5 X X X 0 0 1 1 1 3 2 0 • • +( " X^ 0 X^ " GIV )( " GIV X 0 Z ) Z 0 Z Z 0 ^] ^ 0X X 0 0 0 1 0 1• • • •0 +c5 [ " c4 X^10 X^ X 0 • GIV Z 0 GIV " GIV X 0 GIV GIV + " c5 X ^ 0X ^ 1 1• 2 0 •0 •0 +( " X^10 X^ GIV )( " " GIV X 0 Z ) Z 0 Z Z 0 GIV ] 3 " c4 [c4 1 ^ 0X ^ X X0 1 ^ 0X ^ X and the additional term is then equal to 3 " 1 ^ 0X ^[ X W0 3 " c4 1 ^ 0X ^ X = c4 f +c5 [ +c4 [ +c5 [ !0 + ! 0 0W 1 ^ 0X ^ X 1 +( 3" X^ 0 X^ 0 1 0 • GIV ^ 0X ^ X X X0 ] 3 1 ^ ^ 0X X multiplied by 1 ^ 0X ^ X • " GIV •0 GIV X0 0 1 ^ 0X ^ X + 3 " c5 1 ^ 0X ^ X 0 Z0 0 1 ^ 0X ^ X 0 1 1 2 0 • •0 " GIV )( " GIV X 0 Z ) Z 0 Z Z 0 ^ 0X ^g X 0 1 1 •0 0 • • •0 " c4 Z 0 GIV " GIV X 0 GIV GIV + " c5 X ^ 0X ^ 1 1• 2 0 •0 •0 +( " X^10 X^ GIV )( " " GIV X 0 Z ) Z 0 Z Z 0 GIV ] 1 1 1 1 1 3 0 0 3 0 • •0 " X " c4 X X0 X 0 GIV GIV + " c5 X Z0 ^ 0X ^ ^ 0X ^ ^ 0X ^ ^ 0X ^ ^ 0X ^ X X 0 1 1 1 0 2 3 0 • • + X^ 0 X^ Z 0 Z 0 Z ( " Z 0 X GIV )( " " GIV )] ^ 0X ^ X 0 0 1 1 0 1• 0 • • • • " c4 GIV X 0 " GIV X 0 GIV GIV + " c5 GIV Z 0 ^ 0X ^ ^ 0X ^ X X 0 1 1 0 2 1 •0 • • + GIV Z 0 Z 0 Z ( " Z 0 X GIV )( " " GIV )]: ^ 0X ^ X 25 0 ]: This expression can be simpli…ed slightly when we assume that both matrices X and Z have a …rst column of ones. Then 1 0 ^ 0X ^ X X 0 Z0 = ek;1 ; = 1 Z0Z X0Z Z0 = X 0 Z el;1 ; where ef;g denotes a f 1 unit vector, which has all elements equal to zero, apart from a unit element in position g: This yields 3 " 1 ^ 0X ^[ X W0 0 3 " c4 ek;1 = c4 f +c5 [ +c4 [ +c5 [ 1 ^ 0X ^ X 0 • • " GIV GIV !0 + ! 0W = 2 2 " c5 +( = 4 2 2 " c5 [(3 1 Z0X 1 X0Z 0 1)c25 • IV • IV 1 Z0X 0 0 1 1 1 3 0 ^ 0X ^ + " c5 X ^ 0X ^ X Z el;1 ^ 0X ^ X X 0 0 1 1 3 2 0 • • +( " X^ 0 X^ " GIV )( " GIV X 0 Z )el;1 ^ 0X ^g X 0 0 0 1 1• • • • •0 0 " c4 ek;1 GIV " GIV X 0 GIV GIV + " c5 X ^ 0X ^ X Z el;1 GIV 1• 2 0 •0 •0 +( " X^10 X^ GIV )( " " GIV X 0 Z )el;1 GIV ] 1 1 1 1 3 0 0 3 0 • •0 " X X 0 Z el;1 X " c4 X X 0 GIV GIV + " c5 X ^ 0X ^ ek;1 ^ 0X ^ ^ 0X ^ ^ 0X ^ 0 0 1 1 0 2 3 • • + X^ 0 X^ el;1 ( " Z 0 X GIV )( " " GIV )] ^ 0X ^ X 1 0 1 •0 0 • • •0 • " c4 GIV ek;1 " GIV X 0 GIV GIV + " c5 GIV el;1 Z 0 X X ^ ^ 0X 0 0 1 1 0 2 • • + • GIV el;1 ( " Z 0 X GIV )( " " GIV )]: ^ 0X ^ X 1 ^ 0X ^ X When k = l; i.e. c1 = c2 ; c5 = 1 to the expression NN VIV ] [2c25 Z0Z Z0Z 2 4 )c5 1 X0Z + 2c5 ( c3 ; 2 1 ^ ^ 0X X 3 c5 " 1 •0 IV 1 X0Z = X0 Z0 • •0 1) + 1 26 0 ( • IV IV •0 IV 2 " and " 2 2c5 + 1 1 2 3 " c5 [ Z 0 X 1 •0 3 " IV X 0 + X0 IV X0X 2 IV 0 Z0 0 ( • IV • IV NN specializes = 0; VGIV 0 )] • IV • IV 2 3 " c5 [ + + • IV " • Z0X 1 Z0X Z0 •0 IV + • IV 1 X0Z ] X0X • IV 0 )] • IV • IV : 0 Z0 1 X0Z ] ρ zε ρ =0 xz = 0.8 ρ zε = 0.2 14 12 12 10 8 8 6 6 4 4 2 2 ρ xε = -0.3 10 ρ xε =0 0 0.9 0.95 1 1.05 * β =1 1.1 0 1.15 14 14 12 12 10 10 8 8 6 6 4 4 2 2 0 0.9 0.95 1 1.05 * β =1 ρ zε 0 1.1 0.95 1 1.05 1.1 1.15 * β =1.0791 0.95 1 1.05 1.1 1.15 * β =1.0791 ρ =0 14 zε 1.2 1.2 = 0.2 15 12 ρ xε = 0.3 10 10 8 6 5 4 2 0 0.85 0.9 0.95 1 1.05 * β =1 0 1.1 0.95 1 1.05 1 1.05 1.1 1.15 * β =1.0791 1.2 14 15 12 ρ xε = 0.6 10 10 8 6 5 4 2 0 0.85 0.9 0.95 1 1.05 * β =1 0 0.95 1.1 Figure 1: Densities for k = l = 1; = 1; 27 2 2 x= " 1.1 1.15 * β =1.0791 = 10; n = 50; 200: 1.2 ρ ρ =0 zε xz = 0.3 5 ρ zε = 0.2 4 ρ xε = -0.3 4 3 3 2 2 1 1 0.8 1 * β =1 1.2 0 1.4 5 5 4 4 3 3 2 2 1 1 0.8 1 1.2 1.4 1.6 * β =1.2108 1.8 ρ xε =0 0 0 0.6 0.8 1 * β =1 ρ zε 1.2 0 1.4 = 0.3 1.2 1.4 * β =1.2108 ρ zε 1.6 1.8 = 0.2 6 4 xε 1 =0 5 ρ 0.8 5 4 3 3 2 2 1 0 1 0.6 0.8 1 * β =1 0 1.2 5 6 3 4 xε = 0.6 1.2 1.4 * β =1.2108 8 4 ρ 1 2 2 1 0 0.6 0.8 1 * β =1 0 1.2 Figure 2: Densities for k = l = 1; = 1; 28 1 2 2 x= " 1.2 1.4 * β =1.2108 = 10; n = 50; 200: 1.6 ρ zε ρ =0 xz = 0.1 ρ = 0.2 zε 1 1.5 0.6 1 xε = -0.3 0.8 ρ 0.4 0.5 0.2 0 0 1 * β =1 0 2 -2 0 2 * β =1.6325 4 1.5 1 1 ρ xε =0 1.5 0.5 0.5 0 0 1 * β =1 ρ zε 0 2 -1 0 1 ρ =0 zε 2 3 * β =1.6325 4 = 0.2 3 2.5 2 1 1.5 ρ xε = 0.3 1.5 1 0.5 0.5 0 0 1 * β =1 0 2 -1 0 1 2 3 * β =1.6325 4 2 4 3 1 2 ρ xε = 0.6 1.5 0.5 0 1 0 0.5 1 1.5 * β =1 2 0 2.5 Figure 3: Densities for k = l = 1; = 1; 29 0 2 2 x= " 1 2 * β =1.6325 = 10; n = 50; 200: 3 ρ zε ρ =0 xz = 0.01 ρ zε = 0.2 1 0.4 ρ xε = -0.3 0.8 0.3 0.6 0.2 0.4 0.1 0.2 0 -1 0 1 * β =1 2 0 3 0 5 10 * β =7.3246 0 5 10 * β =7.3246 0.8 0.4 0.6 0.3 0.4 0.2 0.2 0.1 ρ xε =0 1 0 -1 0 1 * β =1 ρ zε 2 0 3 ρ =0 zε = 0.2 1 0.4 ρ xε = 0.3 0.8 0.3 0.6 0.4 0.2 0.2 0.1 0 -1 0 1 * β =1 2 0 3 1.2 = 0.6 5 10 * β =7.3246 0 5 10 * β =7.3246 0.5 1 0.4 0.8 0.3 0.6 0.2 ρ xε 0 0.4 0.1 0.2 0 -1 0 1 * β =1 2 0 3 Figure 4: Densities for k = l = 1; 30 = 1; 2 2 x= " = 10; n = 50; 200: ρ zε n= 20 =0 0 ρ zε = 0.1 0 -1 -1 -2 -3 -2 0.5 ρ 0 0 xε ρ zε 0.2 0.4 ρ 0.6 0.8 0.5 ρ xz 0 0 xε ρ = 0.3 zε 0.2 0.4 ρ 0.6 0.8 xz = 0.6 0 0 -1 -1 -2 -2 -3 -3 0.5 ρ 0 0 xε ρ zε 0.2 0.4 ρ 0.6 0.8 0.5 ρ xz 0 xε n= 100 =0 0 ρ 0 zε 0.2 0.4 ρ 0.6 0.8 xz = 0.1 0 -0.5 -1 -1 -2 0.5 ρ 0 xε ρ 0 zε 0.2 0.4 ρ 0.6 0.8 0.5 ρ xz 0 xε ρ = 0.3 0 zε 0.2 0.4 ρ 0.6 0.8 xz = 0.6 0 0 -1 -1 -2 -2 0.5 ρ 0 xε 0 0.2 0.4 ρ 0.6 0.8 0.5 ρ xz 0 xε 0 0.2 0.4 ρ 0.6 xz Figure 5: log[M AE( ^ IV )=AM AE( ^ IV )] for k = l = 1; any SN ; n = 20; 100: 31 0.8 ρ zε n= 20 =0 ρ 1 1 0 0 zε = 0.1 -1 -1 -2 0.5 ρ 0 0 xε ρ zε 0.2 0.4 ρ 0.6 0.8 0.5 ρ xz 0 xε ρ = 0.3 0 zε 0.2 0.4 ρ 0.6 0.8 xz = 0.6 0 0 -1 -1 -2 -2 -3 0.5 ρ 0 0 xε ρ zε 0.2 0.4 ρ 0.6 0.8 0.5 ρ xz 0 xε n= 100 =0 ρ 0 zε 0.2 0.4 ρ 0.6 0.8 xz = 0.1 1 1 0 0 -1 -1 -2 -2 -3 0.5 ρ 0 xε ρ 0 zε 0.2 0.4 ρ 0.6 0.8 0.5 ρ xz 0 xε ρ = 0.3 0 zε 0.2 0.4 ρ 0.6 0.8 xz = 0.6 0 0 -2 -2 -4 -4 0.5 ρ 0 xε 0 0.2 0.4 ρ 0.6 0.8 0.5 ρ xz 0 xε 0 0.2 0.4 ρ 0.6 xz Figure 6: log[M AE( ^ OLS )=M AE( ^ IV )] for k = l = 1; any SN ; n = 20; 100: 32 0.8 ρ z ε ρ =0 1 z ε ρ = 0.2 1 10 8 8 8 8 6 6 6 6 4 4 4 4 2 2 2 = -0.3 12 10 xε = -0.3 12 10 ρ ρ xε 12 0.95 1 1.05 * β =1 1.1 0 0.9 0.95 1 1.05 1.1 * β =1.0395 2 0.9 0.95 1 1.05 * β =1 1.1 0 14 14 14 14 12 12 12 12 10 10 10 10 8 8 8 8 6 6 6 6 4 4 4 4 2 2 2 0.9 0.95 1 1.05 * β =1 ρ z ε 1.1 0 xε 0.9 0.95 1 1.05 1.1 * β =1.0395 ρ =0 1 z ε 0 1.15 10 8 8 6 6 4 4 2 2 2 ρ xε = 0.3 4 14 1.05 1.1 * β =1.0395 0.9 0.95 1 1.05 * β =1 1.1 0 14 14 12 12 10 10 10 8 8 6 6 4 4 15 ρ = 0.6 10 xε 8 6 5 2 0 2 0.9 0.95 1 1.05 * β =1 ρ z ε 1.1 0 0.95 1 ρ =0 1 1.05 1.1 * β =1.0395 z ε 0.9 0.95 1 1.05 1.1 * β =1.0099 1.15 0.9 0.95 1 1.05 1.1 * β =1.0099 1.15 0.95 1 1.05 * β =1.0099 1.15 2 0 1.15 = 0.2 z ε 2 0 1.15 12 4 1.15 1 12 4 1 1.05 1.1 * β =1.0099 ρ 10 6 0.95 1 =0 z ε 12 6 0 0.9 0 14 8 1.1 0.95 1 8 1 1.05 * β =1 1.1 14 10 0.95 1 1.05 * β =1 ρ 12 0.9 0.95 = 0.2 10 0 0.9 2 0.9 1 14 12 xε 0 1.15 ρ =0 xε ρ =0 0.9 14 ρ = 0.3 = 0.2 z ε 1 14 10 0 xε ρ =0 14 12 0 ρ = 0.6 z ε 1 14 0.9 0.95 1 1.05 * β =1 ρ = 0.2 1 z ε 1.1 0 0.9 ρ =0 1 z ε 1.1 = 0.2 1 5 5 5 4 4 ρ = -0.3 4 3 3 xε 2 2 2 2 1 1 1 0 0 0 0.8 1 * β =1 1.2 0.8 1 1.2 1.4 * β =1.1054 3 3 1 0.8 1 * β =1 1.2 0 5 5 5 4 4 4 4 3 3 3 3 2 2 2 2 1 1 1 1 0 0 0 0.6 0.8 1 1.2 * β =1.0264 1.4 0.6 0.8 1 1.2 * β =1.0264 1.4 xε ρ =0 5 xε ρ =0 xε ρ = -0.3 4 0.8 1 * β =1 ρ z ε 1.2 0.8 1 ρ =0 1 1.2 * β =1.1054 z ε 1.4 0.8 1 * β =1 ρ = 0.2 z ε 1.2 0 ρ =0 1 1 5 z ε = 0.2 1 5 5 4 4 3 3 2 2 1 5 3 = 0.3 4 xε 3 2 2 1 1 1 0 0 0 0.8 1 * β =1 1.2 0.8 1 1.2 * β =1.1054 1.4 6 5 ρ = 0.6 4 0 2 1 * β =1 1.2 0 0.8 1 Figure 7: Densities for l 1.2 * β =1.1054 1 = k = 1; 33 0.8 1 1.2 * β =1.0264 1.4 5 4 3 3 2 0 1.4 0 6 2 1 1 0.8 1.2 xε xε 3 1 1 * β =1 4 3 2 0.8 5 5 4 ρ = 0.6 ρ xε ρ = 0.3 4 1 0.8 = 1; 1 * β =1 2 2 x= " 1.2 0 0.8 1 1.2 * β =1.0264 = 10; n = 50; 200: 1.4 ρ ρ =0 z ε 1 z ε ρ = 0.2 1 z ε ρ =0 1 = 0.2 z ε 1 1.5 1.2 2 2 1 0.6 1 1 ρ ρ xε 1 1.5 = -0.3 0.8 xε = -0.3 1.5 0.5 0.4 0.5 0.5 0.2 0 0.5 1 * β =1 0 1.5 2 0 1 0 2 * β =1.3162 0 1.5 0 1 2 * β =1.0791 1.5 1.5 1 ρ =0 xε 1 xε 1 * β =1 2 1.5 1.5 ρ =0 0.5 1 1 0.5 0.5 0.5 0.5 0 0.5 1 * β =1 ρ 0 1.5 0 1 ρ =0 z ε 1 z ε 0 2 * β =1.3162 0.5 1 * β =1 ρ = 0.2 1 z ε 0 1.5 0 1 2 * β =1.0791 ρ =0 1 = 0.2 z ε 1 2 2 0.5 1 * β =1 0 1.5 2.5 = 0.3 0 0.5 1 0 1 0.5 0.5 1 * β =1 ρ z ε 0 1.5 1 * β =1 5 4 1.5 3 1 2 0.8 1 1.2 ρ =0 z ε 0 1.4 1.6 * β =1.3162 0.5 1 * β =1 ρ = 0.2 1 z ε 0 1.5 1.5 2 = -0.3 0.4 0.2 0 1 2 3 4 0 5 * β =4.1623 0 0.5 1 * β =1 1.5 1 2 * β =1.7906 0 1 2 * β =1.7906 1 1 ρ =0 0.6 0.6 xε xε ρ =0 0 0.8 0.4 0.4 0.5 0.2 0 0 0.5 ρ 1 * β =1 z ε 1.5 0 2 0.2 0 1 2 ρ =0 1 z ε 3 4 0 5 * β =4.1623 0 0.5 1 * β =1 ρ = 0.2 1 z ε 1.5 ρ =0 1 ρ = 0.3 1.5 1 1 1 xε xε 1 0.5 0.5 0 0.5 1 * β =1 1.5 2 0 1 2 3 0 4 0.5 0 0.5 * β =4.1623 1 * β =1 1.5 2 5 1.5 ρ = 0.6 xε 0.5 1 1.5 2 * β =1.7906 4 3 1 xε 3 2 2 0.5 0.5 1 0 0 5 1.5 4 1 = 0.2 1 1.5 0 z ε 2 1.5 1.5 0.5 0 2 2 ρ = 0.3 0 2 1.5 0.5 ρ = 0.6 = 0.2 z ε 0.6 0.8 1 1.6 0.5 1 1.5 1.4 1 0.2 1 * β =1 1.2 * β =1.0791 ρ 1 ρ 0.5 0.5 2 1 xε = -0.3 xε ρ 0.4 0 1.5 * β =1.0791 0.8 0.6 0 1 =0 1.5 0.8 0 1 1 1 1 0.5 1 0.5 1 1.5 0 1.5 2 xε xε 2 1 0.5 2.5 3 1.5 0 1 0.5 1.5 2 * β =1.3162 4 2 1 0.5 xε 1 0.5 ρ 1 0.5 1.5 1.5 ρ = 0.6 = 0.3 xε ρ 1.5 0 ρ = 0.6 2 2 1.5 0.5 1 * β =1 1.5 2 0 1 1 2 Figure 8: Densities for l 3 0 4 0.5 * β =4.1623 1 = k = 1; 34 = 1; 1 * β =1 2 2 x= " 1.5 2 0 1 1.2 = 10; n = 50; 200: 1.4 1.6 1.8 * β =1.7906 ρ zε ρ =0 0 zε = 0.1 0 -1 -1 -2 -2 -3 0.5 0 ρ 0 xε ρ zε 0.2 0.4 0.6 0.8 0.5 ρ 0 xε ρ = 0.3 0 zε 0.2 0.4 0.6 0.8 = 0.6 0 0 -2 -2 -4 -4 0.5 ρ 0 0 xε ρ zε 0.2 0.4 0.6 0.8 0.5 0 ρ xε ρ =0 0 0 zε 0.2 0.4 0.6 0.8 = 0.1 0 -1 -1 -2 -2 0.5 ρ 0 xε ρ 0 zε 0.2 0.4 0.6 0.8 0.5 ρ 0 xε ρ = 0.3 0 zε 0.2 0.4 0.6 0.8 = 0.6 0 0 -2 -2 -4 -4 0.5 ρ 0 xε 0 0.2 0.4 0.6 0.8 0.5 ρ Figure 9: log[M AE( ^ GIV )=AM AE( ^ GIV )] for k = l 35 0 xε 0 0.2 0.4 0.6 1 = 1; any SN ; n = 20: 0.8 ρ zε ρ =0 zε = 0.1 1 1 0 0 -1 -1 -2 -2 0.5 ρ 0 0 xε ρ zε 0.2 0.4 0.6 0.8 0.5 0 ρ xε ρ = 0.3 0 zε 0.2 0.4 0.6 0.8 = 0.6 0 0 -1 -1 -2 -2 -3 -3 0.5 0 ρ 0 xε ρ zε 0.2 0.4 0.6 0.8 0.5 ρ 0 xε ρ =0 0 zε 0.2 0.4 0.6 0.8 = 0.1 1 1 0 0 -1 -1 -2 -2 0.5 ρ 0 xε ρ 0 zε 0.2 0.4 0.6 0.8 0.5 ρ 0 xε ρ = 0.3 1 0 zε 0.2 0.4 0.6 0.8 = 0.6 0 0 -1 -1 -2 -2 -3 -3 0.5 ρ 0 xε 0 0.2 0.4 0.6 0.8 0.5 ρ Figure 10: log[M AE( ^ OLS )=M AE( ^ GIV )] for k = l 36 0 xε 0 0.2 0.4 0.6 1 = 1; any SN ; n = 100: 0.8
© Copyright 2026 Paperzz