Journal of Economic Growth, 7, 25±41, 2002 # 2002 Kluwer Academic Publishers. Manufactured in The Netherlands. Income Inequality and Economic Growth: Evidence from American Data UGO PANIZZA Research Department, Inter-American Development Bank, Stop W-0436, 1300 New York Avenue, NW, Washington DC 20577, USA While most cross-country studies ®nd a negative relationship between income inequality and economic growth, studies that use panel data suggest the presence of a positive relationship between inequality and growth. This paper uses a cross-state panel for the United States to assess the relationship between inequality and growth. Using both standard ®xed effects and GMM estimations, this paper does not ®nd evidence of a positive relationship between inequality and growth but ®nds some evidence in support of a negative relationship between inequality and growth. The paper, however, shows that the relationship between inequality and growth is not robust and that small differences in the method used to measure inequality can result in large differences in the estimated relationship between inequality and growth. Keywords: inequality, endogenous growth, political economy JEL classi®cation: D31, E62, P16, O41, I22 1. Introduction The purpose of this paper is to study the relationship between inequality and growth using a panel of income distribution data that covers the 48 states of the continental US for the 1940±1980 period. This is an interesting experiment because while cross-country studies found a negative relationship between inequality and growth (Perotti, 1996), more recent work showed that panel estimations yield a positive relationship between inequality and growth (Li and Zou, 1998; Forbes, 2000). At the same time, existing work that used US cross-state data found a negative relationship between inequality and growth when inequality is measured with the income share of the third quintile and a positive relationship between inequality and growth when inequality is measured with the Gini index (Partridge, 1997). This paper shows that panel estimations based on US data yield no support for the presence of a positive relationship between changes in inequality and changes in growth and concludes that, at the US cross-state level, there is no clear, robust relationship between inequality and growth and that small differences in the method used to measure inequality and in the econometric speci®cation yield substantial differences in the estimated relationship between inequality and growth. In particular, the paper shows that, while there is a class of ®xed effects speci®cations that yields a negative (or at least nonpositive) relationship between changes in inequality and changes in growth, the results are not extremely robust and are likely to be plagued by several econometric problems. 26 UGO PANIZZA While Kaldor (1957) suggested that there was a trade-off between equity and growth, the recent theoretical literature often predicts a negative link between inequality and growth (Galor and Zeira, 1993; Alesina and Rodrik, 1994; Persson and Tabellini, 1994).1 On the empirical side, most cross-country studies ®nd support for a negative relationship between inequality and growth (Alesina and Rodrik, 1994; Persson and Tabellini, 1994; Perotti, 1996; Easterly, 2001). However, Forbes (2000) suggests that country-speci®c, omitted variables are the cause of a signi®cant negative bias in the estimations of the effects of inequality on growth and concludes that ®xed effects estimations yield the consistent result of a positive short-term correlation between inequality and growth. Barro (2000) using a larger sample and three stages least squares, ®nds a positive relationship between inequality and growth in developed countries and a negative relationship between inequality and growth in developing countries. Banerjee and Du¯o (1999), however, argue that the relationship between inequality and growth is non-linear and that changes in inequality are associated with lower subsequent growth. The main problem affecting cross-country studies of the relationship between income distribution and growth is the quality and comparability of the inequality data. Even though the data set assembled by Deininger and Squire (1996) greatly improved the quality of the available data on income inequality, this data set is far from being problem free. In particular, Atkinson and Brandolini (1999) and SzeÂkely and Hilgert (1999) show that, in the cases of OECD and Latin American countries, even the ``high quality'' data set of Deininger and Squire is plagued by serious problems of data comparability and quality. SzeÂkely and Hilgert also show that Forbes's (2000) results could be dependent on the method used to compute inequality. A possible solution to these problems consists of using regional data. As the experience of the American states represents a very important source of data for studying the determinants of long-run growth (Barro and Sala-i-Martin, 1991), this paper tackles the data quality issue by building a cross-state panel of income distribution and using this data set to explore the links between inequality and growth. While this paper does not ®nd evidence for a positive relationship between inequality and growth, Partridge (1997), who uses data for American states similar to the ones used in this paper, ®nds a positive correlation between the income share of the middle class and economic growth and a positive correlation between the Gini index and growth. One of the main messages of Partridge's work is that different measures of inequality can convey very different messages. There are two main differences between this paper and Partridge's. First, while this paper uses tax data for the 1940±1980 period, Partridge uses Census data for the 1960±1980 period.2 Second, while this paper concentrates on ®xed effects and GMM estimations, Partridge focuses on pooled OLS. This paper shows that both the data and the econometric methodology play a role in explaining the differences between the results of this paper and Partridge's. The paper is organized as follows: Section 2 illustrates the costs and bene®ts of using regional data, discusses the methods used in deriving the income distribution data set, and describes the variability of the cross-state data set; Section 3 presents the results of the reduced form estimates of an equation that links income distribution to economic growth and tests the robustness of these estimates; Section 4 concludes. 27 INCOME INEQUALITY AND ECONOMIC GROWTH 2. The Data The measures of income distribution used in this paper were built using adjusted gross income data from the annual reports, Statistics of Income (SOI), published by the Internal Revenue Service. The SOI data were used to compute Gini indices and break up the population into quintiles for 1940, 1950, 1960, 1969, and 1980.3 The SOI data are based on pre-tax adjusted gross income. They include capital gains but exclude interest on state and local bonds and most transfer income. This is a good source of data for at least three reasons: (i) since tax evasion is not a substantial problem in the US, data on taxation are likely to be more accurate than the data of other surveys (such as the Census and the CPS), (ii) some theoretical models link economic growth to pre-tax and transfer income distribution, and (iii) pre-tax data show more variability than after-tax data. The biggest problem with this data set is that it does not capture the income of the people who are not required to ®ll out a tax return.4 As the SOI data are grouped in income classes, it was necessary to use approximation techniques both to divide the population into quintiles and to estimate the Gini indices. To compute the quintiles, I used the split histogram method suggested by Cowell (1995). Technical details on the derivation of the quintiles are provided in the Appendix. The Gini index was instead computed by using a simple linear approximation to the Lorenz curve. It is well known that this method systematically understates inequality. However, Gastwirth (1972) shows that if the number of groups is large enough, the error is small. One key issue with cross-state data is their limited variability. To compare the variability of the cross-state data set used in this paper with the cross-country variability, I concentrate on the coef®cients of variation of the Gini index and income share of the third quintile (Q3) (Table 1 contains the summary statistics of the data set used by Perotti, 1996, and Table 2 the summary statistics for the data set used in this paper). As expected, cross-country data have larger variability, but the difference between the variability of the two data sets is not as dramatic as one would think. In fact, the coef®cients of variation for the cross-state data set are often greater than one-third of the cross-country coef®cient of variation. As this paper focuses on ®xed effects estimations, within-state variability is more important than cross-state variability. Rather than looking at the behavior of inequality in one state at a time, I study the degree of within-state variability by regressing the inequality indices on a set of state and decade dummies and check what proportion of the total variance of the inequality index is explained by this set of dummies. A high R2 would indicate that there is limited within-state and within-decade variability which, in turn, would lead to unstable ®xed effects estimations.5 This experiment indicates that, while Table 1. Cross-country data. Income distribution and income per capita. Min. Max. Av. C.V. Gini Third Quintile Third and Fourth Quintiles Income Per Capita 0.62 0.448 0.18 0.07 Gabon 0.18 Denmark 0.13 0.19 0.225 Kenya 0.42 Denmark 0.342 0.156 208 Tanzania 7380 USA 2190 0.85 Note: All the variables are measured around 1960. Income per capita is at 1980 prices. 28 UGO PANIZZA Table 2. Cross-state data. Income Share of the Third Quintile 1940 Min. Max. Av. C.V. 0.105 0.182 0.163 0.07 1950 DE ND 0.126 0.179 0.161 0.05 1960 DE WA 0.097 0.212 0.164 0.11 1970 CA NV 0.148 0.192 0.163 0.06 1980 1940±1980 FL WY 0.118 0.174 0.153 0.05 TX CT 0.097 0.212 0.161 0.08 ME NM 0.399 0.524 0.471 0.04 CT SD 0.305 0.676 0.429 0.11 Gini Index Min Max Av. C.V. 0.32 0.584 0.365 0.11 ID DE 0.305 0.535 0.411 0.06 GA DE 0.404 0.673 0.441 0.09 UT MT 0.420 0.494 0.459 0.04 Note: CA, California; CT, Connecticut; DE, Delaware; FL, Florida; GA, Georgia; ID, Idaho; MT, Montana; ND, North Dakota; NM, New Mexico; NV, Nevada; SD, South Dakota; TX, Texas; WA, Washington; WY, Wyoming. data variability is not a serious problem for Q3 R2 0:38, it could be a problem for the Gini index R2 0:76. Another possible criticism to the use of cross-state data is thatÐas most redistributive and tax policies are administered by the federal governmentÐcross-state data may be unable to capture the presence of a ®scal policy channel. There are, however, signi®cant cross-state differences in the level of tax progressivity, income taxation, property taxation, and social expenditure. Such differences have been used by Partridge (1997) to study the links between inequality and growth and by Rodriguez (1999) to study the links between inequality and redistribution. 3. Estimations Even though the aim of this paper is to fully exploit the panel structure of the data illustrated in Section 2, the paper starts by presenting the results of simple cross-sectional and pooled OLS estimations. Next, the paper moves on to standard ®xed effects and GMM estimations and checks the robustness of the results. 3.1. Basic Estimations This section presents cross-sectional, pooled OLS, ®xed effects, and GMM estimations of the relationship between inequality and growth. The section starts by estimating, for each period, simple cross-sectional regressions of the form: GROWTHi a byi gDISTRi yXi oRi ei ; 1 where GROWTHi is state i's annual growth rate of income per capita, yi is state i's log of income per capita, DISTRi is a variable capturing income distribution (measured using the 29 INCOME INEQUALITY AND ECONOMIC GROWTH income share of the third quintile or the Gini index), Xi is a matrix of controls, and Ri is a matrix of regional dummies controlling for the possibility of different growth patterns in different regions of the US (South, Mid-West, and West). All the explanatory variables are measured at the beginning of the growth period. The matrix Xi includes a set of controls that are likely to be correlated with both income distribution and growth. In particular, I follow Perotti (1996) and control for the stock of human capital (High and Coll measure the percentage of adults with high school and college degrees), the degree of urbanization (Urb measures the fraction of the population that lives in urban areas), and age structure (Old measures the percentage of the population above 65 years of age). I start by estimating equation (1) for all ten and twenty-year periods going from 1940 to 1990. Table 3 shows that the coef®cients of the cross-sectional estimates are often positive when inequality is measured with Q3 and negative when inequality is measured with the Gini index. However, these coef®cients are never statistically signi®cant (the table summarizes the results of 18 regressions and reports only the coef®cient and t statistics attached to the inequality variables). Pooled OLS estimations of equation (1) suggest that there is a negative and signi®cant relationship between inequality and growth when growth is measured over a ten-year period, and no signi®cant relationship between inequality and growth when growth is measured over a twenty-year period (Table 4; the constant and regional dummies are omitted from the table). Next, the paper uses a ®xed effects model that allows controlling for unobserved timeinvariant state characteristics. The basic model to be estimated is the following: GROWTH t;t n;i byt;i gDISTRt;i yXt;i ai Zt et;i ; 2 where GROWTH t;t n is the annual growth rate of income per capita from period t to period t n, ai is a state-speci®c intercept, and Zt is a period-speci®c intercept (all other Table 3. Cross-sectional regressions: ten and twenty-year growth episodes. Length of Growth Episode 10 years Starting Year Q3 1940 0.92 (0.07) 5.36 (0.58) 0.19 (0.06) 4.8 (0.62) 12.01 (0.91) 1950 1960 1970 1980 Note: t statistics in parentheses. 20 years Gini ( ( ( ( 1.83 (0.58) 1.96 0.77) 1.35 0.84) 7.16 1.52) 5.79 1.26) Q3 Gini 2.01 (0.38) 5.14 (0.89) 0.38 (0.16) 3.62 ( 1.23) 0.47 (0.36) 2.8 ( 1.81) 1.21 ( 1.12) 0.09 (0.04) 30 UGO PANIZZA Table 4. Pooled OLS: Ten and twenty-year growth episodes. Length of Growth Episode 10 years Q3 Y Q3 Gini High Coll Urb Old R2 N.obs. ( 20 years Gini 4.19*** 12.96) 11.86** (2.12) 3.59** (2.18) 15.85*** (5.73) 1.11** (2.21) 3.14 (1.52) 0.59 239 4.04*** 11.59) ( ( 3.92* 1.88) 4.46*** (2.69) 15.06*** (5.49) 1.15** (2.32) 3.23 (1.56) 0.59 240 Q3 ( ( 1.10*** 5.16) 3.77 (1.03) 1.67 1.59) 8.40*** (3.60) 0.09 (0.41) 2.25 (1.16) 0.65 95 Gini ( 0.99*** 4.73) 0.60 0.40) 1.43 ( 1.39) 7.95*** (3.44) 0.15 (0.70) 2.33 (1.16) 0.64 96 ( variables are de®ned as in equation (1)).6 It is worth noting that the coef®cient g of equation (2) has a different interpretation from the coef®cient g of equation (1). While the latter measures the relationship between inequality and growth across states, the former should be interpreted as a measure of the correlation between changes in inequality and changes in growth within a given state (Forbes, 2000). When I restrict Z to be equal to zero, I ®nd a positive and statistically signi®cant correlation between changes in the income share of the third quintile and changes in growth, and no signi®cant correlation between changes in the Gini index and changes in growth (columns 1±6 of Table 5). The last three columns of Table 5, however, show that the correlation between inequality and growth changes when the regression is augmented with decade dummies. In particular, both the coef®cient and t statistic attached to Q3 decrease substantially and the coef®cient attached to the Gini index becomes statistically signi®cant. Contrary to what was found by Partridge (1997), I never ®nd that both Q3 and the Gini index are positively and signi®cantly correlated with growth. Interestingly, both sets of regressions of Table 5 (with and without time dummies) suggest the presence of a negative relationship between inequality and growth. However, when one does not control for decade-®xed effects, the relevant variable seems to be the income share of the third quintile, and when one controls for decade-®xed effects, the relevant variable seems to be the Gini index. On the one hand, the inclusion of time dummies exacerbates the multicollinearity problem of the ®xed effects estimations; on the other hand, their exclusion is likely to be the cause of omitted variable bias. As the time dummies are highly signi®cant and an F-test shows that it is impossible to reject the null 31 INCOME INEQUALITY AND ECONOMIC GROWTH Table 5. Basic ®xed effects regressions: Ten-year growth episodes. No Controls Y 2.02*** ( 10.64) Q3 Controls 1.93*** ( 5.76) 2.07*** ( 6.06) 5.12*** ( 12.63) 5.07*** ( 11.21) 11.32) 6.57*** ( 14.42) 16.45** 13.22** 13.09* 5.17 (2.06) (1.92) (2.02) (1.73) (1.24) ( 2.39 0.58 0.75) (0.17) 2.81 ( High 1.03) 6.73*** Coll Urb Old 0.43 N.obs. 5.11*** ( 15.83** Gini R2 Controls and Decade Dummies 239 0.42 240 0.43 239 6.42*** ( 14.90) 6.95 6.03*** 0.04) 7.69*** ( 6.76*** (3.41) (3.86) (3.28) 18.21*** 17.11*** 18.18*** (5.86) (5.41) (5.67) 14.82) ( 0.12 ( 6.53*** ( 3.72) 1.34) 7.75*** ( 3.73) 1.37 1.89 1.97 (1.06) (1.51) (1.56) 3.47 3.05 3.42 (1.32) (1.20) (1.34) 0.10 1.16* 1.00 (1.45) (1.69) (1.44) 4.05* 4.19* 4.19* 1.16 1.28 1.35 (1.70) (1.71) (1.69) (0.77) (0.88) (0.93) 0.65 239 0.65 240 1.02** ( 0.65 2.36) 0.94** ( 0.78 239 2.28) 0.95** ( 0.77 239 2.29) 0.79 240 239 Notes: t statistics in parentheses. * Denotes a parameter which is statistically signi®cant at 10%; ** at 5%, and *** at 1%. that they are jointly equal to zero, I tend to prefer the estimations with time dummies. However, I report both sets of estimations because some readers may think that the cost of including the time dummies outweighs the bene®ts of their inclusion. When I estimate equation (2) using twenty-year growth episodes, I ®nd that the model with time dummies yields no signi®cant correlation between changes in inequality and Table 6. Basic ®xed effects regressions: Twenty-year growth episodes. No Controls Y 0.43*** ( Q3 3.76) Controls 0.58** ( 2.01) 1.05*** ( 3.87) 1.13*** ( 3.83) 3.98) 1.53*** ( 4.56) 21.47*** 12.90*** 17.49*** (3.49) (4.90) (2.88) (3.68) 1.86 (0.70) 6.11** 1.35 (2.45) ( Coll 0.57 0.20 0.29) (0.10) 8.56** (2.15) Urb Old 0.34 95 0.37 96 0.09 95 ( 8.40** 3.35*** ( 6.67 2.44* 0.20 0.06 (0.71) (0.23) 0.33 1.80 0.27 (0.12) (0.59) 0.51 0.25 ( 0.10) 0.56 95 1.54) 0.52) 0.09 0.17) (0.045) 0.74 0.68 96 1.70) 5.94** (2.03) 0.10 ( 0.35 95 2.45* ( (1.78) 0.13 ( 1.97 (1.61) 5.07* (1.99) 0.66) 0.40 (0.25) 2.18 ( 5.81** ( 8.06) (1.01) (1.69) 0.11 3.09*** ( (1.24) 0.47) (2.24) 9.16) 4.04 0.88 8.51** (2.03) 96 7.97) (2.21) (0.42) 95 3.05*** ( 5.07** (0.57) High N.obs. 1.49*** ( 16.58*** Gini R2 Controls and Decade Dummies 0.14 ( 0.67) 0.54 ( 0.26) 0.75 95 Notes: t statistics in parentheses. * Denotes a parameter which is statistically signi®cant at 10%, ** at 5%, and *** at 1%. 32 UGO PANIZZA changes in growth (Table 6) and that the model without time dummies yields a signi®cant correlation between Q3 and growth (the Gini index is positive and signi®cant when both measures of income inequality are included in the same regression). Although the difference between the results of the regressions for ten and twenty-year growth episodes may be due to the fact that the short-run relationship between changes in inequality and changes in growth is different from the respective long-run relationship, they could also be driven by the limited degrees of freedom in the regressions of Table 6.7 The ®xed effects estimations of Tables 5 and 6 may be biased by the fact that equation (2) contains a lag of the endogenous variable (Caselli et al. 1996; Judson and Owen, 1999). To address this issue, I re-run the regressions of Table 5 using the two robust GMM estimators developed by Arellano and Bond (1991).8 As GMM estimations in differences require one extra period, the estimations of Tables 7 and 8 are for 1950±1990 rather than 1940±1990 (the ®xed effects estimations reported under the FE columns refer to this subperiod). It should also be noted that the coef®cient attached to the lagged dependent ~ b 1 . variable should be interpreted as b 10 GMM estimations with time dummies suggest a positive but not statistically signi®cant relationship between changes in Q3 and changes in growth (the coef®cient is marginally signi®cant when GMM2 is used) and a negative and signi®cant correlation between changes in growth and changes in the Gini index (this is true for both GMM1 and GMM2). The last two columns of Table 7 also indicate that, contrary to the ®xed-effects Table 7. GMM estimations. Regressions with decade dummies. FE GMM1 GMM2 FE GMM1 GMM2 FE GMM1 GMM2 (1) (2) (3) (4) (5) (6) (7) (8) (9) Y 0.43*** 0.26** (6.96) Q3 0.44*** (2.39) 0.43*** (5.03) 1.3 3.55 4.6* (0.35) (1.06) (1.73) 1.7 ( High Coll Urb 0.43*** (5.05) 0.28*** (6.99) 1.06) 3.5** ( 2.26) ( 0.30) ( 1.04) 42.8*** ( 2.83) 0.43*** (2.89) (5.27) 1.5 ( 0.37) ( 2.68) 2.7 1.2 (0.35) 3.8** 3.7*** ( 2.68) 2.5** 1.8* 1.8* 2.6** 2.1** 2.2** 2.6** 2.1** 2.0** (2.42) (1.77) (1.85) (2.53) (2.09) (2.19) (2.54) (1.98) (1.96) 2.9 2.9* 4.1** 2.9 2.3 3.3** 2.9 3.0 4.3** (1.41) (1.66) (2.35) (1.39) (1.30) (1.93) (1.40) (1.70) (2.50) 0.4 ( 0.43*** (2.73) 1.7 Gini Old 0.28*** (7.06) 1.01) 0.6** ( 2.28) 1.2 1.3** (0.96) (2.25) 1.0*** ( 3.84) 1.8*** (3.20) 0.4 ( 0.98) 0.6*** ( 2.28) 1.2 1.3** (1.03) (1.96) 1.0*** ( 3.53) 1.8*** (2.82) 0.4 ( 0.96) 0.7** ( 2.39) 1.3 1.4** (1.03) (2.32) 1.0*** ( 3.67) 1.9*** (3.21) Test for 1st order serial 1.84 correlation p 0.07 2.67 p 0.007 2.26 p 0.02 2.92 p 0.004 2.31 P 0.02 2.87 p 0.004 Test for 2nd order serial 1.09 0.20 0.92 0.11 0.98 0.32 correlation p 0.28 p 0.84 p 0.36 P 0.91 P 0.33 p 0.75 Notes: t statistics in parentheses. * Denotes a parameter which is statistically signi®cant at 10%, ** at 5%, and *** at 1%. The standard errors were computed using Arellano and Bond (1991) robust estimators that allow for heteroskedastic residuals. 33 INCOME INEQUALITY AND ECONOMIC GROWTH Table 8. GMM estimations. Regressions without decade dummies. FE (1) Y Q3 Gini High Coll Urb Old Test for 1st order serial correlation Test for 2nd order serial correlation 0.79*** (16.67) 10.6* (1.85) 6.7*** (4.23) 2.3 (0.81) 1.9*** (3.16) 3.6* (1.81) GMM1 (2) GMM2 (3) FE (4) GMM1 (5) GMM2 (6) FE (7) GMM1 (8) GMM2 (9) 0.41 (1.42) 14.3* (1.86) 0.39 0.75*** 0.72 0.41 0.75*** 0.67 0.51 (1.37) (15.26) (1.41) (0.86) (15.87) (1.41) (1.18) 13.6** 20.2*** 9.8 11.7 (2.03) (3.19) (1.18) (1.44) 4.9* 9.8 3.04 9.4*** 7.20 4.9 (1.85) ( 1.12) ( 0.46) (3.20) ( 0.94) (0.81) 0.9 0.9* 6.4*** 2.0 3.4 5.2*** 1.5 2.2 (0.43) (0.41) (4.03) (0.78) (1.43) (3.28) (0.54) (0.91) 5.7 5.6 2.4 4.4 3.5 3.5 5.6 5.1 (1.56) (1.57) (0.84) (0.97) (0.79) (1.25) (1.26) (1.16) 0.02 0.1 1.9*** 0.2 0.1 1.5** 0.2 0.3 (0.02) ( 0.16) (3.19) ( 0.14) (0.06) (2.54) ( 0.86) ( 0.23) 2.8** 2.8** 3.2* 3.8* 3.6* 3.1* 3.7** 3.4** (2.38) (2.35) (1.64) (1.90) (1.79) (1.60) (2.11) (1.99) 2.24 2.30 p 0.03 p 0.02 2.05 1.49 p 0.04 p 0.13 2.10 1.80 p 0.04 P 0.07 0.29 0.20 p 0.77 p 0.84 0.63 0.12 p 0.53 p 0.91 0.60 0.16 p 0.55 P 0.87 Notes: t statistics in parentheses. * Denotes a parameter which is statistically signi®cant at 10%, ** at 5%, and *** at 1%. The standard errors were computed using Arellano and Bond (1991) robust estimators that allow for heteroskedastic residuals. estimations, the negative correlation between the Gini index and growth is robust to the inclusion of Q3 in the regression. As in the ®xed effects estimations of Table 5, GMM estimations without time dummies yield a positive and signi®cant correlation between changes in Q3 and changes in growth and a negative but not signi®cant relationship between changes in growth and change in the Gini index. The last rows of Tables 7 and 8 show that I cannot reject the null of ®rst order correlation in the differenced residuals but that I can reject the null of second order autocorrelation (only the latter is a necessary condition for consistent estimates). One serious caveat with the estimates of Tables 7 and 8 is that the Sargan test applied to the homoskedastic estimators (the test is not de®ned for the robust estimators reported in Tables 7 and 8) always rejects the null that the over-identifying restrictions are valid. While this could be due to the presence of heteroskedasticity (with heteroskedasticity, the Sargan test tends to over-reject the null), it could also signal that there are problems with the instruments used in the estimation. The fact that the estimated relationship between inequality and growth changes when I use different inequality measures (the Gini index versus Q3) is rather puzzling. One possible interpretation is that these two indices allow discriminating between the theoretical models that emphasize the role of the median voter (and hence Q3) and the models that focus on a more comprehensive inequality measure (and hence the Gini 34 UGO PANIZZA index). However, I do not think that the results of this paper are strong enough to justify the claim that the paper ®nds empirical support for some speci®c channel linking inequality to growth. The economic impact of inequality is lower than the one found in cross-country studies. While Forbes (2000) found that a one-standard deviation increase in the Gini index is correlated with a 1.3 percent increase in annual average growth over the next ®ve years, the estimations of Table 7 (columns 5 and 6) indicate that a one-standard deviation decrease in the Gini index is associated with a 0.2 percentage points increase in average annual growth over the next ten years. To obtain an idea of the economic impact of inequality on growth, I will use the example of the state of Mississippi (the poorest state in the sample). During the 1960±1970 period, Mississippi had an average annual growth rate of 4.8 percent and a Gini index of 0.45. In the 1970±1980 period, the growth rate had decreased to 3.1 percent, and the Gini index had increased to 0.48. The estimations of Table 7 (column 5) suggest that this increase in inequality is associated to a change in growth of 0.1 percentage points 3.52*0.03, less than 6 percent of the total change in growth between the two periods. The ®nding that the economic impact of inequality on growth is smaller than in the case of cross country studies is not surprising. In particular, higher factor mobility (both labor and capital) and the role of the federal government are important reasons for expecting income distribution to have a much smaller impact on growth in the cross-state sample as opposed to the cross-country sample. 3.2. Sensitivity Analysis This section checks whether the results of the previous section are robust to outliers, tests for the presence of serial correlation, structural breaks in the panel, and explores why the results of this paper differ from the results in Partridge (1997). To check for the role of outliers, I start by running the regressions of Tables 7 and 8 by dropping one state at a time and observe that the results are basically unchanged. Next, I drop one period at a time and observe that this substantially changes the results. In the GMM regressions, dropping 1960 or 1980 weakens the correlation between changes in inequality and changes in growth (all the coef®cients are not statistically signi®cant) and dropping 1970 strengthens the correlation between changes in inequality and changes in growth.9 A possible concern with the estimations of the previous section is the presence of serial correlation. In the GMM estimations of Tables 7 and 8, the tests developed by Arellano and Bond (1991) reject the null of no ®rst order serial correlation in the differenced residuals but do not reject the null of no second order serial correlation. A rough way to check whether serial correlation is a serious problem in the ®xed effects regressions is to lag the independent variables and observe if the results change signi®cantly.10 When I reestimate the full model (including time dummies) and lag inequality by ten years, the coef®cients attached to Q3 and the Gini index drop substantially (however, the Gini index remains statistically signi®cant) suggesting that serial correlation could be an issue. INCOME INEQUALITY AND ECONOMIC GROWTH 35 As panel estimations require a stable relationship between the dependent and explanatory variables, they are not appropriate if there are structural breaks in the sample. Table 3 shows that, although never statistically signi®cant, the coef®cients attached to the inequality indices change noticeably over time. It is, therefore, important to formally test the hypothesis of no structural breaks. To this purpose, I divide the sample in two different ways (1940±1960 versus 1960±1990, and 1940±1970 versus 1970±1990) and perform Chow tests to verify equality of slopes in the two sub-periods. The results of the Chow tests to verify equality of slopes in the two sub-periods. The results of the Chow tests are mixed. When I only test for differences in the inequality coef®cients, I strongly reject the null of a constant slope. However, when I test the equality of all parameters, I cannot reject the null of no structural breaks.11 One ®nal issue is the possibility of non-linearities in the relationship between inequality and growth (Banerjee and Du¯o, 1999). While non-parametric estimations are beyond the scope of this paper, it is interesting to test for non-linearities by augmenting the regressions of Table 7 with quadratic and cubic terms of the inequality index. While I do not ®nd any evidence for a signi®cant quadratic or cubic relationship between changes in Q3 and changes in growth, I do ®nd evidence for a signi®cant quadratic relationship between changes in the Gini index and changes in growth. However, the coef®cients ( 3.49 for the linear term and 319 of the quadratic term) indicate that the correlation between changes in growth and changes of the Gini index becomes positive when the changes of the Gini index is greater than 0.11. As this is an extremely high value (close to 3 standard deviations of the within-state change of the Gini index over the 1940±1980 period), it is fair to conclude that, for any reasonable change of the Gini index, the relationship between changes in inequality and change in growth is non-negative. Using pooled OLS for the 1960±1990 period, Partridge (1997) ®nds a positive relationship between Q3 and growth and between the Gini index and growth. There are four major differences between this paper and Partridge's paper: (i) the estimations technique ( pooled OLS versus ®xed effects) partly compensated by the fact that Partridge uses a larger set of controls; (ii) the period under analysis (1960±1980 versus 1940±1980); (iii) the source of the data (Census data versus IRS data); and (iv) the use of the level rather than the log of initial income per capita as control. Table 9 compares the results obtained with different estimation techniques using both this paper's and Partridge's data. The ®rst three columns of part A of Table 9 reproduce Partridge's (1997) result of a strong positive relationship both between the Gini index and growth and Q3 and growth. The last three columns of part A of Table 9 show that, while ®xed-effects estimations strengthen the correlation between growth and the Gini index, the positive correlation between Q3 and growth is not robust to the use of ®xed-effects estimation. Furthermore, the last three columns of parts B and C of Table 9 show that, when one uses ®xed effects estimations and controls for the log of initial income, the two data sets yield the consistent result of a negative and signi®cant correlation between changes in the Gini index and changes in growth.12 The ®nding that replacing the level of income per capita with the log of income per capita completely reverses the results for the Gini index is puzzling because the correlation between the level and log income variables is very high (the correlation coef®cient is 0.987). Interestingly, this does not seem to be a problem for the data used in this paper 36 UGO PANIZZA Table 9. OLS and ®xed-effects regressions. Ten-year growth episodes 1960±1990. OLS Fixed Effects A. Using Partridge's data and yt 1 Q3 13.53 (1.09) Gini 6.47** (2.54) B. Using Partridge's data and log yt Q3 11.72 (0.98) Gini 4.35 (1.64) C. Using this paper's data and log yt Q3 0.53 (0.14) Gini 3.61** ( 2.01) D. Using this paper's data and yt 1 Q3 1.68 (0.45) Gini 4.36** ( 2.44) 44.87*** (3.16) 11.74*** (3.95) 1 38.37** (2.58) 9.63*** (2.91) 1 ( ( ( ( 5.45 1.19) 4.96** 2.29) 4.84 1.09) 5.53** 2.57) 47.36 (1.58) ( 19.76*** (4.79) 4.95 0.20) ( ( 39.19 (1.49) 19.36*** (4.71) 2.32 0.43) ( 2.40 (0.58) ( 10.13** 2.13) 3.51 1.31) 2.97 1.50) 16.21 0.65) 10.79** ( 2.21) ( ( ( ( ( 8.23 1.30) 5.42* 1.76) 1.48 0.30) 3.28 1.37) Notes: All regressions include time dummies. t statistics in parentheses. * Denotes a parameter which is statistically signi®cant at 10%, ** at 5%, and *** at 1%. ( parts C and D of Table 9 show that the coef®cients are rather stable across speci®cations). An exploration of the data shows that, while the overall data variability is not a key issue, the Partridge data set has a much higher within-state correlation. In the case of Partridge's data set, the state and decade dummies explain 86 percent of the variance of the Gini index, but in the data set used in this paper, the state and decade dummies explain less than 55 percent of the variance of the Gini index (this ®gure is different from the one reported in Section 2, because here I only consider the 1960±1980 period). It is probably this high within-state correlation that leads to the unstable results of Table 9.13 Although, Table 9 indicates that ®xed-effects estimations that control the log of initial income yield the consistent result of a negative correlation between inequality and growth, it is fair to conclude that small differences either in the data used to measure inequality or in the methodology (and speci®cation) used to estimate the relationship between inequality and growth could yield very different results. While I deem the ®xed-effects estimation to be superior to the pooled OLS, it is not clear whether tax data are superior to survey data. The main problem with tax-based inequality measures is the incomplete coverage of households with income below the tax threshold (this may cause the measurement error to be non-random and correlated with income levels). The main INCOME INEQUALITY AND ECONOMIC GROWTH 37 problems with survey data include a less accurate measurement of income and the sampling error.14 4. Conclusions This paper reassesses the relationship between inequality and growth using a US crossstate data set similar to the one used by Partridge (1997) and panel data techniques similar to the ones used by Forbes (2000). Contrary to the ®ndings of Forbes, this paper does not ®nd any evidence of a positive relationship between changes in inequality and changes in growth, and contrary to the ®ndings of Partridge (1997), this paper does not ®nd that both the Gini index and the income share of the third quintile are positively correlated with growth. In fact, while the paper ®nds some evidence in support of a negative relationship between inequality and growth, the paper suggests that the cross-state relationship between inequality and growth is not robust to small changes in the data or econometric speci®cation. In particular, the paper shows that the negative correlation between the Gini index and growth is not robust across all sub-periods and is highly dependent on the speci®cation used (with or without time dummies). Furthermore, the Sargan test suggests that there may be problems with the identifying restriction imposed in the GMM estimations. Even with the above caveats, the paper never ®nds a signi®cant positive relationship between inequality and growth. Given the differences in data quality and coverage, it is not dif®cult to justify the differences between the results of this paper and Forbes' paper. However, the ®ndings of this paper are harder to reconcile with Partridge's (1997) work. The latter, using a similar sample of cross-state data, ®nds a positive and statistically signi®cant relationship between the income share of the third quintile and growth and a positive and statistically signi®cant relationship between the Gini index and growth. Section 3.2 shows that the differences between the results of this paper and Partridge's are partly due to differences in the estimation technique, but it also shows that small differences in the source of the data used to measure inequality can make a big difference in the observed relationship between inequality and growth. Appendix A. Description of the Split Histogram Method This split histogram method suggested by Cowell (1995) was used to divide the population into quintiles. De®ne F y as the proportion of population with income less than or equal to y. Let F y be the proportion of total income received by those who have an income less than or equal to y. Let ai be the lower limit of income class i, ai 1 its upper limit, and mi the average income. Interpolation on the Lorenz curve may be performed as follows: between the observation i and i 1 the interpolated values of F y and F y are. 38 UGO PANIZZA Z F y Fi y fi xdx; ai 1 F y Fi y Z y ai 3 xfi xdx; and the split histogram density function is: ( f a 1 m i i i ; for ai x5mi : f y ai 1 fi maii mai i ai a a m ; for mi x5ai 1 a i1 i i1 4 5 i B. Data Sources 1. Data on income and growth. The data on per capita personal income are from the Bureau of Economic Analysis and from the Survey of Current Business. 2. Data on inequality. The Gini index and Q3 are computed using data on tax returns published by the Internal Revenue Service. The data are from the annual report Statistics of Income, Individual Income Tax Return (the data are not available for the period 1982±1986). 3. Data on school attainment, age structure of the population and urban concentration. These data are from the Census and are available online from the library of the University of Virginia at: http://®sher.lib.virginia.edu/census/. Acknowledgments I would like to thank Laurence Ball, Michelle Barnes, Nada Choueiri, Momi Dahan, Luisa Ferreira, Oded Galor, Alejandro Gaviria, Mandana Hajj, Louis Maccini, Carmen PagesSerra, Miguel SzeÂkely, three anonymous referees, participants to the Hopkins macro lunch, and participants to the 7th Summer School in Economic Theory at the Hebrew University of Jerusalem for helpful comments. I would also like to thank Ann Owen for providing a user-friendly version of the DPD Gauss program and Mark Partridge for useful comments and sharing his data with me. The usual caveats apply. The opinions expressed in this paper are my own and do not necessarily re¯ect the views of the Inter-American Development Bank. INCOME INEQUALITY AND ECONOMIC GROWTH 39 Notes 1. Galor and Moav (1999) argue that inequality is bene®cial for growth in early stages of development when physical capital is the prime engine of growth and harmful in more advanced stages when human capital is the prime engine of growth. Saint-Paul and Verdier (1993) and Galor and Tsiddon (1997), instead, predict a positive relationship between inequality and growth. 2. While Deininger and Squire (1996) dismiss tax records as non-representative, Atkinson and Brandolini (1999) suggest that income tax records should constitute an important source of primary data for the calculation of inequality indices. 3. A previous version of this paper also used data for 1920 and 1930. The observations for these two decades were dropped because of the very small percentage of people who ®lled out a tax report. Data for 1969 were used instead of 1970 data because of a change in the reporting procedure that greatly reduced the number of reporting individuals in 1970. 4. The censoring at the lower end of the distribution may explain the low correlation between the inequality indices computed with tax data and the inequality indices computed with Census data. While, at 0.44, the correlation between Gini indices computed with tax data and Gini indices computed with Census data may seem extremely small, this ®gure is not very different from the correlation (0.48) between the Gini indices in the Deininger and Squire data set and the Gini indices computed for a set of OECD countries by Gottschalk and Smeeding (1997). This last result is even more puzzling, because for most countries, Deininger and Squire, and Gottschalk and Smeeding computed the Gini indices using the same primary source (the LIS data set). 5. Formally, assume that we are interested in estimating a model of the kind: GR t;tn ; i gDISTRt;i ai Zt et;i , where DISTRt;i is an inequality index and ai and Zt are vectors of state and time dummies. De®ne R2D as the R2 in the regression of DISTR on ai and Zt . Then, it is easy to show that: var g 1 R2D s2 P DISTRt;i DISTR : Therefore, lim var g ?. R2D ?1 6. To compare my results with Partridge's (1997), I also introduce both the Gini index and Q3 in the same regression. 7. Unfortunately, my data set does not include inequality data at a higher frequency and therefore precludes the possibility of studying the correlation between inequality and growth episodes over 5 and 15 years. 8. While the two estimators are asymptotically equivalent, the one-step estimator (GMM1) requires some assumption on the weighting matrix. The two-step estimator (GMM2) instead builds the weighting matrix using the residuals of the one-step estimator. Although the two-step estimator seems less ad hoc than the onestep estimator, the former tends to produce low-power t statistics (Arellano and Bond, 1991). 9. When 1960 is dropped from the panel the coef®cient attached to the Gini index increases to 2.4 (with a t statistic of 1.6), when 1970 is dropped from the panel the coef®cient goes to 8.1 ( 3.4 t statistics), and when 1980 is dropped from the panel the coef®cient goes to 7.4 ( 1.1 t statistics). Similar results are found for the income share of the third quintile. 10. I am grateful to a referee for suggesting this method. To estimate the model with lagged inequality over the 1940±1990 period, I had to use 1930 inequality data. Similar changes in the inequality coef®cients are found by estimating the model over the 1950±1990 period. 11. In the tests of inequality of coef®cients, the statistics are F(2,182) 8.07 for the 1940±1960 versus 1960± 1990 sub-periods and F(2,182) 6.40 for the 1940±1970 versus 1970±1990 sub-periods (all above the 5 and 1 percent critical values of 3.00 and 4.72). In the test of inequality of all parameters, the statistics are F(56,72) 1.46 for the 1940±1960 versus 1960±1990 sub-periods and F(56,72) 1.04 for the 1940±1970 versus 1970±1990 sub-periods (below the 5 percent critical value of 1.5). All the tests are performed using a 40 UGO PANIZZA ®xed-effects model that includes time dummies. GMM estimations could not be performed because the subperiods are too short. 12. In the growth literature, it is standard to control for the log of initial income. However, Partridge (1997) uses the level of income per capita to make his results comparable with the ones of Persson and Tabellini (1994). 13. Pooled OLS estimations are unlikely to be a quick ®x for this problem because formal tests reject the restriction that ai 0 and, by suggesting that E ai ; Xi =0, indicate that the pooled OLS model will yield biased estimations. However, in Partridge's (1997) speci®cation, the omitted variable bias is attenuated by the inclusion of a large set of control variables. It should also be pointed out that the presence of low withinstate variability exacerbates the measurement error of ®xed effects estimations. 14. Sampling error should not be a serious issue for the Census data used by Partridge (1997). References Aghion, P., E. Caroli, and C. Garcia-PenÄalosa. (1999). ``Inequality and Economic Growth: The Perspective of the New Growth Theories,'' Journal of Economic Literature 37, 1615±1660. Alesina, A., and D. Rodrik. (1994). ``Redistributive Politics and Economic Growth,'' Quarterly Journal of Economics 436, 465±490. Arellano, M., and S. Bond. (1991). ``Some Tests of Speci®cation for Panel Data: Monte Carlo Evidence and Application to Employment Equations,'' Review of Economic Studies 58, 277±297. Atkinson, A., and A. Brandolini. (1999). ``Promise and Pitfalls in the Use of Secondary Data-Sets: A Case Study of OECD Income Inequality.'' Mimeo, Oxford University. Banerjee, A., and E. Du¯o. (1999). ``Inequality and Growth: What Can the Data Say?'' Mimeo, MIT. Barro, R. (2000). ``Inequality, Growth in a Panel of Countries,'' Journal of Economic Growth 5, 5±32. Barro, R., and X. Sala-i-Martin. (1991). ``Convergence Across States and Regions,'' Brookings Papers on Economic Activity 1, 107±182. Caselli, F., G. Esquivel, and F. Lefort. (1996). ``Reopening the Convergence Debate: A New Look at Crosscountry Empirics,'' Journal of Economic Growth 1, 363±389. Cowell, F. (1995). Measuring Inequality, 2nd edn, London: Prentice Hall. Deininger, K., and L. Squire. (1996). ``Measuring Income Inequality: A New Database,'' World Bank Economic Review 10, 565±591. Easterly, W. (2001). ``The Middle Class Consensus and Economic Development,'' Journal of Economic Growth 6, 317±335. Flora, P. (1987). State, Economy and Society in Western Europe, 1815±1975, Vol. II. Frankfurt: Campus Verlag. Forbes, K. (2000). ``A Reassessment of the Relationship Between Inequality and Growth,'' The American Economic Review 90, 869±887. Galor, O., and O. Moav. (1999). ``From Physical to Human Capital Accumulation: Inequality in the Process of Development.'' CEPR discussion paper N. 2307. Galor, O., and D. Tsiddon. (1997). ``Technological Progress, Mobility and Economic Growth,'' The American Economic Review 87, 363±382. Galor, O., and J. Zeira. (1993). ``Income Distribution and Macroeconomics,'' The Review of Economic Studies 60, 33±52. Gastwirth, J. (1972). ``The Estimation of the Lorenz Curve and Gini Index,'' Review of Economics and Statistics 54, 306±316. Gottschalk, P., and T. Smeeding (1997). ``Cross-National Comparisons of Earning and Income Inequality,'' Journal of Economic Literature 35, 633±687. Judson, R., and A. Owen. (1999). ``Estimating Dynamic Panel Data Models: A Guide for Macroeconomists,'' Economic Letters 65, 9±15. Kaldor, N. (1957). ``A Model of Economic Growth,'' Economic Journal 57, 591±624. Li, H., and H. Zou. (1998). ``Income Inequality is Not Harmful for Growth: Theory and Evidence,'' Review of Development Economics 2, 318±334. Partridge, M. (1997). ``Is Inequality Harmful for Growth? Comment,'' The American Economic Review 87, 1019± 1032. INCOME INEQUALITY AND ECONOMIC GROWTH 41 Perotti, R. (1996). ``Growth, Income Distribution, and Democracy: What the Data Say,'' Journal of Economic Growth 1, 149±187. Persson, T., and G. Tabellini. (1994). ``Is Inequality Harmful for Growth?'' The American Economic Review 84, 600±621. Rodriguez, F. (1999). ``Does Inequality Lead to Redistribution? Evidence from the United States,'' Economics and Politics 11, 171±199. Saint-Paul, G., and T. Verdier. (1993). ``Education Democracy and Growth,'' Journal of Development Economics 42, 399±407. SzeÂkely, M., and M. Hilgert. (1999). ``What's Behind the Inequality We Measure?'' Inter-American Development Bank, Research Department Working Paper 409.
© Copyright 2026 Paperzz