How sensitive are estimates of the marginal propensity to consume to measurement error in survey data in South Africa Reza C. Daniels UCT [email protected] Vimal Ranchhod UCT [email protected] Outline • • • • • • • • Context Question Econometric problem Proposed solution Data Results Caveats Conclusion Context • Best micro level data in SA for incomes and expenditures comes from StatsSA income and expenditure surveys (ies 95, 00 and 05) • From here, we can estimate the marginal propensity to consume (MPC), i.e. what proportion of every rand of disposable income do households spend. This can be broken up into various categories of expenditure. • The marginal propensity to save (MPS) is defined as 1 – MPC, with corresponding definition. Context (2) • A common problem in survey data is measurement error in responses. i.e. The data captured might not truly reflect the financial reality being measured. • In the `classical measurement error’ case, this leads to attenuation bias. Estimated relationships are weaker than the true relationships. • For other types of measurement error, even being able to sign this bias may not be possible. Question • How sensitive are estimates of the MPC to measurement error (m.e.) in the IES data. – We propose to estimate this sensitivity using an instrumental variables approach, with wage data from the same households, but from a different survey, namely the LFS 2000:2, as our candidate instrument. Econometric Problem • Suppose the “true” relationship is: – Yi = B0 + B1 Xi* + ui , where: • • • • Y is the outcome variable of interest, X* is the ‘true’ value of the dependent variable, And u is a mean zero error term. The subscript i refers to person i, where i=1, …, n – It can be shown that, asymptotically, the OLS estimator of B1 obtained from a regression of Y on X* will be consistent if and only if: • Cov(X* , u) = 0 OLS estimator with measurement error • Suppose that we observe X instead of X*, – Where X = X* + e, E[e]=0, cov(X* , e) = 0 and cov(u, e)=0 • By regressing Y on X, we can show that the probability limit of our estimate of B1 from an OLS regression = B1 + cov(X, u-B1e)/Var(X) = B1 – B1 (Var(e)/[Var(X*) + Var(e)]) 2 2 e x* 1 2 1 1 2 2 2 x* e x* e This is known as attenuation bias. – In essence, regardless of the type of m.e. we are considering, the crucial question to ask is whether or not the covariance between the observed X and the composite error term is zero. M.E: An IV solution • Suppose we had another variable, Z, which is: – Correlated with our observed X, and – Uncorrelated with the composite error term, v=(u-B1e) Then Z would provide a valid instrument for the endogenous regressor X. In particular, another noisy measure of X* , eg. Z=X* + k would suffice, if k is uncorrelated with both u and e. Asymptotically plimBhat1,IV = B1 + (corr(Z,v)/corr(Z,X1))*(var(v)/var(X1))0.5 (Recall that v=(u-B1e)) Now, var(v) ≠0, var(X1)≠0 and corr(Z,X1)≠0, therefore the IV works iff corr(Z,v)=0. i.e corr(X1* + k, u-B1e)=0, So the crucial requirement is that k and e are uncorrelated. Implementing the solution • We match data on individuals from the IES 2000 and LFS 2000:2. • The data were obtained in October and September respectively. • IES contains more detailed information on multiple sources of income and categories of expenditure, expenditure at HH level. • LFS contains information on employment status and wage income. Summarizing the Data Table 1: Summary statistics of sample Variable # of individuals 103214 # of HHs 25964 Mean HH size 3.96 African Coloured Indian White Total % of sample 79.4 10.4 2.0 8.1 Mean HH Exp Mean HH Yd p.a. p.a. 21006 20933 35837 35533 75232 69010 124999 129938 32058 32244 Male 47.5 40198 Female 52.5 19590 Notes: 1. Means do not include sampling weights. 2. Means are obtained by race or gender of HH head. 40277 19937 Ratio of exp/Yd 0.997 0.992 0.917 1.040 1.006 1.002 1.018 Mean Expenditure by Category Category alcohol & cigs beverages clothes durable goods education food health housing insurance other non durable own production recreation hh services transport utilities TOTAL EXPENDITURE Total Disposable Income Mean Expenditure As % of Yd 634.0 2.0 533.3 1.7 1384.6 4.3 1097.9 3.4 980.8 3.1 6706.1 20.9 2241.7 7.0 4792.6 14.9 3578.6 11.2 955.5 3.0 679.3 2.1 47.9 0.1 1002.1 3.1 2944.7 9.2 1025.8 3.2 32300.1 100.8 32058 Notes: 1. Proportion in sub-category do not necessarily add to 100, due to omitted categories such as debt servicing. OLS and IV coefficients COEFFICIENT OLS yd s.e IV yd s.e alcohol & cigs 0.00362*** -0.00014 0.00235*** -0.00072 beverages 0.00321*** -6.5E-05 0.00499*** -0.00035 clothes durable goods education food health housing insurance other non durable own production 0.00978*** 0.0142*** 0.0153*** 0.0355*** 0.0274*** 0.0747*** 0.217*** 0.0828*** 0.315*** -0.00017 -0.00028 -0.00029 -0.00045 -0.00035 -0.0016 -0.0019 -0.0011 -0.0033 0.00998*** 0.0184*** 0.0208*** 0.0461*** 0.0378*** 0.0688*** 0.242*** 0.00574 0.325*** -0.00091 -0.0014 -0.0014 -0.003 -0.0018 -0.0077 -0.0094 -0.0057 -0.016 recreation hh services transport utilities TOTAL 0.00214*** 0.0317*** 0.0639*** 0.0119*** 0.981*** -0.00011 -0.00037 -0.00082 -0.00018 -0.0054 0.00244*** 0.0513*** 0.105*** 0.0124*** 1.049*** -0.00054 -0.0019 -0.0042 -0.00092 -0.026 Caveats • Sensitive to imputation method of wage data from categories. • Conceptual difficulties on how to treat debt financing and dissaving or borrowing. • IVs are fairly weak, many HH’s get income from grants. • If m.e. correlated with income levels, eg. Rich HHs always understate, then its not clear that the solution is valid. • Non-response also not accounted for. Conclusion • Promising avenue of investigation • IV’s do have significant first stages • Lots more to be done … – Do by income quintile, – And by age of HH head, or some form of HH composition.
© Copyright 2025 Paperzz