How sensitive are estimates of the marginal propensity to consume

How sensitive are estimates of the
marginal propensity to consume to
measurement error in survey data in
South Africa
Reza C. Daniels
UCT
[email protected]
Vimal Ranchhod
UCT
[email protected]
Outline
•
•
•
•
•
•
•
•
Context
Question
Econometric problem
Proposed solution
Data
Results
Caveats
Conclusion
Context
• Best micro level data in SA for incomes and
expenditures comes from StatsSA income and
expenditure surveys (ies 95, 00 and 05)
• From here, we can estimate the marginal
propensity to consume (MPC), i.e. what
proportion of every rand of disposable income do
households spend. This can be broken up into
various categories of expenditure.
• The marginal propensity to save (MPS) is defined
as 1 – MPC, with corresponding definition.
Context (2)
• A common problem in survey data is
measurement error in responses. i.e. The data
captured might not truly reflect the financial
reality being measured.
• In the `classical measurement error’ case, this
leads to attenuation bias. Estimated relationships
are weaker than the true relationships.
• For other types of measurement error, even
being able to sign this bias may not be possible.
Question
• How sensitive are estimates of the MPC to
measurement error (m.e.) in the IES data.
– We propose to estimate this sensitivity using an
instrumental variables approach, with wage data
from the same households, but from a different
survey, namely the LFS 2000:2, as our candidate
instrument.
Econometric Problem
• Suppose the “true” relationship is:
– Yi = B0 + B1 Xi* + ui , where:
•
•
•
•
Y is the outcome variable of interest,
X* is the ‘true’ value of the dependent variable,
And u is a mean zero error term.
The subscript i refers to person i, where i=1, …, n
– It can be shown that, asymptotically, the OLS
estimator of B1 obtained from a regression of Y on
X* will be consistent if and only if:
• Cov(X* , u) = 0
OLS estimator with measurement error
• Suppose that we observe X instead of X*,
– Where X = X* + e, E[e]=0, cov(X* , e) = 0 and cov(u, e)=0
• By regressing Y on X, we can show that the probability limit of
our estimate of B1 from an OLS regression
= B1 + cov(X, u-B1e)/Var(X)
= B1 – B1 (Var(e)/[Var(X*) + Var(e)])
2
2



e
 x* 
  1  2

 1 1  2
2 
2 
  x*   e 
  x*   e 
This is known as attenuation bias.
– In essence, regardless of the type of m.e. we are
considering, the crucial question to ask is whether or not
the covariance between the observed X and the composite
error term is zero.
M.E: An IV solution
• Suppose we had another variable, Z, which is:
– Correlated with our observed X, and
– Uncorrelated with the composite error term,
v=(u-B1e)
Then Z would provide a valid instrument for the
endogenous regressor X.
In particular, another noisy measure of X* ,
eg. Z=X* + k would suffice, if k is uncorrelated with
both u and e.
Asymptotically
plimBhat1,IV =
B1 + (corr(Z,v)/corr(Z,X1))*(var(v)/var(X1))0.5
(Recall that v=(u-B1e))
Now, var(v) ≠0, var(X1)≠0 and corr(Z,X1)≠0,
therefore the IV works iff corr(Z,v)=0.
i.e corr(X1* + k, u-B1e)=0,
So the crucial requirement is that k and e are
uncorrelated.
Implementing the solution
• We match data on individuals from the IES
2000 and LFS 2000:2.
• The data were obtained in October and
September respectively.
• IES contains more detailed information on
multiple sources of income and categories of
expenditure, expenditure at HH level.
• LFS contains information on employment
status and wage income.
Summarizing the Data
Table 1: Summary statistics of sample
Variable
# of individuals
103214
# of HHs
25964
Mean HH size
3.96
African
Coloured
Indian
White
Total
% of sample
79.4
10.4
2.0
8.1
Mean HH Exp
Mean HH Yd p.a.
p.a.
21006
20933
35837
35533
75232
69010
124999
129938
32058
32244
Male
47.5
40198
Female
52.5
19590
Notes:
1. Means do not include sampling weights.
2. Means are obtained by race or gender of HH head.
40277
19937
Ratio of
exp/Yd
0.997
0.992
0.917
1.040
1.006
1.002
1.018
Mean Expenditure by Category
Category
alcohol & cigs
beverages
clothes
durable goods
education
food
health
housing
insurance
other non durable
own production
recreation
hh services
transport
utilities
TOTAL EXPENDITURE
Total Disposable Income
Mean Expenditure As % of Yd
634.0
2.0
533.3
1.7
1384.6
4.3
1097.9
3.4
980.8
3.1
6706.1
20.9
2241.7
7.0
4792.6
14.9
3578.6
11.2
955.5
3.0
679.3
2.1
47.9
0.1
1002.1
3.1
2944.7
9.2
1025.8
3.2
32300.1
100.8
32058
Notes:
1. Proportion in sub-category do not necessarily add to 100,
due to omitted categories such as debt servicing.
OLS and IV coefficients
COEFFICIENT
OLS
yd
s.e
IV
yd
s.e
alcohol & cigs
0.00362***
-0.00014
0.00235***
-0.00072
beverages
0.00321***
-6.5E-05
0.00499***
-0.00035
clothes
durable goods
education
food
health
housing
insurance
other non durable
own production
0.00978***
0.0142***
0.0153***
0.0355***
0.0274***
0.0747***
0.217***
0.0828***
0.315***
-0.00017
-0.00028
-0.00029
-0.00045
-0.00035
-0.0016
-0.0019
-0.0011
-0.0033
0.00998***
0.0184***
0.0208***
0.0461***
0.0378***
0.0688***
0.242***
0.00574
0.325***
-0.00091
-0.0014
-0.0014
-0.003
-0.0018
-0.0077
-0.0094
-0.0057
-0.016
recreation
hh services
transport
utilities
TOTAL
0.00214***
0.0317***
0.0639***
0.0119***
0.981***
-0.00011
-0.00037
-0.00082
-0.00018
-0.0054
0.00244***
0.0513***
0.105***
0.0124***
1.049***
-0.00054
-0.0019
-0.0042
-0.00092
-0.026
Caveats
• Sensitive to imputation method of wage data
from categories.
• Conceptual difficulties on how to treat debt
financing and dissaving or borrowing.
• IVs are fairly weak, many HH’s get income from
grants.
• If m.e. correlated with income levels, eg. Rich
HHs always understate, then its not clear that the
solution is valid.
• Non-response also not accounted for.
Conclusion
• Promising avenue of investigation
• IV’s do have significant first stages
• Lots more to be done …
– Do by income quintile,
– And by age of HH head, or some form of HH
composition.