Evidence from Income Underreporting of the Self

Are Household Surveys Like Tax Forms?
Evidence from Income Underreporting of the Self-Employed*
October 2012
Erik Hurst
University of Chicago
Geng Li
Board of Governors of the Federal Reserve System
Benjamin Pugsley
Federal Reserve Bank of New York
Abstract
There is a large literature showing that the self-employed underreport their income to tax authorities. In
this paper, we quantify the extent to which the self-employed also systematically underreport their income
in U.S. household surveys. To do so, we use the Engel curve describing the relationship between income
and expenditures of wage and salary workers to infer the actual income, and thus the reporting gap, of the
self-employed based on their reported expenditures. We find that, on average, the self-employed
underreport their income by about 25 percent. We show that failing to account for such income
underreporting leads to biased conclusions in a variety of settings.
* We would like to thank the following for their helpful comments and discussions: Kerwin Charles, Raj Chetty,
Steve Davis, Joshua Gottlieb, Larry Katz, Bruce Meyer, Matt Notowidingo, Michael Palumbo, Jesse Shapiro, Bob
Willis and three anonymous referees. We would also like to thank seminar participants at Boston College, Chicago,
Harvard Business School, MIT, Penn State, Stanford, the Institute for Fiscal Studies, and the Federal Reserve Bank
of Philadelphia. The views presented in this paper are those of the authors and do not represent those of the Federal
Reserve Bank of New York, the Federal Reserve Board, or the Federal Reserve System. Much of Pugsley’s work
on this paper was completed while at the University of Chicago. Hurst and Pugsley gratefully acknowledge the
financial support provided by the George J. Stigler Center for the Study of Economy and the State. Additionally,
Hurst thanks the financial support provided by the University of Chicago's Booth School of Business.
1.
Introduction
Evidence shows that individuals systematically underreport income to tax authorities. 1 A
separate literature finds that participants in experiments distort their behavior as a reaction to being
studied. 2 Even administrative data such as the Vital Statistics data are not immune to problems of
misreporting.3 Collectively, previous research has shown that individuals tend to misreport their actual
behavior to data collectors and administrative agencies when the incentives to do so are sufficiently large
(e.g., to avoid tax payments) and/or the cost of doing so is small (e.g., changing their behavior in
experimental settings). However, an implicit assumption made in the majority of empirical work using
household survey data is that the data within the household surveys are immune to such systematic errors.
In doing so, researchers are assuming that the problems that plague tax data, experimental data, and other
types of administrative data do not plague household survey data. In this paper, we assess whether such
an assumption is valid. Specifically, we address whether the self-employed, who have been shown to
have misreported their income to tax authorities, have also misreported their income to household
surveys.
A natural question is why interviewees would misreport their earnings to household surveys. On
the one hand, it appears there is little to gain from misreporting because unlike reports to tax authorities,
misreporting income to household surveys cannot decrease a household’s tax liabilities. On the other
hand, there is little to lose from misreporting income to household surveys. Additionally, the selfemployed may perceive other indirect net benefits from underreporting income to surveys. For example,
unlike wage and salary employees who receive W-2 forms, the self-employed have to expend efforts to
accurately account for their true income. If the self-employed have already supplied (or have an intention
to supply) a distorted income report to tax authorities, it may be easier to reuse this report for surveys
1
See, for example, Clotfelter (1983), Slemrod (1985), Feinstein (1991), Andreoni et al. (1998), Slemrod (2007), and Feldman
and Slemrod (2007).
2 This phenomenon is sometimes referred to as the Hawthorne Effect, named for a series of studies at the Hawthorne Works
factory where workers’ productivity initially improved while under observation but declined soon after. There is a large literature
within the social sciences on the Hawthorne Effect. See Levitt and List (2009) for a recent discussion.
3 For example, Blank et al. (2009) compare administrative data from Vital Statistic records to data from the U.S. Census to show
that in states where minimum age of marriage laws were binding, younger individuals appeared to have lied about their age to
government officials when applying for their marriage license.
1
instead of computing a separate and more accurate measure of income. Households may also feel
compelled to provide consistent measures between their income reported to tax authorities and household
surveys if they believe that the information provided to surveys may not be completely confidential.
Even a small probability of self-incrimination may lead to distorting their survey responses.
Our goals in this paper are twofold. First, we infer the extent to which the self-employed
underreport their income within U.S. household surveys. As far as we know, this is the first paper that
attempts to do so. Our empirical strategy is similar in spirit to the one set forth in Pissarides and Weber
(1989). This procedure estimates the relationship between expenditures and income for wage and salary
workers and uses the estimated coefficients from this relationship to predict the true income of the selfemployed based on their reported level of expenditures. Using data from the Consumer Expenditure
Survey and the Panel Study of Income Dynamics, we find that, on average, the self-employed underreport
their income to household surveys by about 25 percent. The estimated magnitudes are nearly identical
across both surveys for similar specifications.
Our estimating procedure makes a few key assumptions.
First, it assumes no differential
underreporting of expenditures by the self-employed relative to wage and salary workers. Second, it
assumes that there are no definitional differences in household surveys between wage and salary workers’
income and the self-employed’s income. Finally, it assumes that the underlying relationship between
income and expenditures, absent any underreporting, is similar between wage and salary workers and the
self-employed. We test for the validity of all of these assumptions, as well as provide a variety of
additional robustness analyses, in subsequent sections.
The second goal of the paper is to show several examples of how ignoring the underreporting of
income by the self-employed can bias many different types of empirical analyses. For example, given that
self-employment propensities differ over the lifecycle and across space, measures of lifecycle and spatial
differences in earnings are sensitive to the extent of underreporting of income by the self-employed. For
example, we find that roughly between 10 and 15 percent of the decline in earnings between the ages of
45 and 65 in both the Consumer Expenditure Survey and the Panel Study of Income Dynamics can be
2
explained by the underreporting of income by the self-employed given that self-employment propensities
rise with age. We also show that the underreporting of income by the self-employed can alter standard
estimates about (1) the importance of precautionary savings in explaining aggregate wealth holdings, (2)
wealth differentials between the self-employed and wage salary workers, and (3) differences in earnings
across cities.
Our work in this paper complements two different literatures. First, there is an important existing
literature that looks at reporting errors within household surveys. 4 Our work adds to this literature by
identifying another source of reporting errors. We find that a large group of individuals, the selfemployed, systematically underreport their income to household surveys. Additionally, we show that the
same misreporting issues that appear in non-survey settings may also exist in household survey responses.
Second, our work also complements a large existing literature that estimates the extent of underreporting
of income by the self-employed using tax records. Much of this work uses stratified random samples of
tax returns subjected to a thorough audit. A separate strand of research uses data from actual tax returns
(as opposed to random audits) to assess the amount of underreporting of income. 5 Our study differs from
these in that we analyze the underreporting of income by the self-employed to household surveys as
opposed to tax authorities.6
On balance, our work shows that it is naive for researchers to assume that individuals distorting
the truth in some contexts would always provide accurate responses when participating in household
surveys. While the benefits of providing distorted information to household surveys are small, so are the
costs of providing inaccurate information. Such potential biases need to be accounted for when analyzing
4
See, for example, Bound et al. (1994), Deaton (1997), Fitzgerald et al. (1998), Haider and Solon (2006), Meyer et al. (2009),
and Aguiar and Bils (2011). Bound et al. (2001) survey nearly fifty years of research examining the extent to which household
survey data are measured with error.
5 For a summary of audit literature, see the recent surveys by Andreoni et al. (1998) and Slemrod (2007). Recent papers in the
latter vein of the literature include Feldman and Slemrod (2007), LaLumia (2009), and Saez (2010).
6 Several authors have examined the underreporting of income by the self-employed within household surveys from other
countries. For example, Pissarides and Weber (1989) use their expenditure-based method to detect the underreporting of income
by the self-employed within Britain’s 1982 Family Expenditure Survey. They find that the self-employed underreported their
income by roughly 35 percent within Britain in 1982. Using a similar methodology, Johansson (2005) finds that the selfemployed underreport their income by 16-40 percent in Finland. Schuetze (2002) finds 17 percent underreporting in Canada, and
Gibson, Kim, and Chung (2009) find roughly 40 and 50 percent underreporting in Korea and Russian, respectively. Engström
and Holmund (2009) find that the self-employed underreport their income by about 30 percent within the Swedish Household
Budget Survey.
3
data from household surveys.
2.
Data Description: CE and PSID
We use two nationally representative household surveys, the Consumer Expenditure Survey (CE) and the
Panel Study of Income Dynamics (PSID), for the majority of our empirical analysis. In this section, we
describe both surveys and discuss our sample selection criteria.
2A.
The CE Sample
We start with the pooled NBER CE extracts spanning all waves from 1980 to 2003.7 We restrict the
sample to households reporting expenditures in all four quarters surveyed and sum their quarterly
responses for an annual expenditure measure. We further restrict the sample to include only households
that have a male head between the ages of 25 and 55 (inclusive), who is working at least 30 hours in an
average week, who has worked at least 40 weeks during the previous year.8 We exclude any households
where the male head is a wage and salary worker but the spouse (if present) was self-employed. We also
exclude any household reporting farm income. Finally, we drop households with zero or negative reported
household income or zero reported household expenditures. All workers in the CE data are asked to
classify their primary employment type as either working for someone else in the private sector, working
for the government, or being self-employed. We refer to households in the first two groups as being wage
and salary workers while we classify the latter responses as being self-employed. Thus, our final sample
includes 27,219 households, of which 2,508 are counted as self-employed.
We define three measures of expenditures and three measures of income. The expenditure
measures are: total food expenditures, total nondurable expenditures, and total expenditures. Total food
expenditures include expenditures on both food consumed at home and away from home. Following
Aguiar and Hurst (2009), we define nondurable expenditures as spending on food, alcoholic beverages,
tobacco, clothing and personal care, housing services, utilities, domestic care services, nondurable
7
See http://www.nber.org/data/ces_cbo.html for details on the NBER CE extracts.
Male heads are indentified as those male individuals who self-report themselves as being a head or, to be consistent with the
PSID, those males who are married to someone who reports herself as being a head.
8
4
transportation, nondurable entertainment, and “other” nondurable expenditures.
Our total expenditure
measure is the CE's measure of total household outlays including spending on nondurables, durables,
education, health care, and other household outlays.
Throughout this paper we use three different measures of household income. Our first measure of
household income sums all labor earnings from wage and salary employment plus all earnings generated
by one's business.9 Business earnings include both the returns to labor and the returns to capital for the
male heads and their spouses (if a spouse was present). We refer to this income measure as “labor plus
business income.” Our second measure of household income is total family income, which includes all
earnings, asset income, and transfer income received by the household. Our third measure of income is
after-tax (disposable) total family income. The CEX records all taxes paid (net of any refunds) by the
household during the prior 12 months. These taxes include federal income taxes, state income taxes,
payroll taxes, and property taxes. To compute after-tax total family income, we simply subtract our
measure of net taxes paid by the household from the measure of household total family income.
2B.
The PSID Sample
Compared with the CE, the PSID only collects limited information on household expenditures. Over the
sample period, the only expenditure category measured consistently is food expenditures. The PSID
measures of household income are similar to our CE measures.10
We use the PSID data from 1980 to 1997 except for the 1988 and 1989 waves, during which food
expenditures were not collected. After 1997, the PSID started collecting data biennially, making us
9
CE respondents are asked: “During the last 12 months, did you receive any money in wages or salary? Include all bonuses and
overtime pay, commissions, tips, allowances, Armed Forces pay, severance pay, teaching fellowships, etc.” If the respondent
answered in the affirmative to the above question, they were then asked: “During the last 12 months, how much did you receive
in wages and salaries for all jobs before any deductions?” This question was asked of all respondents regardless of their selfemployment status. Respondents were then asked the following question about nonfarm business income: “During the last 12
months, did you have any income or loss from your nonfarm business, partnership, or professional practice?” Respondents that
answered in the affirmative to this question were asked “What was the amount of income or loss after expenses?” It is possible
that the CE may understate the total income of the business owners if respondents do not report any retained earnings. We
explore this concern in much greater depth in Section 5.
10 The only difference is that PSID only includes reported measures of federal income taxes paid by the household up through
survey year 1991. There are no measures of actual state income taxes or payroll taxes paid by the household in any survey year.
For years up through survey year 1991, our measure of net federal taxes paid by the households is the actual reported taxes paid
by the household. For survey years after 1991, our measure of net federal taxes paid by the household is our imputation of taxes
paid by the household using NBER TAXSIM.10
5
unable to match income and expenditure occurred during the same year.11
With respect to income, business owners in the PSID were first asked: “Did the business show a
profit, a loss, or did it break even in the prior calendar year?” For those reporting a profit, the question
was followed up with “How much was your share of the total income from your business in [the prior
calendar year]—that is, the amount you took out plus any profit left in?” A separate question was asked
about the amount of business losses if the individual reported having business losses. Thus, the PSID
respondents were specifically asked to consider the amount of income earned from the business that was
reinvested back into the business.
Within the PSID, individuals are asked to report whether on their main job they “are selfemployed” or “employed by someone else.” Individuals can report being only employed by someone
else, only self-employed, or both employed by someone else and self-employed.
We define self-
employed households as being a household where the male head reports being self-employed only. Wage
and salary households are defined as ones where the male head reports only working for someone else.
We exclude households who report being both self-employed and working for someone else from our
analysis. Lastly, we exclude all households where the head is a wage and salary worker but the spouse (if
present) was self-employed. None of our results are sensitive to whether these households are included or
excluded from the analysis.
We form two samples using the PSID data.
The selection criteria of the first sample, which
exploits only the cross-sectional nature of the PSID data, are nearly identical to the CE sample discussed
above.
After applying these restrictions, our first PSID sample (PSID 1-Year sample) has 36,434
households, 4,446 of which are counted as self-employed. Our second PSID sample leverages the panel
dimension of the data.
We combine multiple waves of the PSID data to create a three-year average
measure of income. This approach has been taken by many others in the literature to construct measures
of permanent income within the PSID (see, for example, Solon 1992 and Gottschalk and Moffitt 1994).
11
Households are asked to report income earned during the previous calendar year and to report current weekly or monthly food
expenditures. We match the income report from survey year t+1 with food expenditures in survey year t, which is the standard
approach in the literature.
6
Specifically, for each household in year t, we compute income measures that average income between t-1
and t+1.12 This sample imposes two additional restrictions above and beyond the restrictions to our oneyear PSID sample.
between t-1 and t+1.
First, we impose that the household be in the survey for all consecutive years
Second, we impose that households who are classified as self-employed are self-
employed in t-1, t, and t+1 while households who are classified as wage and salary workers are wage and
salary workers consistently in all three years.
We exclude any household who transitioned from self-
employment to wage and salary workers during the three-year period (or vice versa) from our sample.
Additionally, when restricting the sample to include only those individuals with positive income, we
make the restriction based on the three-year average of income.13 These restrictions left us with 18,233
observations, 1,901 of which were self-employed.
2C.
Differences between Wage Workers and the Self-Employed
Table 1 shows descriptive statistics separately for wage and salary workers and the self-employed within
the CE sample, one-year PSID sample, and three-year average PSID sample.14 Across all surveys, the
self-employed are slightly older, more likely to be married, much less likely to be black and are somewhat
better educated. The two surveys are largely consistent in their demographic composition. It is not
surprising that there are some moderate demographic differences between the PSID three-year average
sample and the PSID one-year sample given that the former conditions on the household being in either
full time employment or full time self-employment for three consecutive years. If lower educated
households are more likely to transition between self-employment and wage work or if they are more
likely to transition in and out of the labor force, then the three-year average PSID sample would appear
more educated than the one-year PSID sample.
12
We use this alternative measure of permanent income to evaluate our choice of education dummies as an instrument for
permanent income in the one-year sample. As shown below, our estimates of underreporting using the two procedures to
measure permanent income are very similar.
13 The number of households we exclude was essentially zero when using the three-year average PSID sample. This is not
surprising given that we are averaging income over three years for households with full time working heads.
14 We convert all income and expenditure measures to year 2000 dollars using the CPI and express them in log levels.
All
estimates are weighted using the sampling weights of the respective surveys.
7
Before proceeding, we would like to highlight that, unconditionally, the self-employed report
working roughly 10 to 12 percent more annual hours than wage and salary workers. The difference
remains essentially unchanged even controlling for demographics. The difference in hours worked is
notable and we will return to it when we address whether the expenditure differences we document
conditional on income could be driven by varying tastes for leisure or difference in home production
needs for the self-employed.
3.
Detecting the Underreporting of Income of the Self-Employed
Our central identifying assumption in understanding the relationship between income and
expenditure is the log-linear Engel curve. Specifically, within each group, k , where k indicates either
self-employed ( k  S ) or wage and salary ( k  W ), we assume that household i has a preference that
generates, at least to a first order approximation, an Engel curve of the following form:
ln cikt   k   k ln yiktp  k ' X ikt  kt   ikt
(1)
where  k is the income elasticity, X ikt is a vector of demographic controls,  kt is a time effect, and  ikt
represents the cumulative effects of other unobserved determinants of the household’s consumption. The
vector of household controls includes: a series of five-year age dummies, a dummy if the household head
is black, a dummy if the household head is married, and a series of family size dummies. We assume that
households are able to borrow and lend in asset markets so that it is the permanent component of income,
yiktp , that determines the household’s consumption, as is standard in modern theories of consumption.15
Estimating (1) directly is problematic because the household’s annual income reported to the CE
and PSID represents both yiktp and a transitory component. Here we assume for the moment that all
households accurately report their current income, yikt , so that:
ln yikt  ln yitp  k ' X ikt  ikt ,
15
(2)
We do not require that markets are complete, only that the household is able to save and borrow in some form sufficient to
support a lifetime budget constraint in expected value. In section 5, we explore whether differential liquidity constraints between
the two groups can explain the patterns we see in the data.
8
where k ' X ikt and  ikt represent respectively the predictable and unpredictable components of transitory
income. The transitory fluctuations could represent both the true transitory variation in household income
and classical measurement error in the report of income. We assume that the unpredictable component
 ikt is orthogonal to both the controls X ikt and any unobserved determinants of the household’s
consumption  ikt (i.e., E[ X ikt ikt ]  0 and E[ ikt  ikt ]  0). Rewriting (1) in terms of (2) implies:
ln cikt   k   k ln yikt   k ' X ikt  ikt ,
(3)
where  k  k   k k and ikt   ikt   k ikt . Even if there was no measurement error in household
income reports, transitory income fluctuations introduce attenuation bias in our estimation of  k from (3)
since, by construction, E[ln yiktikt ]  0 .16
To mitigate the effects of both measurement error and transitory income fluctuations in our
estimation of equation (3), we follow two separate approaches. First, when using the CE sample or the
one-year PSID sample, we instrument for yikt using four educational attainment dummies (exactly high
school, some college, exactly college, and more than college), assuming that educational attainments only
affect consumption through affecting permanent income.
Second, when using the PSID three-year
average sample, we average the income measures over multiple years so that the transitory fluctuations
are mitigated. Moreover, by comparing the two procedures for the same sample, we can examine the
extent to which education dummies satisfy the exclusion restriction. As seen below, our estimates are
very similar under both procedures.
We have assumed that the log-linear Engel curve is a good representation of household behavior
as well as that the Engel curves between the two groups differ primarily due to differences in intercepts as
opposed to differences in slopes.
Figure 1 plots a nonparametric estimate of the income-expenditure
Engel curves estimated separately for the two groups using the PSID three-year average sample where our
16
Additionally, Andreoni (1992) shows that individuals can use income underreporting to the IRS as a way to smooth transitory
fluctuations in income if they do not have access to other sources of credit. For this reason, underreporting may be correlated
with transitory fluctuations in income. This suggests another reason to use measures of permanent income in our Engel curve
estimation of income underreporting.
9
measure of income is log average total family income and our measure of expenditure is log average food
spending.
We first condition out our demographic controls by separately regressing log average total
family income and log average food expenditure on our X vector of controls.
We then use a
nonparametric median spline procedure to estimate a nonlinear Engel curve from the residuals for each
group. Figure 1 plots the estimated nonlinear Engel curve for each group.17
There are two interesting facts from Figure 1. First, the estimated Engel curve for the selfemployed (dotted line) lies strictly above the estimated Engel curve for the wage and salary workers
(solid line) over essentially the entire support of the data. In other words, for a given amount of income,
the self-employed within the PSID spend more on food than the wage and salary workers.
This is
consistent with a potential story of the underreporting of income by the self-employed to household
surveys. Second, even though Figure 1 was estimated nonparametrically, the slope of the Engle curve for
the wage and salary workers was roughly constant as income increased. Moreover, the estimated slope of
the Engel curve for the self-employed was roughly similar to the estimated slope for the wage and salary
workers. When we estimate the log-linear Engel curve as specified in (3) separately for each group on
this exact sample, the estimated slope of the log-linear Engel curve for the sample of the self-employed
was 0.29 (standard error = 0.02) while the estimated slope of the Engel curve for the sample of the wage
and salary workers was 0.34 (standard error = 0.01).
Although the differences in the slopes are
statistically significant (p-value = 0.02), the economic magnitude of the difference is quite small.
If income reports in the surveys are accurate, Figure 1 shows that the self-employed consume
substantially more than wage and salary workers. We use a simple model that relaxes the assumption of
accurate reporting of income and allows us to estimate the magnitude of underreporting by the selfemployed. For households of each type, we now assume their income reports to the household surveys
are determined by
17
For expositional purposes, we re-centered the residuals at the unconditional log average income and log average expenditure
for each group. Finally, when plotting the data, we restrict the range of the log average income residuals to be within the
overlapping range for both groups. Within the overlapping range, we further restrict the support by truncating observations with
log average income below the 1st percentile and above the 99th percentile.
10
p
ln yiWt  ln yiWt
 W ' X iWt  iWt
(4)
ln yiSt  ln  S  ln yiStp   S ' X iSt  iSt
(5)
For each group, we assume that the observed income in any time period, yikt , is a noisy proxy of
permanent income, where  ikt includes both any classical measurement error in the income report and
the unpredictable component of transient variation around permanent income. As before, we assume that
ν has mean zero and is orthogonal to the unobserved determinants of log expenditure, ε, in (1).
Equations (4) and (5) embed two additional assumptions. First, we assume that self-employed
households (k = S) systematically misreport their earnings by a factor,  S . Although we do not impose
that  S  1 , our empirical estimates below reveal that, in fact,  S  1 . We assume that  S is constant
across self-employed households. Second, we assume that the wage and salary workers provide a
systematically unbiased report of their income to household surveys, i.e., W  1 . If wage and salary
households also systematically misreport their income to household surveys, κS would be an estimate of
the differential systematic underreporting by the self-employed.
The goal of this section is to provide a method to estimate  S . In order to do so, we also assume
that the parameters of the two Engel curves are constant between the self-employed and the wage and
salary workers. Lastly, we assume that household expenditures are not differentially misreported between
the two groups.
This amounts to assuming that if the self-employed under or over report their
expenditures systematically, the wage and salary workers under or over report their expenditures
systematically to the same extent. We provide evidence for this assumption in Section 5.
Given these assumptions, we can estimate  S via the following expression formed from
combining (3) with (4) and (5):
ln cikt     ln yikt   Dit  X ikt  t  ikt ,
(6)
where D is a dummy variable indicating whether the household head is self-employed. As before
11
     '  and the unobserved determinants are expressed as ikt   ikt   ikt . The fraction of
income reported by the self-employed,  S , is identified as exp(  /  ) , and we form the estimate ˆ S as
exp( ˆ / ˆ ) using the estimated coefficients from equation (6).18 We will often express our results in
terms of the amount that the self-employed underreport their income to household surveys, which can be
defined as 1-  S . When estimating (6) using the CE sample or the PSID one-year sample, we instrument
for reported ln yikt using the set of education dummies. Estimating (6) with OLS would underestimate
the income elasticity of consumption β and hence overestimate  S . When estimating (6) using the PSID
three-year average sample, we use three-year average income when computing our measure of ln yikt .
Given this, we only report the OLS estimates for the PSID sample.
Before proceeding, it is worth discussing the potential of measurement error in our expenditure
data. Given the fact that we are estimating (6), classical measurement error in our expenditure data will
not bias our estimates of 1-  S .
There is evidence, however, that measurement error in the expenditure
measures within the CE is large and that it may not be classical.
For example, Bee et al. (2011) and
Sabelhaus et al. (2011) show that CE respondents systematically underreport their expenditures to the
survey and that the underreporting has grown over time. Additionally, there is evidence that the extent of
the underreporting differs by expenditure category. Categories like food at home tend to be measured
well within the survey while categories like clothing are not. Despite this, our estimates of 1-  S using
the CE data will be consistent if (1) the systematic measurement error in expenditure reports does not
vary between the self-employed and wage and salary workers and (2) the systematic measurement error in
expenditure reports is uncorrelated with permanent income and other controls.
18
In the online robustness appendix that accompanies this paper, we compare our estimation to Pissarides and Weber (1989). The
main difference between the procedures is that Pissarides and Weber take a more parametric approach towards the income
process that adjusts for the greater volatility of transient income fluctuations for the self-employed. Ignoring the differences in
variance between groups could overestimate the extent of underreporting. In the appendix we show that adjusting for the
differences in variance using the Pissarides and Weber model decreases the estimated underreporting only slightly. Moreover,
their procedure only adjusts for differences in transient income volatility. If self-employed permanent income is also more
volatile than for employees, this attenuates the bias. After also correcting for differences in permanent income volatility, we find
estimates close to those in Table 3.
The online robustness appendix can be found at
http://faculty.chicagobooth.edu/erik.hurst/research/.
12
It is true that systematic biases in expenditure reports will affect the intercept of the estimated
Engel curves for both the self-employed and wage and salary workers separately.
However, if the
systematic bias is similar between the groups, the difference in the Engel curves (our estimate of γ) will
not be altered. There is no obvious reason to believe that the systematic measurement error documented
in the CE differs between the self-employed and wage and salary workers. Moreover, in Section 5, we
test directly for differences in systematic reporting of expenditures between the self-employed and wage
and salary workers and reject any systematic difference.
However, if the measurement error in
expenditure reports is correlated with permanent income, our estimates of β will be downward biased,
causing us to understate the amount of income underreporting by the self-employed. Aguiar and Bils
(2011) provide some evidence that the extent of income underreporting in the CE has increased to a
greater extent for high income households relative to low income households. Despite this, we believe
that our estimates are unbiased for two reasons. First, according to Aguiar and Bils (2011), the estimated
income elasticity for food in the CE (something akin to our estimate of β) has been relatively stable over
time, suggesting that the changing differential measurement error in total expenditures between rich and
poor households has not altered the estimates of the food income elasticity. This is consistent with food
being relatively better measured in the CE than other expenditure categories. Second, we are finding
similar estimates using PSID expenditure data which has not been shown to have the same systematic
measurement error problems.
4.
Estimating the Missing Income Using Expenditures
In this section, we show the baseline results from estimating (6) for our three samples—CE, PSID 1-year,
and PSID 3-year. Table 2 shows our estimates of both β and γ from equation (6) using different
measures of expenditures separately for the CE and PSID samples. For comparison, we show both the
OLS and IV results when using the CE and PSID one-year sample. Also, we separately show the results
where we use total family income and total after-tax family income.19 As seen from Table 2, across all
19
After-tax income, although subject to its own measurement issues, is our preferred measure of income. If the self-employed
pay less tax (whether through underreporting or other means), money that would have been paid in tax can instead be allocated to
13
expenditure measures and across both surveys, estimates of γ are positive and statistically different from
zero (shown in panel A). For example, using the CE sample and using the nondurable expenditure
measure, we estimate that the self-employed spend 18 percent more than wage and salary workers with
similar reported income. The estimates of γ are roughly similar whether or not we instrument for annual
income. However, the estimates of γ are slightly smaller when we use after-tax family income as our
measure of income (e.g., 14 percent for nondurable expenditures in the CE). Finally, the estimates of γ
for food expenditures are roughly similar between the PSID and CE samples.
The estimates of β from (6) are shown in Panel B of Table 2. Not surprisingly, given the
existence of transitory variation around permanent income, in addition to measurement error in annual
income, the estimates of β are sensitive to whether we estimate (6) via OLS or IV. The fact that current
reported income is a noisy proxy of permanent income attenuates the estimated expenditure income
elasticities. In all cases using the PSID one-year sample and the CE sample, the IV estimates of β are
higher than the OLS estimates of β. The IV estimates of the income elasticities for food expenditures
range from 0.31 to 0.42 across the various samples and specifications. The IV estimates of the income
elasticities for total nondurables and total expenditures are much higher. These results are consistent with
many empirical studies which find that food is a relative necessity compared to other consumption
categories. It is also comforting that the food income elasticity estimated via OLS using the three-year
average PSID sample is nearly identical to the food income elasticity estimated via IV using the one-year
average PSID sample and after-tax income.
This allows us to provide some estimates of underreporting
using the PSID three-year average sample that does not rely on the validity of our instruments.
Table 3 presents the central results of the paper, quantifying the self-employed’s missing income.
In this table, we use the estimates of β and γ shown in Table 2 to construct estimates of (1-κs) - the
estimated underreporting of income by the self-employed relative to wage and salary workers. We
consumption. Even if income is accurately reported to household surveys, the self- employed’s expenditures would be higher for
a given level of before-tax income. Using after-tax income is robust to this source of excess expenditures.
14
 ˆ 
 and compute standard errors using standard
 ˆ 
estimate this reporting gap as (1  ˆS )  1  exp 
asymptotic approximations. In this table, we only show the results from the IV regressions for the CE
and PSID one-year samples, which are the appropriate way to estimate (6) given the measurement error in
income. When reporting the results using the PSID three-year average sample, we compute the extent of
underreporting using the OLS results.
The results of Table 3 are striking. The estimated amount of underreporting is similar across the
two surveys and are very similar regardless of the measures of income and expenditures used in the
estimation. For example, the estimated underreporting using total family income and food expenditures
are 31.7 percent in the CE, 31.5 percent in the PSID one-year sample, and 29.4 in the PSID three-year
sample. Using after-tax total family money and food expenditures yields estimates of 25.2, 28.8, and 27.8
in the three samples. respectively.
The estimated underreporting is slightly smaller using broader
measures of consumption. For example, the estimated underreporting is roughly 25 percent using both
nondurable expenditures and total expenditure when the income measure is total family income.
after-tax total family income, the estimates fall to below 20 percent.
With
All results are statistically
significant at the one percent level. In the next section, we show that the underreporting of vehicle
expenditures by the self-employed (which are included in both nondurable expenditures and total
expenditures) could be the reason that our income underreporting estimates are lower when we use more
aggregate expenditure measures as opposed to the parameters we estimated we use food expenditures.
As discussed above, within the PSID, respondents were asked only to recount their actual federal
taxes paid up through 1991. After that, we have imputed the federal taxes paid using the NBER Taxsim
data.
Is the imputation potentially biasing our results with respect to the importance of conditioning on
after-tax income?
It does not appear so. We infer this from the fact that if we compare the results
between using pre-tax and after-tax income during the pre-1991 sample, they are similar to the results for
the full sample. For example, using the PSID three-year sample and restricting the sample to pre-1991
data (where no imputation was necessary), the estimated underreporting using pre-tax total family income
15
and after-tax total family income was 28.1 percent and 27.5 percent, respectively.
However, the gap
between the two estimates in the pre-1991 period is roughly similar to the gap between the pre- and aftertax estimates in the full sample.
The results in this section suggest that household surveys are like tax forms in that households
with an incentive to misreport their income to the tax authorities also misreport their income to household
surveys. The results also suggest more broadly that household surveys are not immune to the potential
problems of mis-measurement that plague other types of data.
Our estimates suggest that the self-
employed under-report their income by roughly 20-30 percent.
5.
Examining the Validity of Key Identification Assumptions
In this section, we assess the plausibility of our identifying assumptions, placing particular emphasis on
underreporting as the first order source of differences in predicted expenditure. First, our model imposes,
conditional on our controls, a log-linear Engel curve. Previous studies, including Pissarides and Weber
(1989), have found this to be a reasonable approximation. Besides analytic convenience, we consider
several alternative models incorporating higher order terms; in each of these models we fail to find
consistent evidence that the higher order terms are either significant or quantitatively important. Despite
potential nonlinearities for specific expenditure categories such as alcohol or clothing, we find the loglinear form fits the data well for food, nondurable, and total expenditures. This is consistent with our
findings in Figure 1 where we estimated the Engel curve for wage and salary workers nonparametrically.
Second, our baseline results constrain the effect of controls and the income elasticity to be identical across
groups. In all samples, we fail to reject equality of the covariate coefficient vector. We do find some
weak evidence that the income elasticity of the self-employed is lower than wage and salary households.
However, the difference is inconsistent across samples and small in magnitude.
Given our procedure, our estimate of underreporting remains the residual explanation for the
difference in predicted expenditure across groups conditional on income. There are other potential
explanations for this expenditure gap even if income were accurately reported. In the remainder of this
16
section we address several alternative explanations and show that our estimates holds when taking these
alternative explanations into account.
5A.
Potential Systematic Differences in the Reporting of Expenditures
The fact that some business expenditures can substitute for personal expenditures may undermine the
validity of the assumption of no differential misreporting of household expenditures between the two
groups. For example, the self-employed sometimes purchase vehicles for their business. In doing so,
they may treat gasoline and maintenance associated with the vehicle as business expenses. It is possible
that some of the expenses they attribute to the business actually should instead be recorded as household
expenditures. The story could also hold in reverse. The self-employed could conceivably report all
outlays on vehicle expenses as being personal consumption even though some of the expenses are for
their business. Given this, the self-employed could be underreporting or over-reporting their personal
expenditures to household surveys.20
To test for the importance of this, we perform two separate exercises. First, using the CE sample,
we re-estimate the amount of underreporting of income using a variety of expenditure subcategories in
order to explore the robustness of the results across the subcategories. Then, we can verify whether
expenditure subcategories that are not likely to be subject to the confounding of business and personal use
yield different estimated underreporting gaps than categories where the potential substitution for business
and personal use is larger. For example, while vehicle expenditures may be easily attributed to the
business even if they were for personal use, it is less likely that an individual would attribute the spending
on home utilities as being a business expense. To implement this procedure, we re-estimate (6) using the
log of spending on different expenditure sub-categories as the dependent variable using the CE data.
Focusing on utilities as a measure of expenditures and using after-tax family income as our measure of
income yields an underreporting of income by the self-employed of roughly 30 percent, nearly identical to
20
For some categories of expenditure, such as gasoline, the CE explicitly asks the share of that expenditure used for business.
For some other expenditure categories, such as auto repairs and parking, the CE explicitly reminds the consumers not to include
such expenditure for business use. However, it is still possible that some consumers counted personal expenditure as business
expenditure or vice versa, leading to reporting inaccurate personal expenditure to the CE.
17
the food expenditure estimates.
Across all the subcategories we explored, nearly all of them yielded an estimated underreporting
gap in the 20 to 30 percent range.
One notable exception, however, is nondurable transportation
expenditures (which includes spending on vehicle gasoline, car maintenance, parking fees, tolls, etc.).
Using this subcategory, we estimate underreporting of income by the self-employed close to zero. This is
consistent with anecdotes that the self-employed frequently classify a large amount of their vehicle
expenditures associated with personal use as business expenses. Overall, however, we find no evidence
that the results across any of the other expenditure categories differ in any substantive way from our main
results even though the substitutability with business expenditures differs markedly across the categories.
Given that nondurable vehicle spending is a component of nondurable spending and total outlays, it is not
surprising that our estimates of underreporting are lower for nondurable spending and total spending than
they are for all the other individual categories like food expenditures and utilities.
The results above, however, raise a different question. Is it possible that the self-employed
systematically misreport their expenditures on all categories? To test for this hypothesis, we estimate the
relationship between the share of spending on a given category and total expenditures based on Deaton
and Muellbauer's (1982) Almost Ideal Demand System (AIDS) and allowing for systematic measurement
error of expenditures.
If underreporting were uniform across categories, then while total expenditures
would be underreported, budget shares would be unaffected.
Our procedure tests whether the budget
shares of necessary goods and luxury goods, conditional on total expenditure, differ between the selfemployed and wage and salary workers. For example, if the self-employed underreport expenditures, then
their budget shares for luxury goods will be too high for their measured level of total expenditures.
To be concrete, suppose individuals report their expenditures on different consumption categories
(j) such that:
log ciktj  log ciktj*  log k  t
(7)
j
where cikt is the reported amount of spending on consumption category j by household i from group k in
18
j*
year t and cikt is the true amount of spending on consumption category j by household i from group k in
year t.21 The actual report is subject to two types of multiplicative error: a group specific term that is
constant across all categories in all time periods ( k ) that represents the fraction of true expenditures
reported, and a time specific term that is constant for all groups and for all categories ( t ). We include a
time effect because it is well documented that measurement error in the CE has been growing over time. 22
The key assumption in our procedure to estimate systematic misreporting of expenditures by the selfemployed is that there is a group specific measurement error term in reported expenditures that is constant
across both goods and time and that is additive in log expenditure.
Given this specification, reported total expenditure, cikt , is just the sum of reported expenditures
*
on each of the individual categories and actual total expenditures, cikt , is just the sum of the actual
expenditures on the individuals categories. With these assumptions, we can express total reported
expenditures as also being a function of the three components such that:
*
log cikt  log cikt
 log k  t .
(8)
Under an AI-style demand system, individuals have expenditure shares for consumption category j that
can be expressed as:23
*
siktj*   j   j ln cikt
  j X ikt   iktj
(9)
ciktj* ciktj
where s  * 
and X ikt is the same vector of demographic controls used above. Conditional on
cikt cikt
j*
ikt
demographics, the share of spending on some expenditure categories will be increasing in total
21
We assume that within each year households face similar prices for the expenditure categories so income differences are
driving changes in expenditure shares.
22 See Aguiar and Bils (2011) We wish to note that we could enrich the structure of the measurement error even further by
allowing for either a time-invariant category specific component or a time-varying category specific component. As long as we
assume that these category specific terms do not vary with self-employment status, our estimating procedure to identify
systematic misreporting of expenditures by the self-employed will not be altered.
23 This expression omits a price term. Although there may be year-to-year price variation, since we will aggregate up to fairly
broad categories where substitution effects are much smaller, omitting prices is reasonable. We also estimate the system for
specific years where price variation should be minimal and find the same effects.
19
expenditures (i.e., luxuries), while the share of spending on other expenditure categories will be
decreasing in total expenditures (i.e., necessities).
As above, we assume that the parameters of
expenditure share elasticity function are constant between the self-employed and wage and salary
workers. Given the above specifications and the assumptions on measurement error, we estimate:
siktj*     ln cikt   Dit  X ikt  t  ikt ,
(10)
where Dit is an indicator variable taking the value of one if the household is self-employed in year t and
t is a vector of year dummies. All other variables are defined above.
The coefficient  in (10) represents the extent to which the self-employed will under or over
report their expenditures relative to wage and salary workers. In particular,  can be shown to be equal
to   j log S where S is the fraction of expenditures reported by the self-employed relative to wage
and salary workers. If the self-employed underreport their expenditures by a constant amount on all
expenditure categories relative to wage and salary workers,  will be positive for luxuries and  will be
negative for necessities.
The patterns will be the opposite if the self-employed over-report their
expenditures by a constant amount on all expenditure categories.
If  is equal to zero for luxuries and
necessities, this says that the self-employed are not systematically under-reporting or over-reporting their
expenditures relative to wage and salary workers.
To implement our estimation procedure, we segment our nondurable consumption categories into
luxuries, necessities, and other. We use the estimated consumption category expenditure elasticities from
Aguiar and Bils (2011) to segment the categories.
We define luxuries to be those consumption
categories with estimated expenditure elasticities greater than 1.3. These categories include food away
from home, men’s and women’s clothing, and nondurable entertainment expenditures.
Necessities are
defined to be those consumption categories with estimated expenditures less than 0.7. These categories
20
include food at home, children’s clothing, and utilities. 24
Nondurable expenditure categories with
expenditure elasticities between 0.7 and 1.3 were considered neither luxuries nor necessities. We also
exclude transportation which, given our results above, may be miscategorized as business spending by the
self-employed. We estimate the demand system from (10) defined over the three categories of
expenditures. Our estimate of  from the regression using the luxury share as the dependent variable is
0.005 (standard error = 0.003) while the estimate of  from the regression using the necessity share as
the dependent variable is -0.003 (standard error = 0.003). An F test for   0 for each category has a pvalue of 0.25. This suggests that s , the fraction of expenditures that is systematically reported by the
self-employed relative to wage and salary workers, is very close to one.
Both of our additional tests suggest that, aside for the possibility of nondurable vehicle
expenditures, there is no systematic evidence that the self-employed misreport their expenditure data to
household surveys relative to wage and salary workers.
5B.
Potential Other Reasons That Could Cause Systematic Differences in the IncomeExpenditure Relationships.
In this subsection, we assess whether differences in the income-expenditure relationships between wage
and salary workers and the self-employed are driven by factors other than the systematic underreporting
of income by the self-employed. In particular, we examine 1) differences in the potential conceptual
measures of income between the two groups, 2) differences in desired expenditures due to differences in
hours worked, and 3) differences in desired saving propensities between the two groups.
Different Measures of Income
We begin by assessing whether the above underreporting results could be driven by differences in the
conceptual measures of income between the two types of individuals. As noted above, there are a few
24
Aguiar and Bils (2011) estimate their expenditure elasticities from a regression of log expenditure on a given category on log
total expenditure and controls as opposed to estimating the share of expenditure on a given category on log total expenditure and
controls (as we do). Hence, their estimates range from 0.25 to 2.10.
21
potential issues with our income measures. First, the results in Tables 2 and 3 focus on pre-tax and aftertax family money.
Do our results differ instead if we use before-tax business plus labor income? As
seen from column (1) of Table 4, the IV results using pre-tax business plus labor income are very similar
to the results shown in column (1) of Table 3. The regression we ran is identical to what we reported in
Table 3 aside from changing the income measure to pre-tax business plus labor income.
Additionally, as noted above, the CE data do not allow us to measure the retained earnings of the
self-employed. To the extent that some of the desired saving of the self-employed is done via retained
earnings, the self-employed could have a higher propensity to consume out of the income they draw out
of the business.
This, however, is not a problem with the PSID data. To test for the importance of the
conceptual difference in income as an explanation of our results in the CE, we examine the
underreporting behavior of the self-employed with zero reported business wealth relative to the
underreporting behavior of the self-employed with non-zero business wealth. The assumption is that if
the self-employed are reinvesting a substantial amount of their income into the business, it should show
up in the value of their business. Such an identification is made possible given the fact that a substantial
fraction of the self-employed owns a business with little value (see Hurst and Lusardi 2004). This is
consistent with the facts documented in Hurst and Pugsley (2011), who show that many of the selfemployed are in industries where very little physical capital is needed (skilled craftsmen, real estate
agents, lawyers, etc.).
To do this, we re-estimate (6) using the CE data including separate dummies for the business
wealth of the self-employed. The results from this specification show that the estimated underreporting
gap is still substantial even for those self-employed with zero business wealth. These findings suggest
that the differences in the conceptual measures of income between the self-employed and the wage and
salary workers is not the primary driver of the underreporting of income we documented above.
Difference in Expenditures Due to Differences in Hours Worked
As documented in Section 2, the self-employed do systematically work more hours than wage and salary
workers, even within the restricted sample of individuals working full time.
22
Theory suggests that
individuals who work more may engage in less home production compared to individuals who work less
(Becker 1965 and Ghez and Becker 1975), or that increased work hours may raise the marginal utility of
consumption.
Empirically, Aguiar and Hurst (2005, 2007, and 2009) show that a given consumption
bundle produced with more market inputs, as opposed to time inputs, is indeed more expensive. To the
extent that the self-employed engage in less home production, because they are allocating more hours to
market work, expenditures may be higher simply because they are spending more for an equivalent
consumption bundle.
To control for this potential difference in the market cost of the consumption bundle, we follow a
procedure similar to that of Blundell, Browning, and Meghir (1994) and further control for log work
hours. We re-estimate (6) with a log work hours control using after-tax total family income as our income
measure to compute new estimates of 1  ˆ . These results are shown in column (2) of Table 4. For each
expenditure measure, the results are very close to the original estimates shown in column (2) of Table 3.
Differences in Saving Propensities
A potentially more significant problem is the extent to which the self-employed have different saving
incentives relative to wage and salary workers. If the self-employed face greater income risk, they may
want to accumulate more wealth to insure themselves against potential shocks. Likewise, if the selfemployed face more binding liquidity constraints, they may accumulate more wealth to overcome such
constraints. Finally, if the self-employed receive fewer fringe benefits, they may have to save more for
health shocks and for retirement. All of these factors will shift the income-expenditure relationship
downward (as a result of the higher saving propensities), causing us to under-estimate the extent to which
the self-employed underreport their income.
However, the self-employed may also have a higher tolerance for risk (Barsky et al., 1997). As a
result, for a given amount of risk, the self-employed may have a lower desire to accumulate precautionary
savings. The differences in risk preferences could result in less desired saving by the self-employed and,
as a result, cause us to overstate the extent to which the self-employed underreport their income. A
23
similar result would occur if the self-employed systematically expect higher income in the future.
We can partially assess whether such possibilities affect the expenditure gap by segmenting the
sample by age. High expected future income, binding liquidity constraints, and the existence of labor
income risk are much more important for young households relative to older households (Gourinchas and
Parker, 2002). In columns (3) and (4) of Table 4 we re-estimate (6) on a sample of young and older
households, respectively. We define younger households as households between the ages of 25 and 40
and define older households as households between the ages of 40 and 55. Again, for comparison to the
results above, we use after-tax total family income as our measure of income. Across expenditure
measures, the estimated amount of underreporting is very similar between old and young households.
These results suggest that differences in future income expectations, differences in binding liquidity
constraints, and differences in precautionary motives are likely not the primary driver of our results.25
As an additional robustness exercise, we examined the extent of underreporting by the level of
household wealth. To do this, we augmented (6) to include dummies for whether the household was in
the second wealth quartile, the third wealth quartile or the top wealth quartile along with our
entrepreneurship dummy interacted separately with all of the wealth quartiles. The goals of this
specification are twofold. First, we ask the general question as to whether differences in household
wealth (which could result from past savings behavior) could explain the estimated underreporting by the
self-employed. Second, the specification allows us to assess whether the extent to which households
underreport differs by wealth quartiles. The results from this specification shows that the underreporting
estimates do not differ significantly across the wealth quartiles.
6.
The Importance of Income Underreporting by the Self-Employed
In this section, we show how not accounting for the systematic underreporting of income by the selfemployed can lead to biased conclusions in a variety of settings. To do so, we re-examine several
empirical examples where variation from the self-employed plays a prominent role. For the most part, we
25
Our results are not sensitive to the age cutoff. Even looking at 45-55 year olds or 50-55 year olds, we see little change in our
estimates of the amount of underreporting.
24
restrict the scope to results that can be explored using our existing PSID and CE samples. But, for some
analyses, we must slightly change the sample criteria. Below, we only give a brief overview of the
procedures. Collectively, the results in this section underscore our central point that taking a household
survey at face value without thinking about the incentives the households have to report correctly could
undermine a wide range of empirical analyses.
6A.
Differences in Earnings and Wealth between the Self-Employed and Wage and Salary
Workers
The most straightforward implication of our findings pertains to papers that estimate earnings differentials
between wage and salary workers and the self-employed. In a recent example, Hamilton (2000) estimates
the returns to job tenure and experience in both self-employment and wage employment. Using data from
the Survey of Income and Program Participation (SIPP), he constructs lifecycle wage profiles separately
for wage and salary workers and for the self-employed under several alternative measures of selfemployed income. He finds the median self-employed individual’s wages, cumulating over 10 years, are
35 percent lower than a comparable wage and salary worker, and he assigns a prominent role to nonpecuniary returns to self-employment over alternative explanations.
Throughout his analysis, Hamilton did not adjust his estimates for the fact that the earnings data
of the self-employed could be substantially underreported. In this subsection, we do not attempt to
replicate the analysis of Hamilton using the SIPP data. The reason for this is that it is obvious that not
accounting for the underreporting of income by the self-employed will severely bias these results. As we
show above, the self-employed earnings are underreported by roughly 25 percent. As a result, it is likely
that a substantial amount of the earnings gap estimated by Hamilton is simply due to the measurement
error in household surveys with respect to self-employment income.26
6B.
The Empirical Importance of Precautionary Savings
26
We want to stress that we are not saying that non-pecuniary benefits are not important for the self-employed. Moskowitz and
Vissing-Jorgensen (2002) do account for the fact that the self-employed do underreport their income. Even with this adjustment,
they still conclude that non-pecuniary benefits of self-employment could be very large. Additionally, in the NBER working
paper version of the paper, we show how the underreporting of income by the self-employed can alter the estimates in Gentry and
Hubbard (2004) showing that the self-employed are much wealthier than wage and salary workers conditional on income. Not
surprisingly, if the self-employed underreport their income, their wealth conditional on income will be biased upwards.
25
The above example shows how accounting for the underreporting of income by the self-employed could
lead to substantial changes in the results that directly compare the economic behavior of the selfemployed to that of wage and salary workers.
In the next several examples, we show that the
underreporting of income by the self-employed can bias a variety of other types of analyses.
We start by focusing on the importance of precautionary savings. Carroll and Samwick (1998)
use data from the PSID during the 1980s to explore the relationship between wealth holdings and income
risk conditional on the level of income and other demographics. They measure income risk as the
variance of actual household income as well as decompose the variance of household income into its
permanent and transitory components. They instrument income risk using industry dummies. They
conclude that up to 50 percent of household wealth accumulation for households under the age of 50 is
due to precautionary motives.
Defining a sample that is analogous to Carroll and Samwick and using the same specification as
theirs, we find that 47.5 percent of household saving can be attributed to precautionary motives. This
estimate is nearly identical to the results reported in their work.27
However, if we increase the income of
the self-employed by 25 percent to adjust for the underreporting, the estimated importance of
precautionary savings is 10.3 percent lower (from 47.5 percent to 42.6 percent). What drives this decline?
Industries in which the self-employed are more prevalent are also the industries which are found to have
more income risk. When regressing wealth on income risk controlling for household income, it appears
that households in industries where there is more income risk are holding more wealth for a given amount
of income. However, some of this relationship is simply due to the fact that we are mismeasuring the
level of income for those households. Our results show that ignoring the income underreporting of the
self-employed can introduce upward biases in the importance of precautionary savings in explaining total
wealth accumulation of younger households.
27
The details of the specification and sample can be found in the online robustness appendix.
26
6C.
Adjustments to Life Cycle Earnings Profiles
Self-employment probabilities change over the lifecycle. For example, within our PSID and CE samples,
the fraction of individuals that are self-employed rises from 4.4 percent and 5.9 percent respectively when
households are 25 to 18.3 percent and 11.2 percent respectively when households are 45. They rise
further to 39.9 percent and 21.6 percent when households are 65. Given that the self-employment
propensities are changing over the lifecycle, the extent to which individual earnings is measured with
error is changing over the lifecycle.
How much of the decline in average earnings between the ages of 45 and 65 could be driven by
the underreporting of income by the self-employed? Within the PSID and CE samples, respectively,
average earnings fell by 24.4 percent and 21.5 percent between the ages of 45 and 65.28 Simply adjusting
for underreporting reduces these declines by about 3.9 and 2.8 percentage points. Taken together,
between roughly 10 and 15 percent of the decline in lifecycle earnings between the ages of 45 and 65 in
the PSID and the CE can be accounted for by the underreporting of income by the self-employed.
6D.
Spatial Differences in Average Earnings
Finally, we show that the underreporting of income by the self-employed could bias conclusions about the
spatial differences in earnings.
As shown by Glaeser (2007), self-employment propensities differ
markedly across U.S. cities. Given this, comparing earnings across space will yield biased results if one
does not account for differences in self-employment propensities across space.
Using the Public Use Microdata Sample (PUMS) from the 2000 Census, we illustrate this point.
We construct a sample of male heads between the ages of 25 and 64, who are currently working, usually
work more than 30 hours per week, and worked 40 weeks during the previous year. We further restrict
this sample to only include individuals who live in a defined MSA. Within this sample, we find a
tremendous amount of variation in self-employment rates across MSAs. Across the MSAs, the mean and
median self-employment rates were 12.5 percent and 11.9 percent, respectively. The standard deviation
28
A more formal analysis of this question would want to adjust for either individual or cohort fixed effects. This example is only
meant to be illustrative.
27
in self-employment rates across the MSAs was 2.9 percent.
To illustrate the potential for the underreporting of income by the self-employed to yield biased
results when computing earnings differentials across place, we look at a few pair wise comparisons. To
do this, we examine conditional earnings differentials. Specifically, we regress log earnings on one-year
age dummies and a series of education, usual hours worked per week, and weeks worked per year
dummies. We then compare these residuals, with and without adjusting for the underreporting of income
by the self-employed, across MSAs.
For example, not accounting for the underreporting of income and conditional on the
demographic and work hour adjustments, individuals in our sample from Nassau County, New York
earned 8.9 percent more than individuals in our sample from Detroit. 16.8 percent of the individuals from
Nassau County were self-employed while only 10.2 percent of the individuals from Detroit were selfemployed. Adjusting upward the income of all self-employed individuals, conditional on observables, by
25 percent increases the Nassau-Detroit earnings differential by nearly 15 percent (from 8.9 percent to
10.3 percent). The earnings differential between the Barnstable-Yarmouth, Massachusetts MSA (which
has the highest self-employment rate among MSAs at 25.3 percent) and the Kokomo, Indiana MSA
(which has the lowest self-employment rate among MSAs at 6.0 percent) gets reduced by 30 percent
when adjusting for income underreporting by the self-employed (from -14.7 percent to -10.4 percent).
6E.
Potential Other Applications
The above results show that not accounting for the underreporting of income of the self-employed can
lead to quantitatively important biases when comparing the income and savings behavior of the selfemployed with wage and salary workers, when computing the economic importance of precautionary
savings, when estimating lifecycle earnings profiles, and when computing earnings differentials across
space. There are many other situations where not adjusting for the underreporting of income by the selfemployed can potentially lead to quantitatively important biases.
For example, given that self-
employment propensities differ across races and genders, failing to account for the underreporting of
28
income could lead to biased estimates of racial and gender earnings gaps. Likewise, there is much
interest in comparing the borrowing and default behavior of the self-employed to wage and salary
workers. Not properly measuring income for the self-employed could lead to biased results in this setting
as well. We leave an assessment of the quantitative importance of these potential biases to future work.
7.
Conclusion
Essentially all empirical work using data from household surveys assumes that household income is not
systematically mismeasured. However, there is reason to believe that this assumption may not hold,
particularly for the self-employed. Research from tax audits finds that the self-employed substantially
underreport their income to tax authorities. What was less known to the research community is whether
and the extent to which the self-employed underreport their income within U.S. surveys. Our paper
contributes to the literature by filling this gap.
Using data from both the Consumer Expenditure Survey and the Panel Study of Income
Dynamics, we find that, on average, the self-employed underreport their income by about 25 percent
within household surveys relative to wage and salary workers. The results are remarkably consistent
across both surveys. As robustness analyses, we implement a sequence of alternative specifications that
relax various elements of our assumptions. Our results are essentially unchanged throughout.
The results in this paper also have larger implications. There is an active literature showing that
individuals respond to incentives when reporting information to administrative sources (like tax
authorities) or alter their behavior when participating in experiments. Yet, researchers often ignore the
potential of such problems spilling over into the household surveys that we use for much of our economic
analysis.
It is naive to assume that individuals who are willing to misconstrue their behavior to
administrative sources would otherwise automatically provide accurate responses when participating in
household surveys.
While it is likely true that the benefits of providing distorted information to
household surveys are small, it is also likely true that the costs of providing inaccurate information are
also small. To the extent that individuals would have to exert effort to provide an accurate response to
29
household surveys or feel compelled to maintain consistency in light of concerns about confidentiality,
economic theory suggests that they would continue to provide the erroneous information, even if there is
no direct benefit of doing so. We show these effects are present with respect to the income reporting of
the self-employed. Future work should try to identify and explore other situations in which individuals
may provide systematically biased information to household surveys.
Finally, our paper speaks to the literature that attempts to infer the measurement error in
household surveys by comparing the reports of individual level variables (e.g., income) in household
surveys to administrative data for the same individual. A key assumption in much of this literature is that
any potential errors in the administrative data are orthogonal to the errors in household surveys.
The
results of this paper show that such an assumption is violated in some cases. If the household is willing to
misconstrue their behavior to tax authorities, there is no reason to believe that they will not misconstrue
their behavior to household surveys as well.
30
References
Aguiar, Mark and Mark Bils (2010). “Has Consumption Inequality Mirrored Income Inequality,”
University of Rochester Working Paper.
Aguiar, Mark and Erik Hurst (2009). “Deconstructing Lifecycle Expenditures,” University of Chicago
Working Paper.
Aguiar, Mark and Erik Hurst (2007). “Lifecycle Prices and Production,” American Economic Review,
97(5), pp. 1533-59.
Aguiar, Mark and Erik Hurst (2005). “Consumption vs. Expenditure,” Journal of Political Economy,
113(5), pp. 919-48.
Andreoni, James (1992). “IRS as Loan Shark: Tax Compliance with Borrowing Constraints,” Journal of
Public Economics, 49, pp. 35-46.
Andreoni, James, Brian Erard, and Jonathan Feinstein (1998). “Tax Compliance,” Journal of Economic
Literature, 36(2), pp. 818-60.
Barsky, Robert, Miles Kimball, Thomas Juster, and Matthew Shapiro (1997). “Preference Parameters and
Behavioral Heterogeneity: An Experimental Approach in the Health and Retirement Study,” The
Quarterly Journal of Economics, 112(2), pp. 537-79.
Becker, Gary (1965). “A Theory of the Allocation of Time,” The Economic Journal, 75, pp. 493 – 517.
Bee, Adam, Bruce Meyer, and James Sullivan (2011). “Micro and Macro Validation of the Consumer
Expenditure Survey,” mimeo.
Blank, Rebecca, Kerwin Charles, and James Sallee (2009). “A Cautionary Tale about the Use of
Administrative Data: Evidence from Age of Marriage Laws,” American Economic Journal:
Applied Economics, 1(2), pp. 128-49.
Blundell, Richard, Martin Browning, and Costas Meghir (1994). “Consumer Demand and the Life-Cycle
Allocation of Household Expenditures,” Review of Economic Studies, 61, pp. 57-80.
Bound, John, Brown, Charles, Duncan, Greg, and Rodgers, Willard (1994). “Evidence on the Validity of
Cross Sectional and Longitudinal Labor Market Data,” Journal of Labor Economics, vol. 12(3),
pp. 345-68.
Bound, John, Charles Brown, and Nancy Mathiowetz (2001). “Measurement Error in Survey Data,” in
Handbook of Econometrics, Volume 5, eds. James Heckman and Edward Leamer, Elsevier
Science.
Carroll, Chris and Samwick, Andrew (1998). “How Important is Precautionary Saving?” Review of
Economics and Statistics, 80(3), pp. 410-9.
Clotfelter, Charles (1983). “Tax Evasion and Tax Rates: An Analysis of Individual Returns,” Review of
Economics and Statistics, 65(3), pp. 363-73.
Deaton, Angus (1997). The Analysis of Household Surveys:
Development Policy. World Bank Publication.
A Microeconometric Approach to
Deaton, Angus and John Muellbauer (1980). “An Almost Ideal Demand System,” American Economic
Review, 70(3), pp. 312-26.
Engström, Per and Bertil Holmlund (2009). “Tax Evasion and Self-employment in a High-Tax Country:
Evidence from Sweden,” Applied Economics, 41(19), pp. 2419-30.
Feinstein, Jonathan (1991). “An Econometric Analysis of Income Tax Evasion and Its Detection,” Rand
Journal of Economics, 22(1), pp. 14-35.
Feldman, Naomi and Joel Slemrod (2007). “Estimating Tax Noncompliance with Evidence from
Unaudited Tax Returns,” Economic Journal, 117 (518), pp. 327-52.
Fitzgerald, John, Peter Gottschalk, and Robert Moffitt (1998). “An Analysis of Sample Attrition in Panel
Data: The Michigan Panel Study of Income Dynamics,” Journal of Human Resources, 33(2), pp.
251-99.
Gentry, William and R. Glenn Hubbard (2004). “Entrepreneurship and Household Saving,” Advances in
Economic Analysis & Policy, 4(1), Article 8.
Ghez, Gilbert and Gary Becker (1975). The Allocation of Time and Goods over the Life Cycle, Columbia
University Press, New York.
Gibson, John, Bonggeun Kim and Chul Chung (2009). “Using Panel Data to Exactly Estimate Income
under Reporting of the Self-employed,” Labor Economics Working Papers, East Asian Bureau
of Economic Research.
Glaeser, Edward (2007). “Entrepreneurship and the City.”
Discussion Paper No. 2140.
Harvard Institute of Economic Research
Gottschalk, Peter and Robert Moffitt (1994). “The Growth of Earnings Instability in the U.S. Labor
Market”, Brookings Papers on Economic Activity, 1994(2), pp. 217-72.
Gourinchas, Pierre-Olivier and Jonathan Parker (2002), “Consumption over the Life Cycle,”
Econometrica, 70(1), pp. 47-89.
Haider, Steven and Gary Solon (2006). “Life-Cycle Variation in the Association between Current and
Lifetime Earnings.” American Economic Review, 96(4), pp. 1308-20.
Hamilton, Barton H. (2000). “Does Entrepreneurship Pay? An Empirical Analysis of the Returns to SelfEmployment.” Journal of Political Economy, 108 (3), pp. 604-31.
Hurst, Erik and Annamaria Lusardi (2004). “Liquidity Constraints, Household Wealth, and
Entrepreneurship,” Journal of Political Economy, 112(2), pp. 319-47.
Hurst, Erik and Ben Pugsley (2011). “What do Small Businesses Do?,” Brookings Papers on Economic
Activity, 43(2), pp. 73-142.
32
Johansson, Edvard (2005). “An Estimate of Self Employment Income Underreporting in Finland,”
Nordic Journal of Political Economy, 31(1), pp. 99 - 109.
LaLumia, Sara (2009). “The Earned Income Tax Credit and Reported Self-Employment Income,”
National Tax Journal, 62(2), pp. 191-217.
Levitt, Steve and John List (2009). “Was there Really a Hawthorne Effect at the Hawthorne Plant? An
Analysis of the Original Illumination Experiments,” NBER Working Paper 15016.
Meyer, Bruce, Wallace K.C. Mok, and James Sullivan (2009). “The Under-Reporting of Transfers in
Household Surveys: Its Nature and Consequences,” Harris School Working Paper #09.03,
Moskowitz, Tobias J. and Annette Vissing-Jorgensen (2002). “The Returns to Entrepreneurial
Investment: A Private Equity Premium Puzzle?” American Economic Review 92(4), pp. 745-78.
Pissarides, Christopher and Guglielmo Weber (1989). “An Expenditure Based Estimate of Britain's Black
Economy,” Journal of Public Economics, 39(1), pp. 17-32.
Sabelhaus, John, David Johnson, Stephen Ash, David Swanson, Thesia Garner, John Greenlees, and
Steve Henderson (2011). “Is the Consumer Expenditure Survey Representative by Income,”
mimeo.
Saez, Emmanuel (2010). “Do Taxpayers Bunch at Kink Points?” American Economic Journal: Economic
Policy, 2(3),pp. 180-212.
Schuetze, Herb (2002). “Profiles of Tax Noncompliance Among the Self Employed in Canada:19691992,” Canadian Public Policy, 28(2), pp. 219-37.
Slemrod, Joel (1985). “An Empirical Test for Tax Evasion,” The Review of Economics and Statistics,
67(2), pp. 232-38.
Slemrod, Joel (2007). “Cheating Ourselves: The Economics of Tax Evasion,” Journal of Economic
Perspectives, 21(1), pp. 25-48.
Solon, Gary (1992). “Intergenerational Income Mobility in the United States,” American Economic
Review, 82(3) pp.393-408.
33
Table 1: Descriptive Statistics for Self-Employed and Wage and Salary Worker Samples, CE and PSID
Data Set
I. CE
II. PSID: 1-Year
III. PSID: 3-Year Average
Self
Wage
Self
Wage
Self
Wage
Income Measure/Sample
Employed
Workers
Employed
Workers
Employed
Workers
Labor Plus Business Income
Total Family Income
Total Family Disposable Income
Food Expenditure
Nondurable Expenditure
Total Expenditure
Average Age of Head
% Education = 12
% Education = Some College
% Education = College
% Education = Graduate School
% Black
% Married
Average Family Size
Average Head Work Hours
% Own Home
% Non-Positive Total Family Income
% Non-Positive Labor+Bus. Income
Sample Size
10.81
10.87
10.83
8.85
9.98
10.74
Panel C:
41.4
26.9
23.3
20.4
17.9
3.3
83.2
3.4
2,562
78.2
1.6
2.6
2,508
Panel A: Log Income Measures
10.83
11.01
10.90
10.86
11.20
11.00
10.75
11.04
10.86
Panel B: Log Expenditure Measures
8.67
8.97
8.73
9.78
10.51
Other Labor Supply and Demographic Measures
39.0
40.8
38.0
29.2
20.5
24.5
24.4
30.8
30.7
19.0
21.6
19.2
12.5
15.1
10.0
8.0
2.9
9.9
80.6
86.3
77.9
3.26
3.3
3.1
2,317
2,556
2,295
69.5
81.2
70.0
0.2
1.1
0.1
0.4
2.0
0.1
24,711
4,446
31,988
11.14
11.33
11.15
10.97
11.06
10.91
9.05
-
8.79
-
41.5
22.1
31.6
21.0
15.1
1.7
90.2
3.4
2,588
87.2
0.9
0.3
38.3
25.0
30.5
19.5
10.4
9.4
79.4
3.1
2,284
73.6
0.0
0.0
1,901
16,332
Notes: This table compares income, expenditure, and demographic variables between the self-employed and wage and salary workers within the CE and PSID .
Each sample is restricted to include all households with male heads between the ages of 25 and 55 (inclusive) who are working full time and have positive levels
of income and expenditure. Additional sample restrictions for each survey are discussed in the text. For the fraction of households with non-positive income, we
compute this fraction assuming all sample restrictions hold aside from the restriction that the household have positive levels of income and expenditure.
34
Table 2: Log Expenditure Differences Between Self-Employed and Wage Workers,
Conditional on Log Income and Demographics
I. Income = Pre-Tax
II. Income = After-Tax
Total Family Income
Total Family Income
Sample/Expenditure
Measure
OLS
IV
OLS
IV
A. Estimates of 
CEX, Nondurables
CEX, Total Expenditure
CEX, Food
PSID 1-Yr, Food
PSID 3-Yr Average, Food
0.17
(0.01)
0.20
(0.01)
0.15
(0.01)
0.12
(0.01)
0.11
(0.01)
0.18
(0.01)
0.21
(0.01)
0.15
(0.01)
0.12
(0.01)
-
0.14
(0.01)
0.17
(0.01)
0.13
(0.01)
0.12
(0.01)
0.12
(0.01)
0.13
(0.01)
0.16
(0.01)
0.12
(0.01)
0.12
(0.01)
-
B. Estimates of 
CEX, Nondurables
CEX, Total Expenditure
CEX, Food
PSID 1-Yr, Food
PSID 3-Yr Average, Food
0.37
(0.01)
0.42
(0.01)
0.26
(0.01)
0.29
(0.01)
0.32
(0.01)
0.60
(0.01)
0.76
(0.01)
0.40
(0.01)
0.31
(0.01)
-
0.38
(0.01)
0.43
(0.01)
0.27
(0.01)
0.31
(0.01)
0.36
(0.01)
0.63
(0.01)
0.80
(0.01)
0.42
(0.01)
0.36
(0.01)
-
Notes: This table shows the OLS and IV estimates of the log-linear Engel curve from equation (6). The income
elasticity is denoted as β, and the coefficient on the self-employment dummy is denoted γ. We estimate the
regression using different measures of expenditures and for different samples (indicated down the rows) and for
different measures of household income (indicated across the columns). Robust standard errors are shown in
parentheses. The samples are the same as the ones discussed in Table 1. The demographic controls are a series of
five-year age dummies, a dummy if the household head is black, a dummy if the household head is married, and a
series of family size dummies. We use education dummies as instruments for permanent income in the IV
specifications in the CE and one-year PSID samples.
For the PSID three-year average sample, we do not
instrument. Instead our measure of permanent income is average household income over three years.
35
Table 3: Predicted Fraction of Income Underreported by Self-Employed,
Baseline Specification
Estimate of (1-κ), in Percent
I. Income = Pre-Tax
II. Income = After-Tax
Expenditure Measure
Total Family Income
Total Family Income
CEX, Nondurable Expenditure
25.5
(1.5)
18.8
(1.6)
CEX, Total Expenditure
24.5
(1.5)
17.8
(1.5)
CEX, Food Expenditure
31.7
(2.0)
25.2
(2.0)
PSID 1-Yr, Food
31.5
(2.1)
28.8
(2.0)
PSID 3-Yr Average, Food
29.4
(2.1)
27.8
(2.0)
Notes: This table shows the estimates of the amount of underreporting of income by the self-employed, denoted by
(1 - κ), using the estimates of β and γ from Table 2. We show the different estimates of (1-κ) using different
measures of expenditure, using data from different surveys, and using different measures of income. For the CE
and one-year PSID sample, we use the IV estimates from Table 2. For the PSID three-year average sample, we use
the OLS estimates. Asymptotic standard errors are computed for the transformation 1-κ using a robust estimate of
the variance covariance matrix.
36
Table 4: Predicted Fraction of Income Underreported by Self-Employed, Alternate Specifications
Specification/Sample
Expenditure Measure
1
2
3
4
CEX, Nondurables
27.5
(1.6)
18.4
(1.6)
16.9
(2.6)
19.8
(1.9)
CEX, Total Expenditure
26.5
(1.5)
17.7
(1.5)
14.3
(2.4)
20.0
(1.9)
CEX, Food
33.5
(2.0)
24.9
(2.0)
26.9
(3.9)
23.9
(2.3)
PSID 1-Yr, Food
37.7
(2.0)
28.0
(2.0)
29.5
(2.7)
26.8
(2.7)
PSID 3-Yr Average, Food
36.0
(2.0)
26.8
(2.1)
27.4
(2.6)
27.2
(2.9)
Labor Earnings Plus
Business Income
After-Tax
Total Family Income
After-Tax
Total Family Income
After-Tax
Total Family Income
Base
Include Log Work
Hours as a Control
Income Measure
Specification
Restrict To Heads With Restrict To Heads With
Age 25-40
Age 40-55
Notes: This table shows the amount of underreporting of income by the self-employed (1-κ) for alternate specifications of equation (6) from the text. In column
(1), we re-estimate the specification in Table 3 with labor earnings plus business income as our measure of income. In column (2), we add log work hours to our
vector of control variables. In columns (3) and (4), we split the samples by age (young vs. old). For all specifications, the sample is otherwise the same as the
one described in the note to Table 1. Asymptotic standard errors are in parentheses.
37
8
8.5
9
9.5
10
Figure 1: Nonparametric Estimate of Food Engel Curve Using Three-Year PSID Panels
10
11.5
11
10.5
Log 3 Year Average Family Income
Wage and Salary Employees
12
Self Employed
Notes: Figure shows the estimates of (3) from the text estimated nonparametrically for both wage and salary
workers and for the self-employed. When creating the figure, we divide the support of log income residuals into
equally spaced bins and fit a cubic spline through the median log average income and log average expenditure for
each bin. Additionally, when making the figure, we use our three-year average PSID sample. Our measure of
income is the three-year average of after tax family income. Our measure of expenditure is the three-year average
of food expenditures. Our vector of additional controls is described in the text. See text for additional sample
restrictions.
38