ThisworkisdistributedasaDiscussionPaperbythe
STANFORDINSTITUTEFORECONOMICPOLICYRESEARCH
SIEPRDiscussionPaperNo.16-017
OptimalTaxMixwithIncomeTaxNon-compliance
By
JasonHuangandJuanRios
StanfordInstituteforEconomicPolicyResearch
StanfordUniversity
Stanford,CA94305
(650)725-1874
TheStanfordInstituteforEconomicPolicyResearchatStanfordUniversitysupports
researchbearingoneconomicandpublicpolicyissues.TheSIEPRDiscussionPaper
Seriesreportsonresearchandpolicyanalysisconductedbyresearchersaffiliatedwith
theInstitute.Workingpapersinthisseriesreflecttheviewsoftheauthorsandnot
necessarilythoseoftheStanfordInstituteforEconomicPolicyResearchorStanford
University
Optimal Tax Mix with Income Tax
Non-compliance∗
Jason Huang †
Juan Rios‡
Stanford University
Stanford University
March 2016
Abstract
Although developing countries face high levels of income inequality, they rely
more on consumption taxes, which tend to be linear and are less effective for redistribution than a non-linear income tax. One explanation for this pattern is that the consumption taxes are generally more enforceable in these economies. This paper studies
the optimal combination of a linear consumption tax, with a non-linear income tax, for
redistributive purposes. In our model, households might not comply with the income
tax code by reporting income levels that differ from their true income. However, the
consumption tax is fully enforceable. We derive a formula for the optimal income tax
schedule as a function of the consumption tax rate, the recoverable elasticities, and
the moments of the taxable income distribution. Our equation differs from those of
Mirrlees (1971) and Saez (2001) because households respond to income tax not only
through labor supply but also through mis-reporting their incomes. We then characterize the optimal mix between a linear consumption tax rate and a non-linear income
tax schedule. Finally, we find that the optimal consumption tax rate is non-increasing
in the redistributive motives of the social planner.
JEL: D31, D63, D82, H21, H23, H26, I30, O21
∗ We
are grateful to the editor, the two anonymous referees, Douglas Bernheim, Tim Bresnahan, Liran
Einav, Caroline Hoxby, Matthew Jackson, Petra Persson, Florian Scheuer and Alex Wolitzky for their invaluable comments and advice.
† Stanford Economics Department; 579 Serra Mall, Stanford, CA 94305-6072. E-mail:jhuang99 at stanford.edu
‡ Stanford Economics Department; 579 Serra Mall, Stanford, CA 94305-6072. E-mail:juanfrr at stanford.edu
1
Keywords: Labor Supply, Tax Non-compliance, Optimal Tax, Income Tax, Redistribution, Tax
Evasion, Development
1
Introduction
In developing countries, higher proportions of tax revenues come from consumption
taxes, which are linear, rather than income taxes, which are typically non-linear. For instance, while in Mexico 73.5% of the tax revenue comes from consumption taxes, this
proportion is on average 32.6% among OECD countries (OECD, 2002) and it is below
7% in the United States (Office of Management and Budget, 2008). One possible explanation for such pattern is that the usually non-linear income tax is more vulnerable to
non-compliance than most linear taxes. There are at least three reasons for this. The first
one is the simplicity of these linear taxes. The non-linear nature of the income tax requires
from the government information on individual income, while linear taxes are impersonal
and therefore do not require from the government information on how much of each good
was consumed by each individual. This milder information requirement makes it easier
for the government to enforce linear taxes. The second reason is the self-enforcing mechanism of consumption value added taxes (VAT) adopted in many economies across the
world. Many countries have adopted VAT schemes because of their self-enforcing structures (Keen and Lockwood 2010 (10)). In theory, VAT and sales tax work the same way. But
the former is collected at every transaction providing tax rebates for the input purchases,
while the latter only collects taxes at the very last stage. The VAT’s tax rebates create
self-enforcing incentives that assist the governments in preventing tax evasion. This selfenforcement mechanism has been empirically documented in different settings (see for
instance Naritomi 2013 (18) and Pomeranz 2013 (21)). Finally, consumption taxes could
be easier to enforce since there are fewer points of collection (firms) than in the case of
income tax (workers).
Historically, we see a relationship between the maturity of a government’s tax collection
infrastructure and its reliance on non-linear income tax. The share of federal revenue in
the US coming from excise taxes decreased from 12.6% in 1960 down to 2.7% in 2008, while
the income tax share increased from 44% to 45.4% (Office of Management and Budget,
historical tables, government receipts by source). The U.S. did not enact an income tax
until 1861, while excise taxes have been in place since right after the ratification of the
Constitution in 1789 (Historical Statistics of the United States Series). More generally,
governments in early stages of development have relied on import tariffs, a form of linear
consumption tax, because the authorities can focus all their collection efforts at the ports.
Despite these enforcing advantages of consumption taxes, redistributing resource across
different individuals using these taxes is harder because of their linearity.
1
In this paper, we leverage the strength of one tax to overcome the weakness of the other.
We study the optimal use of a non-linear income tax, which is susceptible to non-compliance,
and a linear consumption tax, which we assume is perfectly enforced. There are three
main results.
First, we derive the optimal non-linear income tax schedule as a function of the linear
consumption tax rate, taxable and misreported income elasticities, and the moments of
the taxable income distribution. Our schedule contains a corrective term that captures
how the income tax adjusts to correct the limitation caused by the linear consumption tax.
Intuitively, the marginal income tax rates are lower under the presence of the consumption
tax, since less revenue needs to be collected through the income tax. Furthermore, as the
non-compliance behavior becomes more responsive to the non-linear tax, the capacity of
this tax to undue the linear tax distortions diminishes.
Second, we describe the optimal linear tax rate jointly with the optimal non-linear tax
schedule. A perturbation in the linear tax rate affects welfare by changing the reported
and misreported income that tax payers choose in equilibrium. Nonetheless, the effect
on the households’ reporting behavior is second order when an optimal non-linear tax
schedule is in place through two channels. First, a decrease (increase) in the consumption tax rate increases (decreases) social welfare by expanding (reducing) the set of implementable non-linear tax schedules, since the social planner is able to implement higher
(lower) marginal tax rates for the same level of mis-reporting. Second, it ambiguously
affects the marginal cost of non-compliance, since the direction of this effect depends on
the distribution of mis-reporting behavior across the households. More specifically, since
we assume the cost of mis-reporting increases with its absolute value, the increase (decrease) in the mis-reporting level would increase (decrease) this direct welfare cost for
households over-reporting their income, but would decrease (increase) for households
under-reporting it. We characterize the optimum linear tax rate by setting the net effect of
these two channels to zero.
Finally, the optimal linear consumption tax rate is non-decreasing in the redistributive
motives of the social planner. This result may appear surprising at first glance, since
linear taxes are perceived as regressive. However, our result involves the joint optimal
tax structure of the linear tax and the non-linear income tax. A more redistributive social planner tends to implement higher marginal income tax rates, causing households to
evade more taxes. Because the consumption tax cannot be evaded, a higher consumption
tax rate discourages tax evasion by lowering the marginal benefit of an additional unit of
evaded income. Hence, to combat the increase in evasion, the social planner sets a higher
2
consumption tax rate.
Our work contributes to the literature in optimal taxation started by Mirrlees (1971, (17)).
Part of this literature has addressed the possibility of evasion. Sandmo (1981, (23)) constructs a model with two groups of taxpayers (evaders and non-evaders) but restricts the
income tax to be linear and set the probability of detection to be endogenous. Cremer and
Gahvari (1995, (5)) incorporates tax evasion to the general optimal income tax problem,
where the social planner chooses not only the optimal tax schedule but also the optimal
audit structure, restricting the analysis to 2 types. Alternatevely, Schroyen (1997, (25))
model a two-class economy with an official and an unofficial labour market. The official
economy is taxed non-linearly, while unofficial income is only observable after a costly
audit upon which it is taxed at an exogenous penalty rate. Kopczuk (2001, (12)) considers
an optimal linear tax problem in which individuals differ not only in their labor productivity but also in their cost of avoidance. He finds that allowing for tax avoidance may
improve social welfare since allowing people to avoid paying tax can serve as a redistributive mechanism. Sandmo (2004, (24)) reviews this literature. More recently, Piketty
et al (2014, (20)) derive the optimal income tax formula as a function of the labor supply,
tax avoidance and compensation bargaining elasticities.
Another part of this literature has determined the optimal commodity taxes joint with the
optimal income tax. Atkinson and Stiglitz (1976, (1)) showed that if a general income tax
function may be chosen by the government, no commodity tax should be employed on
commodities where the utility function is separable between labor and all commodities.
Boadway and Jacob (2014, (9)) consider this problem, restricting the commodity taxes to
be linear and writing the optimal tax formulae as a function of recoverable elasticities.
However, these papers have not examined the trade-off between a linear tax and a nonlinear evadable income tax. It is important to note that we restrict ourselves to a linear
and uniform commodity tax over all consumption goods. Technically in our framework,
there is no difference between consumption and income tax except for the functional form
and the exposure to non-compliance.
The closest paper to the present work is Boadway, Marchand and Pestieau (1994, (3)).
They study the use of a non-linear income and a linear consumption tax, when households
can only evade income tax in a two-type economy. However, our model allows for a richer
hetereogeneity in the population. This richness allows us to derive a formula for the nonlinear income tax schedule as a function of recoverable elasticities and moments of income
distributions. Furthermore, we provide a precise characterization of this linear tax, rather
than simply arguing that a positive consumption tax is optimal in the presence of tax non3
compliance. This precision allows us to perform comparative statics of the optimal linear
tax rate with respect to the social planner’s redistributive motives.
Finally, the present work also relates to the normative literature of taxation in developing countries. These papers take into account the tax evasion behavior, common in these
countries, to recommend tax policies. Best et al (2013, (2)) consider the corporate tax evasion of firms in Pakistan. They note that the presence of evasion justifies taxes on turnover
instead of on profits, sacrificing production efficiency but increasing tax revenue. Gordon
and Li (2009, (6)) show that if a government needs to rely on the information available
from bank records in order to enforce corporate taxes, the optimal tax structure includes
capital income taxes, tariffs and inflation. Kleven and Kopczuk (2011, (11)) solve for the
optimal anti-poverty program in an income maintenance framework. Since they are interested on the trade-off between mis-targeting and take-up of the program, the government’s objective is to maximize the number of deserving poor receiving the benefits, given
a budget. There are two papers directly related to ours in this literature. Gorodnichenko et
al (2009 (7)) analyze the welfare costs of income tax reform in Russia, taking into account
the different margins of response (real vs mis-reporting). Kopczuk (2012, (14)) conducts a
similar exercise using a flat tax reform in Poland. In this paper, we discuss the implication
of income tax evasion beyond their partial equilibrium analysis, in an optimal income tax
context.
The paper is organized as follows. In section 2, we set up the model. In section 3, we
characterize the optimal non-linear tax schedule for a fixed linear tax rate. Then in section
4, we characterize the joint optimal non-linear tax schedule and the linear tax rate. In
section 5, we derive comparative statics results. And in section 6, we conclude.
2
Model
We consider a unit mass of heterogenous individuals who differ in their level of labor
productivity, θ, i.e. the amount of income generated with a unit of labor. Let F (θ ) denote a
differentiable cumulative distribution function with the probability distribution function
f (θ ) with bounded support of θ ∈ [θ, θ̄ ] and θ > 0. We assume that the individuals have
the following quasilinear preference for consumption, reported income and misreported
income:
ȳ + ỹ
U (c, ỹ, ȳ; θ ) = c − ψ
− φ(ỹ),
θ
4
where c is consumption, ȳ is the reported income, ỹ is the misreported income. The total income is the sum of the reported and misreported component, ỹ + ȳ, and the labor
ỹ+ȳ
supply is the total income divided by the household’s productivity, i.e. θ . The continuously differentiable function ψ(·) captures the labor supply cost. We assume ψ0 (·) > 0
and ψ00 (·) > 0. The continuously differentiable function φ(·) captures the cost of noncompliance. We can interpret φ() as the effort that one must exert to misreport its income or the expected disutility from the penalty he suffers when caught. We assume that
φ(ỹ) ≥ 0 and φ00 (ỹ) > 0 for all ỹ, φ0 (ỹ) > 0 for ỹ > 0 and φ0 (ỹ) < 0 for ỹ < 0.
In our model, the domain of φ(·) is the set of the real numbers because the household
can understate or overstate its income, i.e. ỹ can be either positive or negative. When
ỹ > 0, then y > ȳ, in which case the household hides income. When ỹ < 0, then ȳ > y, in
which case the household claims to have produced more than it actually did. A household
may want to inflate its income if it faces a negative marginal tax rate, as in the case of the
Earned Income Tax Credit. We need to allow ỹ to be negative to ensure that our solution
does not trivially simplify to Mirrlees schedule. If hidden income was restricted to be
non-negative, the Social Planner could set the consumption tax high enough to prevent
any mis-reporting and set the income tax in order to obtain the mirrlesian allocation. We
formalize this intuition in Proposition 3 in the Appendix.
Also note that θ enters only through the labor supply and does not impact the disutility of
tax non-compliance. A justification for this assumption is that when a household is caught
for misreporting its income, its penalty depends on the total amount of misreported income and not on the skill level of the household’s breadwinner.
The households only pay income taxes on their reported income ȳ so the amount collected
by the government is T (ȳ). Since our model is static, households consume all their after
tax income ȳ + ỹ − T (ȳ), paying a linear tax over this amount at rate t. Although there
is no distinction between consumption and income in our model, one could interpret the
linear tax as a consumption tax considering the usual pattern of linear consumption tax
versus a non-linear income tax. However, we denote t as the linear tax hereafter. The
households choose both ȳ and ỹ to solve the following problem:
max (1 − t) [ỹ + ȳ − T (ȳ)] − ψ
ȳ≥0,ỹ≥−ȳ
ỹ + ȳ
θ
− φ(ỹ).
(1)
The first constraint, ȳ ≥ 0, requires the household to report non-negative income. The
second constraint, ỹ ≥ −ȳ, ensures that the household cannot produce negative total
5
income. The social planner’s problem is the following:
max
(
Z
T (ȳ),t θ ∈Θ
−ψ
(1 − t) [ỹt,T (θ ) + ȳt,T (θ ) − T (ȳt,T (θ ))]
ỹt,T (θ ) + ȳt,T (θ )
θ
)
− φ(ỹt,T (θ )) d F̃ (θ )
with d F̃ (θ ) as the pareto weights the social planner puts on type θ and ȳt,T (θ ) and ỹt,T (θ )
as the optimal amount of income to report and mis-report, respectively, for a given income
tax schedule T () and a linear tax rate t. The chosen tax system must satisfy the following
resource constraint
(
)
Z
θ ∈Θ
t [ỹt,T (θ ) + ȳt,T (θ ) − T (ȳt,T (θ ))] +
|
{z
}
linear tax revenue
T (ȳ (θ ))
| t,T
{z }
dF (θ ) = 0,
non-linear tax revenue
which ensures that the social planner balances its budget.
To characterize the jointly non-linear tax schedule and the linear tax rate, we proceed in
two steps, following the approach developed in Rothschild and Scheuer (2013) (22). First,
we solve the inner problem: we characterize the optimal non-linear tax for a given linear
tax rate. Such optimal schedule gives us a social welfare W (t) that depends on the linear
tax t. In the outer problem, we maximize the social welfare with respect to the linear tax
rate.
3
Inner Problem: Optimal Income Tax Schedule for a Given
Linear Tax
In this section, we take the linear tax as given, and solve the Social Planner’s problem
for the non-linear tax. Rather than solving for the optimal T (·), we use the revelation
principle from mechanism design. We consider the isomorphic problem in which the
social planner offers a menu [c̄t (θ̂ ), ȳt (θ̂ )] ∀θ̂ ∈ [θ, θ̄ ] that can depend on the linear tax t,
with ȳt (θ̂ ) as the income that a household that reports type θ̂ hands to the social planner
and c̄t (θ̂ ) as the transfer in terms of real consumption given back to that household. Note
that, because of tax non-compliance, reported type θ̂’s actual consumption is c̄t (θ̂ ) + (1 −
t)ỹ rather than c̄t (θ̂ ). This feature of our model differs from the traditional application of
the mechanism design approach in optimal non-linear taxation.
6
An individual that reports a type θ̂ receives a [c̄t (θ̂ ), ȳt (θ̂ )] bundle and decides the optimal
amount of income to mis-report by solving the following problem:
(
max
ỹ≥−ȳt (θ̂ )
c̄t (θ̂ ) + (1 − t)ỹ − ψ
ȳt (θ̂ ) + ỹ
θ
!
)
− φ(ỹ)
(2)
Let:
ỹt (θ, ȳ) ≡ argmaxỹ≥−ȳ
(1 − t)ỹ − ψ
ȳ + ỹ
θ
− φ(ỹ)
(3)
We add a subscript t to ỹt (·) because a household’s misreporting decision depends on the
linear tax rate. Because we assume that the households have quasilinear preferences, this
amount of tax non-compliance does not depend on c̄. Also, because the objective function
of (2) is strictly concave in ỹ and continuously differentiable in ỹ, ȳ and t, the optimum is
unique, continuous and differentiable in t and ȳ.
Now we introduce the indirect utility function, ut (·), defined as
ut (c̄, ȳ; θ ) = c̄ + (1 − t)ỹt (θ, ȳ) − ψ
ȳ + ỹt (θ, ȳ)
θ
− φ(ỹt (θ, ȳ)).
(4)
with ỹt (·) as defined above, c̄ as the official consumption allocation chosen by the social
planner and ȳ as the income collected by the social planner. This modified utility function
reflects the household’s preference by accounting for the act of tax non-compliance.
The utility of type θ when he reports θ̂ is ut (c̄(θ̂ ), ȳ(θ̂ ); θ ). The social planner wants to
ensure that
θ ∈ argmaxut (c̄t (θ̂ ), ȳt (θ̂ ); θ ),
θ̂
i.e. each type chooses to report its true type. Let vt (·) be the value function of type θ,
which we define as
7
c̄t (θ ) + (1 − t)ỹt (θ, ȳt (θ )) − ψ
vt (θ ) = ut (c̄t (θ ), ȳt (θ ); θ ) =
ỹt (θ, ȳt (θ )) + ȳt (θ )
− φ(ỹt (θ, ȳt (θ ))).
θ
(5)
Using the envelope theorem with respect to c̄, ȳ and ỹt the local incentive constraint is
expressed as the following:
v0t (θ )
ȳt (θ ) + ỹt (θ, ȳt (θ )) 0
ψ
=
θ2
ȳt (θ ) + ỹt (θ, ȳt (θ ))
θ
.
(6)
Since (6) reflects the marginal information rent a type θ household extracts for not emulating a household less productive by an arbitrarily small amount, this formula expresses
how the linear tax affects non-compliance and total production, which in turn impacts
the incentive constraints. To see this, note that when t increases, the marginal benefit of
income decreases and the total income reported in any equilibrium of an incentive compatible tax schedule ȳt (θ ) + ỹt (θ, ȳt (θ )) also decreases. This decreases v0 (θ ) and relaxes
the local incentive constraint. To see this, consider the figure 1 that displays vt (θ ) for
t L < t M < t H . As the consumption taxes increases, the relationship between the value
functions and the types become flatter so that the social planner needs to pay less informational rent for higher types.
vt (θ )
vt L (θ )
vt M (θ )
vt H (θ )
θ
Figure 1: Value Function of Different Types
8
The following lemma shows that, with monotonicity of ȳ(θ ), these intuitions generalize
to global incentive compatibility constraints.
Lemma 1. Given the preference in (4), an allocation is incentive compatible if and only if it satisfies
equation (6) and ȳ(θ ) is non-decreasing.
Proof. See Appendix.
The intuition for the lemma is as follows. Because the cost of non-compliance through
φ() is the same for all types but the cost of supplying such misreported income is relatively lower for higher types, the marginal disutility of ȳ, even accounting for the noncompliance behavior, is decreasing in θ. And since the preference is quasilinear, we have
that the marginal substitution between c̄ and ȳ is decreasing in θ as well.
With Lemma 1, we proceed with our direct mechanism approach with the new preference
(4), replacing the incentive compatibility constraint with local incentive compatibility constraint and monotonicity constraints.
For a given t, the Social Planner Problem (SPP) is
max
Z θ̄
vt (θ ),ȳt (θ ) θ
vt (θ )d F̃ (θ )
with d F̃ (θ )1 as the pareto weight, subject to (6)
v0t (θ )
ỹ(θ, ȳt (θ )) + ȳt (θ ) 0
ψ
=
θ2
ỹ(θ, ȳt (θ )) + ȳt (θ )
θ
and to the resource constraint, which is
Z
θ ∈Θ
{ȳt (θ ) − c̄t (θ ) + tỹt (θ, ȳ(θ ))}dF (θ ) ≥ 0.
(7)
R
The term t θ ∈Θ ỹdF (θ ) is the revenue the social planner has from the linear tax that it can
use for redistribution. The solution of this problem yields the following proposition:
Proposition 1. The first order condition for the optimal income tax rate at positive reported income
1 We do not require d F̃ ( θ ) to be differentiable.
This generalization allows us to consider a Rawlsian social
planner who puts all the weight on the lowest type θ.
9
level ȳ with linear tax rate t can be written as either
T 0 (ȳ
t ( θ )) +
t
1− t
ỹEỹ,1−T 0
ȳEȳ,1−T
+1
=
1 − T 0 (ȳt (θ ))
ỹEỹ,1−T 0
+1
ȳEȳ,1−T
F̃ (θ ) − F (θ )
f (θ )θ
1+
1
Ey,1−T 0
!
(8)
or
T 0 (ȳ) +
Eȳ,1−T
t
1− t
ỹEỹ,1−T 0
ȳEȳ,1−T
+1
0 ȳT 00 ( ȳ ) + (1 − T 0 ( ȳ ))
=
H̃ (ȳ) − H (ȳ) 1
.
h(ȳ)ȳ
Eȳ,1−T 0
(9)
The H (ȳ) is the cumulative reported income distribution and h(ȳ) is the density of the reported
income.
Proof. See Appendix
Our result is similar to those found in the optimal income tax literature, except we have
a linear tax rate and non-compliance elasticity. More specifically, suppose t = 0. Then
equation (8) is the same as Piketty (1997, (19)) and equation (9) is the same as equation
(34) in Bovenberg and Jacobs (2005, (16)). However, the schedule is function of not only
the elasticity of taxable Eȳ,1−T 0 income but also of the mis-reported income Eỹ,1−T 0 . These
two objects are related. The cost of avoidance is one of the key determinants of the second
elasticity, at the same time that it affects the first elasticity as shown in Kopczuk (2005,
(13)).
The optimal non-linear tax adjusts in the presence of the linear tax, since the linear tax
introduces a wedge that the non-linear tax can partially undo. The extent of this adjustment depends on household’s non-compliance behavior. In the extreme case when noncompliance is costless, i.e. φ() = 0, we have −Eỹ,1−T 0 ỹ = Eȳ,1−T 0 ȳ, since the first order
dỹ
condition of the household problem with respect to ỹ implies dȳ = −1. Hence, the optimal non-linear tax does not need to account for the effect of the linear tax. However, the
optimal non-linear schedule still depends on the linear tax since this tax influences households’ reported and misreported income elasticities. In particular it affects the marginal
disutility of reported income relative to the mis-reported income by reducing the marginal
utility of consumption. In the other extreme case in the absence of non-compliance, we
have Eỹ,1−T 0 = 0, so the non-linear tax must fully account for the linear tax. The intuition is
that as misreported income becomes more elastic, the non-linear tax becomes less effective
in undoing the linear tax wedge. In fact, when t > 0, the optimal income tax schedule that
10
properly accounts for tax non-compliance behavior is more progressive than the schedule
that ignores such behavior.
If the linear tax t = 0, we see from proposition 1 that we get back the“no distortion at
the top" result, i.e. T 0 (ȳmax ) = 0. This result differs from the one found in Grochulski
2008 (8), which states that, under certain non-compliance cost structure and even with
bounded support for skill types, the optimal income tax is progressive. The difference
arises because the incomes of households in our model come from supplying labor while
the incomes from households in Grochulski (2008) (8) are exogenously endowed.
So far, we have characterized the optimal income tax for strictly positive reported income.
We also have two comments for the case when ȳt (θ ) = 0. First, the usual condition used
to rule out a corner solution in other optimal income tax problems (the marginal disutility
from labor when supplying no labor is 0) does not ensure an interior solution because
of the presence of tax non-compliance. The intuition is the following. Without the tax
non-compliance, reported income equals total income, and at zero reported income, the
marginal change in social welfare when slightly increasing production is positive since
marginal disutility of production is 0. However, with positive misreported income, even
when a household reports zero income, its actual level of production is positive and the
marginal disutility from labor might be quite high. As a result, the marginal social welfare
of increasing reported income may be negative, even at level zero. Another interpretation
for why the non-negative constraint on ȳ may bind is that the presence of non-compliance
restricts the set of implementable marginal tax schedules. When the marginal tax rates
exceed some threshold, households choose to report zero income.
Second, the set of types that report zero income must consist of an interval starting from
the lowest type. This result arises by the monotonicity of ȳt (θ ). This result agrees with the
general findings that the informal markets are less efficient than the formal markets.2
In summary, the characterization of the optimal non-linear tax schedule offers two seemingly opposing implications. On one hand, the presence of non-compliance imposes an
upper envelope for the feasible tax schedule. Gordon and Li (6) noted this fact when they
argued that developing countries need to set lower marginal tax rates so that firms do not
move into the informal economy. On the other hand, for the region of the schedule with
positive reported income and positive consumption tax, the presence of non-compliance
makes marginal tax rates higher than the optimal rates that ignore tax non-compliance.
2 La
Porta and Shleifer (2009) (15) examine World Bank firm-level surveys and find that firms in the
informal sector are smaller, much less productive, and managed by less educated managers than compared
to even small firms in the formal sector.
11
4
Outer Problem: Characterizing Optimal Consumption Tax
In the previous section, we solved the inner problem by characterizing the optimal nonlinear tax schedule for a given the linear tax rate. Now we solve the outer problem, i.e.
we maximize the welfare with the optimal income tax in place with respect to the linear
tax rate.
Proposition 2. The optimal linear tax rate, with the optimal non-linear income tax schedule derived in Proposition 1, must satisfy the following condition:
Z θ̄ 0
φ (ỹt (θ, ȳ)) + λt (θ ) θ
φ00 (ỹt (θ, ȳ))
f (θ )dθ = 0,
(10)
ȳt (θ )
with ȳt (θ ) defined by (18) and with λt (θ ) as the lagrange multiplier on the constraint ȳt (θ ) ≥ 0.
Proof. See Appendix
Notice that all the effects that depend on total income, i.e. ψ() and its higher derivatives,
disappear when characterizing the optimal linear tax rate t. The social planner has full
control over them through the non-linear tax. Here, we see how the flexibility of the
income tax complements the linearity of the enforceable tax. And since the social planner
has already optimized the non-linear tax in the inner problem, changing t slightly does
not impact the social welfare effect through this total income.
However, the social planner cannot fully control tax non-compliance, and non-compliance
changes the solution in two ways. First, an increase in ỹ increases the social cost of nonR
compliance by φ0 (ỹ(θ ))dθ. Second, even when ψ0 (0) = 0, non-compliance may cause
the constraint ȳ ≥ 0 to bind. We can think of this second force as non-compliance’s impact
on the social planner’s ability to implement a desired non-linear tax schedule. Hence, a
change in t only affects social welfare through the cost of non-compliance and the boundary conditions, the two channels that the social planner does not fully control.
To fix ideas, consider an exogenous increase of the social cost of evasion φ(·) which increases this cost to k · φ(·) for some k > 1. This would directly affect welfare by reducing
the utility of all households mis-reporting in equilibrium, but it would also affect the set of
households that would choose to report zero income. This would in turn change the possible set of schedules the social planner can choose in order to raise the revenue necessary
for redistribution.
12
Finally, note that the social planner cares about the breakdown of the total income into
reported and misreported to the extent that cost of non-compliance impacts the social
welfare. To better illustrate this point, consider an economy in which a 1 − α fraction
of the cost of non-compliance is transferred back to the social planner, as considered in
(Chetty 2008, (4)). For example, if part of the costs of non-compliance consists of fines
households must pay when caught, these fines can be sources of additional revenue that
the social planner can use for redistribution. On the other hand, the α fraction of the cost
can reflect the efforts to learn about the tax code or the expense of hiring tax lawyers. We
can instead characterize the optimal income tax schedule for a given linear tax t as the
solution to the following point-wise maximization.
ỹt (θ, ȳ) + ȳ
ȳt (θ ) = arg max ȳ + ỹt (θ, ȳ) − ψ
− αφ(ỹt (θ, ȳ))
θ
ȳ≥0
F (θ ) − F̃ (θ )
ỹt (θ, ȳ) + ȳ 0 ỹt (θ, ȳ) + ȳ
ψ
.
+
θ
f (θ )
θ2
(11)
This problem is analogous to (18) found in the proof of proposition 1 with an α in front of
φ() term. In the case that α = 0, i.e. all the misreporting cost incurred by the households
are transferred back to the social planner. If the non-negativity constraint on ȳ does not
bind, the optimal allocation only depends on the total productivity and the linear tax
has no role. To see this more precisely, when α = 0, the objective function of (11) is
a function of ȳ + ỹt (θ, ȳ). As long as the constraint ȳ ≥ 0 does not bind, ȳ + ỹt (θ, ȳ)
is strictly increasing with respect to ȳ, regardless of the non-compliance behavior of the
household. Hence the social planner can choose the appropriate ȳ to yield the desired
total income. Intuitively, when evasion does not generate resource costs (α = 0), the
planner can implement redistribution without caring about the enforcement of each tax
instrument as long as ȳ + ỹ(θ, ȳ) is strictly increasing in ȳ. However, when mis-reporting
is wasteful (α > 0), the planner trades off the enforceability of the consumption tax against
the progressivity of the income tax which yields a mix of both instruments.
5
Comparative Statics
In this section, we discuss how the optimal linear tax rate varies with the redistributive
motives of the social planner. Our analysis uses proposition 2. First, we formalize the
degree of a redistributive motive with the following definition.
Definition 1. Pareto weights F̃ (θ ) are more redistributive than pareto weights F̃ 0 (θ ) when
13
Figure 2
F̃ (θ ) is first order stochastically dominated by F̃ 0 (θ ), i.e. F̃ (θ ) ≥ F̃ 0 (θ ) for all θ ∈ [θ, θ̄ ].
If we assume that φ0 ()/φ00 () is non-decreasing, a condition that is satisfied by an iso|ỹ|δ
elastic mis-reporting cost function φ(ỹ) = δ , and that we assume W (t) is differentiable
and single-peaked, we can show that the optimal linear tax rate is weakly increasing with
the planner’s redistributive motives (corollary 1). The interpretation of the first condition is that the marginal impact of increasing t on social welfare through the cost of noncompliance does not decrease as mis-reported income increases. The interpretation for
the second condition is straightforward, but to fully specify the set of functional forms
and parameters to ensure the condition is quite cumbersome. For all the cases that we
γ
|ỹ|δ
numerically checked when assuming φ(ỹ) = δ and ψ( x ) = xγ , functional forms commonly assumed in the optimal tax literature, the necessary conditions are also sufficient
conditions for optimality.
Corollary 1. Suppose that φ0 ()/φ00 () is a non-decreasing function and W (t) is differentiable and
single peaked, i.e. optimal t is unique. Then the optimal t never decreases when the social planner
becomes more redistributive.
Proof. See Appendix
As discussed in more detail in the appendix, the condition that φ0 ()/φ00 () is non-decreasing
dW (t)
ensures that the marginal social welfare with respect to changes in t, i.e. dt is nondecreasing with respect to the redistributive motive of the social planner. Consider figure
14
2. Suppose W (t) and W 0 (t) are the social welfares as functions of t under and F̃ (θ ) and
F̃ 0 (θ ), respectively, and that F̃ (θ ) is more redistributive than F̃ 0 (θ ). Then dW (t)/dt always lie above dW 0 (t)/dt, and the two curves do not cross. Furthermore, a social welfare
function like W 00 (t) cannot exist, since its derivative curve crosses the derivative curve of
W 0 ( t ).
Corollary 1 may seem surprising, as non-linear tax tend to be perceived as “progressive"
and linear tax as “regressive." However, when social planner becomes more redistributive,
the marginal tax rates increase. These increases in the rates result in higher mis-reported
income ỹ. More specifically, households who evade incomes will evade more and the
households who inflate their income will inflate less. The effect of a marginal increase in
the linear tax rate on welfare, i.e. φ0 ()/φ00 () is positive. As a result, the social planner
needs a higher linear tax rate to offset the rise in ỹ.
6
Conclusion
We view the contribution of this paper as twofold. The first is providing a method to
solve an optimal income tax problem when income levels are not perfectly observable. We
overcome this difficulty by providing modified preferences of the households that account
for their optimal non-compliance behavior. Our strategy allows us to have a model with
a continuum of heterogeneous households who choose both labor supply and reported
income.
Our second contribution is providing insights on how to combine tax instruments, each
limited in its own way, to yield better redistributive outcomes. As seen in the characterization of the optimal tax mix, the non-linearity of the income tax is crucial in its ability
to complement the rigidity of the linear consumption tax. The presence of linear taxes in
an economy should be taken into account in calibrations of optimal income tax schedule,
since even countries with high levels of compliance have linear taxes. This consideration requires an adjustment of the income tax schedule which yields, in general, lower
marginal tax rates. However, if we take into account the non-compliance behavior on income taxes, this adjustment should be smaller than what first appears to be the case. We
also see the complementarity between the two instruments in our analysis of the optimal
tax structure as we change the social planer’s redistributive motive. As the social planner puts more weight on the lower ability households, we see the non-linear tax becoming
more progressive as expected. But in reaction to the households’ increased evasion behav15
ior in response to the higher marginal tax rates, the optimal linear tax rate also increases.
References
[1] Anthony Barnes Atkinson and Joseph E Stiglitz. The design of tax structure: direct
versus indirect taxation. Journal of public Economics, 6(1):55–75, 1976.
[2] Michael Carlos Best, Anne Brockmeyer, Henrik Jacobsen Kleven, Johannes Spinnewijn, and Mazhar Waseem. Production vs revenue efficiency with limited tax capacity: theory and evidence from pakistan. 2013.
[3] Marchand Maurice Boadway, Robin and Pierre Pestieau. Towards a theory of the
direct-indirect tax mix. Journal of Public Economics, 55:71 – 88, 1994.
[4] Raj Chetty. Is the taxable income elasticity sufficient to calculate deadweight loss?
the implications of evasion and avoidance.
[5] Helmuth Cremer and Firouz Gahvari. Tax evasion and the optimum general income
tax. Journal of Public Economics, 60(2):235–249, 1996.
[6] Gordan Gordon and We Li. Tax structure in developing countries: Many puzzles and
a possible explanation. NBER, 2005.
[7] Yuriy Gorodnichenko, Jorge MARTINEZ-VAZQUEZ, and Klara Sabirianova PETER.
Myth and reality of flat tax reform: Micro estimates of tax evasion response and
welfare effects in russia. Journal of political economy, 117(3):504–554, 2009.
[8] Borys Grochulski. Optimal nonlinear income taxation with costly tax avoidance. Economic Quarterly-Federal Reserve Bank of Richmond, 93(1):77, 2007.
[9] Bas Jacobs and Robin Boadway. Optimal linear commodity taxation under optimal
non-linear income taxation. Journal of Public Economics, 117:201–210, 2014.
[10] Michael Keen and Ben Lockwood. The value added tax: Its causes and consequences.
Journal of Development Economics, pages 138 – 151, 2010.
[11] Henrik Jacobsen Kleven and Wojciech Kopczuk. Transfer program complexity and
the take-up of social benefits. American Economic Journal: Economic Policy, pages 54–
90, 2011.
[12] Wojciech Kopczuk. Redistribution when avoidance behavior is heterogeneous. Journal of Public Economics, pages 51–71, 2000.
16
[13] Wojciech Kopczuk. Tax bases, tax rates and the elasticity of reported income. Journal
of Public Economics, 89(11):2093–2119, 2005.
[14] Wojciech Kopczuk. The polish business “flat” tax and its effect on reported incomes:
a pareto improving tax reform. Technical report, Mimeo, Columbia University, 2012.
[15] Rafael La Porta and Andrei Shleifer. The unofficial economy and economic development. Brooking Papers on Economic Activity, 2008.
[16] A Lans Bovenberg and Bas Jacobs. Redistribution and education subsidies are
siamese twins. Journal of Public Economics, 89(11), 2005.
[17] James A Mirrlees. An exploration in the theory of optimum income taxation. The
review of economic studies, pages 175–208, 1971.
[18] Joana Naritomi. Consumers as tax auditors. Job market paper, Harvard University, 2013.
[19] Thomas Piketty. La redistribution fiscale face au chômage. Revue française d’économie,
12(1):157–201, 1997.
[20] Thomas Piketty, Emmanuel Saez, Stefanie Stantcheva, et al. Optimal taxation of top
labor incomes: A tale of three elasticities. American Economic Journal: Economic Policy,
6(1):230–271, 2014.
[21] Dina Pomeranz. No taxation without information: Deterrence and self-enforcement
in the value added tax. Technical report, National Bureau of Economic Research,
2013.
[22] Casey Rothschild and Florian Scheuer. Redistributive taxation in the roy model. The
Quarterly Journal of Economics, 128(2):623–668, 2013.
[23] Agnar Sandmo. Income tax evasion, labour supply, and the equity - efficiency tradeoff. Journal of Public Economics, 16(3):265–288, 1981.
[24] Agnar Sandmo. The theory of tax evasion: A retrospective view. National Tax Journal,
pages 643–663, 2005.
[25] Fred Schroyen. Pareto efficient income taxation under costly monitoring. Journal of
Public Economics, 65(3):343–366, 1997.
17
Appendix
The Case when ỹ is Constrained to be Non-negative
We discuss here why if we disallow negative ỹ, the solution equals that of Mirrlees (1971).
Let us call y(θ ) M as the optimal total income in which household cannot misreport its
income. More precisely, y(θ ) M is defined as
(
y(θ ) M = arg max
y ≥0
y y
y−ψ
+ 2 ψ0
θ
θ
θ
y
F (θ ) − F̃ (θ )
f (θ )
)
.
(12)
Now we are ready to present the following proposition.
Proposition 3. Suppose that constraint of the problem (2) is ỹ ≥ 0 rather than ỹ ≥ −ȳ. The
optimal solution involve any t ≥ t∗ with t∗ satisfying
(1 − t∗ ) = min
θ
ψ0 (y(θ ) M /θ )
.
θ
(13)
And the optimal allocation involves ỹt (θ, ȳt (θ )) = 0 and ȳt = y(θ ) M for all θ ∈ [θ, θ̄ ].
Proof. Suppose that in our optimal allocation, there exists a θ̂ ∈ [θ, θ̄ ] such that ỹt (θ̂, ȳt (θ̂ )) >
0. We can increase social welfare by first setting t = t∗ . Note that ỹt∗ (θ, ȳ) = 0 for ȳ ≥ 0
and for all θ. Then set new allocation as ȳt∗ (θ ) = ȳt (θ ) + ỹt (θ, ȳt (θ )) for all θ. This new
allocation increases social welfare by at least φ(ỹt (θ̂, ȳt (θ̂ ))). Hence the optimal allocation must involve zero income tax evasion, and therefore it must coincide with that of
Mirrlees.
Proof of Lemma 1
Proof. Let b(ȳ, θ ) ≡ (1 − t)ỹt (θ, ȳ) − ψ
ỹt (θ,ȳ)+ȳ
θ
− φ(ỹt (θ, ȳ)). Using the definition of
ȳ+ỹ
ỹt (θ, ȳ), we can rewrite b(ȳ, θ ) = maxỹ (1 − t)ỹ − ψ θ
− φ(ỹ). Applying envelope
ỹt (θ,ȳ)+ȳ 0 ỹt (θ,ȳ)+ȳ
theorem when differentiating with respect to θ gives us bθ (θ, ȳ) =
ψ
.
θ
θ2
(⇒): We rewrite IC as
θ̂ ∈ argmax{ut (c̄t (θ̂ ), ȳt (θ̂ ); θ ) − vt (θ )},
θ
18
(14)
where vt (θ ) is defined in (5). Since the objective function of the above expression is differentiable with respect to θ, IC implies that the first order condition evaluated at θ = θ̂
is:
v0t (θ̂ ) = bθ (ȳt (θ̂ ), θ̂ ),
(15)
which is equivalent to (6).
We now show that IC implies monotonicity of ȳt (θ ). For any θ1 ≥ θ0 , v(θ1 ) − v(θ0 ) =
[c̄(θ1 ) − c̄(θ0 )] + [b(ȳt (θ1 ), θ1 ) − b(ȳt (θ0 ), θ0 )]. Hence note that:
b(ȳt (θ0 ), θ1 ) − b(ȳt (θ0 ), θ0 ) ≤ v(θ1 ) − v(θ0 ) ≤ b(ȳt (θ1 ), θ1 ) − b(ȳt (θ1 ), θ0 )
where the first inequality comes from the IC of θ1 and the second from the IC of θ0 . We
can hence simplify this expression as follows:
b(ȳt (θ0 ), θ1 ) − b(ȳt (θ1 ), θ1 ) ≤ c̄(θ1 ) − c̄(θ0 ) ≤ b(ȳt (θ0 ), θ0 ) − b(ȳt (θ1 ), θ0 )
Therefore:
b(ȳt (θ0 ), θ0 ) − b(ȳt (θ1 ), θ0 ) − [b(ȳt (θ0 ), θ1 ) − b(ȳt (θ1 ), θ1 )] =
Z θ1
bθ (ȳt (θ1 ), x ) − bθ (ȳt (θ0 ), x )dx ≥ 0
θ0
By Lemma 2, we have that ȳt (θ1 ) ≥ ȳt (θ0 ).
(⇐) For θ0 ≤ θ1 , we have
v ( θ1 ) − v ( θ0 ) =
Z θ1
θ0
bθ (ȳt ( x ), x )dx ≥
Z θ1
bθ (ȳt (θ0 ), x )dx = b(ȳt (θ0 ), θ1 ) − b(ȳt (θ0 ), θ0 ),
θ0
where the first equality follows from local incentive compatibility constraint and the inequality follows from Lemma 2 and the monotonicity of ȳt (θ ). Hence,
c̄(θ1 ) + b(ȳt (θ1 ), θ1 ) − [c̄(θ0 ) + b(ȳt (θ0 ), θ0 )] ≥ b(ȳt (θ0 ), θ1 ) − b(ȳt (θ0 ), θ0 ),
which implies,
v(θ1 ) = c̄(θ1 ) + b(ȳt (θ1 ), θ1 ) ≥ c̄(θ0 ) + b(ȳt (θ0 ), θ1 ) = ut (c̄t (θ0 ), ȳt (θ0 ), θ1 ).
19
Similarly, we have
v t ( θ1 ) − v t ( θ0 ) =
Z θ1
bθ (ȳt ( x ), x )dx ≤
θ0
Z θ1
bθ (ȳt (θ1 ), x )dx = b(ȳt (θ1 ), θ1 ) − b(ȳt (θ1 ), θ0 ),
θ0
which implies
vt (θ0 ) = c̄(θ0 ) + b(ȳt (θ0 ), θ0 ) ≥ c̄(θ1 ) + b(ȳt (θ1 ), θ0 ) = ut (c̄t (θ1 ), ȳ(θ1 ), θ0 ).
Lemma 2. The preference defined in (4) exhibit single crossing property, i.e.
in ȳ.
∂u
∂θ
is non-decreasing
Proof. By the envelope theorem, we have
ỹ(θ, ȳ) + ȳ 0
∂u
=
ψ
∂θ
θ2
ỹ(θ, ȳ) + ȳ
θ
Applying the implicit function theorem on the FOC of (3), we see that ỹ(θ, ȳ) + ȳ is increasing in ȳ. Since both ψ0 () and total income are non-negative and ψ0 () is increasing, ∂u
∂θ
is non-decreasing in ȳ.
Proof of Proposition 1
Proof. We start by rewriting the constraint (7)
Z θ̄ ȳ(θ ) + ỹt (θ, ȳ(θ )) − ψ
θ
ỹt (θ, ȳt (θ )) + ȳt (θ )
θ
− φ (ỹt (θ, ȳ(θ )); p) − vt (θ ) f (θ )dθ
≥ 0.
using (5).
Using integration by parts, we have
Z θ̄
θ
vt (θ ) f˜(θ )dθ =
Z θ̄ ỹt (θ, ȳ(θ )) + ȳ(θ )
θ
θ2
ψ
0
ỹt (θ, ȳ(θ )) + ȳ(θ )
θ
1 − F̃ (θ )
f (θ )
f (θ )dθ
+ v ( θ ),
(16)
20
and
Z θ̄
vt (θ ) f (θ )dθ =
θ
Z θ̄ ỹt (θ, ȳ(θ )) + ȳ(θ )
θ2
θ
ψ
0
ỹt (θ, ȳ(θ )) + ȳ(θ )
θ
1 − F (θ )
f (θ )
f (θ )dθ.
+ v ( θ ).
(17)
Using the above expressions, we rewrite the integrand of the objective function as3 :
ỹt (θ, ȳ(θ )) + ȳ
− φ(ỹt (θ, ȳ(θ )))
ȳ(θ ) + ỹt (θ, ȳ(θ )) − ψ
θ
ỹt (θ, ȳ(θ )) + ȳ(θ ) 0 ỹt (θ, ȳ(θ )) + ȳ(θ )
F (θ ) − F̃ (θ )
+
ψ
.
θ
f (θ )
θ2
(18)
Hence, the social planner’s solution involves a point-wise maximization of the above objective funciton. The objective function of (18) often arise in screening problems, with the
first line as the household first best utility and the second line as the information rent the
social planner should give to each household in order for it to reveal its type.
The maximization problem (18) also suggests that optimal allocation may involve households not complying in their tax reports, i.e. ỹt (θ, ȳt (θ )) 6= 0.
Note that the FOC of the household problem with respect to ỹ implies that (1 − t) −
1 0 ȳ+ỹ
− φ0 (ỹ) = 0. Thus the FOC of (18) with respect to ȳ is
θψ
θ
∂ỹt
1 0 ỹt (θ, ȳ) + ȳ
+
t + λt (θ ) =
1− ψ
θ
θ
∂ȳ
1 ∂ỹt
F̃ (θ ) − F (θ )
ỹt (θ, ȳ) + ȳ 00 ỹt (θ, ȳ) + ȳ
0 ỹt ( θ, ȳ ) + ȳ
+1
ψ
+
ψ
,
f (θ )
θ
θ
θ
θ 2 ∂ȳ
(19)
where λt (θ ) represents the Lagrange multiplier on the constraint that ȳ ≥ 0.
In order to determine the optimal income tax schedule, remember that the household
problem is,
max
ȳ≥0,ỹ+ȳ≥0
(1 − t) [ỹ + ȳ − T (ȳ)] − ψ
3 Note
ỹ + ȳ
θ
− φ(ỹ) .
(20)
that the Lagrangian multiplier on the resource constraint is one because of the quasi-linear specification of the utility function.
21
Hence the first order condition with respect to ȳ is
1
(1 − T (ȳ))(1 − t) − ψ0
θ
0
ỹ + ȳ
θ
= 0,
(21)
and the first order condition with respect to ỹ is
1
1 − t − ψ0
θ
ỹ + ȳ
θ
= φ0 (ỹ).
(22)
The above two first order conditions implicitly define ȳ(θ ) and ỹ(θ ). Using the fact that
∂ỹt
∂ȳ
+1 = 1+
sion.
ỹEỹ,1−T 0
ȳEȳ,1−T 0
and
(1−t)(1− T 0 )θ 2
y
yψ00 ( θ )
= Ey,1−T 0 , rearranging (19) gives us the first expres-
To get the second expression, we need to transform the skill distribution into the reported
income distribution. Differentiating (21) with respect to θ gives us,
ȳ0 (θ ) =
1
θ2
h
ψ0
ỹ(θ )+ȳ(θ )
θ
(1 − t) T 00 (ȳ(θ )) +
i
+ ỹ(θ )+θ ȳ(θ ) ψ00 ỹ(θ )+θ ȳ(θ )
00
1 00
ψ
θ2
ỹ(θ )+ȳ(θ )
θ
(1−t) T (ȳ(θ ))
φ00 (ỹ(θ ))
+1
,
(23)
which we
transform the type pdf f (θ ) into reported income pdf h(ȳ). Noting that
use to ỹ
(
θ,
ȳ
)+
ȳ
= t + T 0 (ȳ(θ ))(1 − t), we rewrite the interior case of (19) as
1 − 1θ ψ0 t θ
dy
=
dȳ
dy H̃ (ȳ) − H (ȳ)
1 00 ỹ + ȳ
(1 − t) T 00 (ȳ)
00
(1 − t) T (ȳ) + 2 ψ
+1
.
dȳ
h(ȳ)
θ
φ00 (ỹ)
θ
T 0 (ȳ)(1 − t) + t
Note that differentiating (22) with respect to ȳ gives us
dy
dȳ
=
φ00 ()
ψ00 ()
+φ00 ()
θ2
Finally, from the equalities above and rearranging (24), we get the result.
Proof of Proposition 2
Proof. First, we rewrite the optimal social welfare as
max W (t),
t
22
=
(24)
(1−t)(1− T 0 (ȳ))
.
1 00
2 ψ ()Eȳ,1− T 0 ȳ
θ
or
max
t
(Z
θ̄
θ
ỹt (θ, ȳ) + ȳ
− φ(ỹt (θ, ȳ))
max ȳ + ỹt (θ, ȳ) − ψ
θ
ȳ≥0
)
ỹt (θ, ȳ) + ȳ 0 ỹt (θ, ȳ) + ȳ
F (θ ) − F̃ (θ ) o
+
ψ
f (θ )dθ .
θ
f (θ )
θ2
n
(25)
y
We introduce a new function x̃t (θ, y), implicitly defined as the solution to (1 − t) − 1θ ψ0 ( θ ) =
φ0 ( x̃ ). This function x̃t (θ, y) represents the optimal non-compliance when the household
must produce total output y facing commodity tax t. Then, we can rewrite the integrand
of the objective function as
(
y y
− φ( x̃t (θ, y)) + 2 ψ0
max y − ψ
y
θ
θ
θ
y
F (θ ) − F̃ (θ )
f (θ )
)
+ λt (θ )[y − x̃t (θ, y)] , (26)
with yt (θ ) as the argmax. Using the envelope theorem and interchanging differentiation
and integration, we get
dW (t)
=−
dt
Z θ̄
θ
y
[φ0 ( x̃t (θ, yt (θ ))) + λt (θ )]
∂ x̃t
(θ, yt (θ ))dF (θ ),
∂t
(27)
y
where λt (θ ) is the Lagrange multiplier on the constraint that y ≥ x̃t (θ, y). Notice from the
FOC of (26) that:
ψ0 (yt (θ )/θ )
∂ x̃t
∂ x̃t
= t+ 1−t−
1−
+
1−
∂y
θ
∂y
1 F (θ ) − F̃ (θ )
yt (θ )ψ00 (yt (θ )/θ )
0
ψ (yt (θ )/θ ) +
,
f (θ )
θ
θ2
y
−λt (θ )
(28)
y
∂ x̃t
1
−1
and that 1 − ∂∂yx̃t = ∂y/∂
ȳ , we see that λt ( θ ) = λt ( θ ). Finally, noting that ∂t = φ00 ( x̃t )
by implicitly differentiating x̃t (θ, y) with respect to t, setting (27) equal to 0 gives us the
result.
23
Proof of Corollary 1
Proof. Suppose that F̃ (θ ) ≥ F̃ 0 (θ ) for all θ ∈ [θ, θ̄ ], so that F̃ (θ ) is more redistributive
than F̃ 0 (θ ). Since (18) exhibit increasing difference in (- F̃ (θ ),ȳ), when the pareto weights
change from F̃ (θ ) to F̃ 0 (θ ), the corresponding optimal reported income allocation does
not decrease, i.e ȳt (θ ) ≤ ȳ0t (θ ) for every type. Since misreported income is decreasing in
reported income, ỹt (θ, ȳt (θ )) ≥ ỹt (θ, ȳ0t (θ )) for all θ and t. Also, if we denote λt (θ ) and
λ0t (θ ) as the Lagrange multiplier for the pareto weights of F̃ (θ ) and F̃ 0 (θ ), we have that
λt (θ ) ≥ λ0t (θ ). To see this, suppose that λt < λ0t (θ ). Since λt (θ ) ≥ 0, we have λ0t (θ ) > 0,
which implies that both ȳ0t (θ ) and ȳt (θ ) are 0, i.e. the optimal allocations for the two pareto
weights are the same. Then, from (19), we have
λt (θ ) − λ0t (θ )
1
= 2
θ
F̃ (θ ) − F̃ 0 (θ )
f (θ )
ỹt (θ, 0)ψ00 (ỹt (θ, 0)/θ )
ψ (ỹt (θ, 0)/θ ) +
θ
0
∂y
∂ȳ
≥ 0, (29)
which implies that λt (θ ) ≥ λ0t (θ ), a contradiction.
Let W (t) and W 0 (t) be the social welfare and topt and t0opt be the optimal linear tax for
pareto weights F̃ (θ ) and F̃ 0 (θ ), respectively. Suppose by contradiction that topt < t0opt .
Since
dW (t)
=
dt
Z θ̄ 0
φ (ỹt (θ, ȳt (θ ))) + λt (θ )
θ
f (θ )dθ
(30)
f (θ )dθ for all t,
(31)
φ00 (ỹt (θ, ȳt (θ )))
and
dW 0 (t)
=
dt
Z θ̄ 0
φ (ỹt (θ, ȳ0 (θ ))) + λ0 (θ )
t
θ
t
φ00 (ỹt (θ, ȳ0t (θ )))
we have
dW (t)/dt ≥ dW 0 (t)/dt for all t,
(32)
To see this inequality, first note from our assumption that φ0 ()/φ00 () is non-decreasing, and
φ0 (ỹ (θ,ȳ0 (θ )))
φ0 (ỹ (θ,ȳ (θ )))
that ỹt (θ, ȳt (θ )) ≥ ỹt (θ, ȳ0t (θ )), so we have φ00 (ỹt (θ,ȳt (θ ))) ≥ φ00 (ỹt (θ,ȳt0 (θ ))) for all θ and t. Then
t
t
t
t
λ0 (θ )
notice that
≥ φ00 (ỹ (tθ,ȳ0 (θ ))) for all θ and t. To see this inequality, consider the
t
t
three possible cases: λt (θ ) = λ0t (θ ) = 0, λt (θ ) > 0 and λ0t (θ ) = 0, and λt (θ ) ≥ λ0t (θ ) > 0.
Because we assume that φ00 () > 0, the first two cases satisfy the inequality trivially. And
λt (θ )
φ00 (ỹt (θ,ȳt (θ )))
24
in the third case, ȳt (θ ) = ȳ0t (θ ) = 0, so the denominators are the same.
Since we assume that W (t) is single-peaked and differentiable in t, i.e. dW (t)/dt crosses
0 once, which implies there exists e > 0, where for t ∈ (topt − e, topt ), dW (t)/dt > 0 and
for t ∈ (topt , t + e), dW (t)/dt < 0. Also, for all t ≤ t0opt , dW 0 (t) ≥ 0. However, there exists
a t00 ∈ (topt , t0opt ) such that dW (t00 )/dt < 0 and dW 0 (t00 )/dt ≥ 0, which contradicts (32).
25
© Copyright 2025 Paperzz