Optimal Taxation - University of Warwick

Lecture 9: Optimal Taxation
Extra slides
EC981 - Topics in Public Economics
Lecture 9: Optimal Taxation
Miguel Almunia
MSc in Economics - University of Warwick
Thursday, 6th March, 2014
Lecture 9: Optimal Taxation
Extra slides
Optimal Taxation
“Optimal” tax theory must combine lessons from excess burden
and tax incidence:
Optimal size of the pie? - Efficiency
How is the pie distributed? - Equity
What is the best way to design taxes given equity and
efficiency concerns?
Lecture 9: Optimal Taxation
Extra slides
Optimal Taxation
Efficiency perspective: finance the govt through lump-sum
taxation
Fixed amount per person regardless of individual characteristics
Does not induce behavioural responses, so no inefficiency
Equity perspective: individual lump-sum taxes
Tax higher-ability individuals a larger lump sum
Problem: we cannot observe individual abilities
Hence we tax outcomes, such as income or consumption
Creates distortions & inefficiency
Lecture 9: Optimal Taxation
Outline of Today’s Lecture
Traditional Optimal Taxation Theory:
Ramsey (1927): Inverse elasticity rule for taxation of goods
Mirrlees (1971): Optimal nonlinear income tax
Modern Optimal Taxation Theory:
Saez (2001): optimal tax rate for top earners
Borrows some results from Diamond (1998)
* Diamond & Saez (2011): survey of the literature
Extra slides
Lecture 9: Optimal Taxation
Extra slides
Traditional Approaches
Ramsey (1927): linear tax systems
T (z) = t · z
Rules out lump-sum taxes by assumption
Homogeneous agents
Mirrlees (1971): nonlinear tax systems
T (z) = f (z)
Permits lump sum taxes
Heterogeneity in agents’ skills
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Tax Problem
Government levies taxes on goods in order to accomplish two
goals:
1
2
Raise total revenue (R) of amount E
Minimize utility loss for agents in the economy
Representative individual
No redistributive concerns
As in efficiency analysis, assume that individual does not
internalize the effect of τi on govt budget
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Tax Problem
United Kingdom, 1903-1930
Mathematician & Philosopher
Cambridge University
.
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Assumptions
1
No lump-sum or nonlinear taxes - Only proportional taxes
2
Cannot tax all commodities (eg, leisure untaxed)
3
Production prices fixed (normalized to one):
pi
⇒ qi
= 1
= 1 + τi
Note: full mathematical derivation in “extra slides” at the end
Lecture 9: Optimal Taxation
Ramsey (1927): Graphical Intuition
Taxation of elastic vs. inelastic goods
Extra slides
Lecture 9: Optimal Taxation
Ramsey (1927): Interpretation & Limitations
Intuition: tax inelastic goods to minimize efficiency costs
Limitations:
Does not take into account redistributive motives
Necessities usually more inelastic than luxuries
Thus, optimal Ramsey tax system is regressive
Restricted to linear tax systems
Extra slides
Lecture 9: Optimal Taxation
Extra slides
Application of Ramsey to Taxation of Savings
Standard lifecycle model:
max
!
ut (ct ) subj. to qt ct
= W
t
where qt = (1 + τt ) pt
Consumption in each period treated like different commodities
Let capital income tax be θ, levied on interest rate:
qt =
1
(1 + (1 − θ) r )t
Lecture 9: Optimal Taxation
Extra slides
Application of Ramsey to Taxation of Savings
For any θ > 0, implied optimal tax goes to infinity:
"
#t
1+r
qt
= 1 + τt =
pt
1 + (1 − θ) r
⇒ lim τt = ∞
t→∞
But Ramsey formula implies that optimal tax τ ∗ cannot be ∞
for any good
Therefore, optimal capital income tax must be zero in long run
(Judd 1985; Chamley 1986)
Lecture 9: Optimal Taxation
Zero Capital Taxation Result: Limitations
The zero result within Ramsey model no longer holds when
assumptions are relaxed, for example:
Allowing for progressive income taxation
Allowing for credit market imperfections
Finitely-lived agents with finite bequest elasticities
Allowing for myopic agents
Extra slides
Lecture 9: Optimal Taxation
Extra slides
Mirrlees (1971): Incorporating Behavioural Responses
Ramsey’s approach ignored equity-efficiency tradeoff
Mirrlees (1971) incorporates behavioural responses:
1
2
3
Standard labour supply model
Individuals differ in ability w
Govt maximizes social welfare function (SWF) subj. to:
Budget constraint
Behavioural responses to taxation
Lecture 9: Optimal Taxation
Extra slides
Mirrlees (1971): Incorporating Behavioural Responses
Scotland, 1936
Professor of Economics
Cambridge University
1996 Nobel Prize Winner
.
Lecture 9: Optimal Taxation
Extra slides
Mirrlees (1971): Model
Individual maximizes:
u (c, l ) subj. to c = wl − T (wl )
Ability distribution given by density f (w )
Government maximizes:
SWF
subj. to budget
subj. to indiv FOC
=
ˆ
G [u (c, l )] f (w ) dw
ˆ
T (wl ) f (w ) dw ≥ E
$
%
w 1 − T ′ (·) uc − ul = 0
where G (·) is increasing and concave
Lecture 9: Optimal Taxation
Extra slides
Mirrlees (1971): Model
Govt maximizes weighted sum of utilities of ex-post
consumption
With equal weights and diminishing marginal utility, we would
equalize everyone’s income
Utilitarianism leads to comunism!
Is maximizing total ex-post utility the right objective function?
Deep debate dating back to Rawls, Nozick, Sen...
Lecture 9: Optimal Taxation
Extra slides
Mirrlees (1971): Results
Mirrlees formulas are complicated, only a few general results:
1
2
3
4
T ′ (·) ≤ 1: Obvious, because otherwise no one works
T ′ (·) ≥ 0: Non-trivial. Rules out EITC (incentives for labour
force participation with tax subsidies)
T ′ (·) = 0 at the bottom of the skill distribution (assuming
everyone works)
T ′ (·) = 0 at the top of the skill distribution (if skill
distribution is bounded)
Lecture 9: Optimal Taxation
Extra slides
Mirrlees (1971): Results
Mirrlees model had huge impact in fields like contract theory
Models with asymmetric information
But little impact on practical tax policy
Recently, connected to empirical tax literature:
Diamond (AER, 1998), Saez (REStud, 2001)
Sufficient statistic formulas in terms of elasticities
Lecture 9: Optimal Taxation
Extra slides
Optimal Income Taxation: Sufficient Statistic Formulas
Revenue-maximizing linear tax: Laffer curve
Top income tax rate (Saez 2001)
Overview of this literature: *Diamond and Saez (JEL, 2011)
Lecture 9: Optimal Taxation
Laffer Curve: Revenue Maximizing Rate
Useful benchmark for optimal rate
Let tax revenue be R (τ ) = τ · z (t − τ )
Notice: reported income z is a function of net-of-tax rate
(1 − τ )
R (τ ) has an inverse U-shape:
No taxes: R (τ = 0) = 0
Confiscatory taxes: R (τ = 1) = 0
Extra slides
Lecture 9: Optimal Taxation
Extra slides
Laffer Curve: Revenue Maximizing Rate
Revenue maximizing rate, τ ∗ :
R ′ (τ ∗ ) = 0
dz
= 0
z −τ
d (1 − τ )
&
'
&
'
(1 − τ )
(1 − τ )
dz
z
−τ
= 0
z
d (1 − τ )
z
(
)*
+
ε
1 − τ − τε = 0
⇒ τ∗ =
Strictly inefficient to have τ > τ ∗ (Why?)
1
1+ε
Lecture 9: Optimal Taxation
Laffer Curve: Revenue Maximizing Rate
Graphical representation
Extra slides
Lecture 9: Optimal Taxation
Extra slides
Saez (2001): Using Elasticities to Derive Optimal Tax Rates
Derive optimal tax rate τ using perturbation argument
Assume there are no income effects: εC = εU = ε
Diamond (1998) shows this is a key theoretical simplification
Assume that there are N individuals above earnings z̄
Let z m (1 − τ ) be avge income function of these individuals
Lecture 9: Optimal Taxation
Extra slides
Saez (2001): Optimal Income Tax Rate
Three effects of small ∆τ > 0 reform above z̄:
1
Mechanical increase in tax revenue:
∆M = N · [z m − z̄] ∆τ
2
Behavioural response:
∆B
3
=
Nτ ∆z m = Nτ
=
−N
"
−∆τ
τ
ε̄z m ∆τ
1−τ
∆z m
∆ (1 − τ )
#
Welfare effect: if govt values rich consumption at ḡ ∈ (0, 1):
∆W = −ḡ dM
Lecture 9: Optimal Taxation
Extra slides
Saez (2001): Optimal Income Tax Rate
$%&'(&)*+,! 4('!*8)/F,9?!&+(',!E3 )*(C,!2;!!
G,H(80?!&+(',!E3
)*(C,!2;!
-./(0,
/1234526
=,/>).%/)+!9):!%./8,)&,?
@232;A
2;3452;6
B,>)C%(8)+!8,&'(.&,!9):!+(&&?!
2 1!3
<
2;
Source: Diamond and Saez (JEL, 2011)
,!2! D5E3 6
2
78,39):!%./(0,!2
Lecture 9: Optimal Taxation
Extra slides
Saez (2001): Optimal Income Tax Rate
Optimal tax rate solves:
∆M + ∆W + ∆B = 0
After some algebra:
(1 − ḡ ) [(z m /z̄) − 1]
τ ∗top
=
1 − τ ∗top
ε̄z m /z̄
Top tax rate τ ∗top is higher when:
↓ ḡ : less weight on welfare of the rich
↓ ε̄: lower elasticity of taxable income
↑ z m /z̄: higher income inequality
Lecture 9: Optimal Taxation
Extra slides
Saez (2001): Optimal Income Tax Rate
z m /z̄
With Pareto distribution,
is Pareto distn. parameter
converges to
,
a
a−1
, where a
Simplified formula is then:
τ ∗top =
1 − ḡ
1 − ḡ + εa
In the US, z m /z̄ ≈ 3 and quite stable over time
So Pareto distn. parameter is
zm
z̄
=3=
a
a−1
⇒ a = 1.5
We can estimate ε
Society decides value of ḡ (relative weight of rich on SWF)
Lecture 9: Optimal Taxation
Extra slides
1
Empirical Pareto Coefficient
1.5
2
2.5
US Income Distribution approximated by Pareto Distribution
0
200000
400000
600000
800000
z* = Adjusted Gross Income (current 2005 $)
a=zm/(zm-z*) with zm=E(z|z>z*)
Source: Diamond and Saez (JEL, 2011)
1000000
alpha=z*h(z*)/(1-H(z*))
Lecture 9: Optimal Taxation
Extra slides
Zero Top Rate with Bounded Skill Distribution
Suppose top earner make z T , second makes z S
Tax only raised on top earner, so z m = z T :
∆M = N∆τ [z m − z̄] → 0 < ∆B = N∆τ · ε̄ ·
Optimal τ = 0 for top earner
Standard Mirrleesian result
But the result applies only to the top earner
No practical relevance
τ
zm
1−τ
Lecture 9: Optimal Taxation
Extra slides
Connection to Revenue Maximizing Tax Rate
Revenue-maximizing top tax rate can be calculated by setting
ḡ = 0
Utilitarian SWF: ḡ = uc (z m ) → 0 when z̄ → ∞
Rawlsian SWF: ḡ = 0 for any z̄ > min (z)
If ḡ = 0, we obtain τ ∗top = τ max =
1
1+ε̄a
Assuming a = 1.5:
ḡ = 0
ḡ = 0.5
ε = 0.2
ε = 0.5
ε=1
0.77
0.62
0.57
0.40
0.40
0.25
Lecture 9: Optimal Taxation
Extra slides
Mathematical Derivation of Ramsey (1927)
Note: the following slides draw heavily on Raj Chetty’s lecture
slides for his Public Economics course at Harvard University,
available at www.rajchetty.com
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Individual’s problem
Individual maximizes utility over goods (leisure untaxed):
max
{x1 ,..xN ,l}
u (x1 , ..., xN , l )
subj. to q1 x1 + ... + qN xN = wl + Y
Lagrangian:
L = u (x1 , ..., xN , l ) + α (wl + Y − q1 x1 − ... − qN xN )
Solve and obtain indirect utility V (q, Y )
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Government’s problem
Then government maximizes welfare of representative
individual:
max V (q, Y )
subject to the tax revenue requirement:
N
!
τi xi (q, Y ) ≥ E
i =1
Government’s Lagrangian:
.
L = V (q, Y ) + λ E −
N
!
i =1
/
τi xi (q, Y )
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Government’s problem
.
L = V (q, Y ) + λ E −
∂L
⇒
=
∂qi
N
!
/
τi xi (q, Y )
i =1
⎡
⎢
⎢
⎢
∂V
+λ ⎢
xi
+
⎢
()*+
∂qi
⎢
()*+
⎣Mech. Effect
Loss of indiv. welfare
Using Roy’s Identity
,
∂V
∂qi
= −αxi :
(λ − α) xi + λ
!
τj
∂xj
=0
⎤
⎥
⎥
⎥
∂xj
⎥=
τj
⎥
∂qi
⎥
j
⎦
( )* +
Behav. Response
!
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Perturbation Argument
Equivalent but more intuitive than original Ramsey analysis
Effect of tax increase (τi to τi + d τi ) on social welfare:
Marginal effect on govt revenue:
dR =
xi dτi
+
( )* +
mech. effect
!
τj dxj
j
( )* +
behav. response
Marginal effect on private surplus:
dU =
∂V
dτi = −αxi dτi
∂qi
Optimum characterized by balancing the two marginal effects:
dU + λdR = 0
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Equilibrium
After some algebra, arrive at the following expression for the
optimal tax rates, τj∗ :
!
j
τj
∂xj
xi
= − (λ − α)
∂qi
λ
N equations in N unknowns
Notice the relevance of cross-price elastiticies
λ = Lagrange multiplier in the govt’s problem
α = Lagrange multiplier in the individual’s problem (mgl utility
of money)
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Compensated Elasticities
Starting from the perturbation argument formula:
dU + λdR = 0
Rewrite in terms of Hicksian elasticities and use Slutsky
equation:
∂hj
∂xj
∂xj
=
− xi
∂qi
∂qi
∂Y
Combining the two equations:
#
! " ∂hj
∂xj
− xi
(λ − α) xi + λ
τj
= 0
∂qi
∂Y
j
⇒
∂
!
1 ! ∂hj
τj
xi
∂qi
j
τx
= −
θ
λ
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Compensated Elasticities
θ is independent of i and measures the value for the govt of
introducing a $1 lump sum tax:
6
∂ j τj xj
θ =λ−α−λ
∂Y
Three effects of introducing a $1 lump sum tax:
1
2
3
Direct value of λ for the govt
Loss in welfare of α for the individual
Behavioural effect → Loss in tax revenue:
∂
!
j τj xj
∂Y
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Index of Discouragement
θ
1 ! ∂hj
τj
=
xi
∂qi
λ
j
Suppose revenue requirement E is small, so all taxes are small
Then tax τj on good j reduces consumption of good i (holding
utility constant) by approx
dhi = τj
∂hi
∂qj
Numerator of LHS (top equation): total reduction in cons of
good i
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Index of Discouragement (cont’d)
Dividing by xi yields % reduction in consumption of each good
i – “index of discouragment” of the tax system on good i
Ramsey tax formula says the indices of discouragement must
be equal across goods at the optimum
Lecture 9: Optimal Taxation
Extra slides
Ramsey (1927): Inverse Elasticity Rule
Introducing elasticities, we can write Ramsey formula as:
N
!
j=1
∂
!
τj xj
j
θ = λ − α − λ ∂Y
a $1 lump sum tax
τj
θ
εc =
1 + τj ij
λ
is the value for the govt of introducing
Consider special case where εcij = 0 for all i ̸= j
Slutsky matrix is diagonal
Obtain classic inverse elasticity rule:
τi
1 θ
= c
1 + τi
εii λ