Conditional and Unconditional Quantile Estimation of

Conditional and Unconditional Quantile Estimation of
Telecommunications Engel Curves∗
Alan Ker (University of Guelph)
Preliminary Draft – Comments Welcome
Abstract
The standard conditional quantile estimates of telecommunication Engel curves are
compared to the unconditional quantile estimates of the same curves. Conditional
quantile regression was introduced in the seminal paper of Koencker and Bassett
(Econometrica, 1978) and Koencker has been a force in this literature exploring areas of conditional quantile regression related to inference, asymptotics, time-series,
nonlinearities, and nonparametrics. Empirical investigations using this methodology
has been continually growing in the economics literature. Taylor and Houthakker
(Springer, 2009) use conditional quantile regression methods extensively throughout
their demand book in response to the prevalence of non-normal errors. A less appealing aspect of conditional quantile regression is that the quantiles are with respect
to the error distribution, that is Y |X, which are not easily interpreted and thus not
of great economic interest. Conversely, the unconditional or marginal quantiles of Y
are both easily interpreted and of great economic interest. A recent paper by Firpo,
Fortin, and Lemieux (Econometrica, 2009) introduces unconditional quantile regression. This manuscript presents and contrasts estimates of telecommunication Engel
curves using mean regression, conditional quantile regression, and unconditional quantile regression methods. Conditional and unconditional quantiles are compared not
only at specific time periods but throughout time as well. BLS data from 1996 quarter
1 to 1999 quarter 4 are used allowing comparisons with Taylor and Houthakker (2009).
Keywords: Telecommunications Engel curves, conditional quantile regression, unconditional quantile regression
JEL Classification: C14, C25
August 2011
Working Paper Series - 11-04
Institute for the Advanced Study of Food and Agricultural Policy
Department of Food, Agricultural and Resource Economics
OAC
University of Guelph
∗
Alan Ker, Professor and Chair, Department of Food Agricultural and Resource Economics, University of Guelph ([email protected]). The author would like to thank Lester Taylor for supplying
the data. The author would also like to thank the Institute for the Advanced Study of Food and
Agricultural Policy (Department of Food, Agricultural and Resource Economics, OAC, University
of Guelph) for its generous financial support.
1
1.0 Introduction
In this manuscript we estimate a series of telecommunication Engel curves using
BLS (Bureau of Labor Statistics) survey data. What is unique is that we estimate the
curves at both the conditional and unconditional quantiles (0.1,...,0.9). Conditional
quantile regression was introduced in the seminal paper of Koenker and Bassett (1978)
and Koenker has been a force in this literature exploring areas of inference, asymptotics, time-series, and nonparametrics, all related to conditional quantile regression
(see for example, Bassett and Koenker (1982), Koenker and Park (1994), Koenker and
Machado (1999), Koenker and Xiao (2002), Koenker (2008)). Conditional quantile
regression methods have been applied in many empirical settings and telecommunications demand is no exception. Taylor and Houthakker (2009) use conditional quantile
regression methods extensively in response to the prevalence of non-normal errors.
While the robustness of conditional quantile regression is very appealing, one may
be left wanting to interpret the results with respect to the quantiles of the marginal
distribution of the dependent variable (Y ) rather than the marginal distribution of
the error term (or the conditional distribution of the dependent variable (Y |X)).
Firpo et al. (2009) introduce unconditional quantile regression in which one can
interpret the results with respect to the quantiles of the marginal distribution of the
dependent variable. They substitute the influence function (recentered) at a given
quantile of the marginal distribution for the dependent variable (Y ) in a simple OLS
regression on the explanatory variables (X). This allows them to identify different
marginal effects at different quantiles of the dependent variable rather than the error
term. In our empirical case study, we are able to identify income marginal effects
at different quantiles of the telecommunications expenditure distribution – a very
appealing notion.
The remainder of the manuscript proceeds as follows. In the next section we
briefly describe conditional and unconditional quantile regression. In the third section we discuss the data used in the empirical analysis. The fourth section presents
the findings while the final section summarizes the manuscript.
2.0 Quantile Regression
Conditional quantile regression, introduced with the seminal article of Koenker
and Bassett (1978), was motivated on the following basis: (i) an alternative to least
squares when the normality assumption does not hold; and (ii) a compliment to least
squares allowing one to look beyond the mean effects and complete the regression
picture (Koenker (2005)). While conditional quantile regression allows one to recover
the marginal effects at quantiles of the conditional distribution of Y given X, unconditional quantile regression allows one to recover marginal effects at quantiles of the
marginal distribution of Y . The latter representing quantiles of F (Y ) and the former
quantiles of F (Y |X) = F (). Unconditional quantile regression was motivated by the
need to estimate unconditional partial effects (Firpo et al. (2009)). One could argue
that unconditional quantile estimation also completes the regression picture.
2
2.1 Conditional Quantile Regression
A very brief description of conditional quantile regression is provided here and
follows from Koenker (2005) where the interested reader is directed for a thorough
description and history of the conditional quantile regression.
Assume real-valued random variable Y has the following data generating process
y = Xβ + (1)
where X ∈ RK is a vector of explanatory variables inclusive of the intercept, β ∈
RK is a vector of unknown parameters, and is an unknown independently and
identically distributed error term. The least squares estimator of β is defined as
βLS = (X T X)−1 X T y and is found as the argmin||y − Xβ||.
For conditional quantile regression at quantile τ , we define β̂τ as the solution to
X
argmin
ρτ |y − Xβ|
(2)
where ρτ = τ if Y − Xβ ≤ 0 and 1 − τ if y − Xβ > 0. This is generally reformulated
into the following linear programming problem.
min(τ 1Tn (u) + (1 − τ )1Tn (v)|Xβ + u − v = y)
(3)
where 1n represents a vector of ones with dimension n and the residual vector is
split into positive u and negative v parts. Interestingly, the solution to the quantile
regression produces exactly K residuals that are equal to zero. This does not mean
that only those K realizations are used in the analysis as all realizations are used in
determining which will have zero residuals. This is analagous to the fact that the
median is equal to a single point but all points determine which point is that median.
2.2 Unconditional Quantile Regression
Unconditional quantile regression is introduced in Firpo et al. (2009) and much
of this section is taken from that article. Readers further interested in unconditional
quantile regression are directed there as well as their companion papers (Firpo et al.
(2007a, 2007b, 2009)).
The influence function has been used in robust statistics for some time although
it is not well known in the economics literature. The influence function of a distributional statistic, termed µ(FY ), represents the influence of an individual realization on
that distributional statistic. Adding back the distribution statistic µ(FY ) yields what
Firpo et al. (2009) call the recentered influenced function. Because the expectation
of the influence function is zero, the expectation of the recentered influence function
is µ(Fy ). Firpo et al. (2009) denote the recentered influence function as a triplet,
R(Y ; µ, Fy ), where Y is the random variable of interest, µ is the statistic of interest,
and of course FY is the distribution of random variable Y . The distributional statis-
3
tic we are interested in is the unconditional quantile, denoted earlier qτ for the τ th
quantile. Therefore, we have
R(Y ; qτ , Fy ) = qτ + (τ − I(−∞,qτ ) (y))/fY (qτ )
(4)
where I( a, b)(x) is an indicator function that takes on a value of 1 if x lies in (a, b)
and 0 otherwise, and fY (qτ ) is the marginal density of random variable Y evaluated
at point qτ . Defining Wτ = R(Y, qτ , Fy ) and undertaking the following unconditional
quantile least squares regression:
β̃τ = (X T X)−1 X T Wτ
(5)
yields the marginal effects (with the appropriate transformations) of X on the unconditional quantile τ of Y .
To compute Wτ we use the sample quantile for τq and an estimate of fY (qτ ) is
obtained using standard nonparametric kernel methods and Silverman’s rule of thumb
for the smoothing parameter.1 Firpo et al. (2009) suggest two other methods may
be used to estimate the marginal effects but find that their results change very little
across the methodologies. The first other approach using the recentered influence
function but estimates the regression nonparametrically using a series expansion.
The second other approach estimates a logit model where the dependent variable
takes a 1 if the dependent variable realization is below the quantile of interest and 0
if it is above. Given their findings, we employ the least squares approach using the
recentered influence function as described above.
3.0 Data
The Bureau of Labor Statistics (BLS) Consumer expenditure survey data have
been used in a number of empirical studies. The data used is quarterly from 1996 Q1
to 1999 Q4. The number of observations per quarter are outlined in the below table.
Table 1: Number of Realizatons by Year and Quarter
Year Quarter 1 Quarter 2 Quarter 3 Quarter 4
1996
2346
3594
3540
3658
1997
3709
3743
3789
3808
1998
3795
3730
3678
3795
1999
4587
5004
4855
4846
Unfortunately this data does not allow one to estimate demand curves because
there is no accompanying price data.2 The data contains income and expenditure
data as well as a number of demographic variables on the survey respondents (see
1
Least squares cross validation and likelihood cross validation methods were also used with no
discernable change in the results.
2
For an exception see Taylor and Houthakker (2005) where significant time and energies were
undertaken to match price data from the American Chambers of Commerce Research Association
(ACCRA).
4
below).
Table 2: Explanatory Variables
Type
Household (hh)
Demographic Variable
number of income earners in hh
age of head of hh
size of hh
dummy (d) for single hh
d for owned home
d for hh receiving food stamps
d for no children in hh
d for children in hh under age 4
d for oldest child ∈ [12,17] and at least one child < 12
d for oldest > 64
Head of Household age of head of hh
d for head of hh education: grades 1-8
d for head of hh education: some high school no diploma
d for head of hh education: high school diploma
d for head of hh education: some college no diploma
d for head of hh education: bachelor’s degree
d for head of hh education: post-graduate degree
d for head of hh white
d for head of hh black
d for head of hh male
Region
d for residence in northwest
d for residence in midwest
d for residence in south
d for residence in west
d for rural residence
Seasonal
d for quarterly seasons
Figure 1 presents kernel density and normal estimates of the marginal density of
telecommunication expenditures and the log of telecommunication expenditures. It
is clear that neither are normal as pointed out by Taylor and Houthakker (2009).
5
0.7
1e-03
Kernel
Normal
0.0
0e+00
0.1
2e-04
0.2
4e-04
0.3
0.4
6e-04
0.5
8e-04
0.6
Kernel
Normal
-1000
0
1000
2000
3000
0
5
Demand
(a) Expenditures
10
15
Log(Demand)
(b) Log(Expenditures)
Figure 1: Kernel Density and Normal Maximum Likelihood Estimates of
Telecommunications Expenditures
4.0 Estimation Results
In this section we presents the findings for the various regressions. We estimate
the Engel equation in its double-log form:
log(expenditures) = α + βlog(income) + γ(demographicvariables) + .
(6)
Figure 2 presents the estimated β coefficients for least squares (βLS ), conditional
quantile regression (β̂τ ), and unconditional quantile regression (β̃τ ) for each quarter
of 1996.3 The least squares coefficient is constant across the quantiles. The conditional quantile regressions yield very little economic information as the estimates tend
to tightly vary around the least squares estimate. Conversely, the unconditional quantile regressions indicate that the income elasticity decreases as one increases across
the unconditional quantiles of demand. While this result may not be surprising, in
fact it is quite intuitive, this finding can only be deduced from the unconditional
quantile regressions.
In figure 3 we present the estimated β coefficients using all three methodologies
for the 0.2, 0.5 and 0.8 quantiles. Note that the quantiles from the conditional and
unconditional regressions are not directly comparable in the sense that for the conditional quantile regression they represent the quantiles of F (Y |X) = F () whereas
for the unconditional quantile regression they represent the quantiles of F (Y ). The
least squares coefficient remains relatively constant over the time period. The conditional quantile regression coefficient remains close to the least squares estimate at all
3
The figures for 1997, 1998, and 1999 are in the Appendix.
6
1.0
1.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
Quantile
0.6
0.8
1.0
Quantile
(b) 1996 Quarter 2
1.0
1.0
(a) 1996 Quarter 1
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
Quantile
0.4
0.6
0.8
1.0
Quantile
(c) 1996 Quarter 3
(d) 1996 Quarter 4
Figure 2: Least Squares, Conditional Quantile, and Unconditional Quantile Estimated Income Coefficient
three quantiles. Conversely, the unconditional quantile regression coefficients exhibit
two interesting properties. First, they are above the least squares and conditional
quantiles estimates for lower quantile 0.2, are in the same neighborhood as the least
squares and conditional quantile estimates for the median quantile 0.5, and are below
the least squares and conditional quantile estimates for the upper quantile 0.8. Second, the unconditional estimates are more volatile for quantiles corresponding to the
tails of the marginal distribution. The first result is not surprising and again illustrates that people who use little telecommunications have a higher income elasticity
than people who use more telecommunications. The second result is an artifact of
higher estimation error in the quantiles representing the tails of the marginal distribution F (Y ).
7
1.0
1.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0
5
10
15
0
Time
5
10
15
Time
1.0
(a) Quantile 0.2
(b) Quantile 0.5
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0
5
10
15
Time
(c) Quantile 0.8
Figure 3: Least Squares, Conditional Quantile, and Unconditional
Quantile Estimated Income Coefficient Across Time
8
5.0 Summary
Conditional quantile regression, introduced with the seminal article of Koenker
and Bassett (1978), allows one to recover the marginal effects at quantiles of the
conditional distribution of Y given X. Conversely, unconditional quantile regression
allows one to recover marginal effects at quantiles of the marginal distribution of
Y . In different senses, conditional and unconditional quantile regression complete
the regression picture with least squares. In this manuscript we estimated a series of
telecommunication Engel curves using BLS quarterly survey data from 1996-99. What
is unique is that we estimated the curves at both the conditional and unconditional
quantiles (0.1,...,0.9).
We found that the conditional quantile estimation added very little to completing the regression picture because the estimated income coefficient tended to be in
the same neighborhood as the least squares estimate for all quantiles and as argued
earlier it is very difficult to interpret the quantiles of the conditional distribution
F (Y |X) = F ()). Conversely, the unconditional quantile estimation seems to complete the regression picture as the quantiles of the marginal distribution F (Y ) are
easily interpreted and not surprisingly added significantly to the story drawn from
the least squares or conditional quantile estimates. The unconditional quantile estimates illustrate that the income elasticity with respect to telecommunications demand
decreases significantly as expenditures increases.
9
References
Bassett, G. and R. Koenker (1982): “Tests of Linear Hypotheses and L1 Estimation,”
Econometrica, 50, 1577â83.
Firpo, S., N. Fortin, and T. Lemieux (2009): “Unconditional Quantile Regressions,”
Econometrica, 77, 953-973.
Firpo, S., N. Fortin, and T. Lemieux (2007a): “Unconditional Quantile Regressions,”
Technical Working Paper 339, National Bureau of Economic Research, Cambridge,
MA.
Firpo, S., N. Fortin, and T. Lemieux (2007a): “Decomposing Wage Distributions
Using Recentered Influence Function Regressions,” Unpublished Manuscript, Department of Economics, University of British Columbia.
Firpo, S., N. Fortin, and T. Lemieux (2009): “Supplement to ’Unconditional Quantile Regressions’,” Econometrica Supplemental Material, 77, http://www.econometricsociety.org/ecta/Supmat/6822_extensions.pdf
Koenker, R. (2005): Quantile Regression, New York, Cambridge University Press.
Koenker, R. (2008): “Censored Quantile Regression Redux,” Journal of Statistical
Software, 27, http://www.jstatsoft.org/v27/i06.
Koenker, R., and G. Bassett (1978): “Regression Quantiles,” Econometrica, 46, 33-50.
Koenker, R. and Park, B.J. (1994): “An Interior Point Algorithm for Nonlinear Quantile Regression,” Journal of Econometrics, 71(1-2): 265-283.
Koenker, R. and J.A.F. Machado (1999): “Goodness of fit and related inference processes for quantile regression,” Journal of American Statistical Association, 94, 12961310.
Koenker, R. and Zhijie Xiao (2002): “Inference on the Quantile Regression Process”,
Econometrica, 81, 1583â1612.
10
1.0
1.0
Appendix
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
Quantile
0.6
0.8
1.0
Quantile
(b) 1997 Quarter 2
1.0
1.0
(a) 1997 Quarter 1
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.2
0.4
0.6
0.8
1.0
0.0
Quantile
0.2
0.4
0.6
0.8
1.0
Quantile
(c) 1997 Quarter 3
(d) 1997 Quarter 4
Figure 3: Least Squares, Conditional Quantile, and Unconditional
Quantile Estimated Income Coefficient
11
1.0
1.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
Quantile
0.6
0.8
1.0
Quantile
(b) 1998 Quarter 2
1.0
1.0
(a) 1998 Quarter 1
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.2
0.4
0.6
0.8
1.0
0.0
Quantile
0.2
0.4
0.6
0.8
1.0
Quantile
(c) 1998 Quarter 3
(d) 1998 Quarter 4
Figure 4: Least Squares, Conditional Quantile, and Unconditional
Quantile Estimated Income Coefficient
12
1.0
1.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
Quantile
0.6
0.8
1.0
Quantile
(b) 1999 Quarter 2
1.0
1.0
(a) 1999 Quarter 1
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.0
0.2
0.4
Beta
0.6
0.8
Unconditional Quantile
Conditional Quantile
Least Squares
0.0
0.2
0.4
0.6
0.8
1.0
0.0
Quantile
0.2
0.4
0.6
0.8
1.0
Quantile
(c) 1999 Quarter 3
(d) 1999 Quarter 4
Figure 5: Least Squares, Conditional Quantile, and Unconditional
Quantile Estimated Income Coefficient
13