reb-test-usps-rt-7.pdf

USPS-RTJ
BEFORE THE
POSTAL RATE COMMISSION
WASHINGTON, D. C. 20268-0001
POSTAL
RATE AND FEE CHANGES, 2000
REBUTAL
Docket No. R2000-1
TESTIMONY
OF
WILLIAM H. GREENE
ON BEHALF OF THE
UNITED STATES POSTAL SERVICE
Table of Contents
I. STATEMENT
2
II. PURPOSE
3
III. CONCLUSIONS
4
IV. THE VOLUME
5
A. Dr. Bradley’s
6
B. Dr. Bozzo’s Updated Version of the Bradley Model.. .................................
7
V.
OF QUALIFICATIONS
...............................................................
1
AND SCOPE OF MY TESTIMONY..
DRAWN FROM MY EXAMINATION
OF THE STUDIES .. ...4
8
Cost Equation Model. .............................................................
8
ISSUES RAISED BY THE INTERVENORS
8
A. Sample Selection and the Data ‘Scrubbing’ Procedure
9
B. The Issue of Measurement
10
2
MODELS.. .......................................................
VARIABILITY
ECONOMETRIC
.............................................
1
C. Panel Data Treatments
....................
.............................
Error.. .............................................................
............................................................................
11
I, The Fixed Effects Model vs. a Group Means Model.. ............................
12
2. Measurement
13
3. A Pooled Time Series Regression..
Error ................................................................................
14
D. Alternative
15
E. Visually Compelling
........................................................
Estimates Based on a Reverse Regression
...........................
Plots .........................................................................
IO
11
14
21
.27
.27
31
32
.33
.36
1
2
I. STATEMENT
OF QUALIFICATIONS
I am a professor of Econometrics
My name is William H. Greene.
at the
3
Stern School of Business at New York University and, since 1995, chairman of
4
Stern’s Economics Department.
5
taught Econometrics
6
and Ph.D. degrees from the University
7
1976, respectively.
8
sector in 1980-I 981 at National Economic
9
provided consultation
I have taught at Stern since 1981.
at Cornell University
from 1976 to 1981.
I received Masters
of Wisconsin at Madison
I worked briefly as an economic consultant
in 1974 and
in the private
Research Associates
to numerous corporations,
Prior to that I
and have also
including American
Express
10
Corp, Ortho Biotech, Inc., and The Reader’s
11
works in econometrics,
12
widely used computer
13
notably for this proceeding,
14
which several of the witnesses
15
including Neels, Smith, Bradley, Higgins, and Bozzo have all cited in their
16
testimonies.
17
Digest. I have published
numerous
including roughly 40 articles, one of the world’s most
programs for econometric
computation,
LIMDEP, and,
the widely used textbook Econometric
Analysis,
in this and the prior omnibus rate proceeding,
I do note that this is my first appearance
before the Postal Rate
18
Commission.
19
data systems beyond that contained
20
scope and nature of my testimony will be limited to econometric
21
methodology,
22
Section II.
I have no knowledge
of the details of Postal Service operations or
in the testimonies that I reviewed.
about which I have written extensively.
~The
technique and
I will discuss this further in
1
I
2
II. PURPOSE AND SCOPE OF MY TESTIMONY
I have been asked by the United States Postal Service, the sponsor of Dr.
3
Bozzo’s testimony,
4
Smith, both of which seek to rebut Dr. Bozzo’s testimony
5
Michael Bradley in the 1997 counterpart
6
number of issues have been raised regarding the econometric
7
by Drs. Bradley and Bozzo to estimate ‘volume variability
8
in mail processing.
9
processing
10
to comment on the testimonies
of Kevin Neels and J. Edward
and its predecessor
to this proceeding.
by
In particular, a
techniques
used
factors’ for labor costs
(Volume variability is a measure of the elasticity of mail
costs with respect to volume.)
I have learned through my reading of the various testimonies
11
reviewed
12
cost elasticity
13
work I have read-there
14
based essentially
15
rigorous statistical procedures
16
Bradley and Bozzo have used quite complex multiple regression
17
large data base generated within the U.S. Postal Service system to measure this
18
effect, and have found a large amount of evidence that disagrees with the
19
traditional
assumption.
20
variability
factors for a large number of specific activities in the mail processing
21
chain is considerably
22
that the Postal Rate Commission
is I, or 100 percent.
Witnesses
has traditionally
that I have
So far as I have been able to discern from the
is a summary in Dr. Bono’s
on judgment,
assumed that this
impression,
testimony’-this
value is
‘common sense,’ and intuition.
were ever used to arrive at this parameter.
No
Drs.
methods and a
They found consistent evidence that the volume
less than 100 percent
Neels and Smith have raised a large number of criticisms of the
23
data, methods and models used by Drs. Bradley and Bozzo and, by implication,
24
their results.
’ USPS-T-l
Primary among these issues are:
5 at 4-l 3.
2
1
.
Data quality problems and the issue of measurement
2
l
Problems
consequence
4
incorrect,
.
7
l
arose as a
of the methods by which the data were purged of possibly
misreported,
or missing values;
The issue of functional form relating to the use of certain ‘panel data’ style
models-the
6
_.-
of nonrandom sampling that they suggest
3
5
error;
‘fixed effects’ model in particular;
Certain other issues concerning
the ways in which the regression
model
8
was formulated-among
them the problem of missing variables.
9
I intend in my testimony to limit my attention to issues of econometric
10
technique and model building.
11
the testimonies
12
processing
13
details are advanced
14
doubt on the econometric
15
comments in this regard are superfluous
16
nonetheless
17
their relevance to the econometric
18
somewhat technical.
19
the intervenors
20
themselves,
21
22
amount of fine detail in all
I read, about the specifics of and technical
procedures
aspects of mail
and costs, work flow, and technology.
Many of these
by Drs. Neels and Smith as severe complications
that cast
results. Although I believe that some of their
to the questions
at hand, I will
not be addressing any of this material, and offer no testimony
modeling.
Unfortunately,
Some of my testimony will be
this is unavoidable.
Some of the issues that
fairly esoteric.
will be related to the following general topic areas:
The criticisms of the methods by which the data set was ‘scrubbed’
23
miss some important points about sampling,
24
particular, the nature of model building, and, very importantly,
25
issue of ‘sample selection.’
26
27
l
as to
have raised, such as the problem of ‘sample selection,’ are,
My testimony
l
There is an enormous
The discussions
random sampling
in the Neels and Smith testimonies
issues of measurement
in
the
relating to
error rely upon some widely held
3
1
misconceptions
2
issue is incomplete,
3
about the topic.
Most of their discussion
on this
and some of it is incorrect.
Much of the discussion of the ‘fixed effects’ model for panel data,
l
4
particularly
the claim that it is inferior to a pooled regression
5
regression
based on group means, is incorrect.
6
My testimony will briefly review the models developed
7
Dr. Bozzo, and the criticisms
8
detailed summary appears
9
specific econometric
by Dr. Bradley and
of them raised by Drs. Neels and Smith.
in Dr. Bozzo’s testimony.
or a
A more
I will then turn to the
issues listed above. To summarize
my conclusions,
I find
10
that while some of the criticisms
11
useful guidance for refinement
12
variability,
13
exaggerated
14
the flaws suggested
15
Bradley’s and Bozzo’s estimates
16
contrary, from what I have read, I believe that the Bradley and Bozzo studies
17
provide strong evidence
18
reconsidered.
19
activity, I do believe that the two studies combined
20
that the right results will be substantially
21
22
Ill. CONCLUSIONS DRAWN FROM MY EXAMINATION OF THE
STUDIES
23
raised by Drs. Neels and Smith might provide
of the data used for estimating models for volume
many of the more methodological
and/or misleading.
among their comments
are
I disagree with the suggestion that virtually all of
by the intervenors would have acted systematically
of volume variability
downward.
to bias
On the
that the 100% volume variability assumption
should be
While I am not prepared to commit to any specific value for any
provide a strong suggestion
less than one.
I would not say at this juncture that every econometric
24
that could possibly be addressed
25
addressed.
26
amount of evidence that the Commission
I would definitely
or modeling
issue
by Dr. Bradley or Dr. Bozzo has been
conclude that they have provided a substantial
should take very seriously.
4
1
The Commission
should have taken a much more favorable view in 1997,
2
and should at this time consider the panel data, fixed effects form of econometric
3
analysis an appropriate
4
mail processing
5
regressions
advocated by Drs. Smith and Neels discard far more useful
6
information
than the data scrubbing
7
Dr. Smith is simply wrong that the simple regression
8
other is the “least bad” model.
9
of group means on each other is not the ‘least bad’ model; it is the second
platform for continuing work on developing
costs. The aggregate
a model for
means models and time series
operation of which they have been so critical.
of group means on each
Given the data set at hand, the simple regression
most
10
bad model.
11
by Dr. Neels, followed by Smith’s site means model, and the best of the lot is the
12
fixed effects model. The arguments
13
their alternatives
14
rejected on this basis alone.
15
advocated
16
kind of visual approximation
17
The worst is the grossly aggregated time series regression
proposed
advanced by Smith and Neels in favor of
are based on flawed statistical reasoning,
The same conclusion
and should be
applies to the visual devices
by Dr. Smith. I do not believe that the Commission
should accept this
as a substitute for careful econometric
The MODS and PIRS data are obviously far from perfect.
analysis.
But, from my
18
vantage
point, they appear to be quite good, and in the absence of a well
19
designed
20
variability,
21
What is important is for the Commission
22
data using appropriate
23
testimonies
24
that intuitively appealing evidence
25
deviations
of the measurement
26
(discussed
below), which suggest an alarming amount of measurement
27
appear to be much more discouraging
and sharply focused data set designed specifically
are as good as an analyst of mail processing
criteria.
for studying
volume
costs could hope for.
and otherresearchers
to evaluate
these
The criticisms raised in the Neels and Smith
are, in many cases, inappropriate.
is misleading.
Likewise,
it sometimes
happens
For example, the standard
error cited by the Commission
in its Opinion
error,
than they really are. The intervenors
in
5
1
this case have thrown up an array of criticisms of the data set that raise a
2
standard that could never be met. Apparently,
3
for the purpose for which they were used in this proceeding.
4
the case with large micro level data sets.
5
to assert that there is useful information
6
of volume variability.
7
researchers
8
not go to great lengths to discredit the data, and then discard them and the
9
analysis based on them.
10
.c
the MODS data were not created
Nonetheless,
But that is usually
it does seem reasonable
in the MODS data for the determination
I would suggest that the Commission
take the view that
should extract from these data what useful information
On the other hand, if the Commission
they contain,
desires to pursue the line of
11
research begun in these studies of volume variability, then continued
12
development
13
maxim in econometrics
14
The reason is almost self-evident.
15
information contained
16
the other hand, if it is genuinely believed that the micro level data contain no
17
useful independent
18
cannot be reversed.
19
and Smith that analysis of the MODS data should be done using site means of
20
the same data set that could be used in disaggregated
21
of micro level data should be undertaken.
In that connection,
it is a
that micro level data are always better than aggregates.
Aggregation
almost always discards
in micro level data, and never creates new information.
information,
then they can be aggregated.
By this construction,
On
This process
I am unable to agree with Drs. Neels
Finally, what kind of model should be developed?
form.
It is clear that it is
22
appropriate
to use multiple regression to model the response of labor costs to
23
output-the
appropriate
24
them is an issue to be settled elsewhere.
25
logarithm) on output of any sort (or its logarithm) will surely ignore many other
26
factors that that should be in the equation,
27
that Dr. Bozzo has analyzed.
definitions of these two variables and how to measure
A simple regression
of hours (or its
including the site specific differences
I also assume that the various models proposed
6
1
will be based on the same data that have been used in this set of studies.
2
instance, given the availability
3
proposed
4
do no worse, and will generally do better, than the site means (‘between
5
models suggested
6
The more crude the aggregate,
7
worst, if the disaggregated
8
which is contained in the group means, then a model based on the
9
disaggregated
10
-
In this
of micro level data, the fixed effects models
by Drs. Bradley and Bozzo are appropriate.
At a minimum, they can
by Drs. Neels and Smith. Aggregation
the more information
groups’)
discards information.
is discarded.
data really do not contain information
At the very
beyond that
data would simply mimic the model based on aggregated
Lastly, there is the question of econometric
practice.
data.
The worst extreme I
11
see here is Dr. Smith’s willingness
12
dimensional
13
this and the appropriate
14
intervenors,
15
in direct contradiction
16
improved the estimates.
17
that imposes the restriction
18
removes a bias inherent in the fixed effects model is simply not true-exactly
19
opposite is the case.
20
cannot.
21
developed
22
panel data set. The models have been subjected
23
sensitivity tests, and include features such as dynamic structure, time and site
24
effects, models for autocorrelation,
25
equation.
26
within which the Postal Service should be analyzing mail processing
to rely on gross and misleading,
crude two-
scatter plots to defend a specific estimate of a parameter.
model lie the pooled regressions
in which they impose restrictions
to long established
suggested
Between
by the
on a regression model, then argue,
results, that these results have
In particular, the suggestion
that a pooled regression
that there are ‘no site specific effects somehow
the
Imposing restrictions can induce biases, relaxing them
At the other end of the scale are Drs. Bradley’s and Bozzo’s carefully
econometric
models that embody current practice using an elaborate
to numerous specification
and
and flexible functional form for the estimated
As I noted earlier, I believe that this is the appropriate
framework
costs.
7
1
IV. THE VOLUME VARIABILITY
2
A. Dr. Bradley’s Cost Equation Model
3
Dr. Bradley’s
MODELS
model was estimated
4
large number of specific installations
5
crucial variables were:’
using a data set that provided for a
at numerous points in time. The three
6
HRSit
= hours of labor at site i in period t
7
TPHit
= total pieces handled at site i in period t
8
MANRit
= manual ratio, a site specific measure
technology
9
of one aspect of the
at site i in period t.
10
The specific equation estimated for ‘Direct Activities’ (sorting, etc.) includes
11
linear, quadratic,
12
contained in a time trend which allows for a discrete change in the time effect at
13
a midpoint in the period of the analysis, one period lagged terms for the IogTPH
14
variable and its square, and a site specific dummy variable which allows for the
15
site specific constant, or ‘fixed effect.’
16
plus seasonal dummy variables, plus the site specific constants,
17
large. Additional
18
a correction for autocorrelation
19
involved, model was specified for the ‘Allied Activities.’
20
and all cross products of these three variables,
lagged effects are introduced
consequence,
22
sample all observations:3
24
includes 15 variables
so it is quite
into the model through the use of
in the disturbances.
The data used for the study contained
21
23
All told, the equation
time effects
A similar, but slightly more
numerous
obvious flaws, and as a
they were ‘scrubbed’ by a procedure that removed from the
(1) that were not part of a continuous
observations
that were otherwise
sequence
of 39 consecutive
‘clean;’
’ See Docket No. R97-1, USPS-T-14
at 12-22.
3 Id. at 30-37; see also USPS-LR-H-148.
8
(2) for which variables were obviously missing or erroneously
1
2
zeros;
(3) that were ‘outliers,’ in that they were in the top or bottom one
3
4
5
percent of the distribution
of the variable HRS/TPH.
Dr. Bradley subjected this model to numerous specification
6
tests for whether
7
rarely survives this test-tests
8
and a test for autocorrelation.
9
coded as
Estimates
tests, including
a fixed or random effects model was appropriate-the
latter
for the presence of any site specific effects at all,
of the crucial elasticity of hours with respect to TPH for the
10
direct activities ranged from 0.395 to 0.945; none of the estimates
11
one.5 The counterparts
12
number of other regression
13
similar results. The consistent outcome was that the volume variability varied
14
across operations,
15
equations were subjected to various specification
16
using several different methods, for example without the autocorrelation
17
correction.
18
estimation
19
exceeded
for the Allied Activities ranged from 0.720 to 0.829.6 A
results were presented for other activities, all with
rarely approached
one, and almost never exceeded
it. The
tests, as noted, and estimated
The elasticity estimates were quite robust to the changes in the
methods.
Dr. Bradley conducted
an analysis of the effect of measurement
20
the TPH variable as well, using a method suggested
21
panel data.’
22
albeit inconsistent,
23
inconsistent
error in
in Hsiao’s monograph
As he points out, with panel data one can compute
on
two different,
estimators of the slope coefficient and, at the same time, two
estimators
of the measurement
error variance.
4 Id. at 41-51.
5 Id. at 54.
6 Id. at 63.
’ Cheng Hsiao, Analysis of Panel Data, Cambridge
65.
Solving two
University Press 1986, at 63-
9
1
equations
2
each of these two parameters.
3
model, and found that the consistent
4
effects estimator.
5
case, does) produce a negative variance estimator.*
6
issue-Hsiao’s
7
skepticism
8
correct, but whether this is the best way to approach the analysis.
9
applies in a very narrow specification,
10
result in a very ordinary circumstance.
11
emphasize,
12
test and, I suspect, not the most robust one that could be constructed.
13
B. Dr. Bouo’s
it is possible to obtain one consistent estimator
of
Dr. Bradley carried out the analysis in a restricted
estimator was quite close to the fixed
As Dr. Neels pointed out, Hsiao’s method can (and in this
This is a small sample
results are based on infinite sample results.
of this procedure,
not over Hsiao’s analytical
I confess some
results, which are
Hsiao’s result
and produces the disturbing variance
It is prone to this finite sample problem.
the test is not biased and is not misleading.
I
It is simply one possible
Updated Version of the Bradley Model
Dr. Bozzo’s model is similar to Dr. Bradley’s
14
_-
in two unknowns,
In constructing
to remedy some of the defects in Dr. Bradley’s
it, Dr. Bozzo
15
attempted
16
by the intervenors
17
scrubbing
18
relating to the capital stock and another relating to wage rates. As before, the
19
model fit was a translog (log quadratic)
20
effects).
21
and its square, as opposed to one in the earlier study.g The data preparation
22
Dr. Bozzo’s model is considerably
23
equation
and by the Commission,
procedure,
model that were argued
including the use of the data
and the absence of several other variables, including
model with site specific intercepts
one
(fixed
The translog model was specified with four lags of the IogTPH variable
is also considerably
more elaborate than Dr. Bradley’s,
for
The
more elaborate, involving the following variables:
’ Docket No. R97-1, TR. 28/l 5637.
’ USPS-T-l
5 at 117-l 18. Note that since Bozzo also changed from AP level to
quarterly data, his model embodies a lag structure that is effectively 13 times
longer than Bradley’s,
10
1
HRSir
= the log of hours
2
TPHit
= the output (volume) variable (enters with four lags)
3
CAPit
= the capital stock variable
4
DELii
= deliveries,
5
WAGEit
= the wage variable
6
TRENDit
= trend variable to capture smooth time effects
7
MANRit
= the manual ratio
8
QTR,
= specific quarterly
9
to capture network and density effects
dummy variables.
Dr. Bozzo estimated the model without transforming
10
deviations,
11
relevant elasticities
12
variables.
13
ranging from 0.522 to 0.964.
14
considerable
15
estimation and forms ofthe equations
estimated on the quantitative
16
worth noting that Dr. Bozzo examined
his results for sensitivity to different
17
methods of estimation
of the elasticities,
18
various computations
19
151-160.
20
v.
21
unlike Dr. Bradley.
the data to overall mean
The point is important as, in the current case, all
become lengthy functions of the parameters
The estimated elasticities
and the
obtained are similar to Dr. Bradley’s,
(USPS-T-15
at 119-120;
126). Since
attention has been paid to the effects of different methods of
and computing
produced very similar results.
results, it is
and found that the
Id. at 130-131,
140-141,
See also USPS-LR-I-107.
ECONOMETRIC
ISSUES RAISED BY THE INTERVENORS
As noted earlier, an extremely
long list of problems with the preceding
22
analyses was raised by intervenors
Neels and Smith.
23
whether the variables
24
measures of the activity being analyzed,
Many of these related to
used in the analyses were appropriate
or accurate
e.g., whether total pieces handled
11
,---
1
(TPH)”
was an appropriate
2
plant and equipment
3
stock.
4
necessary
5
However,
6
.
7
mentioned
8
measurement
9
toward zero-that
volume measure and whether the Postal Service’s
data really contain useful information about the capital
I will not be commenting
background
on these concerns
or knowledge
as I do not have the
about the specifics of the Postal Service.
they did, as well raise several issues related to the econometrics.
Dr. Neels: Most of Dr. Neels’s rebuttal focused on the data issues
above.
He did make a number of points about the effects of
error that he feels persistently bias the estimated elasticities
is, toward a result less than one. He was also critical of the
10
screening
of the data which produced the estimating
11
below.
12
.
13
Bradley and Bono
Dr. Smith: Dr. Smith has raised a daunting
studies.
sample.
I will address this
litany of criticisms of both the
I will focus my testimony
on only a few of these:
14
(1) He, as does Dr. Neels, criticizes the data scrubbing procedure.
15
(2) He feels that the analysis is ‘short run’ in nature, and is therefore
inappropriate
16
for the phenomenon
17
(3) He feels that observable
18
Dr. Bozzo’s analysis.
19
being studied.
(with his eyes) evidence contradicts the results of
(4) He is critical of the panel data, fixed effects model that Dr. Bozzo used.
A fair amount of Dr. Smith’s testimony is centered on issues of underlying
20
21
microeconomic
theory.
22
underpinnings
23
address
24
proceeding
25
and econometrics
Some of this is used to criticize the theoretical
of Dr. Bozzo’s study. It is not my intention in this testimony
issues of the underlying
as an econometrician.
to
theory of any of this analysis; I come to this
However, I am familiar with the economics
of the estimation of cost and production functions,.
lo Total pieces fed (TPF) was used in place of TPH in the automated
mechanized operations. See USPS-T-15 at 50-52.
My doctoral
and
12
.,-
1
dissertation
and a subsequent
paper published in the Journal of Political
2
Economy
3
studies (they were among the first) of the marriage of theory and empirical
4
estimation
5
standard
6
valid points in Dr. Smiths discussion
7
function, there are also some noteworthy
8
that “Dr. Bouo’s
9
conclusions.”
with Laurits Christensen
have long been regarded as pioneering
of cost and factor demand equations-they
application
presented
in microeconomics
are, for example,
textbooks.
a
While there are
of the theory behind an appropriate
errors.
treatment of homotheticity
cost
For example, Dr. Smith states
appears to lead to incorrect
Tr. 27/l 3196. He then states:
10
11
12
13
14
15
16
17
18
19
20
21
In his testimony, Dr. Bozzo asserts that ‘I... capital and labor
variabilities will be identical in equilibrium under the assumption that
cost-pool-level production (or cost) functions are ‘homofhefic’ . . .
Homotheticity implies that changing the level of output of the
operation will not alter relative factor demands such as the
capital/labor ratio, in equlibrium (and other things equal). However,
the Postal Service testimony is replete with examples of the
implementation of major investments designed to reduce costs. . . .
The focus is on the elimination of major labor costs via capital
investment to achieve an overall reduction of total costs.
Accordingly, the application of a homotheticity assumption appears
to be an inappropriate assumption. (Id.)
22
Nowhere
does the theory state that the firm in equilibrium will never adjust
23
its capital labor ratio in response to changes in relative prices.
24
technology
is homothetic, the firm will respond to a change in relative prices by
25
substituting
away from the factor that is becoming
26
unable to. This has nothing to do with whether the production function is
27
homothetic
28
either Dr. Bozzo or Dr. Smith argued that the Postal Service operates under
29
conditions
30
in the face of increasing wage rates. The wages that appear in the labor demand
31
functions
more expensive unless it is
or not. It is a question of factor substitution,
of fixed coefficients,
estimated by Dr. Bouo
Even if the
and I do not believe that
in which it would never substitute capital for labor
allow for adjustment
in response to changes in
13
wages over time, and are not consistent with a fixed coefficients
Smith seems, as well, to suggest that the technology
assumption.
of production
Dr.
in the Postal
Service is nonhomothetic,
which it may well be. But no empirical evidence for
this has been presented,
and the mere fact that the Postal Service has invested
in labor saving capital does not say anything on the subject one way or the other.
6
7
.-
A. Sample Selection and the Data ‘Scrubbing’
Procedure
As noted, Dr. Bradley subjected the MODS data to a screening
8
denoted ‘scrubbing’
9
obviously erroneous
that was intended to remove observations
and unrepresentative
process
that were
of the underlying relationship
10
attempting to uncover.
11
from the sample-is
12
proceeding.
13
The first is biases.
14
on specific criteria (as opposed to randomly)
15
obtained with the resulting sample.
16
use of sample data-the
Ii
efficiency has not been an issue in this proceeding.
18
comments
19
been made, evidently
20
section will discuss these issues.
This data scrubbing-systematic
the subject of a considerable
he was
removal of observations
amount of discussion
in this
There are two issues that are brought forth by such a procedure.
Under certain circumstances,
based
can induce biases in the estimates
The second concerns the issue of efficient
problem of ‘throwing
by the interveners
selection of observations
away information.’
In point of fact,
However, at some points,
that are related to this issue have nonetheless
to cast doubt on the Bradley and Bozzo studies.
21
To review, Dr. Bradley’s screening
22
(1) He removed
observations
procedure
This
involved the following steps:
with obviously missing values, zeros coded
23
for certain activities known to be taking place at the facilities observed,
24
and observations
25
threshold.
for which reported output was below a specified
14
(2) He imposed a continuity requirement
that the remaining
site contain at least 39 useable consecutive
observations.
(3) He removed the top and bottom 1% of observations
productivity-the
data set for a
based on
ratio of pieces handled to labor hours.
The productivity screen could disrupt the ‘continuity’
of the data series for some
sites, so his data scrub was iterative in that after step (3) it was necessary
to
revisit step (2).
Among the issues raised by the intervenors
process removed an extraordinary
was that this screening
amount of data from the sample.”
The
10
response
to this point has been made in passing by Dr. Bozzo, but it bears
11
repeating.
12
scrub.‘*
13
outset that the underlying
14
points, then absent the problem of nonrandom
15
paragraph,
16
obtained
17
Bradley could have simply randomly thrown away three quarters of the
18
observations,
19
intuition would correctly suggest that this waste of information would be costly.
20
But the cost is that of having a smaller sample, which leads to less precise
21
estimates
22
relation at all to whether or not those estimates are systematically
biased in one
23
direction or another.
is a red
24
herring.
The samples involved in these analyses
However, irrespective
are large, even after the data
of the size of the samples, if we are agreed at the
model that we seek to discover applies to all the data
selection discussed
in the next
the amount of data discarded has no bearing on whether
with the remainder
the results
are biased or not. Under the assumption,
and safely based his results on the remaining quarter.
than one might otherwise obtain-i.e.,
Dr.
Certainly,
larger standard errors.
The issue of how many data were discarded
” E.g., Docket No. R97-1, TR. 28/l 5609-619, 15632-l 5633, 15853.
present docket, see Tr. 27/l 3163, 13172 and TR. 27/l 2796-l 2800.
” USPS-T-15
at 20-22,95-102.
It has no
In the
15
1
There is a substantive
issue concerning
how the data were removed from
2
the sample.
The overriding issue is whether the criteria used to discard data
3
points were themselves
4
case, the log of the hours variable.
5
does not appear to have been noted by the intervenors
6
Bozzo.
7
which the scrubs were based were the output variable, an independent
a
and the hours variable, the dependent
9
removal of data from the sample need not impart any particular
related to the quantitative
This immediately
In particular, the missing or questionable
variable.
being studied, in this
raises a consideration
that
or by Drs. Bradley or
values in the data set upon
variable,
In the former case, once again,
from the sample because
bias to the
IO
results.
11
missing or miscoded
12
underlying
13
which remain.
14
similar in its impact to throwing away observations
15
may amount simply to wasting possibly useful information.
16
output variable is erroneous while the hours variable is correctly coded, then my
17
rule that the model must apply to all the data points would not hold, and the
ia
observation
19
with zero hours makes no sense.
20
Removal of observations
measure
simply makes the sample smaller.
relationship
still applies to, and is discernible
Discarding observations
should be discarded.
the output variable is
Once again, the
from, the observations
based on values of the output variable is
randomly.
For an obvious example,
On one hand, it
On the other, if the
positive output coded
That leaves the missing or badly coded data on the dependent
variable,
21
hours. Bradley and Bozzo note a few cases.13 Zero values recorded within a
22
sequence of positive values are obviously
23
the assumption
24
be discarded.
erroneous.
These once again violate
that the model applies to all data in the sample, and they should
Bradley identifies another case, that of a ‘ramping up’ period, in
I3 E.g., USPS-T-15
30.
at 109-110.
See also Docket No. R97-1,
USPS-T-14
at
16
1
which the hours data would be unrepresentative.14
2
I am not able to comment on specific technical
3
in mail handling.
4
settle whether this is an appropriate omission.
5
suggests to me that it is.
aspects of the production
process
As such, I leave it to the others involved in this discussion to
My own reading of the testimony
The final case, and the one that does merit some attention
6
P
As I noted in my introduction,
7
operation.
a
from his sample, reasoning
9
treated as erroneous
is the trimming
Dr. Bradley eliminated the extreme values of hours per piece handled,
that these were unrepresentative
and should be
data.15 This is a specific sample selection rule that could, in
Dr. Bradley
10
principle, color the results obtained with the remaining sample.
11
removed the top and bottom 1% of the distribution
12
a more detailed screen of this sort.16 This productivity screen has called forth a
13
criticism about “sample selection.”
14
issue, but I believe there is a consideration
15
selection has become something of a bugaboo
16
should be precise in our use of the term.
17
potentially induce is a truncation bias. The distinction is important
ia
because not very much is known about sample selection bias-except
19
bad-but
20
has direct bearing on this case.
21
Dr. Bozzo has commented
productivity
in econometric
What the productivity
on the
First, sample
analysis, so we
screen could
in this context
that it is
and some of what is known
scrub of the data amounts to a trimming
22
operation.
23
proper term is ‘truncation.’
24
variable does have the potential to do some mischief.
Although
specifically
that should be added.
a fair amount is known about truncation,
Dr. Bradley’s
with this rule. Dr. Bozzo used
the term ‘selection bias’ has been used in this context, the
Extracting data based on values of the dependent
The pure theory of the
l4 Docket No. R97-I, USPS-T-i 5 at 30.
l5 Id. at 32.
l6 USPS-T-15
at IOI-102,110-112.
17
1
issue (discussed
at various points in Chapter 20 of my text) does suggest that
2
trimming the tails of the distribution
3
estimator toward zero. There are two crucial variables
4
the trim and the amount of the distribution
5
lesser the former and the greater the latter, the less ‘damage’ is done.
6
regard, Dr. Bradley’s productivity scrub scores well on both counts, in that he
7
removed a fixed and very small percentag-ne
a
Bozzo’s scrub was more complicated,
9
observations
would bias the least squares regression
here, the asymmetry
that remains after the culling.
percent-from
The
In this
each tail. Dr.
in that he did not symetrically
from the tails of the productivity distribution
of
cull
as per Bradley, but
10
rather used cutoffs based on a priori information on maximum and minimum TPH
11
per hour.
12
But, in any event, looking at his Table 3 (USPS-T-15
13
two exceptions,
14
productivity
15
truncation
16
By foregoing
17
obtains largely similar results. What I find surprising,
48
these results is that the theory suggests the estimates
should rise, not fall, when
19
the suspect observations
In fact, most of the
20
estimates fall, some substantially.
21
to the assumptions
22
conclude that concerns about attenuation
23
probably misplaced.
24
25
It is impossible to tell what if any truncation
the numbers of observations
bias would result from this.
at 107) we see that, with
removed from the sample by the
scrub are generally so small that it would be hard to argue that the
effect would be substantial.
His Appendix
A (id. at 140) is suggestive.
the productivity screen and using “All Usable Observations,”
are put back in the sample.
and encouraging,
about
Dr. Bozzo’s type of screen does not conform
in my text, so I don’t see this as a contradiction.
due to truncation
This leaves an important consideration,
and Bozzo discussions,
he
I do
of the data set are
which entered both the Bradley
the data continuity issue.
Dr. Bradley imposed a 39
ia
.
.c
1
contiguous
2
with autocorrelation,
3
tools (the econometric
4
are permissible,
5
either case, it would be a question of sample size, not systematic
6
sample.
7
observation
threshold
on his sample.”
Since he was fitting models
this was primarily a practical consideration.
software package,
TSP) in which time series with gaps
so the continuity requirement
I am reluctant to generalize
Dr. Bozzo used
becomes a nonissue.
But, in
coloring of the
from narrow results to sweeping conclusions
a
(though, in fact, both Dr. Neels and Dr. Smith have done so, using results taken
9
from my book).
But I do believe, based on the considerations
IO
that the attention
11
data scrubbing
12
errors and other obvious flaws must be cleaned before being used. The samples
13
used were large to begin with, and remained so after the trimming operations.
14
By and large, the trimming operations
15
not innocent,
16
be very small.
17
paid to the criticisms
discussed above,
procedures
raised by Neels and Smith concerning the
has been exaggerated.
were innocent.
Data that contain recording
To the extent they were
the received theory suggests that the problems
The foregoing
created are likely to
is not meant to be glib. Data cleaning of this sort must be
ia
done carefully,
particularly when the end result of the statistical
19
an input into a process as important
20
dispute the possibility that the data scrubbing procedures
21
were less than ideal, perhaps even less perfect than it potentially
22
been had it been done differently at the time. Dr. Neels has raised some valid
23
criticisms of the procedures;
24
also provide the clearest possible picture of how processing
volume” is well taken.”
as setting postal rates.
his observation
process will be
Moreover, I do not
used by Dr. Bradley
that “unusual
could have
observations
. . . may
costs vary with
In his update of Dr. Bradley’s model, Dr. Bozzo backed
” Docket No. R97-I, USPS-T-14
at 31.
‘* See Docket No. R97-1, TR. 28/15613.
19
1
away from some of Dr. Bradley’s procedures.
But in its Opinion and
2
Recommended
3
Appendix
4
45 pages (pp. 24-34) to this issue, and the conclusions
5
dire.
6
conclusions
were unduly pessimistic.
7
Commission
stated ‘Witness
a
data elimination that econometricians
9
values that are accurate as well as those that are erroneous,
Decision from Docket No. 97-l
F), in the discussion
(PRC Op., R97-1,
of the cost models, the Commission
Volume 2,
devoted 11 of
it reached were quite
I believe that while many of the issues raised were appropriate,
the
After reviewing the procedures,
Bradley’s productivity
try to avoid.
Id. at 26-27.
the
scrub is exactly the kind of
Since the scrub eliminates
it leaves a sample
10
that cripples the econometrics,”
This is not true.
Notwithstanding
11
the truncation
12
observations
13
producing
14
standard
15
The screen left more than adequate variation in the sample to allow econometric
16
estimation
17
sample might seem extreme,
18
remained
contained thousands
19
attempted
to estimate only a relative handful of coefficients.
20
either to use obviously erroneous
21
observations
that might have improved his estimates,
22
right course.
In order to argue that this data scrubbing
23
econometrics,”
24
not just some of them.
25
The Commission
issue I raised above, discarding the extreme, though
still valid,
will indeed reduce the quality of the sample; it will do so by
a model that is less precise (in terms of statistical measures
errors) than it might be otherwise.
of the model.
But “cripples” overstates
such as‘
the case.
Discarding anywhere from a quarter to a half of a
but it must be recalled that the sample that
of observations,
not dozens, and the analysis
Faced with a need
data or to discard with those data some
I feel that Bradley took the
“crippled the
one would have to argue that all or nearly all the data were bad,
makes one final argument
about the data scrubbing
26
process, that the process did not truly purge the sample of erroneous
27
3334.
This may well be true, but it is a side issue-the
data. Id. at
screen was not intended
20
1
for this purpose.
They cite certain values derived by Dr. Bradley to illustrate the
2
extent of measurement
3
observation
4
intended to provide complete
and appropriate
5
measurement
or not TPH is an appropriate
6
or whether errors are introduced
7
TPH are valid concerns,
a
data discussed
9
extent of measurement
error remaining in the data. Two aspects of this
should be noted.
The first is already made above. The screen was
error. Whether
by the conversion
measure of the output
of some other measure to the
but they are separate issues from the screening
here. The second point concerns
two numerical estimates
of the
of the
error that are given. These measures are interesting,
10
are prone to misinterpretation,
11
B. The Issue of Measurement
12
data, not data free of
as I discuss in the next section.
Error
A large amount of the criticism leveled at the Bradley and Bozzo studies
13
concerns the issue of measurement
14
output measure used, pieces handled in Dr. Bradley’s
15
Dr. Bozzo’s case, do not correspond
16
enter the calculation of volume variability.‘g
17
variable which appears on the right hand sides of both regression models is
18
measured with error. From this, he concludes:
error. Specifically,
Dr. Neels argues that the
case and pieces “fed” in
to the true measure of output that should
He concludes
that the output
19
(1) “It is a well established
20
in an independent
21
estimates.”
22
quote from page 437 of the third edition of my text.)
23
24
but
econometric
principle that measurement
variable causes downward
error
bias in coefficient
(Docket No. R97-1, Tr. 28/l 5604. He goes on to state a
(2) “Measurement
error in an explanatory
variable of a linear regression
model renders the estimator inconsistent
I9 Tr. 27/12792-12793,12802
28/15598-600.
and frequently biases
et seq. See Also Docket No. R97-1, Tr.
21
1
coefficient
2
he does not invoke my text.)
3
estimates towards zero.”
The statements
In this instance,
above are based on a widely cited, fairly simple result
4
from econometric
5
parameter in a regression
6
(Tr. 27/12800.
theory.
Suppose that the analyst wishes to estimate the slope
model:
y = CY + px* + E
7
where x’ is the desired independent
8
denote the observed
9
We further assume that x deviates from x* by a random measurement
independent
variable, volume in this instance.
variable,
pieces handled, however
Let x
measured.
error,
10
denoted u, so that x = x* + u. In order to obtain the results that form the
11
backbone of Dr. Neels’s criticism, it must now be assumed that (a) the
12
measurement
13
strictly uncorrelated
14
random samples-(c)
15
respectively.
16
estimator in a least squares regression of y on x (the observable
17
estimate consistently,
ia
error and the true variable are uncorrelated,
across time and with other observations-i.e.,
we are using
the variances of u and x* are denoted 8’ and h2,
With these in place, we obtain the fundamental
y = p x 1 /(I
(b) all variables are
result that the slope
data) will
not p, but
+ e%*).
19
Two important points to note are, first, that if there.is no measurement
20
8 is zero and least squares does what it should (it estimates
21
when 6 is not zero, least squares estimates
22
This is the source of Neels’s result stated above.
23
There are quite a few misconceptions
24
discussions
error, then
p), and, second,
p with a persistent downward
about measurement
bias.
error in the
on the subject that I have seen in this case.
22
1
(1)
The suggestion that measurement
error biases all coefficients
The preceding
downward
2
is generally not correct.
3
circumstances
4
present docket or the preceding one involve a simple regression
5
dependent
6
additive error.
7
with error in this fashion, the coefficient
8
is indeed biased downward,
9
simple regression
assumed.
However,
statement is true under the
none of the models discussed
variable on a single independent
In a multiple regression
affected as well, in unknown directions.
II
analyze.
12
constant term is biased upward, not downward.
13
measurement
14
observation
15
regression models, not simple ones.
16
pages after the familiar one cited by Neels.)
(2)
are
There is one easy case to
error is to tilt the regression
error, the
The effect of the
line, not to push it down. This
is important in this case because all models are multiple
(This result appears
Whether or not the bias in the coefficients
functions of those coefficients,
19
unknown.
20
which a variable is badly measured
21
which may be biased downward
22
upward.
(3)
variable
in the regression
example, with measurement
ia
23
is measured
though not by the same amount as in the
10
17
with
on the badly measured
case. Also, other coefficients
In the preceding
of a
variable measured
in which one variable
in the
in my text four
carries through to biases in
such as the volume-variability
Any function of the coefficients
factors, is
in a multiple regression
is a mixture of coefficients,
and others
in
some of
of which might be biased
The end result is not knowable.
In time series data with autocorrelation
error will be mitigated
in the variables, the effect of the
24
measurement
if the underlying
25
correlated across time and the measurement
26
particular relevance
variables
are
errors are not. This has
here because lagged values of the output variable
23
1
appeared
2
model.
3
In a model in which more than one variable is measured
with error,
4
essentially
all bets are off. The familiar attenuation
5
be shown.
The directions of biases, if any, are unknown.
6
models fit by Drs. Bradley and Bozzo are translog, quadratic in the logs
7
of the output variable, this result applies here. In addition, note that the
a
square of the erroneously
9
estimated
result can no longer
measured variable appears
which enabled the derivation of the multiple regression
11
also lost.
(5)
If the original data were measured
of additive error
case in my text is
with additive error, surely the logs of
13
them are not. This means that the blanket statements
14
cited above are incorrect.
15
measured
16
were measured
17
have to be qualified, in ways that are not obvious.
18
Since the
in the models
by Drs. Bradley and Bozzo, so the assumption
IO
12
-~
(4)
in the model, through the estimation of the autocorrelation
made by Neels
The results would obtain if the logs were
with additive error, which would be the case if the original data
with multiplicative
Lost in this discussion
error. Thus, the analytic results above
is an assessment
impact of measurement
of the likely magnitude of the
19
quantitative
20
quality data, we cannot say much with any precision on this subject.
21
form some impressions,
22
can be written in the form
though.
error. Without a large amount of very high
We can
First, the familiar result on the previous page
23
Y = PXP2
24
where p is the correlation
25
error. As noted, I am not able to make a judgment on this sort of calculation.
26
do note that the R*s in the regressions
between the true variable and the one measured with
I
reported by the various authors are
24
.1
exceedingly
2
is to bias the fit of the model downward.
3
measurement
4
this. Suppose the measure were multiplicative.
5
instance, the result becomes a bit cleaner,
6
high, sometimes
above 0.99. Another effect of measurement
error
Given values this high, I suspect that
error is not a major factor here. There is another
way to approach
It is possible to show that in this
y = p/(1 + e*).
7
Now, what value do we use for the measurement
error variance?
a
the pieces handled variable varied in either direction by as much as 20 percent
9
from its true value.
This would imply a relative standard deviation
Suppose that
of the
10
measurement
11
about 8’ = 0.01. This is trivial. While a 20 percent error rate in the reporting of
12
piece handlings seems fairly large, it implies only a 1% bias in the estimated
13
coefficient, since with these values, y = 0.99p.
14
All of these results are narrow theoretical
15
hypothetical
16
overriding
17
measurement
ia
error of about 0.1, or a relative measurement
situation.
conclusions
error variance of
based on a
But I do believe that they have relevance
here. The
result, which will fall out of any analysis, is that the damage done by
error will be a function of the ‘reliability ratio’:
reliability ratio = variance of true variable I variance of measured variable.
19
This, in turn, is a function of the correlation
20
variables.
21
things as education,
22
this setting, by contrast, we are considering
23
evolve reasonably
24
reliability of output data in this setting is exceedingly
25
argue that criticisms of the models based on measurement
between the true and the measured
In cross sections, in which researchers
attempt to measure such
the reliability of self reported data can be extremely low. In
very stable flow variables which
smoothly through time. I would strongly predict that the
high. Consequently,
I would
error, while certainly
25
P
1
to be taken seriously, are greatly exaggerated.
2
which an observed flow rate of pieces handled differs noticeably from some
3
known true value are not really informative.
4
between the observed measure
5
face of a few egregious reporting
6
these data sets.
7
.-
Interestingly,
No. R97-1 Opinion.
9
(which I have not seen),
isolated examples
in
What matters is the correlation
and what we are trying to measure.
Even in the
errors, this seems likely to be very high for
there are a couple of estimates
8
10
11
12
13
14
15
16
17
18
19
20
Moreover,
in the Commission’s
Docket
They state, citing a table in Dr. Bradley’s rebuttal testimony
The standard deviations for total piece handlings (TPH) derived
from the variances in Table 3 are 0.268 for manual letters and
0.297 for manual flats. The corresponding standard deviations for
the measurement error are 0.123 for manual letters and 0.068 for
flats. These results do not support the conclusion reached by
witness Bradley that large and material measurement errors are
absent from~the piece handling data for these activities.. ..ln the
Commission’s opinion these results are inconclusive but tend to
support exactly the opposite finding, that large measurement errors
remain in the sample after witness Bradley’s scrubs. (PRC Op.,
Docket No. R97-1, Volume 2, Appendix F at 34.)
21
This seems true. However, one must be careful.
22
in natural units, the crucial variables
23
standard deviations).
24
ratios are 0.2682/(0.2682
25
0.2972/(.02972+D.0682)
26
small is debatable,
27
certainly different from the raw data.
While standard deviations
for our analysis are variances (squared
For the values cited by the Commission,
+ 0.123’)
are
the reliability
= 0.826 for manual letters and
= 0.950 for manual flats. Whether these are large or
but the impression
one gets from the reliability ratio is
26
1
C. Panel Data Treatments
2
1. The Fixed Effects
3
Model vs. a Group Means Model
For the purpose of this discussion,
I will focus on Dr. Bouo’s
raised by Drs. Neels and Smith in this proceeding.
model and
4
the criticisms
5
discussion
6
concentrate
7
of the form:
9
where or is a site specific constant and the remaining terms are the components
would apply to the earlier results, but it seems more useful to
on the current efforts.
Dr. Bozzo estimated a labor demand equation
10
of a conventional
11
cross terms in logs of the independent
12
variable, and autocorrelation,
13
concerns the use of the fixed effects regression model.
14
Some of the
regression (though one that involves linear, quadratic, and
variables, time effects, lags of the output
which make it quite complicated).
Dr. Smith takes issue with the fixed effects approach
15
27/13163-65,
1318990.13207-214.
16
graphical and logical demonstration,
17
and that the appropriate
18
site effects.
19
(TR. 27/13210).
20
depiction.
A central issue
altogether.
TR.
He argues, through the devices of a
that the fixed effects model is inappropriate,
model is a regression on means that does not have fixed
To focus ideas, I use a caricature of his main figure, his Figure 4
I assume he would agree with this approximation
to his
27
F-
1
The three boxes in the figure represent the observed
2
line of best fit that passes through each box is that produced by a fixed effects
3
regression.
4
regression,
5
of the data, which is the heavy line in the figure, and which has a slope much
6
greater than the individual lines. The logic behind the demonstration
7
data inside the box embody a short run relationship
8
long run the site obeys the steeper sloped long run relationship.
9
data for three sites.
The
But, so Smith’s argument goes, each of these is a short run
and the long run regression
Missing from this demonstration
is the one that passes through
the center
is that the
that the site obeys, but in the
is just when the short run becomes
10
long run. At some point, so the argument goes, the site in the lower box is
11
transformed
12
and obeys the long run relationship.
13
within each box (at least figuratively-in
14
sites’ operations
15
the “long run” regression relationship
16
point.
17
stock is constant within a site for the full period of the observations,
the
to the one in the middle box, as it is then that it climbs up this curve,
The problem with this discussion
is that
actuality within the MODS data) the
are observed for several years.
What Smith characterizes
as
certainly should manifest itself at some
Dr. Smith claims that the fixed effects model assumes that the capital
but this is not
.-
28
true.”
Dr. Bozzo’s model contains
both a measure of the site’s capital stock in
each period (imperfect though it may be, it does not appear to be devoid of
information)
and a time trend.
There is evolution of the technology
built into the
model.
Now, let’s consider this argument statistically.
appropriate
model is the group means regression.
make the assumption
correct.
9
Let’s suppose that it is. We’ll
that the site means regression
that Dr. Smith advocates
is
That means that the linear regression model using the site means,
J$=a+pxi+Ei
10
is appropriate.
11
obey the same regression:
12
Dr. Smith argues that the
yit
=
Suppose it is. Then it must be true that the disaggregated
ff
+ @it
+
data
Eit,
It is there of necessity,
because if this were not
13
Note the common constant term.
14
the case, then the group means regression suggested
15
way there can be a single constant term in the group means regression IS if there
16
is a single constant tern7 in the disaggregated
17
Surely the data would have something
18
regression
19
specific constants would all be the same, at least statistically
20
decidedly
21
hypothesis
were appropriate,
data.
cannot be right. The only
Something
to say about this. If the group means
then when the fixed effects regression is fit, the site
not the case. All the tests of this hypothesis
of no site effects?’
is wrong here.
so. But this is
decisively reject the
The upshot is that it must be the group means
” TR. 27/I 3190-92.
21 See USPS-T-15
at 123; see also Docket No. R97-1,
Tr. 33118021-22.
USPS-T-14
at 4143,
29
1
regression,
2
not the fixed-effects
3
restricts the regression model, that is biased-
model.
There is another
point of view here. If Dr. Smith is correct, then surely his
4
group means regression
5
Disaggregating
6
equal constant terms for each site. It is a fundamental
7
when one relaxes a restriction, the very worst one can do is what one did before.
8
Here, what this means is that if we don’t assume the constants
9
they really are, then the regression on the disaggregated
could be fit using the disaggregated
the data always relaxes restrictions,
data.
in this case, the restriction of
principle
of statistics that
are the same, and
d,ata should do no
10
worse than the group means regression, and the site specific constants
11
resemble each other.
12
fact, the MODS data are speaking loudly and clearly here. Statistically,
13
Smith’s argument in favor of the group means regression is not correct.
14
Logically, it is weak as well, but whether that is the case or not, his graphical
15
device cannot be used to support it, and his statistical interpretation
16
-
which inappropriately
Otherwise, the restrictions
will appear to be incorrect.
I would like to make one additional point at this juncture.
regression
should
In
Dr.
is incorrect.
The term
17
“between”
has been used at several points in the discussion,
18
has been argued that the “between” regression
19
appropriate
20
means issue. But it should be noted that there is an important
21
between the group means regression and the “between groups”
22
fixed effects model is the ‘within groups’ regression.
23
regression
is a weighted
24
dependent
variable from the overall mean on the same transformation
25
independent
variables,
26
observations
made for each site. It is easy to show-it
27
my aforementioned
and it
(using group means) is more
than the fixed effects model. The preceding addresses
the site
distinction
regression.
The “between
The
groups”
regression of the deviation of site means of the
without a constant term, and weighted
text, for example-that
of the
by the number of
is done on page 619 in
the same regression
model applies to
30
1
this data set as to the original data set. But this is not the group means
2
regression
3
involves a simple regression
4
term.
5
effects is correct.
6
groups estimator both estimate the same parameters,
7
regression
8
C~
that Dr. Smith has suggested.
This regression
Dr. Smith’s group means regression
of site means on site means, with a single constant
is only appropriate
Otherwise,
if the original model with no site specific
the between groups estimator and the within
while Dr. Smith’s
produces results that are biased.
The preceding
has addressed
some econometric
There is a
9
substantive
conclusion.
IO
regression,
with means constructed
11
effects model, as somehow
12
makes little sense.
13
regression
14
regression.
15
the worst the fixed effects model could do would be to mimic it. The fact that it
16
does not is strong evidence
17
is incorrect.
18
available
19
regression.
20
2. Measurement Error
21
Dr. Smith has advocated
fine points.
the group (site) means
from the underlying
data used to tit the fixed
superior to the fixed effects model. Logically, this
Statistically,
it is simply incorrect.
which imposes the improper restriction,
It is the group means
not the fixed effects
A fortiori, if Dr. Smith were correct about the means regression,
His statement
that the assumption
that the “between
underlying the means regression
model” is the least bad model
is not correct either, even if he had fit the appropriate
Dr. Neels has suggested
then
between
groups
that aggregating
the data to group means helps
error problem.**
The logic is that averaging
22
to ameliorate the measurement
23
tends to average out the measurement
24
would be a very useful one if it were true. Unfortunately,
error.
It’s an intriguing argument,
and
it is not. Once again,
~** Docket No. R97-I,
TR. 28/15626-15630.
31
the statistical rebuttal is simple. It is true that in the averaged
of the measurement
data, the variance
error is now divided by the number of observations.
However, the variance of the mean of the systematic component
is likewise.
That leaves the crucial reliability ratio that I discussed earlier unchanged.
is measurement
error, it will exert the same influence on a model fit with site
means as it would on the underlying disaggregated
7
8
9
-
3. A Pooled
data.
Time Series Regression
Dr. Neels has offered a pooled, aggregate yearly time series regression
based on system wide costs and volumes as an appropriate
IO
volume variability.23
11
in volume variability
12
calendar years in response to changes
13
response to technological
14
assumptions
15
the outset discards
16
disaggregated
17
If there
tool for analyzing
If it could be argued that there were no systematic
factors across sites or activities, no adjustment
within
in output, and no long run adjustment
advances, this might be appropriate.
seems warranted.
variation
in
None of these
And whether they are or not, assuming them at
nearly all the useful information to be had from the
data set.
The (lack of) usefulness of the time series regression suggested
18
same as that for the group means regression.
19
clear. If it were appropriate
20
no site specific and no period specific effects-then
21
disaggregated
22
parameters.
23
they differ substantially,
approaches
to aggregate
Once again, the statistical result is
the data-in
this case, that would mean
the aggregate
and the
would produce similar estimates of the same
The disaggregated
23 TR. 27/12835-l
is the
approach
cannot make things worse.
as they do here, the right conclusion
When
to draw is that the
2843.
32
1
aggregated
2
data will give the right answer whether or not Dr. Neels’ approach
3
D. Alternative Estimates Based on a Reverse Regression
4
is causing the problem.
In a demonstration
To reiterate, the disaggregated
intended to show that piece handlings
5
volume, Dr. Neels presents
6
regression
7
decades ago in the sex discrimination
8
Economic Statistics, (April 1984) in a symposium
9
paper.)
a set of results
(TR. 27/12805-12810).
using a technique
This technique
The logic of the technique is suggested
of wages on qualifications,
job characteristics,
11
other control variables,
12
variable will be positive and the coefficient
13
so, then in a regression
14
variables, the coefficient
15
women are underpaid,
rise faster than
known as reverse
roughly two
in which I have a contributed
by the following:
In a regression
a dummy variable for sex, and
if there is discrimination,
of qualifications
originated
is correct.
(See Journal of Business and
literature.
10
16
P
approach
the coefficient
on qualifications
on the dummy
will be too small. If
on wages, the sex dummy, and the other
on the dummy variable will be too high.
then they are overqualified
Dr. Neels has extended this methodology,
In essence, if
for the jobs they have.
using an alternative
error. In particular,
approach
17
to regression in the presence of measurement
18
avoid the pitfalls of errors-in-variables
19
respect to FHP using the reverse regression
20
variables.. .” TR. 27/12806.
21
reciprocal of the elasticity of FHP with respect to TPHlF derived from the reverse
22
regression.
Id. The reasoning appears to be that in reversing the dependent
23
independent
variables,
24
on the right hand side and mitigate the measurement
25
the direct regression.
26
measurement
bias, I estimate the elasticity
he states, “[t]o
of TPHlF with
of FHP on TPHlF and other
Then, to obtain the desired result, he takes the
he can pile the measurement
and
error into the equation error
error bias that might affect
Once again, he cites my text: “It is a well known result that
error in the dependent
variable
is absorbed in the error term and
33
1
can be ignored.”
2
enough),
3
rearranged
4
and the estimator
Id. (footnote omitted).
but the regression
The quotation
result is not. The reason is that, even with the
equation, the measurement
remains inconsistent,
error is still in the independent
The prototype for the original regression is:
6
y = px* + E
7
x = x* + u
9
exactly as specified earlier. The thinking in Dr. Neels’s reverse regression,
apparently,
10
11
12
variable,
as I now show.
5
8
is right (well, close
is embodied
x* = (l/p)y
in:
- (l/P)&
so that:
X
13
= x* + u
=
(llf3)y - (l/P)&
+ u
14
which is a conventional
15
that was quoted earlier. We just absorb u in the disturbance,
16
regression,
17
conventional
18
result. The least squares slope estimator in the reverse regression of x on y is
19
an estimator
20
21
regression
then take the reciprocal
regression
which appears to obey the claim from my text
of the coefficient.
compute the
It doesn’t work. Relying on
results, and to avoid technical details, I go directly to the
of the quantity
6 = Cov[x,y] / Varfy].
We can derive this just by going back to the original specification;
P
22
6 = ph* I [p*h* + 021
34
1
2
b = 116
3
which estimates
4
not 6 but
l/6 = p[l
+ o*/(p?b2)].
Whether or not there is measurement
6
measurement
7
that he seeks.
error-the
independently
of the
the true coefficient
His elasticity estimates are biased upwards.
What went wrong?
9
trades one misspecification
What went wrong is that this manipulation
for another.
simply
Looking back at the rearranged
equation,
X = (l/p)y
- (l/P)& + u
12
what we find by manipulating
13
correlated with the ‘disturbance,’
14
another assumption
15
estimates from the reverse regression
16
on reverse regression
17
and direct regression
18
shown above.
19
error-indeed,
Neels estimator will overestimate
8
11
Neels’ estimator would be
variance.
5
10
F
where e* is the disturbance
it a bit is that the ‘independent
-(l/b)s+u;
of the regression
the covariance
variable,’ y, is
is -o*/p. This violates
model, and renders least squares
inconsistent.
In fact, the accepted result
in the presence of measurement
error is that the reverse
estimators bracket the correct result, which is what is
I hesitate to generalize from narrow results. The reverse regression
20
estimator is generally
21
because it is usually embedded
22
least in this case, we are not completely
23
measurement
24
safe here. Reverse regression is not an appropriate
going to be vastly more complicated
than I have made it,
in a multiple, not simple regression
model, and at
agreed on the nature of the
error in any event. I do believe, however, that a firm conclusion
is
way of “avoiding the pitfalls
35
1
of errors-in-variables
2
be accepted.
3
done so directly.
4
E. Visually Compelling
5
bias.”
This method and the estimates presented
Rather, if the problem is to be analyzed
should not
and solved, it should be
Plots
In both his 1997 and 2000 testimonies,
6
low resolution figures generated
7
suggestion
8
volume variability.
9
“visually compelling
witness Smith has relied on some
by the SAS computer
program to bolster his
that the data and evidence in hand are consistent with 100 percent
Indeed,
in the 1997 testimony
in demonstrating
a variability
he characterizes
approaching
labor hours and mail volume.“24 Irrespective
the figures as
100 percent
10
between
11
results discussed
12
relying on any visual devices of this sort. I do not believe that the eye is capable
13
of resolving such evidence
14
convincing.
in this proceeding,
of any other evidence or
I would caution the Commission
at a level approaching
“compelling”
against
or even
I offer the figure below as motivation for this belief. The figure
24 Docket No. R97-1, Tr. 28/l 5847.
36
5000
5000
10000
15000
25000
20000
X
P
1
contains a plot of 500 points that, as those in Dr. Smith’s graphs do, certainly
2
appear to be consistent with a 100% variability relationship.
3
line plotted in the figure is intended to aid the eye in reaching its conclusion.
4
However,
5
the proportions
6
figure were cleanly produced with a random number generator so that the values
7
of X are drawn from a normal distribution
8
deviation of 3000, while Ywas
9
Xplus
they are not consistent
with such a relationship-by
of interest in this proceeding,
The solid 45 degree
what appear to be
not even closely. The points in the
with a mean of 15,000 and a standard
produced so as to be equal to 2000 + 0.85 times
a random normal draw with mean zero and standard deviation X/100.
10
the terminology
of this proceeding,
11
volume variability.
12
fanning effect in the figure that is typical in Dr. Smith’s figures.
13
not produce
(The standard
the data were constructed
In
with an 85 percent
deviation is intended to produce the kind of
any effect on the slope of an underlying
This feature will
relationship;
it will only
37
1
produce systematic variation around that relationship.)
2
results I have reviewed in the various testimonies,
3
and 1 .OOis substantial.
4
I realize that this little demonstration
so as to produce an impression,
In the context of the
the difference
is simplistic.
between 0.85
The data were carefully
5
constructed
not to mimic any real outcome that
6
an analyst might observe.
7
this, which could be based on such real data, could be very misleading.
8
believe that one could rely on a visual inspection
9
itself of considerably
The purpose is to suggest that visual devices such as
I do not
of such a figure as this, which is
higher quality than Dr. Smith’s, to draw a conclusion
10
precise quantity such as the slope of a regression.
11
Smith’s testimony should not be substituted
12
should not be accepted
about a
Graphs such as those in Dr.
for careful statistical analysis, and
as evidence that casts doubt on any such analysis.
38