USPS-RTJ BEFORE THE POSTAL RATE COMMISSION WASHINGTON, D. C. 20268-0001 POSTAL RATE AND FEE CHANGES, 2000 REBUTAL Docket No. R2000-1 TESTIMONY OF WILLIAM H. GREENE ON BEHALF OF THE UNITED STATES POSTAL SERVICE Table of Contents I. STATEMENT 2 II. PURPOSE 3 III. CONCLUSIONS 4 IV. THE VOLUME 5 A. Dr. Bradley’s 6 B. Dr. Bozzo’s Updated Version of the Bradley Model.. ................................. 7 V. OF QUALIFICATIONS ............................................................... 1 AND SCOPE OF MY TESTIMONY.. DRAWN FROM MY EXAMINATION OF THE STUDIES .. ...4 8 Cost Equation Model. ............................................................. 8 ISSUES RAISED BY THE INTERVENORS 8 A. Sample Selection and the Data ‘Scrubbing’ Procedure 9 B. The Issue of Measurement 10 2 MODELS.. ....................................................... VARIABILITY ECONOMETRIC ............................................. 1 C. Panel Data Treatments .................... ............................. Error.. ............................................................. ............................................................................ 11 I, The Fixed Effects Model vs. a Group Means Model.. ............................ 12 2. Measurement 13 3. A Pooled Time Series Regression.. Error ................................................................................ 14 D. Alternative 15 E. Visually Compelling ........................................................ Estimates Based on a Reverse Regression ........................... Plots ......................................................................... IO 11 14 21 .27 .27 31 32 .33 .36 1 2 I. STATEMENT OF QUALIFICATIONS I am a professor of Econometrics My name is William H. Greene. at the 3 Stern School of Business at New York University and, since 1995, chairman of 4 Stern’s Economics Department. 5 taught Econometrics 6 and Ph.D. degrees from the University 7 1976, respectively. 8 sector in 1980-I 981 at National Economic 9 provided consultation I have taught at Stern since 1981. at Cornell University from 1976 to 1981. I received Masters of Wisconsin at Madison I worked briefly as an economic consultant in 1974 and in the private Research Associates to numerous corporations, Prior to that I and have also including American Express 10 Corp, Ortho Biotech, Inc., and The Reader’s 11 works in econometrics, 12 widely used computer 13 notably for this proceeding, 14 which several of the witnesses 15 including Neels, Smith, Bradley, Higgins, and Bozzo have all cited in their 16 testimonies. 17 Digest. I have published numerous including roughly 40 articles, one of the world’s most programs for econometric computation, LIMDEP, and, the widely used textbook Econometric Analysis, in this and the prior omnibus rate proceeding, I do note that this is my first appearance before the Postal Rate 18 Commission. 19 data systems beyond that contained 20 scope and nature of my testimony will be limited to econometric 21 methodology, 22 Section II. I have no knowledge of the details of Postal Service operations or in the testimonies that I reviewed. about which I have written extensively. ~The technique and I will discuss this further in 1 I 2 II. PURPOSE AND SCOPE OF MY TESTIMONY I have been asked by the United States Postal Service, the sponsor of Dr. 3 Bozzo’s testimony, 4 Smith, both of which seek to rebut Dr. Bozzo’s testimony 5 Michael Bradley in the 1997 counterpart 6 number of issues have been raised regarding the econometric 7 by Drs. Bradley and Bozzo to estimate ‘volume variability 8 in mail processing. 9 processing 10 to comment on the testimonies of Kevin Neels and J. Edward and its predecessor to this proceeding. by In particular, a techniques used factors’ for labor costs (Volume variability is a measure of the elasticity of mail costs with respect to volume.) I have learned through my reading of the various testimonies 11 reviewed 12 cost elasticity 13 work I have read-there 14 based essentially 15 rigorous statistical procedures 16 Bradley and Bozzo have used quite complex multiple regression 17 large data base generated within the U.S. Postal Service system to measure this 18 effect, and have found a large amount of evidence that disagrees with the 19 traditional assumption. 20 variability factors for a large number of specific activities in the mail processing 21 chain is considerably 22 that the Postal Rate Commission is I, or 100 percent. Witnesses has traditionally that I have So far as I have been able to discern from the is a summary in Dr. Bono’s on judgment, assumed that this impression, testimony’-this value is ‘common sense,’ and intuition. were ever used to arrive at this parameter. No Drs. methods and a They found consistent evidence that the volume less than 100 percent Neels and Smith have raised a large number of criticisms of the 23 data, methods and models used by Drs. Bradley and Bozzo and, by implication, 24 their results. ’ USPS-T-l Primary among these issues are: 5 at 4-l 3. 2 1 . Data quality problems and the issue of measurement 2 l Problems consequence 4 incorrect, . 7 l arose as a of the methods by which the data were purged of possibly misreported, or missing values; The issue of functional form relating to the use of certain ‘panel data’ style models-the 6 _.- of nonrandom sampling that they suggest 3 5 error; ‘fixed effects’ model in particular; Certain other issues concerning the ways in which the regression model 8 was formulated-among them the problem of missing variables. 9 I intend in my testimony to limit my attention to issues of econometric 10 technique and model building. 11 the testimonies 12 processing 13 details are advanced 14 doubt on the econometric 15 comments in this regard are superfluous 16 nonetheless 17 their relevance to the econometric 18 somewhat technical. 19 the intervenors 20 themselves, 21 22 amount of fine detail in all I read, about the specifics of and technical procedures aspects of mail and costs, work flow, and technology. Many of these by Drs. Neels and Smith as severe complications that cast results. Although I believe that some of their to the questions at hand, I will not be addressing any of this material, and offer no testimony modeling. Unfortunately, Some of my testimony will be this is unavoidable. Some of the issues that fairly esoteric. will be related to the following general topic areas: The criticisms of the methods by which the data set was ‘scrubbed’ 23 miss some important points about sampling, 24 particular, the nature of model building, and, very importantly, 25 issue of ‘sample selection.’ 26 27 l as to have raised, such as the problem of ‘sample selection,’ are, My testimony l There is an enormous The discussions random sampling in the Neels and Smith testimonies issues of measurement in the relating to error rely upon some widely held 3 1 misconceptions 2 issue is incomplete, 3 about the topic. Most of their discussion on this and some of it is incorrect. Much of the discussion of the ‘fixed effects’ model for panel data, l 4 particularly the claim that it is inferior to a pooled regression 5 regression based on group means, is incorrect. 6 My testimony will briefly review the models developed 7 Dr. Bozzo, and the criticisms 8 detailed summary appears 9 specific econometric by Dr. Bradley and of them raised by Drs. Neels and Smith. in Dr. Bozzo’s testimony. or a A more I will then turn to the issues listed above. To summarize my conclusions, I find 10 that while some of the criticisms 11 useful guidance for refinement 12 variability, 13 exaggerated 14 the flaws suggested 15 Bradley’s and Bozzo’s estimates 16 contrary, from what I have read, I believe that the Bradley and Bozzo studies 17 provide strong evidence 18 reconsidered. 19 activity, I do believe that the two studies combined 20 that the right results will be substantially 21 22 Ill. CONCLUSIONS DRAWN FROM MY EXAMINATION OF THE STUDIES 23 raised by Drs. Neels and Smith might provide of the data used for estimating models for volume many of the more methodological and/or misleading. among their comments are I disagree with the suggestion that virtually all of by the intervenors would have acted systematically of volume variability downward. to bias On the that the 100% volume variability assumption should be While I am not prepared to commit to any specific value for any provide a strong suggestion less than one. I would not say at this juncture that every econometric 24 that could possibly be addressed 25 addressed. 26 amount of evidence that the Commission I would definitely or modeling issue by Dr. Bradley or Dr. Bozzo has been conclude that they have provided a substantial should take very seriously. 4 1 The Commission should have taken a much more favorable view in 1997, 2 and should at this time consider the panel data, fixed effects form of econometric 3 analysis an appropriate 4 mail processing 5 regressions advocated by Drs. Smith and Neels discard far more useful 6 information than the data scrubbing 7 Dr. Smith is simply wrong that the simple regression 8 other is the “least bad” model. 9 of group means on each other is not the ‘least bad’ model; it is the second platform for continuing work on developing costs. The aggregate a model for means models and time series operation of which they have been so critical. of group means on each Given the data set at hand, the simple regression most 10 bad model. 11 by Dr. Neels, followed by Smith’s site means model, and the best of the lot is the 12 fixed effects model. The arguments 13 their alternatives 14 rejected on this basis alone. 15 advocated 16 kind of visual approximation 17 The worst is the grossly aggregated time series regression proposed advanced by Smith and Neels in favor of are based on flawed statistical reasoning, The same conclusion and should be applies to the visual devices by Dr. Smith. I do not believe that the Commission should accept this as a substitute for careful econometric The MODS and PIRS data are obviously far from perfect. analysis. But, from my 18 vantage point, they appear to be quite good, and in the absence of a well 19 designed 20 variability, 21 What is important is for the Commission 22 data using appropriate 23 testimonies 24 that intuitively appealing evidence 25 deviations of the measurement 26 (discussed below), which suggest an alarming amount of measurement 27 appear to be much more discouraging and sharply focused data set designed specifically are as good as an analyst of mail processing criteria. for studying volume costs could hope for. and otherresearchers to evaluate these The criticisms raised in the Neels and Smith are, in many cases, inappropriate. is misleading. Likewise, it sometimes happens For example, the standard error cited by the Commission in its Opinion error, than they really are. The intervenors in 5 1 this case have thrown up an array of criticisms of the data set that raise a 2 standard that could never be met. Apparently, 3 for the purpose for which they were used in this proceeding. 4 the case with large micro level data sets. 5 to assert that there is useful information 6 of volume variability. 7 researchers 8 not go to great lengths to discredit the data, and then discard them and the 9 analysis based on them. 10 .c the MODS data were not created Nonetheless, But that is usually it does seem reasonable in the MODS data for the determination I would suggest that the Commission take the view that should extract from these data what useful information On the other hand, if the Commission they contain, desires to pursue the line of 11 research begun in these studies of volume variability, then continued 12 development 13 maxim in econometrics 14 The reason is almost self-evident. 15 information contained 16 the other hand, if it is genuinely believed that the micro level data contain no 17 useful independent 18 cannot be reversed. 19 and Smith that analysis of the MODS data should be done using site means of 20 the same data set that could be used in disaggregated 21 of micro level data should be undertaken. In that connection, it is a that micro level data are always better than aggregates. Aggregation almost always discards in micro level data, and never creates new information. information, then they can be aggregated. By this construction, On This process I am unable to agree with Drs. Neels Finally, what kind of model should be developed? form. It is clear that it is 22 appropriate to use multiple regression to model the response of labor costs to 23 output-the appropriate 24 them is an issue to be settled elsewhere. 25 logarithm) on output of any sort (or its logarithm) will surely ignore many other 26 factors that that should be in the equation, 27 that Dr. Bozzo has analyzed. definitions of these two variables and how to measure A simple regression of hours (or its including the site specific differences I also assume that the various models proposed 6 1 will be based on the same data that have been used in this set of studies. 2 instance, given the availability 3 proposed 4 do no worse, and will generally do better, than the site means (‘between 5 models suggested 6 The more crude the aggregate, 7 worst, if the disaggregated 8 which is contained in the group means, then a model based on the 9 disaggregated 10 - In this of micro level data, the fixed effects models by Drs. Bradley and Bozzo are appropriate. At a minimum, they can by Drs. Neels and Smith. Aggregation the more information groups’) discards information. is discarded. data really do not contain information At the very beyond that data would simply mimic the model based on aggregated Lastly, there is the question of econometric practice. data. The worst extreme I 11 see here is Dr. Smith’s willingness 12 dimensional 13 this and the appropriate 14 intervenors, 15 in direct contradiction 16 improved the estimates. 17 that imposes the restriction 18 removes a bias inherent in the fixed effects model is simply not true-exactly 19 opposite is the case. 20 cannot. 21 developed 22 panel data set. The models have been subjected 23 sensitivity tests, and include features such as dynamic structure, time and site 24 effects, models for autocorrelation, 25 equation. 26 within which the Postal Service should be analyzing mail processing to rely on gross and misleading, crude two- scatter plots to defend a specific estimate of a parameter. model lie the pooled regressions in which they impose restrictions to long established suggested Between by the on a regression model, then argue, results, that these results have In particular, the suggestion that a pooled regression that there are ‘no site specific effects somehow the Imposing restrictions can induce biases, relaxing them At the other end of the scale are Drs. Bradley’s and Bozzo’s carefully econometric models that embody current practice using an elaborate to numerous specification and and flexible functional form for the estimated As I noted earlier, I believe that this is the appropriate framework costs. 7 1 IV. THE VOLUME VARIABILITY 2 A. Dr. Bradley’s Cost Equation Model 3 Dr. Bradley’s MODELS model was estimated 4 large number of specific installations 5 crucial variables were:’ using a data set that provided for a at numerous points in time. The three 6 HRSit = hours of labor at site i in period t 7 TPHit = total pieces handled at site i in period t 8 MANRit = manual ratio, a site specific measure technology 9 of one aspect of the at site i in period t. 10 The specific equation estimated for ‘Direct Activities’ (sorting, etc.) includes 11 linear, quadratic, 12 contained in a time trend which allows for a discrete change in the time effect at 13 a midpoint in the period of the analysis, one period lagged terms for the IogTPH 14 variable and its square, and a site specific dummy variable which allows for the 15 site specific constant, or ‘fixed effect.’ 16 plus seasonal dummy variables, plus the site specific constants, 17 large. Additional 18 a correction for autocorrelation 19 involved, model was specified for the ‘Allied Activities.’ 20 and all cross products of these three variables, lagged effects are introduced consequence, 22 sample all observations:3 24 includes 15 variables so it is quite into the model through the use of in the disturbances. The data used for the study contained 21 23 All told, the equation time effects A similar, but slightly more numerous obvious flaws, and as a they were ‘scrubbed’ by a procedure that removed from the (1) that were not part of a continuous observations that were otherwise sequence of 39 consecutive ‘clean;’ ’ See Docket No. R97-1, USPS-T-14 at 12-22. 3 Id. at 30-37; see also USPS-LR-H-148. 8 (2) for which variables were obviously missing or erroneously 1 2 zeros; (3) that were ‘outliers,’ in that they were in the top or bottom one 3 4 5 percent of the distribution of the variable HRS/TPH. Dr. Bradley subjected this model to numerous specification 6 tests for whether 7 rarely survives this test-tests 8 and a test for autocorrelation. 9 coded as Estimates tests, including a fixed or random effects model was appropriate-the latter for the presence of any site specific effects at all, of the crucial elasticity of hours with respect to TPH for the 10 direct activities ranged from 0.395 to 0.945; none of the estimates 11 one.5 The counterparts 12 number of other regression 13 similar results. The consistent outcome was that the volume variability varied 14 across operations, 15 equations were subjected to various specification 16 using several different methods, for example without the autocorrelation 17 correction. 18 estimation 19 exceeded for the Allied Activities ranged from 0.720 to 0.829.6 A results were presented for other activities, all with rarely approached one, and almost never exceeded it. The tests, as noted, and estimated The elasticity estimates were quite robust to the changes in the methods. Dr. Bradley conducted an analysis of the effect of measurement 20 the TPH variable as well, using a method suggested 21 panel data.’ 22 albeit inconsistent, 23 inconsistent error in in Hsiao’s monograph As he points out, with panel data one can compute on two different, estimators of the slope coefficient and, at the same time, two estimators of the measurement error variance. 4 Id. at 41-51. 5 Id. at 54. 6 Id. at 63. ’ Cheng Hsiao, Analysis of Panel Data, Cambridge 65. Solving two University Press 1986, at 63- 9 1 equations 2 each of these two parameters. 3 model, and found that the consistent 4 effects estimator. 5 case, does) produce a negative variance estimator.* 6 issue-Hsiao’s 7 skepticism 8 correct, but whether this is the best way to approach the analysis. 9 applies in a very narrow specification, 10 result in a very ordinary circumstance. 11 emphasize, 12 test and, I suspect, not the most robust one that could be constructed. 13 B. Dr. Bouo’s it is possible to obtain one consistent estimator of Dr. Bradley carried out the analysis in a restricted estimator was quite close to the fixed As Dr. Neels pointed out, Hsiao’s method can (and in this This is a small sample results are based on infinite sample results. of this procedure, not over Hsiao’s analytical I confess some results, which are Hsiao’s result and produces the disturbing variance It is prone to this finite sample problem. the test is not biased and is not misleading. I It is simply one possible Updated Version of the Bradley Model Dr. Bozzo’s model is similar to Dr. Bradley’s 14 _- in two unknowns, In constructing to remedy some of the defects in Dr. Bradley’s it, Dr. Bozzo 15 attempted 16 by the intervenors 17 scrubbing 18 relating to the capital stock and another relating to wage rates. As before, the 19 model fit was a translog (log quadratic) 20 effects). 21 and its square, as opposed to one in the earlier study.g The data preparation 22 Dr. Bozzo’s model is considerably 23 equation and by the Commission, procedure, model that were argued including the use of the data and the absence of several other variables, including model with site specific intercepts one (fixed The translog model was specified with four lags of the IogTPH variable is also considerably more elaborate than Dr. Bradley’s, for The more elaborate, involving the following variables: ’ Docket No. R97-1, TR. 28/l 5637. ’ USPS-T-l 5 at 117-l 18. Note that since Bozzo also changed from AP level to quarterly data, his model embodies a lag structure that is effectively 13 times longer than Bradley’s, 10 1 HRSir = the log of hours 2 TPHit = the output (volume) variable (enters with four lags) 3 CAPit = the capital stock variable 4 DELii = deliveries, 5 WAGEit = the wage variable 6 TRENDit = trend variable to capture smooth time effects 7 MANRit = the manual ratio 8 QTR, = specific quarterly 9 to capture network and density effects dummy variables. Dr. Bozzo estimated the model without transforming 10 deviations, 11 relevant elasticities 12 variables. 13 ranging from 0.522 to 0.964. 14 considerable 15 estimation and forms ofthe equations estimated on the quantitative 16 worth noting that Dr. Bozzo examined his results for sensitivity to different 17 methods of estimation of the elasticities, 18 various computations 19 151-160. 20 v. 21 unlike Dr. Bradley. the data to overall mean The point is important as, in the current case, all become lengthy functions of the parameters The estimated elasticities and the obtained are similar to Dr. Bradley’s, (USPS-T-15 at 119-120; 126). Since attention has been paid to the effects of different methods of and computing produced very similar results. results, it is and found that the Id. at 130-131, 140-141, See also USPS-LR-I-107. ECONOMETRIC ISSUES RAISED BY THE INTERVENORS As noted earlier, an extremely long list of problems with the preceding 22 analyses was raised by intervenors Neels and Smith. 23 whether the variables 24 measures of the activity being analyzed, Many of these related to used in the analyses were appropriate or accurate e.g., whether total pieces handled 11 ,--- 1 (TPH)” was an appropriate 2 plant and equipment 3 stock. 4 necessary 5 However, 6 . 7 mentioned 8 measurement 9 toward zero-that volume measure and whether the Postal Service’s data really contain useful information about the capital I will not be commenting background on these concerns or knowledge as I do not have the about the specifics of the Postal Service. they did, as well raise several issues related to the econometrics. Dr. Neels: Most of Dr. Neels’s rebuttal focused on the data issues above. He did make a number of points about the effects of error that he feels persistently bias the estimated elasticities is, toward a result less than one. He was also critical of the 10 screening of the data which produced the estimating 11 below. 12 . 13 Bradley and Bono Dr. Smith: Dr. Smith has raised a daunting studies. sample. I will address this litany of criticisms of both the I will focus my testimony on only a few of these: 14 (1) He, as does Dr. Neels, criticizes the data scrubbing procedure. 15 (2) He feels that the analysis is ‘short run’ in nature, and is therefore inappropriate 16 for the phenomenon 17 (3) He feels that observable 18 Dr. Bozzo’s analysis. 19 being studied. (with his eyes) evidence contradicts the results of (4) He is critical of the panel data, fixed effects model that Dr. Bozzo used. A fair amount of Dr. Smith’s testimony is centered on issues of underlying 20 21 microeconomic theory. 22 underpinnings 23 address 24 proceeding 25 and econometrics Some of this is used to criticize the theoretical of Dr. Bozzo’s study. It is not my intention in this testimony issues of the underlying as an econometrician. to theory of any of this analysis; I come to this However, I am familiar with the economics of the estimation of cost and production functions,. lo Total pieces fed (TPF) was used in place of TPH in the automated mechanized operations. See USPS-T-15 at 50-52. My doctoral and 12 .,- 1 dissertation and a subsequent paper published in the Journal of Political 2 Economy 3 studies (they were among the first) of the marriage of theory and empirical 4 estimation 5 standard 6 valid points in Dr. Smiths discussion 7 function, there are also some noteworthy 8 that “Dr. Bouo’s 9 conclusions.” with Laurits Christensen have long been regarded as pioneering of cost and factor demand equations-they application presented in microeconomics are, for example, textbooks. a While there are of the theory behind an appropriate errors. treatment of homotheticity cost For example, Dr. Smith states appears to lead to incorrect Tr. 27/l 3196. He then states: 10 11 12 13 14 15 16 17 18 19 20 21 In his testimony, Dr. Bozzo asserts that ‘I... capital and labor variabilities will be identical in equilibrium under the assumption that cost-pool-level production (or cost) functions are ‘homofhefic’ . . . Homotheticity implies that changing the level of output of the operation will not alter relative factor demands such as the capital/labor ratio, in equlibrium (and other things equal). However, the Postal Service testimony is replete with examples of the implementation of major investments designed to reduce costs. . . . The focus is on the elimination of major labor costs via capital investment to achieve an overall reduction of total costs. Accordingly, the application of a homotheticity assumption appears to be an inappropriate assumption. (Id.) 22 Nowhere does the theory state that the firm in equilibrium will never adjust 23 its capital labor ratio in response to changes in relative prices. 24 technology is homothetic, the firm will respond to a change in relative prices by 25 substituting away from the factor that is becoming 26 unable to. This has nothing to do with whether the production function is 27 homothetic 28 either Dr. Bozzo or Dr. Smith argued that the Postal Service operates under 29 conditions 30 in the face of increasing wage rates. The wages that appear in the labor demand 31 functions more expensive unless it is or not. It is a question of factor substitution, of fixed coefficients, estimated by Dr. Bouo Even if the and I do not believe that in which it would never substitute capital for labor allow for adjustment in response to changes in 13 wages over time, and are not consistent with a fixed coefficients Smith seems, as well, to suggest that the technology assumption. of production Dr. in the Postal Service is nonhomothetic, which it may well be. But no empirical evidence for this has been presented, and the mere fact that the Postal Service has invested in labor saving capital does not say anything on the subject one way or the other. 6 7 .- A. Sample Selection and the Data ‘Scrubbing’ Procedure As noted, Dr. Bradley subjected the MODS data to a screening 8 denoted ‘scrubbing’ 9 obviously erroneous that was intended to remove observations and unrepresentative process that were of the underlying relationship 10 attempting to uncover. 11 from the sample-is 12 proceeding. 13 The first is biases. 14 on specific criteria (as opposed to randomly) 15 obtained with the resulting sample. 16 use of sample data-the Ii efficiency has not been an issue in this proceeding. 18 comments 19 been made, evidently 20 section will discuss these issues. This data scrubbing-systematic the subject of a considerable he was removal of observations amount of discussion in this There are two issues that are brought forth by such a procedure. Under certain circumstances, based can induce biases in the estimates The second concerns the issue of efficient problem of ‘throwing by the interveners selection of observations away information.’ In point of fact, However, at some points, that are related to this issue have nonetheless to cast doubt on the Bradley and Bozzo studies. 21 To review, Dr. Bradley’s screening 22 (1) He removed observations procedure This involved the following steps: with obviously missing values, zeros coded 23 for certain activities known to be taking place at the facilities observed, 24 and observations 25 threshold. for which reported output was below a specified 14 (2) He imposed a continuity requirement that the remaining site contain at least 39 useable consecutive observations. (3) He removed the top and bottom 1% of observations productivity-the data set for a based on ratio of pieces handled to labor hours. The productivity screen could disrupt the ‘continuity’ of the data series for some sites, so his data scrub was iterative in that after step (3) it was necessary to revisit step (2). Among the issues raised by the intervenors process removed an extraordinary was that this screening amount of data from the sample.” The 10 response to this point has been made in passing by Dr. Bozzo, but it bears 11 repeating. 12 scrub.‘* 13 outset that the underlying 14 points, then absent the problem of nonrandom 15 paragraph, 16 obtained 17 Bradley could have simply randomly thrown away three quarters of the 18 observations, 19 intuition would correctly suggest that this waste of information would be costly. 20 But the cost is that of having a smaller sample, which leads to less precise 21 estimates 22 relation at all to whether or not those estimates are systematically biased in one 23 direction or another. is a red 24 herring. The samples involved in these analyses However, irrespective are large, even after the data of the size of the samples, if we are agreed at the model that we seek to discover applies to all the data selection discussed in the next the amount of data discarded has no bearing on whether with the remainder the results are biased or not. Under the assumption, and safely based his results on the remaining quarter. than one might otherwise obtain-i.e., Dr. Certainly, larger standard errors. The issue of how many data were discarded ” E.g., Docket No. R97-1, TR. 28/l 5609-619, 15632-l 5633, 15853. present docket, see Tr. 27/l 3163, 13172 and TR. 27/l 2796-l 2800. ” USPS-T-15 at 20-22,95-102. It has no In the 15 1 There is a substantive issue concerning how the data were removed from 2 the sample. The overriding issue is whether the criteria used to discard data 3 points were themselves 4 case, the log of the hours variable. 5 does not appear to have been noted by the intervenors 6 Bozzo. 7 which the scrubs were based were the output variable, an independent a and the hours variable, the dependent 9 removal of data from the sample need not impart any particular related to the quantitative This immediately In particular, the missing or questionable variable. being studied, in this raises a consideration that or by Drs. Bradley or values in the data set upon variable, In the former case, once again, from the sample because bias to the IO results. 11 missing or miscoded 12 underlying 13 which remain. 14 similar in its impact to throwing away observations 15 may amount simply to wasting possibly useful information. 16 output variable is erroneous while the hours variable is correctly coded, then my 17 rule that the model must apply to all the data points would not hold, and the ia observation 19 with zero hours makes no sense. 20 Removal of observations measure simply makes the sample smaller. relationship still applies to, and is discernible Discarding observations should be discarded. the output variable is Once again, the from, the observations based on values of the output variable is randomly. For an obvious example, On one hand, it On the other, if the positive output coded That leaves the missing or badly coded data on the dependent variable, 21 hours. Bradley and Bozzo note a few cases.13 Zero values recorded within a 22 sequence of positive values are obviously 23 the assumption 24 be discarded. erroneous. These once again violate that the model applies to all data in the sample, and they should Bradley identifies another case, that of a ‘ramping up’ period, in I3 E.g., USPS-T-15 30. at 109-110. See also Docket No. R97-1, USPS-T-14 at 16 1 which the hours data would be unrepresentative.14 2 I am not able to comment on specific technical 3 in mail handling. 4 settle whether this is an appropriate omission. 5 suggests to me that it is. aspects of the production process As such, I leave it to the others involved in this discussion to My own reading of the testimony The final case, and the one that does merit some attention 6 P As I noted in my introduction, 7 operation. a from his sample, reasoning 9 treated as erroneous is the trimming Dr. Bradley eliminated the extreme values of hours per piece handled, that these were unrepresentative and should be data.15 This is a specific sample selection rule that could, in Dr. Bradley 10 principle, color the results obtained with the remaining sample. 11 removed the top and bottom 1% of the distribution 12 a more detailed screen of this sort.16 This productivity screen has called forth a 13 criticism about “sample selection.” 14 issue, but I believe there is a consideration 15 selection has become something of a bugaboo 16 should be precise in our use of the term. 17 potentially induce is a truncation bias. The distinction is important ia because not very much is known about sample selection bias-except 19 bad-but 20 has direct bearing on this case. 21 Dr. Bozzo has commented productivity in econometric What the productivity on the First, sample analysis, so we screen could in this context that it is and some of what is known scrub of the data amounts to a trimming 22 operation. 23 proper term is ‘truncation.’ 24 variable does have the potential to do some mischief. Although specifically that should be added. a fair amount is known about truncation, Dr. Bradley’s with this rule. Dr. Bozzo used the term ‘selection bias’ has been used in this context, the Extracting data based on values of the dependent The pure theory of the l4 Docket No. R97-I, USPS-T-i 5 at 30. l5 Id. at 32. l6 USPS-T-15 at IOI-102,110-112. 17 1 issue (discussed at various points in Chapter 20 of my text) does suggest that 2 trimming the tails of the distribution 3 estimator toward zero. There are two crucial variables 4 the trim and the amount of the distribution 5 lesser the former and the greater the latter, the less ‘damage’ is done. 6 regard, Dr. Bradley’s productivity scrub scores well on both counts, in that he 7 removed a fixed and very small percentag-ne a Bozzo’s scrub was more complicated, 9 observations would bias the least squares regression here, the asymmetry that remains after the culling. percent-from The In this each tail. Dr. in that he did not symetrically from the tails of the productivity distribution of cull as per Bradley, but 10 rather used cutoffs based on a priori information on maximum and minimum TPH 11 per hour. 12 But, in any event, looking at his Table 3 (USPS-T-15 13 two exceptions, 14 productivity 15 truncation 16 By foregoing 17 obtains largely similar results. What I find surprising, 48 these results is that the theory suggests the estimates should rise, not fall, when 19 the suspect observations In fact, most of the 20 estimates fall, some substantially. 21 to the assumptions 22 conclude that concerns about attenuation 23 probably misplaced. 24 25 It is impossible to tell what if any truncation the numbers of observations bias would result from this. at 107) we see that, with removed from the sample by the scrub are generally so small that it would be hard to argue that the effect would be substantial. His Appendix A (id. at 140) is suggestive. the productivity screen and using “All Usable Observations,” are put back in the sample. and encouraging, about Dr. Bozzo’s type of screen does not conform in my text, so I don’t see this as a contradiction. due to truncation This leaves an important consideration, and Bozzo discussions, he I do of the data set are which entered both the Bradley the data continuity issue. Dr. Bradley imposed a 39 ia . .c 1 contiguous 2 with autocorrelation, 3 tools (the econometric 4 are permissible, 5 either case, it would be a question of sample size, not systematic 6 sample. 7 observation threshold on his sample.” Since he was fitting models this was primarily a practical consideration. software package, TSP) in which time series with gaps so the continuity requirement I am reluctant to generalize Dr. Bozzo used becomes a nonissue. But, in coloring of the from narrow results to sweeping conclusions a (though, in fact, both Dr. Neels and Dr. Smith have done so, using results taken 9 from my book). But I do believe, based on the considerations IO that the attention 11 data scrubbing 12 errors and other obvious flaws must be cleaned before being used. The samples 13 used were large to begin with, and remained so after the trimming operations. 14 By and large, the trimming operations 15 not innocent, 16 be very small. 17 paid to the criticisms discussed above, procedures raised by Neels and Smith concerning the has been exaggerated. were innocent. Data that contain recording To the extent they were the received theory suggests that the problems The foregoing created are likely to is not meant to be glib. Data cleaning of this sort must be ia done carefully, particularly when the end result of the statistical 19 an input into a process as important 20 dispute the possibility that the data scrubbing procedures 21 were less than ideal, perhaps even less perfect than it potentially 22 been had it been done differently at the time. Dr. Neels has raised some valid 23 criticisms of the procedures; 24 also provide the clearest possible picture of how processing volume” is well taken.” as setting postal rates. his observation process will be Moreover, I do not used by Dr. Bradley that “unusual could have observations . . . may costs vary with In his update of Dr. Bradley’s model, Dr. Bozzo backed ” Docket No. R97-I, USPS-T-14 at 31. ‘* See Docket No. R97-1, TR. 28/15613. 19 1 away from some of Dr. Bradley’s procedures. But in its Opinion and 2 Recommended 3 Appendix 4 45 pages (pp. 24-34) to this issue, and the conclusions 5 dire. 6 conclusions were unduly pessimistic. 7 Commission stated ‘Witness a data elimination that econometricians 9 values that are accurate as well as those that are erroneous, Decision from Docket No. 97-l F), in the discussion (PRC Op., R97-1, of the cost models, the Commission Volume 2, devoted 11 of it reached were quite I believe that while many of the issues raised were appropriate, the After reviewing the procedures, Bradley’s productivity try to avoid. Id. at 26-27. the scrub is exactly the kind of Since the scrub eliminates it leaves a sample 10 that cripples the econometrics,” This is not true. Notwithstanding 11 the truncation 12 observations 13 producing 14 standard 15 The screen left more than adequate variation in the sample to allow econometric 16 estimation 17 sample might seem extreme, 18 remained contained thousands 19 attempted to estimate only a relative handful of coefficients. 20 either to use obviously erroneous 21 observations that might have improved his estimates, 22 right course. In order to argue that this data scrubbing 23 econometrics,” 24 not just some of them. 25 The Commission issue I raised above, discarding the extreme, though still valid, will indeed reduce the quality of the sample; it will do so by a model that is less precise (in terms of statistical measures errors) than it might be otherwise. of the model. But “cripples” overstates such as‘ the case. Discarding anywhere from a quarter to a half of a but it must be recalled that the sample that of observations, not dozens, and the analysis Faced with a need data or to discard with those data some I feel that Bradley took the “crippled the one would have to argue that all or nearly all the data were bad, makes one final argument about the data scrubbing 26 process, that the process did not truly purge the sample of erroneous 27 3334. This may well be true, but it is a side issue-the data. Id. at screen was not intended 20 1 for this purpose. They cite certain values derived by Dr. Bradley to illustrate the 2 extent of measurement 3 observation 4 intended to provide complete and appropriate 5 measurement or not TPH is an appropriate 6 or whether errors are introduced 7 TPH are valid concerns, a data discussed 9 extent of measurement error remaining in the data. Two aspects of this should be noted. The first is already made above. The screen was error. Whether by the conversion measure of the output of some other measure to the but they are separate issues from the screening here. The second point concerns two numerical estimates of the of the error that are given. These measures are interesting, 10 are prone to misinterpretation, 11 B. The Issue of Measurement 12 data, not data free of as I discuss in the next section. Error A large amount of the criticism leveled at the Bradley and Bozzo studies 13 concerns the issue of measurement 14 output measure used, pieces handled in Dr. Bradley’s 15 Dr. Bozzo’s case, do not correspond 16 enter the calculation of volume variability.‘g 17 variable which appears on the right hand sides of both regression models is 18 measured with error. From this, he concludes: error. Specifically, Dr. Neels argues that the case and pieces “fed” in to the true measure of output that should He concludes that the output 19 (1) “It is a well established 20 in an independent 21 estimates.” 22 quote from page 437 of the third edition of my text.) 23 24 but econometric principle that measurement variable causes downward error bias in coefficient (Docket No. R97-1, Tr. 28/l 5604. He goes on to state a (2) “Measurement error in an explanatory variable of a linear regression model renders the estimator inconsistent I9 Tr. 27/12792-12793,12802 28/15598-600. and frequently biases et seq. See Also Docket No. R97-1, Tr. 21 1 coefficient 2 he does not invoke my text.) 3 estimates towards zero.” The statements In this instance, above are based on a widely cited, fairly simple result 4 from econometric 5 parameter in a regression 6 (Tr. 27/12800. theory. Suppose that the analyst wishes to estimate the slope model: y = CY + px* + E 7 where x’ is the desired independent 8 denote the observed 9 We further assume that x deviates from x* by a random measurement independent variable, volume in this instance. variable, pieces handled, however Let x measured. error, 10 denoted u, so that x = x* + u. In order to obtain the results that form the 11 backbone of Dr. Neels’s criticism, it must now be assumed that (a) the 12 measurement 13 strictly uncorrelated 14 random samples-(c) 15 respectively. 16 estimator in a least squares regression of y on x (the observable 17 estimate consistently, ia error and the true variable are uncorrelated, across time and with other observations-i.e., we are using the variances of u and x* are denoted 8’ and h2, With these in place, we obtain the fundamental y = p x 1 /(I (b) all variables are result that the slope data) will not p, but + e%*). 19 Two important points to note are, first, that if there.is no measurement 20 8 is zero and least squares does what it should (it estimates 21 when 6 is not zero, least squares estimates 22 This is the source of Neels’s result stated above. 23 There are quite a few misconceptions 24 discussions error, then p), and, second, p with a persistent downward about measurement bias. error in the on the subject that I have seen in this case. 22 1 (1) The suggestion that measurement error biases all coefficients The preceding downward 2 is generally not correct. 3 circumstances 4 present docket or the preceding one involve a simple regression 5 dependent 6 additive error. 7 with error in this fashion, the coefficient 8 is indeed biased downward, 9 simple regression assumed. However, statement is true under the none of the models discussed variable on a single independent In a multiple regression affected as well, in unknown directions. II analyze. 12 constant term is biased upward, not downward. 13 measurement 14 observation 15 regression models, not simple ones. 16 pages after the familiar one cited by Neels.) (2) are There is one easy case to error is to tilt the regression error, the The effect of the line, not to push it down. This is important in this case because all models are multiple (This result appears Whether or not the bias in the coefficients functions of those coefficients, 19 unknown. 20 which a variable is badly measured 21 which may be biased downward 22 upward. (3) variable in the regression example, with measurement ia 23 is measured though not by the same amount as in the 10 17 with on the badly measured case. Also, other coefficients In the preceding of a variable measured in which one variable in the in my text four carries through to biases in such as the volume-variability Any function of the coefficients factors, is in a multiple regression is a mixture of coefficients, and others in some of of which might be biased The end result is not knowable. In time series data with autocorrelation error will be mitigated in the variables, the effect of the 24 measurement if the underlying 25 correlated across time and the measurement 26 particular relevance variables are errors are not. This has here because lagged values of the output variable 23 1 appeared 2 model. 3 In a model in which more than one variable is measured with error, 4 essentially all bets are off. The familiar attenuation 5 be shown. The directions of biases, if any, are unknown. 6 models fit by Drs. Bradley and Bozzo are translog, quadratic in the logs 7 of the output variable, this result applies here. In addition, note that the a square of the erroneously 9 estimated result can no longer measured variable appears which enabled the derivation of the multiple regression 11 also lost. (5) If the original data were measured of additive error case in my text is with additive error, surely the logs of 13 them are not. This means that the blanket statements 14 cited above are incorrect. 15 measured 16 were measured 17 have to be qualified, in ways that are not obvious. 18 Since the in the models by Drs. Bradley and Bozzo, so the assumption IO 12 -~ (4) in the model, through the estimation of the autocorrelation made by Neels The results would obtain if the logs were with additive error, which would be the case if the original data with multiplicative Lost in this discussion error. Thus, the analytic results above is an assessment impact of measurement of the likely magnitude of the 19 quantitative 20 quality data, we cannot say much with any precision on this subject. 21 form some impressions, 22 can be written in the form though. error. Without a large amount of very high We can First, the familiar result on the previous page 23 Y = PXP2 24 where p is the correlation 25 error. As noted, I am not able to make a judgment on this sort of calculation. 26 do note that the R*s in the regressions between the true variable and the one measured with I reported by the various authors are 24 .1 exceedingly 2 is to bias the fit of the model downward. 3 measurement 4 this. Suppose the measure were multiplicative. 5 instance, the result becomes a bit cleaner, 6 high, sometimes above 0.99. Another effect of measurement error Given values this high, I suspect that error is not a major factor here. There is another way to approach It is possible to show that in this y = p/(1 + e*). 7 Now, what value do we use for the measurement error variance? a the pieces handled variable varied in either direction by as much as 20 percent 9 from its true value. This would imply a relative standard deviation Suppose that of the 10 measurement 11 about 8’ = 0.01. This is trivial. While a 20 percent error rate in the reporting of 12 piece handlings seems fairly large, it implies only a 1% bias in the estimated 13 coefficient, since with these values, y = 0.99p. 14 All of these results are narrow theoretical 15 hypothetical 16 overriding 17 measurement ia error of about 0.1, or a relative measurement situation. conclusions error variance of based on a But I do believe that they have relevance here. The result, which will fall out of any analysis, is that the damage done by error will be a function of the ‘reliability ratio’: reliability ratio = variance of true variable I variance of measured variable. 19 This, in turn, is a function of the correlation 20 variables. 21 things as education, 22 this setting, by contrast, we are considering 23 evolve reasonably 24 reliability of output data in this setting is exceedingly 25 argue that criticisms of the models based on measurement between the true and the measured In cross sections, in which researchers attempt to measure such the reliability of self reported data can be extremely low. In very stable flow variables which smoothly through time. I would strongly predict that the high. Consequently, I would error, while certainly 25 P 1 to be taken seriously, are greatly exaggerated. 2 which an observed flow rate of pieces handled differs noticeably from some 3 known true value are not really informative. 4 between the observed measure 5 face of a few egregious reporting 6 these data sets. 7 .- Interestingly, No. R97-1 Opinion. 9 (which I have not seen), isolated examples in What matters is the correlation and what we are trying to measure. Even in the errors, this seems likely to be very high for there are a couple of estimates 8 10 11 12 13 14 15 16 17 18 19 20 Moreover, in the Commission’s Docket They state, citing a table in Dr. Bradley’s rebuttal testimony The standard deviations for total piece handlings (TPH) derived from the variances in Table 3 are 0.268 for manual letters and 0.297 for manual flats. The corresponding standard deviations for the measurement error are 0.123 for manual letters and 0.068 for flats. These results do not support the conclusion reached by witness Bradley that large and material measurement errors are absent from~the piece handling data for these activities.. ..ln the Commission’s opinion these results are inconclusive but tend to support exactly the opposite finding, that large measurement errors remain in the sample after witness Bradley’s scrubs. (PRC Op., Docket No. R97-1, Volume 2, Appendix F at 34.) 21 This seems true. However, one must be careful. 22 in natural units, the crucial variables 23 standard deviations). 24 ratios are 0.2682/(0.2682 25 0.2972/(.02972+D.0682) 26 small is debatable, 27 certainly different from the raw data. While standard deviations for our analysis are variances (squared For the values cited by the Commission, + 0.123’) are the reliability = 0.826 for manual letters and = 0.950 for manual flats. Whether these are large or but the impression one gets from the reliability ratio is 26 1 C. Panel Data Treatments 2 1. The Fixed Effects 3 Model vs. a Group Means Model For the purpose of this discussion, I will focus on Dr. Bouo’s raised by Drs. Neels and Smith in this proceeding. model and 4 the criticisms 5 discussion 6 concentrate 7 of the form: 9 where or is a site specific constant and the remaining terms are the components would apply to the earlier results, but it seems more useful to on the current efforts. Dr. Bozzo estimated a labor demand equation 10 of a conventional 11 cross terms in logs of the independent 12 variable, and autocorrelation, 13 concerns the use of the fixed effects regression model. 14 Some of the regression (though one that involves linear, quadratic, and variables, time effects, lags of the output which make it quite complicated). Dr. Smith takes issue with the fixed effects approach 15 27/13163-65, 1318990.13207-214. 16 graphical and logical demonstration, 17 and that the appropriate 18 site effects. 19 (TR. 27/13210). 20 depiction. A central issue altogether. TR. He argues, through the devices of a that the fixed effects model is inappropriate, model is a regression on means that does not have fixed To focus ideas, I use a caricature of his main figure, his Figure 4 I assume he would agree with this approximation to his 27 F- 1 The three boxes in the figure represent the observed 2 line of best fit that passes through each box is that produced by a fixed effects 3 regression. 4 regression, 5 of the data, which is the heavy line in the figure, and which has a slope much 6 greater than the individual lines. The logic behind the demonstration 7 data inside the box embody a short run relationship 8 long run the site obeys the steeper sloped long run relationship. 9 data for three sites. The But, so Smith’s argument goes, each of these is a short run and the long run regression Missing from this demonstration is the one that passes through the center is that the that the site obeys, but in the is just when the short run becomes 10 long run. At some point, so the argument goes, the site in the lower box is 11 transformed 12 and obeys the long run relationship. 13 within each box (at least figuratively-in 14 sites’ operations 15 the “long run” regression relationship 16 point. 17 stock is constant within a site for the full period of the observations, the to the one in the middle box, as it is then that it climbs up this curve, The problem with this discussion is that actuality within the MODS data) the are observed for several years. What Smith characterizes as certainly should manifest itself at some Dr. Smith claims that the fixed effects model assumes that the capital but this is not .- 28 true.” Dr. Bozzo’s model contains both a measure of the site’s capital stock in each period (imperfect though it may be, it does not appear to be devoid of information) and a time trend. There is evolution of the technology built into the model. Now, let’s consider this argument statistically. appropriate model is the group means regression. make the assumption correct. 9 Let’s suppose that it is. We’ll that the site means regression that Dr. Smith advocates is That means that the linear regression model using the site means, J$=a+pxi+Ei 10 is appropriate. 11 obey the same regression: 12 Dr. Smith argues that the yit = Suppose it is. Then it must be true that the disaggregated ff + @it + data Eit, It is there of necessity, because if this were not 13 Note the common constant term. 14 the case, then the group means regression suggested 15 way there can be a single constant term in the group means regression IS if there 16 is a single constant tern7 in the disaggregated 17 Surely the data would have something 18 regression 19 specific constants would all be the same, at least statistically 20 decidedly 21 hypothesis were appropriate, data. cannot be right. The only Something to say about this. If the group means then when the fixed effects regression is fit, the site not the case. All the tests of this hypothesis of no site effects?’ is wrong here. so. But this is decisively reject the The upshot is that it must be the group means ” TR. 27/I 3190-92. 21 See USPS-T-15 at 123; see also Docket No. R97-1, Tr. 33118021-22. USPS-T-14 at 4143, 29 1 regression, 2 not the fixed-effects 3 restricts the regression model, that is biased- model. There is another point of view here. If Dr. Smith is correct, then surely his 4 group means regression 5 Disaggregating 6 equal constant terms for each site. It is a fundamental 7 when one relaxes a restriction, the very worst one can do is what one did before. 8 Here, what this means is that if we don’t assume the constants 9 they really are, then the regression on the disaggregated could be fit using the disaggregated the data always relaxes restrictions, data. in this case, the restriction of principle of statistics that are the same, and d,ata should do no 10 worse than the group means regression, and the site specific constants 11 resemble each other. 12 fact, the MODS data are speaking loudly and clearly here. Statistically, 13 Smith’s argument in favor of the group means regression is not correct. 14 Logically, it is weak as well, but whether that is the case or not, his graphical 15 device cannot be used to support it, and his statistical interpretation 16 - which inappropriately Otherwise, the restrictions will appear to be incorrect. I would like to make one additional point at this juncture. regression should In Dr. is incorrect. The term 17 “between” has been used at several points in the discussion, 18 has been argued that the “between” regression 19 appropriate 20 means issue. But it should be noted that there is an important 21 between the group means regression and the “between groups” 22 fixed effects model is the ‘within groups’ regression. 23 regression is a weighted 24 dependent variable from the overall mean on the same transformation 25 independent variables, 26 observations made for each site. It is easy to show-it 27 my aforementioned and it (using group means) is more than the fixed effects model. The preceding addresses the site distinction regression. The “between The groups” regression of the deviation of site means of the without a constant term, and weighted text, for example-that of the by the number of is done on page 619 in the same regression model applies to 30 1 this data set as to the original data set. But this is not the group means 2 regression 3 involves a simple regression 4 term. 5 effects is correct. 6 groups estimator both estimate the same parameters, 7 regression 8 C~ that Dr. Smith has suggested. This regression Dr. Smith’s group means regression of site means on site means, with a single constant is only appropriate Otherwise, if the original model with no site specific the between groups estimator and the within while Dr. Smith’s produces results that are biased. The preceding has addressed some econometric There is a 9 substantive conclusion. IO regression, with means constructed 11 effects model, as somehow 12 makes little sense. 13 regression 14 regression. 15 the worst the fixed effects model could do would be to mimic it. The fact that it 16 does not is strong evidence 17 is incorrect. 18 available 19 regression. 20 2. Measurement Error 21 Dr. Smith has advocated fine points. the group (site) means from the underlying data used to tit the fixed superior to the fixed effects model. Logically, this Statistically, it is simply incorrect. which imposes the improper restriction, It is the group means not the fixed effects A fortiori, if Dr. Smith were correct about the means regression, His statement that the assumption that the “between underlying the means regression model” is the least bad model is not correct either, even if he had fit the appropriate Dr. Neels has suggested then between groups that aggregating the data to group means helps error problem.** The logic is that averaging 22 to ameliorate the measurement 23 tends to average out the measurement 24 would be a very useful one if it were true. Unfortunately, error. It’s an intriguing argument, and it is not. Once again, ~** Docket No. R97-I, TR. 28/15626-15630. 31 the statistical rebuttal is simple. It is true that in the averaged of the measurement data, the variance error is now divided by the number of observations. However, the variance of the mean of the systematic component is likewise. That leaves the crucial reliability ratio that I discussed earlier unchanged. is measurement error, it will exert the same influence on a model fit with site means as it would on the underlying disaggregated 7 8 9 - 3. A Pooled data. Time Series Regression Dr. Neels has offered a pooled, aggregate yearly time series regression based on system wide costs and volumes as an appropriate IO volume variability.23 11 in volume variability 12 calendar years in response to changes 13 response to technological 14 assumptions 15 the outset discards 16 disaggregated 17 If there tool for analyzing If it could be argued that there were no systematic factors across sites or activities, no adjustment within in output, and no long run adjustment advances, this might be appropriate. seems warranted. variation in None of these And whether they are or not, assuming them at nearly all the useful information to be had from the data set. The (lack of) usefulness of the time series regression suggested 18 same as that for the group means regression. 19 clear. If it were appropriate 20 no site specific and no period specific effects-then 21 disaggregated 22 parameters. 23 they differ substantially, approaches to aggregate Once again, the statistical result is the data-in this case, that would mean the aggregate and the would produce similar estimates of the same The disaggregated 23 TR. 27/12835-l is the approach cannot make things worse. as they do here, the right conclusion When to draw is that the 2843. 32 1 aggregated 2 data will give the right answer whether or not Dr. Neels’ approach 3 D. Alternative Estimates Based on a Reverse Regression 4 is causing the problem. In a demonstration To reiterate, the disaggregated intended to show that piece handlings 5 volume, Dr. Neels presents 6 regression 7 decades ago in the sex discrimination 8 Economic Statistics, (April 1984) in a symposium 9 paper.) a set of results (TR. 27/12805-12810). using a technique This technique The logic of the technique is suggested of wages on qualifications, job characteristics, 11 other control variables, 12 variable will be positive and the coefficient 13 so, then in a regression 14 variables, the coefficient 15 women are underpaid, rise faster than known as reverse roughly two in which I have a contributed by the following: In a regression a dummy variable for sex, and if there is discrimination, of qualifications originated is correct. (See Journal of Business and literature. 10 16 P approach the coefficient on qualifications on the dummy will be too small. If on wages, the sex dummy, and the other on the dummy variable will be too high. then they are overqualified Dr. Neels has extended this methodology, In essence, if for the jobs they have. using an alternative error. In particular, approach 17 to regression in the presence of measurement 18 avoid the pitfalls of errors-in-variables 19 respect to FHP using the reverse regression 20 variables.. .” TR. 27/12806. 21 reciprocal of the elasticity of FHP with respect to TPHlF derived from the reverse 22 regression. Id. The reasoning appears to be that in reversing the dependent 23 independent variables, 24 on the right hand side and mitigate the measurement 25 the direct regression. 26 measurement bias, I estimate the elasticity he states, “[t]o of TPHlF with of FHP on TPHlF and other Then, to obtain the desired result, he takes the he can pile the measurement and error into the equation error error bias that might affect Once again, he cites my text: “It is a well known result that error in the dependent variable is absorbed in the error term and 33 1 can be ignored.” 2 enough), 3 rearranged 4 and the estimator Id. (footnote omitted). but the regression The quotation result is not. The reason is that, even with the equation, the measurement remains inconsistent, error is still in the independent The prototype for the original regression is: 6 y = px* + E 7 x = x* + u 9 exactly as specified earlier. The thinking in Dr. Neels’s reverse regression, apparently, 10 11 12 variable, as I now show. 5 8 is right (well, close is embodied x* = (l/p)y in: - (l/P)& so that: X 13 = x* + u = (llf3)y - (l/P)& + u 14 which is a conventional 15 that was quoted earlier. We just absorb u in the disturbance, 16 regression, 17 conventional 18 result. The least squares slope estimator in the reverse regression of x on y is 19 an estimator 20 21 regression then take the reciprocal regression which appears to obey the claim from my text of the coefficient. compute the It doesn’t work. Relying on results, and to avoid technical details, I go directly to the of the quantity 6 = Cov[x,y] / Varfy]. We can derive this just by going back to the original specification; P 22 6 = ph* I [p*h* + 021 34 1 2 b = 116 3 which estimates 4 not 6 but l/6 = p[l + o*/(p?b2)]. Whether or not there is measurement 6 measurement 7 that he seeks. error-the independently of the the true coefficient His elasticity estimates are biased upwards. What went wrong? 9 trades one misspecification What went wrong is that this manipulation for another. simply Looking back at the rearranged equation, X = (l/p)y - (l/P)& + u 12 what we find by manipulating 13 correlated with the ‘disturbance,’ 14 another assumption 15 estimates from the reverse regression 16 on reverse regression 17 and direct regression 18 shown above. 19 error-indeed, Neels estimator will overestimate 8 11 Neels’ estimator would be variance. 5 10 F where e* is the disturbance it a bit is that the ‘independent -(l/b)s+u; of the regression the covariance variable,’ y, is is -o*/p. This violates model, and renders least squares inconsistent. In fact, the accepted result in the presence of measurement error is that the reverse estimators bracket the correct result, which is what is I hesitate to generalize from narrow results. The reverse regression 20 estimator is generally 21 because it is usually embedded 22 least in this case, we are not completely 23 measurement 24 safe here. Reverse regression is not an appropriate going to be vastly more complicated than I have made it, in a multiple, not simple regression model, and at agreed on the nature of the error in any event. I do believe, however, that a firm conclusion is way of “avoiding the pitfalls 35 1 of errors-in-variables 2 be accepted. 3 done so directly. 4 E. Visually Compelling 5 bias.” This method and the estimates presented Rather, if the problem is to be analyzed should not and solved, it should be Plots In both his 1997 and 2000 testimonies, 6 low resolution figures generated 7 suggestion 8 volume variability. 9 “visually compelling witness Smith has relied on some by the SAS computer program to bolster his that the data and evidence in hand are consistent with 100 percent Indeed, in the 1997 testimony in demonstrating a variability he characterizes approaching labor hours and mail volume.“24 Irrespective the figures as 100 percent 10 between 11 results discussed 12 relying on any visual devices of this sort. I do not believe that the eye is capable 13 of resolving such evidence 14 convincing. in this proceeding, of any other evidence or I would caution the Commission at a level approaching “compelling” against or even I offer the figure below as motivation for this belief. The figure 24 Docket No. R97-1, Tr. 28/l 5847. 36 5000 5000 10000 15000 25000 20000 X P 1 contains a plot of 500 points that, as those in Dr. Smith’s graphs do, certainly 2 appear to be consistent with a 100% variability relationship. 3 line plotted in the figure is intended to aid the eye in reaching its conclusion. 4 However, 5 the proportions 6 figure were cleanly produced with a random number generator so that the values 7 of X are drawn from a normal distribution 8 deviation of 3000, while Ywas 9 Xplus they are not consistent with such a relationship-by of interest in this proceeding, The solid 45 degree what appear to be not even closely. The points in the with a mean of 15,000 and a standard produced so as to be equal to 2000 + 0.85 times a random normal draw with mean zero and standard deviation X/100. 10 the terminology of this proceeding, 11 volume variability. 12 fanning effect in the figure that is typical in Dr. Smith’s figures. 13 not produce (The standard the data were constructed In with an 85 percent deviation is intended to produce the kind of any effect on the slope of an underlying This feature will relationship; it will only 37 1 produce systematic variation around that relationship.) 2 results I have reviewed in the various testimonies, 3 and 1 .OOis substantial. 4 I realize that this little demonstration so as to produce an impression, In the context of the the difference is simplistic. between 0.85 The data were carefully 5 constructed not to mimic any real outcome that 6 an analyst might observe. 7 this, which could be based on such real data, could be very misleading. 8 believe that one could rely on a visual inspection 9 itself of considerably The purpose is to suggest that visual devices such as I do not of such a figure as this, which is higher quality than Dr. Smith’s, to draw a conclusion 10 precise quantity such as the slope of a regression. 11 Smith’s testimony should not be substituted 12 should not be accepted about a Graphs such as those in Dr. for careful statistical analysis, and as evidence that casts doubt on any such analysis. 38
© Copyright 2025 Paperzz