Econometrics - Questions and selected answers Juergen Bracht (Ph.D. Economics, Pittsburgh, U.S.A.) 24 February 2009 Abstract Tutorial 1 Problems Problem 1) Suppose that you are asked to conduct a study to determine whether smaller class sizes improve performance on standardized tests of fourth graders in Scotland. (a) If you could conduct any experiment you want, what would do? Be specific. (b) More realistically, suppose you can collect observational data on several thousand fourth graders. You can obtain the size of their fourth-grade class and a standardized test score taken at the end of fourth grade. Why might you expect a negative correlation between class size and test score? (c) Would a negative correlation necessarily show that smaller class sizes cause better performance? Explain. Problem 2) Suppose a secondary-school student is preparing to take an university-entrance exam. Explain why her eventual score is properly viewed as a random variable. Problem 3) Let X be a random variable distributed as Normal(5,4). Find the probabilities of the following events: a) P (X <= 6). b) P (X > 4). c) P (|X − 5| > 1). 1 Tutorial 2 Problems Problem 1) Let Y1 , Y2 , Y3 , Y4 independent, identically distributed random variables from a population with mean μ and variance σ 2 . Let Y = 14 (Y1 + Y2 + Y3 + Y4 ) denote the average of these four random variables. a) What are the expected value and variance of Y in terms of μ and variance σ 2 ? b) Now, consider a different estimator of μ: W = 18 Y1 + 18 Y2 + 14 Y3 + 12 Y4 . This is an example of a weighted average of the Yi . Show that W is also an unbiased estimator of μ. Find the variance of W . c) Based on the answer to parts (a) and (b) which estimator do you prefer, Y or W ? Problem 2) This is a more general version of Problem 1). Let Y1 , Y2 , ..., Yn be n pairwise uncorrelated random variables with common mean μ and common variance σ 2 . Let Y denote the sample average. a) Define the class of linear estimators of μ by Wa = a1 Y1 + a2 Y2 + ... + an Yn where the ai are constants. What restriction on the ai is needed for Wa to be an unbiased estimator of μ? b) Find V ar(Wa ). c) For any numbers a1 , a2 , ..., an , the following inequalities holds: (a1 + a2 + ... + an )2 /n <= a21 + a22 + ... + a2n . Use this along with parts (a) and (b) to show that V ar(Wa ) >= V ar(Y ) whenever Wa is unbiased so that Y is the best linear unbiased estimator. Hint: What does the inequality become when the ai satisfy the restriction from part (a)? 2 Tutorial 3 Problem (Difficult) Consider the standard simple regression model y = β 0 + β 1 x + u under Gauss-Markov assumptions. b0 and β b1 are unbiased for their respective population parameters. Let β e1 be The usual OLS estimators β the estimator of β 1 obtained by assuming the intercept is zero. e1 is unbiased for β 1 when the population e1 ) in terms of xi , β 0 and β 1 . Verify that β a) Find E(β e1 is unbiased? intercept (β 0 ) is zero. Are there other cases when β e1 (Hint: The variance does not depend on β 0 ). b) Find the variance of β e1 ) <= V AR(β b1 ). Hint: For any sample of data, c) Show that the V ar(β strict inequality when unless x. P x2i >= P (xi − x)2 , with e1 and β b1 . d) Comment on the trade-off between bias and variance when choosing between β 3 Tutorial 4 (Computer Problem) Use the data in SLEEP75.wf1 from Biddle and Hamermesh (1990), Sleep and the Allocation of Time, Journal of Political Economy 98, 922-943 1) We study whether there is a trade-off between time spent sleeping per week and the time spent in paid work. We could use either variable as the dependent variable. For concreteness, estimate the model sleep = β 0 + β 1 totwork + u where sleep is minutes spent sleeping at night per week and totwork is total minutes worked during the week. (1a) Report your results in equation form along the number of observations and R2 . What does the intercept in this equation mean? (1b) If totwork increases by 2 hours, by how much is sleep estimated to fall? Do you find this to be a large effect? 2) The following model is a simplified version of the multiple regression model used in Biddle and Hamermesh (1990) to study the trade-off between time spent sleeping and working and to look at other factors affecting sleep: sleep = β 0 +β 1 totwork+β 2 educ+β 3 age+u where sleep and totwork are measured in minutes per week and educ and age are measured in years. (2a) If adults trade off sleep for work, what is the sign of β 1 ? (2b) What signs do you think β 2 and β 3 will have? (2c) Using the data, the [ = 3638.25 − 0.148totwork − 11.13educ + 2.20age where n = 706, R2 = 0.113. estimated equation is sleep If someone works five more hours per week, by how many minutes is sleep predicted to fall? Is this a large trade-off? (2d) Discuss the sign and magnitude of the estimated coefficient on educ. (2e) Would you say totwork, educ and age explain much of the variation in sleep? What other factors might affect the time spent sleeping? Are these likely to be correlated with totwork? [ = 3638.25 − 0.148totwork − 3) We now report the standard errors along with the estimates: sleep (112.28) (0.017) 11.13educ + 2.20age. (3a) Is either educ or age individually significant at the 5% level against a two-sided (5.88) (1.45) alternative? (3b) Drop educ and age from the equation. Are educ and age jointly significant in the original equation at the 5% level? Justify your answer. (3c) Does including educ and age in the model greatly affect the estimated trade-off between sleeping and working? (3d) Suppose that the sleep equation contains heteroskedasticity. What does this mean about the tests computed in parts (3a) and (3b)? 4 Tutorial 1 Solutions Solution 1) a) Ideally, we could randomly assign students to classes of different sizes. That is, each student is assigned a different class size without regard to any student characteristics such as ability and family background. We also would like substantial variation in class sizes. b) A negative correlation means that larger class size is associated with lower performance. We might find a negative correlation because larger class size actually hurts performance. However, with observational data, there are other reasons we might find a negative relationship. For example, children from more affluent families might be more likely to attend schools with smaller class sizes, and affluent children generally score better on standardized tests. Another possibility is that, within a school, a head teacher might assign the better students to smaller classes. c) Given the potential for confounding factors — some of which are listed in (b) — finding a negative correlation would not be strong evidence that smaller class sizes actually lead to better performance. Some way of controlling for the confounding factors is needed, and this is the subject of multiple regression analysis. Solution 2) Before the student takes the exam, we do not know — nor can we predict with certainty — what the score will be. The actual score depends on numerous factors, many of which we, as observers, cannot even list, let alone know ahead of time. (The student’s innate ability, how the student feels on exam day, and which particular questions were asked, are just a few.) The eventual exam score clearly satisfies the requirements of a random variable. Solution 3) < a) P (X < 6) = P [ X−5 2 variable. > b) P (X > 4) = P [ X−5 2 6−5 ] 2 (4−5) ] 2 = P (Z < 0.5) ≈ 0.6915, where Z denotes a Normal(0, 1) random = P (Z > −0.5) = P (Z < 0.5) ≈ 0.6915. c) P (|X − 5| > 1) = P (X − 5 > 1) + P (X − 5 < −1) = P (X > 6) + P (X < 4) ≈ (1 − 0.6915) + (1 − 0.6915) = 0.617. We used answers from parts (a) and (b). 5 Tutorial 2 Solutions Solution 1) a) This is a special case of what is covered in the text, with n = 4: E(Y ) = μ and V ar(Y ) = σ 2 /4. b) E(W ) = E(Y1 )/8 + E(Y2 )/8 + E(Y3 )/4 + E(Y4 )/2 = μ[(1/8) + (1/8) + (1/4) + (1/2)] = μ(1 + 1 + 2 + 4)/8 = μ, which shows that W is unbiased. Because the Yi are independent, V ar(W ) = V ar(Y1 )/64 + V ar(Y2 )/64 + V ar(Y3 )/16 + V ar(Y4 )/4 = σ 2 [(1/64) + (1/64) + (4/64) + (16/64)] = σ2 (22/64) = σ 2 (11/32). c) Because 11/32 > 8/32 = 1/4, V ar(W ) > V ar(Y ) for any σ 2 > 0, so Y is preferred to W because each is unbiased. Solution 2) a) E(Wa ) = a1 E(Y1 ) + a2 E(Y2 ) + ... + an E(Yn ) = (a1 + a2 + ... + an )μ. Therefore, we must have a1 + a2 + ... + an = 1 for unbiasedness. b) V ar(Wa ) = a21 V ar(Y1 ) + a22 V ar(Y2 ) + ... + ann V ar(Yn ) = (a21 + a22 + ... + ann )σ 2 . c) From the hint, when a1 + a2 + +an = 1 — the condition needed for unbiasedness of Wa — we have 1/n <= a21 + a22 + ... + a2n . But then V ar(Y ) = σ 2 /n <= σ 2 (a21 + a22 + ... + ann ) = V ar(Wa ). 6 Tutorial 3 Solutions e1 = a) Textbook Equation 2.66: β S i Sxi y . x2i e1 = Plugging in yi = β 0 + β 1 xi + ui gives β The numerator can be written as β 0 Plug in: e1 = β S x β0 S 2i xi + β1 + S xu S i2i . xi P S xi (β 0 +β 1 xi +ui ) S 2 . xi xi + β 1 S P x2i + P xi ui . e1 ) = β 0 S x2i + β 1 because E(ui ) = 0 for all i. Conditional on the xi , we have E(β x i e Therefore, the P bias in β 1 is given by the first term in the equation. Bias is zero when β 0 = 0. It is also zero when xi = 0 (hence x = 0). In the latter case, regression through the origin is identical to regression with an intercept. e1 we have, conditional on the xi , b) From the last expression for β ³X ´−2 ³X ´−2 ³X ´ P e1 ) = x2i V ar(ui ) x2i V ar ( xi ui ) = x2i V ar(β ³X ´−2 ³ X ´ σ2 σ2 . x2i = X = x2i x2i b1 ) = X σ2 c) From (2.57), V ar(β . From the hint, (xi −x)2 X x2i >= X e1 ) <= V ar(β b1 ). (xi −x)2 so V ar(β e1 increases as x increases (holding the sum of the x2 fixed). But as x d) For fixed n, the bias of β i ³ ´ e e b increases, the variance of β 1 increases relative to V ar β 1 . Then bias in β 1 is also small when β 0 is b1 on a mean squared error basis depends on the sizes of β 0 , e1 and β small. Therefore, whether we preferX β x and n (in addition to the size of x2i ). 7 Tutorial 4 Solutions [ = 3586.4 − 0.151totwork with n = 706, R2 = 0.103. The (1a) The estimated equation is sleep intercept implies that the estimated amount of sleep per week for someone who does not work is 3586.4 minutes or about 59.44 hours per week or about 8.5 hours per night. [ = −0.151∗120 = (1b) If someone works two more hours per week then 4totwork = 120 and so 4sleep −18.12 (minutes). This is only a few minutes a night. (2a) If adults trade off sleep for work, more work implies less sleep (other things equal), so β 1 < 0. (2b) The signs of β 2 and β 3 are not obvious. (2c) 4totwork = 0.148 ∗ 300 = 44.4 (minutes). For a week, this is not a overwhelming change. (2d) If we assume the difference between college and high school is four years, the college graduate sleeps about 11.13 ∗ 4 = 44. 52 (minutes) less per week. The effect is quite small. (2e) Not surprisingly, the three explanatory variables explain only about 11.3% of the variation in sleep. One important factor in the error term is general health. Another is marital status, and whether the person has children. Health, for example, would be correlated with totwork. (3a) df = 706 − 4 = 702. The standard critical value (df = ∞ ) is 1.96 for a two-tailed test at 5% level. Now teduc = −11.13 = −1.8929. We fail to reject the null hypothesis at the 5% level. Also, 5.88 2.20 tage = 1.45 = 1.5172. Age is also statistically insignificant at the 5% level. 702 (3b) We could to compute the R2 -form of the F statistic for joint significance. F = 0.113−0.103 = 1−0.113 2 3.9572. The 5% critical value is the F2,702 distribution can be obtained with a denominator df = ∞: 3.00. Therefore, educ and age are jointly significant at the 5% level. (In fact, the p value is about 0.019, and so educ and age are jointly significant at the 2% level). (3c) Not really. These variables are jointly significant, but including them only changes the coefficient on totwork from −0.151 to −0.148. (3d) The t and F statistics that we used assume homoskedasticity. If there is heteroskedasticity in the equation, the tests are no longer valid. 8 Exam #1 Econometrics 1) A justification for job training programs is that they improve worker productivity. Suppose that you are asked to evaluate whether more job training makes workers more productive. However, rather than having data on individual workers, you have access to manufacturing firms in Scotland. In particular, for each firm, you have information on hours of job training per worker (training) and number of nondefective items produced per worker hour (output). 1a) Carefully state the ceteris paribus thought experiment underlying this policy question (5 marks). 1b) Does it seem likely that a firm’s decision to train its workers will be independent of worker characteristics? What are some of those measurable and unmeasurable worker characteristics? (5 marks) 1c) Name a factor other than worker characteristics that can affect worker productivity. (3 marks) 1d) If you find a positive correlation between output and training, would you have convincingly established that job training makes workers more productive? Explain. (7 marks) 2) Briefly explain these terms: Experiment, Binary Random Variable, Normal Distribution, Standard Normal Distribution, Cumulative Distribution Function, Random Sample, Asymptotic Normality, Central Limit Theorem, Sampling Distribution, Rejection Region, p value, Sample Average, Sample Correlation Coefficient, Sample Standard Deviation, Sample Variance, Sampling Variance. (20 marks) 3a) Let Y1 , Y2 , ..., Yn be n pairwise linear uncorrelated random variables with common mean μ and common variance σ 2 . Let Y denote the sample average. Show that Y is an unbiased estimator of the population mean μ. Verify that V ar(Y ) = σ 2 /n. (10 marks) 3b) Why has "unbiasedness" appeal as a property for an estimator? (5 marks) 3c) What are weaknesses of "unbiasedness" as a property for an estimator? (5 marks) 4) A researcher investigates what factors affect chief executives officer salaries. Her data set contains information on 177 chief executives for U.S. corporations from 1990. The variable salary is annual compensation, in thousands of dollars; the variable ceoten is prior number of years as company CEO; the variable comten is years with company; the variable sales is firm sales, in millions; the variable val is market value, in millions; the variable marg is profits as % of sales. 1 Dependent Variable: log(salary) Method: Least Squares Included observations: 177 Model 1 log(salary) = β 0 + β 1 log(sales) Variable Coefficient Std. Error t-Statistic Prob. C 4.961077 0.199960 24.81039 0.0000 LOG(SALES) 0.224279 0.027129 8.267132 0.0000 R-squared 0.280858 Model 2 log(salary) = β 0 + β 1 log(sales) + β 2 log(val) + β 3 marg Variable Coefficient Std. Error t-Statistic Prob. C 4.620690 0.254344 18.16709 0.0000 LOG(SALES) 0.158483 0.039814 3.980590 0.0001 LOG(VAL) 0.112261 0.050393 2.227701 0.0272 MARG -0.002259 0.002165 -1.043124 0.2983 R-squared 0.303494 Model 3 log(salary) = β 0 + β 1 log(sales) + β 2 log(val) + β 3 marg + β 4 ceoten + β 5 comten Variable Coefficient Std. Error t-Statistic Prob. C 4.571977 0.253466 18.03781 0.0000 LOG(SALES) 0.187787 0.040003 4.694340 0.0000 LOG(VAL) 0.099872 0.049214 2.029345 0.0440 MARG -0.002211 0.002105 -1.050132 0.2951 CEOTEN 0.017104 0.005540 3.087309 0.0024 COMTEN -0.009238 0.003337 -2.767983 0.0063 R-squared 0.352537 4a) Comment on the effect of marg on CEO salary. Would you include marg in a final model explaining CEO compensation in terms of firm performance? Explain. (4 marks) 4b) Does market value have a significant effect? Explain. (4 marks) 4c) Interpret the coefficients on ceoten and comten. Are these explanatory variables statistically significant? What do the estimates imply? (4 marks) 4d) What do you make of the fact that longer tenure with the company, holding the other factors fixed, is associated with lower salary. (4 marks) 4e) What is the parameter β 1 ? What do the estimates mean? (4 marks) 2 Selected answers - Exam #1 Econometrics 1a) One way to pose the question: If two firms, say A and B, are identical in all respects except that firm A supplies job training one hour per worker more than firm B, by how much would firm A’s output differ from firm B’s? 1b) Firms are likely to choose job training depending on the characteristics of workers. Some observed characteristics are years of schooling, years in the workforce and experience in a particular job. Firms might even discriminate based on age, gender or race. Perhaps firms choose to offer training to more or less able workers, where “ability” might be difficult to quantify but where a manager has some idea about the relative abilities of different employees. Moreover, different kinds of workers might be attracted to firms that offer more job training on average, and this might not be evident to employers. 1c) The amount of capital and technology available to workers would also affect output. So, two firms with exactly the same kinds of employees would generally have different outputs if they use different amounts of capital or technology. The quality of managers would also have an effect. 1d) No, unless the amount of training is randomly assigned. The many factors listed in parts (b) and (c) can contribute to finding a positive correlation between output and training even if job training does not improve worker productivity. ¡ P ¢ 1 P P P 3a) E(Y ) = E n1 Yi = n E ( Yi ) = n1 E(Yi ) = n1 μ = n1 nμ = μ. 4a) In model 2 and 3, the coefficient on marg is negative, although its t statistics is only about −1. It appears that, once firm sales and market value have been controlled for, profit margin has no effect on CEO salary. 4b) Model 3 controls for the most factors affecting salary. The t statistics on log(val) is about 2.05. The standard critical value is 1.96. So log(val) is just significant at the 5% level against a two-sided alternative. Because the coefficient β 2 is an elasticity, a ceteris paribus increase in market value is predicted to increase salary by 1%. 4c-d) These variables are individually significant at a low significance level. Another year with the company, but not as a CEO, lowers salary be 0.92%. This finding at first seems surprising but could be related to the superstar effect: firms hire CEOs from outside the company often go after a small pool of highly regarded candidates and salaries of these people are bid up. More non-CEOs years with the company makes it less likely the person was hired as an outside manager. Related case: Regression of log(wage) on experience and tenure. 4e) β 1 is an elasticity. 1% increase in sales, 0.19% in salary. 3
© Copyright 2026 Paperzz