Midterm Examination IOMS Department Regression and Forecasting Models Professor William Greene Phone: 212.998.0876 Office: KMC 7-90 Home page: people.stern.nyu.edu/wgreene Email: [email protected] Course web page: people.stern.nyu.edu/wgreene/regression/outline.htm Midterm Examination: Fall 2013, Section 1 This examination contains 10 questions, each worth 10 points. Where questions have more than one part, point values for the parts are shown. Please answer all questions. All answers are to be given in this test booklet. No supplementary materials will be accepted. This is an open book test. You may use any notes, books, or other materials that you like. You may not use a telephone, iPad, or any other device that is capable of sending or receiving a signal during the exam. In cases in which a computation involving several values is requested, you may report the values in the appropriate formula. For example, if the answer were the result of (1+1)/2, you may report the exact formula, (1+1)/2 rather than 1. The Stern honor code applies to your participation in this examination. YOUR NAME __________________ 1 Midterm Examination 1. The following shows the results of regression of household income (HHNINC) on years of education based on a survey of a large sample of German households. (2)a. How large is the sample? 27326 = 27325+1 (2)b. What is the reported value of R2? 0.069 (3)c. What is the value of the sum of squared residuals? 796.319 (3)d. What are the constant term and slope of the estimated regression equation? 0.12609 and 0.019963. (10)2.The regression model states that the relationship between a dependent variable y and an independent variable x is y = β0 + β1x + ε. If I want to estimate β1, I will compute the regression of y on x (as we discussed in class). But, just using some simple algebra, I can also write x = -β0/β1 + (1/β1)y - (1/β1)ε. This means that I could just compute the regression of x on y, instead, and take the reciprocal of the slope coefficient that I compute – I will get the identical answer. True or false? Justify your answer. 2 Midterm Examination If you regress y on x, your b1 will be If you regress x on y, your b1 will be ∑ ∑ N i =1 ( xi − x )( yi − y ) ∑ N i =1 N i =1 ( xi − x ) 2 ( xi − x )( yi − y ) N 2 i =1 i N now take the reciprocal and your 1/b1 . (Note different denominators.) ∑ ( y − y) ∑ ( y − y) = ∑ ( x − x )( y − y ) 2 i =1 i N i =1 i i This is obviously different from the slope in the regression of y on x. So false. Note this is not about the theory of the model. The question asked about what you would get if you did the computations with your data. 3. One of the interesting features of a country’s economy, or the world economy, is the rate at which health expenditure increases in response to the size of the economy. In the regression below, using the world health organization data that we discussed in class, I have regressed the log of per capita health expenditure on the log of the GDP for the 191 countries in their sample. (5)a. What is the estimate of the elasticity of health expenditure with respect to income (GDP)? 1.18226 (5)b. How do you interpret the value of the elasticity? The elasticity implies that if gdp increases by 1%, expenditures increase by about 1.18%. 3 Midterm Examination 4. The following sample of 11 observations reports the sales in units for electronics stores in the UK. To begin my study of how these businesses work, I computed a regression of the camera sales on the number of people working the floor. (Staff). (5)a. The equation states that if staff equals zero, sales of cameras will equal -86.7. This is obviously nonsense. If there is no one working, sales should be zero. There is obviously something wrong with this equation. True or false? Explain. There is nothing wrong with the equation. The -86.75 is the constant term that is needed to make the line pass through the middle of the data. No store has staff of zero. The line predicts the outcome in the range of the data. (5) b. . What is the economic interpretation of the coefficient on Staff in this equation? Each additional person added to the staff is associated with an increase of cameras of 58.427. 4 Midterm Examination 5. This question is based on the regression in Question 4. The average store size is a little under 5 people. We will use 5 for the average value of staff. I want to predict the sales of Cameras for a store that has Staff = 10 employees. (3)a. What is the prediction of Camera sales for a store with Staff = 10? The prediction is cameras = -86.75 + 58.427(10) = 497.52 (5)b. What are the lower and upper limits of a prediction interval for the sales of this store? (2)c. It’s OK if you use 2.0 or 1.96 for calculating the width of your interval. But, this small sample has only 11 observations. The appropriate critical value is somewhat larger than 2. The table you need is on slide 37 of Notes Part 2. What is the correct value? b. and c. The standard error to use is se2 (1 + N1 ) + ( x * −= x ) 2 SEb2 2 49.97382 (1 + 111 ) + (10 − 5)= 5.3832 58.726 The bounds of the interval are 497.52 ± t* (58.726). It is typical to use 1.96 or 2.00 for t*. This is a very small sample. The actual appropriate value from the table mentioned in part c is the t with N-2 = 9 degrees of freedom, 2.262 6. Being an economic philosopher, I decided to use the production data in Question 4 to test a couple hypotheses. The following shows the results of my regression of the log of video sales on the log of floor space. I interpret this as a regression of the log of output on the log of capital. (5)a. Theory 1 (Marxist) holds that capital is not productive. The null hypothesis is that the coefficient on log capital (logfloor) is zero. Test this hypothesis. t = (1.9352 – 0)/.3651 = 5.30. This is much larger than 2.262, so the hypothesis is rejected. 5 Midterm Examination (5)b. Theory 2 (bland and noncommittal) holds that capital might be productive, but there are constant returns to scale. The null hypothesis is that the coefficient is one. Test this hypothesis. t = (1.9352 – 1)/.3651 = 2.561. Still larger than 2.262, so this is also rejected. (10)7. The P value reported in the Analysis of variance table in the regression in Question 6 is 0.000. Since this is a probability, the program is reporting that the probability of the model is 0.000, i.e., impossible. Therefore we should discard the model as worthless and build a new model. True or false? Explain. The P value reports the probability of observing a coefficient as large (far from zero) as what we observed if the actual coefficient really were zero. Since P=0.000, we reject that assumption. The P value says it is (essentially) impossible to observe this value b1 = 1.9352 if the true value of β 1 really were zero. (10)8. Using the data from Question 1, I decided to regress the log of income on age. The results are shown below. Notice that Minitab reports that R2 equals 0.0%. This is nonsense. This is sloppiness on the part of the people who wrote the software. Luckily, you can compute the R2 for this regression using other numbers that are reported in the results. What is the right value for R2? R2 = 2.9503/6599.6116 = 0.000447. 6 Midterm Examination 9. The figure below shows a scatter plot of the monthly returns on Microsoft stock vs. Walmart stock. There are 70 observations in the sample. (5)a. Based on this figure, is the correlation between these two variables positive or negative? Justify your answer. Positive. The slope of the line is positive. This is the same as the sign of the correlation. (5)b. Which is a good guess of the correlation between these two variables? Explain. 0.01 0.25 0.95 1.00 If you answered “negative” in part a, then put a minus sign on the guesses above. 0.01 is close to zero, which would show for an unorganized blob of points. But, the regression line slopes up, so this is not reasonable. .95 and .99 are extremely high. The points are not that organized around the line. That leaves 0.25. (The actual value is .2486.) 7 Midterm Examination (10)10. Using the data in the figure in Question 9, I computed a linear regression of the Microsoft price on the Walmart price. I then computed the residuals and plotted them in the figure below. Looking at these results, I don’t see any patterns that would make me concerned about the model – they look like random noise to me. What kinds of patterns would make me (an analyst) question whether my model is an appropriate regression model? Long streaks of positive and/or negative values would suggest nonrandomness. Large numbers of very large residuals might also be suggestive. This scatter of residuals looks line an unorganized blob of points that swings randomly between positive and negative. 8
© Copyright 2026 Paperzz