Final—Form A Spring 2003 Economics 173 Instructor: Petry Name_____________ SSN______________ Before beginning the exam, please verify that you have 18 pages with 50 questions in your exam booklet. You should also have a decision-tree and formula sheet provided by your TA. Please include your full name, social security number and Net-ID on your bubble sheets. Good luck! Use the following information to answer the next eight questions (#1-8). You are interested in understanding the home run hitting ability of young major league baseball players. You decide to run a regression with the dependent variable: HRs--Number of Homeruns Hit by the Player in the Most Recently Completed Major League Season. You identify three independent variables: Minor HR--Number of Homeruns the Player Hit in Last Season as a Minor Leaguer. Age--the player’s age. Years Pro--Number of Years the Player has been a Professional Ball Player. SUMMARY OUTPUT Regression Statistics Multiple R 0.59256 R Square 0.351128 Adjusted R Square 0.335172 Standard Error 6.992105 Observations 126 ANOVA df Regression Residual Total Intercept Minor HR Age Years Pro MS 1075.871 48.88953 F Significance F 1.85592E-11 125 SS 3227.612245 5964.522676 9192.134921 Coefficients -1.96998 0.665838 0.135728 1.176371 Standard Error 9.547049398 0.087149184 0.524087215 0.670625334 t Stat -0.20634 7.640212 0.258979 1.75414 P-value 0.836866 5.46E-12 0.796088 0.081917 Lower 95% -20.86933228 0.493317598 -0.901756157 -0.151200086 3 Upper 95% 16.92938 0.838359 1.173212 2.503942 1. The test statistic for testing the model’s overall significance is: a. 0.0454 b. 22.006 c. 7.640 d. 0.524 e. 0.671 81894892 Page 1 of 19 2. The degrees of freedom for the test statistic named above is: a. 3 b. 122 c. 3 and 122 d. 122 and 125 e. 122 and 126 3. The conclusion from the test performed above is: a. none of the three independent variables are significant b. all three independent variables are significant c. the dependent variable is significant d. at least one of the independent variables is significant e. all of the above 4. Based on the t-tests for individual significance, and at a 5% level of significance, the variable(s) that DO have a significant impact on homerun hitting ability is (are): a. Minor HR b. Age c. Years Pro d. All the above e. Both a and c 5. Ignoring the results from any significance tests conducted on the model, the estimated number of HRs hit by a player who hit 22 HRs in his last season in the minor leagues, is 20 years old and has 3 years of professional ball playing experience is: a. 19 b. 15 c. 12 d. 10 e. 5 6. Suppose that in this study, you generated a correlation matrix for the 3 independent variables, as given below. Based SOLELY on this correlation matrix, which of these problems would you suspect? Minor HR Age Years Pro a. b. c. d. e. 81894892 Minor HR Age 1 0.035416 -0.03916 1 0.837398 Years Pro 1 non-normality autocorrelation heteroskedasitcity multicollinearity none of the above Page 2 of 19 7. In order to fix the problem identified in the previous question, you would: a. drop either Minor HR or Age from the model b. drop either Minor HR or Years Pro from the model c. drop either Years Pro or Age from the model d. include the log of age in the model e. do nothing since there is no problem 8. Assume that it was appropriate to conduct a Durbin-Watson test on the regression output, and that the DW test statistic was 2.36. The DW critical values are: dL=1.61 and dU=1.74. What should you conclude from this test? a. heteroskedasticity is present b. homoskedasticity is present c. multicollinearity is present d. autocorrelation is present e. the test proves inconclusive 81894892 Page 3 of 19 Use the following information to answer the next two questions (#9-10). In a study of income differentials, data was collected for 100 subjects on their Incomes (in thousands of dollars), Years of Education, Age, and Number of Children They Had. Then, the natural log of income was taken as the y-variable, and regressed on the three independent variables named above. The output is given below: SUMMARY OUTPUT Regression Statistics Multiple R 0.759978 R Square 0.577566 Adjusted R Square 0.564365 Standard Error 0.264954 Observations 100 ANOVA df Regression Residual Total Intercept Education Age Children 3 96 99 SS 9.214115 6.739241 15.95336 Coefficients 2.189232 0.092041 0.001391 -0.01082 Standard Error 0.156791 0.008131 0.002276 0.020423 MS 3.071372 0.0702 F 43.75147 t Stat 13.9627 11.31922 0.611137 -0.5299 P-value 7.58E-25 2.26E-19 0.542553 0.597402 9. According to this output, for every additional child, the impact on income is a. decreases by 0.01082 dollars b. decreases by 10.82 dollars c. decreases by 0.989 dollars d. decreases by 989 dollars e. increases by 989 dollars 10. Disregarding any tests on the significance of individual independent variables, the estimated income, in thousands of dollars, for someone with 17 years of education, 42 years of age, with 3 children would be: a. 3.78 b. 3779.9 c. 43.81 d. 24.6 e. 51.12 81894892 Page 4 of 19 11. In a multiple regression model the subjects’ ethnicities are to be represented as independent variables. All subjects fall into five ethnic groups: Caucasian, AfricanAmerican, Asian, Native-American and Hispanic. How many dummy variables must be constructed to adequately represent all five groups? a. 1 b. 2 c. 3 d. 4 e. 5 Use the following information to answer the next two questions (#12-13). Following is the output from a regression of Used Car Prices on Car Color and Odometer Reading. Car Color is a qualitative variable, with levels White, Silver and Other Colors, and is therefore represented in the model via dummy variables. SUMMARY OUTPUT Regression Statistics Multiple R 0.8354822 R Square 0.6980304 Adjusted R Square 0.6885939 Standard Error 142.27105 Observations 100 ANOVA df Regression Residual Total Intercept White Silver Odometer 3 96 99 SS 4491749.241 1943140.949 6434890.19 MS 1497250 20241.05 F 73.97095 Coefficients 6350.3231 45.240979 -147.73801 -0.0277698 Standard Error 92.16652879 34.08443045 38.18498973 0.002368579 t Stat 68.90053 1.327321 3.869007 -11.7242 P-value 1.5E-83 0.187551 0.000199 3.14E-20 12. On average, and odometer readings being the same, a White colored car would sell for how much more (or less) than a Silver colored car? a. 192.98 b. -192.98 c. 102.50 d. -102.50 e. 45.24 81894892 Page 5 of 19 13. According to this model, the estimated average selling price of a car that is of a Color other than White or Silver (neither White nor Silver), and has an Odometer reading of 27,125 miles would be: a. 6350.32 b. 5642.31 c. 5597.07 d. 5449.33 e. 4200.91 Use the following information to answer the next five questions (#14-18). These questions are based on Project 2. You are expected to be able to recall that entire scenario. Provided below are the ANOVA tables from the full and reduced model regressions, respectively: From full model: ANOVA df Regression Residual Total 15 284 299 SS 2386035 1875818 4261852 MS 159069 6604.991 F 24.08315 Significance F 3.63E-42 From reduced model: ANOVA df Regression Residual Total 8 291 299 SS 2318693.212 1943159.21 4261852.422 MS 289836.7 6677.523 F 43.40481 Significance F 2.06E-45 14. Going from the full to the reduced, how many variables get dropped? a. 6 b. 7 c. 8 d. 9 e. 10 15. In order to test if the variables dropped are significant as a group, which statistical test should be conducted? a. t-test, two sample, assuming equal variances b. t-test, two sample, assuming unequal variances c. chi square test for variance d. F-test for overall significance e. Partial F-test 81894892 Page 6 of 19 16. The calculated value of the test statistic for the test referred to above is: a. 1.227 b. 1.117 c. 1.025 d. 1.457 e. cannot be calculated due to insufficient information 17. The degrees of freedom for the test statistic calculated above is (are): a. 6 and 284 b. 7 and 284 c. 6 d. 7 e. 284 18. Given that the relevant critical value for this test is 2.04, your conclusion should be: a. fail to reject the null hypothesis, therefore choose the reduced model b. fail to reject the null hypothesis, therefore choose the full model c. reject the null hypothesis, therefore choose the reduced model d. reject the null hypothesis, therefore choose the full model e. the test proves inconclusive 19. The range of values that R2 can possibly take is from -1 to 1. a. True b. False 20. While R2 can go down if irrelevant variable are included in the model, adjusted R2 always goes up upon the inclusion of new variables. a. True b. False 81894892 Page 7 of 19 Use the following information to answer the next two questions (#21-22). MON TUE WED THU FRI SAT SUN 35 42 56 46 67 51 39 21. After doing a centered 2 period moving average on this column, the moving averages for Friday, Saturday and Sunday are (in that order): a. 50, 53.75, 57.75 b. 53.75, 57.75, 52 c. 57.75, 52, not available d. 52, not available, not available e. 67, 51, 39 22. The exponentially smoothed value (use a smoothing constant of 0.4) for Sunday is: a. not available b. 54.07 c. 52.84 d. 47.30 e. 39 Use the following information to answer the next two questions (#23-24). The following table gives you the actual observations from a time series (y) and the corresponding residuals obtained after fitting a trend to the series. Observation 1 2 3 4 5 6 7 8 9 10 Y 6.9 7.6 8.5 11.3 12.7 10.9 11.9 11.6 10.2 11.5 Residuals -1.89763 -1.59915 -1.10068 1.297798 2.296273 0.094749 0.693224 -0.0083 -1.80982 -0.91135 23. The percent of trend value for period 7 is: a. 1.06 b. 0.94 c. 17.17 d. 8.25 e. cannot be calculated from the information provided 81894892 Page 8 of 19 24. After obtaining the percent trends for all periods, you plotted them against time. This procedure is designed to reveal the presence of which time-series component? a. trend b. seasonal c. cyclical d. random e. irregular Use the following information to answer the next five questions (#25-29). Following is some output you might expect to be generated when dealing with a seasonal time series. In this case we have quarterly data, for which the trend regression output is given below: Intercept Time Coefficients 23.67601329 0.310413772 Standard Error 2.256404591 0.089331913 t Stat 10.4928 3.474836 P-value 3.51E-13 0.001221 Based on this model, percent of trend values were calculated, and an attempt was made to construct seasonal indices. The results from this initial attempt are given below: Q1 0.78 Q2 1.01 Q3 0.88 Q4 1.36 25. Making any necessary adjustments to these initial results, the final seasonal index for the third quarter would be: a. 0.774 b. 1.002 c. 1.350 d. 0.873 e. 0.88 26. Given that the actual value of the series (y) for period 19, which happens to be a third quarter, is 23.43, what is the seasonal plus trend based forecast (that is the forecast that takes into account BOTH the trend and the seasonal components) for that period? a. 29.57 b. 25.82 c. 20.46 d. 26.03 e. 26.82 27. Relying on all the information above, what should the seasonally adjusted (deseasonalized) value of the series be for period 19? a. 29.57 b. 25.83 c. 20.46 d. 26.03 e. 26.84 81894892 Page 9 of 19 The SAME time series discussed above is now analyzed by representing the quarters by indicator variables. The output is given below: ANOVA df Regression Residual Total Intercept Time Q1 Q2 Q3 4 38 42 SS 2438.89821 365.8136645 2804.711874 MS 609.7246 9.626675 F 63.33698 Coefficients 34.6197333 0.30514848 -17.5114879 -10.4739091 -14.3417848 Standard Error 1.291752195 0.038191454 1.356199997 1.355662143 1.356199997 t Stat 26.8006 7.989968 -12.9122 -7.72605 -10.575 P-value 2.76E-26 1.17E-09 1.8E-15 2.62E-09 7.06E-13 28. Given that this is sales data, name the quarter that clearly outperforms ALL other quarters: a. Quarter 1 b. Quarter 2 c. Quarter 3 d. Quarter 4 e. Cannot be determined from the given information 29. Forecast the sales value for period 20, according to this indicator variable model: a. 34.62 b. 40.72 c. 26.08 d. 42.10 e. Cannot be determined from the given information 30. Which of the following statements is FALSE ? a. SSE should be used for model selection when it is important to avoid any large errors. b. Autoregressive models are based on regressing a time series on its past values. c. In Autoregressive models some observations are lost, more so if the order of the model is high. d. MAD is a criteria for model selection when several forecasting techniques are available. e. When using MAD for model selection, the model with the largest MAD statistic should be chosen. 81894892 Page 10 of 19 Use the following information to answer the next four questions (#31-34). Recently a member of the United Nations was sent to Baghdad. He was assigned to compare the prices that looters are charging for hospital equipment on the black market (population 1) as compared to the prices that are charged by the regular manufacturers (population 2). The United Nations representative claims that the black market prices are lower than the regular market prices. Assume that because of the volatile situation in Baghdad, the looters have a drastically higher variation in prices as compared to manufacturers. You take a sample of 50 looted blood pressure cuffs and a sample of 300 blood pressure cuffs from various regular manufacturers. You find that the average price for the sample of looted cuffs was $65 with a standard deviation of $20.50 as compared to the $75 charged by cuff manufacturers with a standard deviation of $3.25. 31. What is the null and alternative hypotheses to test the UN representative’s claim? a. Ho: μ2=μ1; H1: μ2<μ1 b. Ho: μ1-μ2=0; H1: μ1- μ2>0 c. Ho: μ1-μ2=0; H1: μ1- μ2<0 d. Ho: μ1-μ2=0; H1: μ1- μ2≠0 e. Ho: μ1=μ2; H1: μ1≠μ2 32. The test statistic for the test above is: a. -3.44 b. -7.92 c. -5.39 d. 5.39 e. unable to determine because the variances are not known and cannot be assumed equal or not equal. 33. What is the correct formula in Excel to calculate the p-value for the test statistic in the previous question? a. FDIST(abs(test statistic),df1,df2,,300) b. TDIST(test statistic,df,2) c. FDIST(test statistic,df,50) d. NORMSDIST(test statistic) e. TDIST(abs(test statistic),df,1) 81894892 Page 11 of 19 34. Which shaded area (designated by the arrow) is the appropriate p-value for the test designated above? a. b. c. d. e. 35. Suppose the United Nations wants to make an inference about the population price of black-market blood pressure cuffs (population 1 parameter) based upon the sample using a confidence interval. What is the 95% confidence interval for blood pressure cuff prices on the black market? TINV(0.025,49)= 2.311 TINV(0.05,49)= 2.009 TINV(0.025,299)= 2.252 TINV(0.05,300)= 1.968 a. 74.63, 75.37 b. 74.58, 75.42 c. 59.29, 70.71 d. 59.18, 70.82 e. 58.30, 71.70 81894892 Page 12 of 19 36. Suppose the United Nations wants to test the claim that the population variance of blackmarket blood pressure cuff prices is equal to $20. What is the correct test to use? a. Pooled-variance t test for the difference of two means b. F test for the difference in variances c. T-test for a single population mean d. Z-test for a single population mean e. Chi-square test for a single population variance 37. What sequence of commands are necessary to reach the following menu? Moving Average Rank and Percentile Random Number Generation Regression Sampling t-Test: Paired Two Sample for Means t-Test: Two Sample Assuming Equal Variances t-Test: Assuming Unequal Variances z-Test: Two Sample for Means a. b. c. d. e. OK Cancel Help Data --> Tools --> Descriptive Statistics Tools--> Data Analysis Tools --> Descriptive Statistics --> Data Analysis Descriptive Statistics --> Tools Regression --> Descriptive Statistics 38. The Central Limit Theorem states: a. Only when a random sample is drawn from a normally distributed population will the sampling distribution of the sample mean be approximately normal for a sufficiently large sample size. b. If a random sample is drawn from any population, the sampling distribution of the sample mean is approximately normal even with a small sample size. c. If a random sample is drawn from any population, the sampling distribution of the sample mean is approximately normal for a sufficiently large sample size. d. It is impossible to go from a sampling distribution to a standardized distribution regardless of the population distribution. e. None of the above 39. Which of the following is FALSE regarding Hypothesis Testing? a. When the p-value for the test is less than alpha for the test, the null hypothesis is always rejected. b. The currently accepted hypothesis or the status quo is designated as the null hypothesis, whereas the claim being tested is the alternative hypothesis. c. The equal sign is never in the alternative hypothesis. d. The alpha or significance level is associated with a Type I error. e. The p-value for a one-tailed test is always equal to the p-value of the twotailed test divided in half. 81894892 Page 13 of 19 40. Mark claims that students majoring in life sciences are particularly “slow” in understanding statistics and that their final exam grades are lower than a 78. Tonight he finds out that students in the life sciences scored an average of 81. The p-value for the two-sided test is .04. What is your conclusion for this test based upon a 5% level of signficance? a. There is insufficient evidence to reject the null hypothesis b. Ho: μ>78 c. H1: μ=78 d. The null hypothesis is rejected e. None of the above Use the following information to answer the next two questions (#41-42). Based on the summary regression outputs between the S&P 500 index (the market) and five individual stocks, and your recollection of Project 1, answer the following two questions. Theragenics Intercept S & P 500 Coefficients 2.321611303 -0.48015053 Standard Error 2.89727262 0.62430997 t Stat 0.801309234 0.769089974 P-value 0.426223 0.444961 CDN Intercept S & P 500 Coefficients -4.5450303 1.289943573 Standard Error 2.22063883 0.4785076 t Stat -2.04672198 2.695764034 P-value 0.045226 0.009175 Xerox Intercept S & P 500 Coefficients -2.14726773 1.82500000 Standard Error t Stat 2.17235151 -0.98845317 0.468100 3.8987400000 P-value 0.327037 0.000253 Mattel Intercept S & P 500 Coefficients -0.1667143 0.152880037 Standard Error 1.58542855 0.34163124 t Stat -0.10515409 0.447500161 P-value 0.916616 0.656181 Walmart Intercept S & P 500 Coefficients 1.53009625 -1.0523076 Standard Error 1.04315895 0.22478193 t Stat 1.466791088 4.681460317 P-value 0.147834 1.76E-05 41. Suppose that several large companies are posting their expected earnings tomorrow and that investors expect the market to go DOWN significantly. Where would you invest your money, i.e. which stock would you buy? a. Theragenics b. CDN c. Xerox d. Mattel e. Walmart 81894892 Page 14 of 19 42. Suppose that you want to test whether XEROX stock moves perfectly with the market (S&P 500 index). Use 5% significance level. What is your test statistic and what is your conclusion (the sample size for Project 1 was 60)? (t0.025, 59= 2.000, t0.05, 59= 1.671, t0.10, 59= 1.296)? a. 3.89874, conclude that you cannot reject the claim that Xerox and S&P 500 move perfectly together b. 3.89874, conclude that Xerox and S&P 500 move perfectly together c. 1.762444, conclude that you cannot reject the claim that Xerox and S&P 500 move perfectly together d. 1.762444, conclude that Xerox and S&P 500 move perfectly together e. 1.762444, conclude that you cannot reject the claim that Xerox and S&P 500 move perfectly together, but at 1% level of significance, you would be able to conclude that they move together 43. The test statistic for testing whether there is linear relationship between Xerox and Mattel is? Note: there are 60 observations in the sample a. b. c. d. e. Theragenics CDN Beverage Xerox Mattel Wal-Mart 0.076418 0.583686 0.001274 1 4.585059 81894892 CDN Theragenics Beverage 1 Xerox Mattel -0.00642 1 0.340769 0.293372 1 0.074253 0.211244 0.076418 1 -0.04206 0.19027 0.21453 0.142672 Wal-Mart 1 Page 15 of 19 Use the following information to answer the next two questions (#44-45). Suppose that you are a big baseball enthusiast and that you like both Sammy Sosa and Barry Bonds. You are interested in testing whether there is a difference in batting averages between the two of them. You are given the following summary data where SS indicates Sammy Sosa and BB indicates Barry Bonds. x SS 0.341 s SS 0.144 n 15 x BB 0.352 s BB 0.221 n 18 44. First, you have to decide which test to perform. Your initial strategy is to perform an Ftest to see whether variances of the two samples are equal or not. What is the value of your F statistic? a. 0.651584 b. 0.424561 c. 0.96875 d. 0.7819 e. 0.509474 45. Given that the upper critical value is F0.025, 14, 17 = 2.752, what would you do to test whether there is a difference in batting averages between Sammy Sosa and Barry Bonds? a. perform a difference in means test assuming equal variances b. perform a matched pairs test for difference in means c. perform a difference in means test assuming unequal variances d. perform a ratio of variances test to check whether variances or two samples are equal e. perform a z test for difference in means Use the following information to answer the next two questions (#46-47). It is said that group studying improves your chances of receiving an A on the final exam. To test this claim, you gather grades from the UIUC students. More specifically, you separate students who study in groups and the ones who don’t. Out of 1,700 observations, you found that 800 of them study in groups and that 340 of those got A’s. Out of the people who don’t study in groups, 345 got A’s. 46. At 5% significance, what is the test statistic and your conclusion? (Z0.025=1.96, Z0.05=1.645, Z0.1=1.28) a. -0.252, students studying in groups don’t have a better chance of getting an A b. 1.748, students studying in groups don’t have a better chance of getting an A c. -0.175, students studying in groups don’t have a better chance of getting an A d. 1.748, students studying in groups have a better chance of getting an A e. -0.252, students studying in groups have a better chance of getting an A 81894892 Page 16 of 19 47. You are given summary data of study time in hours (x) and exam grades out of 100 (y). n 20 x 10 y 82 xi yi 16,000 xi2 = 1900 Using the given specifications, predict what is the expected grade of a student who studies 13 hours? a. 42 b. 4 c. 94 d. 13 e. 81.333 48. Suppose that you ran a regression between watching tv and eating chips, where the dependent variable is bags of chips eaten during a week and an independent variable is hours of watching tv during the week. You are given the summary of the regression output below. SUMMARY OUTPUT Regression Statistics Multiple R 0.539579 R Square Adjusted R Square 0.26583 Standard Error Observations 30 ANOVA df Regression Residual Total SS MS 1 94.93884 94.93884 28 231.148 8.255287 29 F Significance F Calculate the coefficient of determination and interpret it. a. 0.2911, 29.11% of variation in hours of tv watching is explained by variation in the number of bags of chips eaten b. 2.873, 2.873% of variation in the number of bags of chips eaten is explained by variation in hours of tv watching c. 11.5, 11.5% of variation in the number of bags of chips eaten is explained by variation in hours of tv watching d. 0.2911, 29.11% of variation in the number of bags of chips eaten is explained by variation in hours of tv watching e. 2.873, 2.873%, of variation in the number of bags of chips eaten is explained by variation in hours of tv watching 81894892 Page 17 of 19 Use the following information to answer the next five questions (#49-50). You are given the summary output of the regression between age and hours of reading newspapers during a week. Age is an independent variable, while the Hours of Reading Newspapers During a Week is a dependent variable. SUMMARY OUTPUT Regression Statistics Multiple R 0.333683 R Square 0.111345 Adjusted R Square 0.096023 Standard Error 16.66673 Observations 60 ANOVA df Regression Residual Total Intercept AGE 1 58 59 Coefficients -4.54503 1.289944 SS MS 2018.665 2018.665 16111.22 277.7797 18129.89 Standard Error 2.220639 0.478508 t Stat F Significance F 0.009175 P-value 0.045226 49. If you want to formally test the validity of the model, which test would you perform and what would be your test statistic? a. F-test, 0.009175 b. t-test, -2.04672 c. Z-test, 0.111345 d. t-test, 1.289944 e. F-test, 7.267144 50. Suppose you want to test whether variable AGE is a significant variable in the above model. What is your test statistic and a p-value for that test? a. test stat: 7.267144; p-value: 0.045226 b. test stat: 7.267144; p-value: 0.096023 c. test stat: 2.695764; p-value: 0.009175 d. test stat: -2.04672; p-value: 0.045226 e. test stat: 2.695764; p-value: 0.045226 81894892 Page 18 of 19 Answer Key: 1. B 2. C 3. D 4. A 5. A 6. D 7. C 8. E 9. E 10. C 11. D 12. A 13. C 14. B 15. E 16. D 17. B 18. A 19. B 20. B 21. C 22. D 23. A 24. C 25. D 26. B 27. E 28. D 29. B 30. E 31. C 32. A 33. E 34. E 35. E 36. E 37. B 38. C 39. E 81894892 40. A 41. E 42. C 43. B 44. B 45. A 46. D 47. C 48. D 49. E 50. C Page 19 of 19
© Copyright 2026 Paperzz