NCSU ST512 TAKE HOME QUIZ SUM 2 2011 1. A friend has asked you for help in fitting a line for a class project. He has collected data for runners world record times on ten distances for men and women. He needs to fit a single linear trend over all data points to show the relationship between time and distance, ignoring gender of the runner. The world records on ten distances (outdoor running) are listed for men and women in Table below. The records were taken from the website of the International Association of Athletics Federation (IAAF), http://www.iaaf.org on July 23, 2011. The marathon is a long-distance running event with an official distance of 42.195 kilometers (26 miles and 385 yards) Male Distance (m) time(sec) Female date time(sec) date 100 9.77 14/06/2005 10.49 16/07/1988 200 19.32 01/08/1996 21.34 29/09/1988 400 43.18 26/08/1999 47.60 06/10/1985 800 101.11 24/08/1997 113.28 26/07/1983 1500 206.00 14/07/1998 230.46 11/09/1993 3000 440.67 01/09/1996 486.11 13/09/1993 5000 757.35 31/05/2004 864.53 03/06/2006 10000 1577.53 26/08/2005 1771.78 08/09/1993 21097.5 3535.00 15/01/2006 4004.00 15/01/1999 42195 7495.00 28/09/2003 8125.00 13/04/2003 As requested, your run a simple linear regression on this dataset and present your friend with the results. a) Write the estimated Simple linear regression equation and test whether the linear regression coefficient is significantly different from 0. b) As learned in class, you run a lack of fit test on this data to ensure that the linear fitting is adequate. Test the hypothesis that a higher degree polynomial may be needed. Based on the lack of fit test, you decided that linear trend is fine, and prepared a plot of the observed records and linear trend. c) Include the plot of predicted vs distance. Still, a look at the residual should not be bad idea, present a plot of residual against predicted and discuss the linear regression fitting. Does the plot of predicted against distance adjust well to data? Does the residual plot show whether the linear fitting was adequate? July 23, 2011 1 NCSU ST512 d) TAKE HOME QUIZ SUM 2 2011 After looking at the residuals, you decided to try a power function for this relationship. This power function is expressed as a linear function after a log transformation for both x- and y-variables,as shown next, time M distanceb1 log10 time log10 M b1 log distance y bo b1 x where bo log10 M y log10 time x log10 distance e) Write the power function equation estimated for this data. What are the estimated values for M and b1. f) Compute the predicted mean for distance 100 meters, distance of 1000 meters time1000 . time100 . Repeat computation for a Note that time x new 10 y where y bo b1 xnew g) Find the time 1000 . Interpret b1. time 100 h) Estimate the mean record time for a distance of 25 km. Calculate the 95%confidence interval for this predicted mean. i) Plot of residuals against the predicted values is presented below. Discuss whether a separate fitting is needed for male and females. July 23, 2011 2 NCSU ST512 TAKE HOME QUIZ SUM 2 2011 Residual plot Residual 0.06 0.05 0.04 0.03 0.02 0.01 0.00 -0.01 -0.02 -0.03 -0.04 -0.05 -0.06 1 2 3 4 Predicted Value of LOG_TIME F gender M 2. The following data presents the results of a study of the effect of ambient temperature and liquid viscosity on the amount of energy (joules/sec) honeybees spend while drinking. Temperature levels were 20 and 30C. Levels of liquid viscosity refer to the percent of Sucrose in total solids dissolved in liquid. There were two levels for Sucrose, 20% and 40%. Each of the 4 combinations of temperature and viscosity were repeated three times in controlled conditions, randomly assigning the bees to each of the four experimental groups. The following variables were used in the analysis to simplify calculations: Temperature 25 5 Sucrose 30 X2 10 X 3 X 1 X 2 X1 Sucrose X 2 30 1 Temperature X 1 Note 20 1 30 1 40 1 Temperature Sucrose X3 20 20 1 20 40 1 30 20 1 30 40 1 Data. July 23, 2011 Obs i temperature sucrose rep energy x1 x2 x3 1 20 20 1 3.1 -1 -1 1 2 20 20 2 3.7 -1 -1 1 3 NCSU ST512 TAKE HOME QUIZ SUM 2 2011 Obs i temperature sucrose rep energy x1 x2 x3 3 20 20 3 4.7 -1 -1 1 4 20 40 1 5.5 -1 1 -1 5 20 40 2 6.7 -1 1 -1 6 20 40 3 7.3 -1 1 -1 7 30 20 1 6.0 1 -1 -1 8 30 20 2 6.9 1 -1 -1 9 30 20 3 7.5 1 -1 -1 10 30 40 1 11.5 1 1 1 11 30 40 2 12.9 1 1 1 12 30 40 3 13.4 1 1 1 a) The following regression model was fit to study the effect of temperature, sucrose and their interaction on the amount of energy spent. y j o 1 X 1 2 X 2 3 X 3 e j j 1,...,12 b) Test the following hypothesis H o : o 1 2 3 0 H1 : not all i 0 , i 1, 2,3 c) Write the estimated regression equation (need to replace each regression coefficient by is estimated value). d) Write the test hypothesis for each parameter, and conclusion. e) How much is the change in energy when the temperature increases from 20 to 40 and the viscosity is 20%? Table of means for each experimental group is presented next. Group mean Temperature 20 30 Mean 20 3.8 6.8 5.3 Sucrose 40 6.5 12.6 9.6 Mean 5.2 9.7 7.4 f) Show that the predicted mean for Temperature 20 when sucrose is at its average value is equal to the observed mean for this temperature over both sucrose levels. g) Show that the predicted mean for Sucrose = 40 when temperature is at its average value is equal to the observed mean for that sucrose over both temperature levels. July 23, 2011 4 NCSU ST512 TAKE HOME QUIZ SUM 2 2011 h) Show that predicted value for Temperature =20 and Sucrose = 30 is equal to the observed mean for the corresponding experimental group. i) Use the following graph to explain the significance of X3. Energy against temperature by sucrose level energy mean 14 14 12 12 10 10 8 8 6 6 4 4 2 2 0 0 20 21 22 23 24 25 26 27 28 29 30 temperature July 23, 2011 sucrose 20 40 sucrose 20 40 5
© Copyright 2025 Paperzz