Homework3_july192010.pdf

ST512
HOMEWORK 3
SSII10
Q.3.1 A medical study was conducted to study the relationship between the systolic blood pressure and
the explanatory variables, weight (kgm) and age (days) for infants. The data for 25 infants are shown
here.
Systolic Blood Pressuse Weight and Age relationship on infants
July18, 2010
Obs
Infant
Age
Weight
SystolicBP
sqrd_age
sqrd_weight
age_weight
1
1
3
2.61
80
4
0.7225
1.70
2
2
4
2.67
90
1
0.6241
0.79
3
3
5
2.98
96
0
0.2304
0.00
4
4
6
3.98
102
1
0.2704
0.52
5
5
3
2.87
81
4
0.3481
1.18
6
6
4
3.41
96
1
0.0025
0.05
7
7
5
3.49
99
0
0.0009
0.00
8
8
6
4.03
109
1
0.3249
0.57
9
9
3
3.41
88
4
0.0025
0.10
10
10
4
2.81
90
1
0.4225
0.65
11
11
5
3.24
100
0
0.0484
0.00
12
12
6
3.75
102
1
0.0841
0.29
13
13
3
3.18
86
4
0.0784
0.56
14
14
4
3.13
93
1
0.1089
0.33
15
15
5
3.98
101
0
0.2704
0.00
16
16
6
4.55
100
1
1.1881
1.09
17
17
3
3.41
86
4
0.0025
0.10
18
18
4
3.35
91
1
0.0121
0.11
19
19
5
3.75
100
0
0.0841
0.00
20
20
6
3.83
105
1
0.1369
0.37
21
21
3
3.18
84
4
0.0784
0.56
22
22
4
3.52
91
1
0.0036
-0.06
23
23
5
3.49
95
0
0.0009
0.00
24
24
6
3.81
104
1
0.1225
0.35
25
25
6
4.03
109
1
0.3249
0.57
Page 1
ST512
HOMEWORK 3
SSII10
Mean Age = 4.56
Mean Weight = 3.4584
sqrd_age=(age-5)*(age-5);
sqrd_weight=(weight- 3.46)*(weight- 3.46);
age_weight=(age-5) *(weight- 3.46);
1. Run the procedure PROC CORR for Age, Weight and SystolicBP.
a. What explanatory variable shows the highest linear correlation with
2.
3.
4.
5.
6.
7.
8.
9.
SystolicBP?
b. Are the explanatory variables linearly correlated?
Run a regression analysis with both explanatory variables and their interaction in the
model and write the estimated prediction equation of SystolicBP.
Interpret the regression coefficient for weight.
An infant of 5 days was found to have a weight of 3.1, compute a 95% confidence
interval for her SystolicBP.
Compute a 95% confidence interval for 5-day infants with a weight of 3.1 kgm.
Interpret the regression coefficient for the interaction age and weight. What is the
effect, on the Systolic BP, of a kilogram increase in the weight of 3-day old infants?
What would be the effect if the infant is 6-day old?
Can we assume that assumptions for the residuals are valid?
A residual diagnostic analysis was run, should observation 16 dropped from the
analysis?
If affirmative your answer in 8), rerun the analysis without observation 16. Explain
changes in regression coefficients and R Squared.
July18, 2010
Page 2
ST512
HOMEWORK 3
SSII10
Q3.2
A construction science project was to compare the daily gas consumption of 20
homes with a new form of insulation to 20 similar homes with standard
insulation. They set up instruments to record the temperature both inside and
outside of the homes over a six-month period of time (October-March). The
average differences in these values are given below. They also obtained the
average daily gas consumption (in Kilowatt hours). All the homes were heated
with gas
1. Run a regression analysis for gas consumption on temperatures
differences for each type of insulation.
2. Run a jointly regression analysis for gas consumption on temperatures
differences for each type of insulation, fitting a separate intercept
and separate slope for each type of insulation (Full Model).
3. Run an overall regression analysis for gas consumption on temperatures
differences for both type of insulation jointly, a single regression
line for both types of insulation (Reduced Model 1).
4. Run a jointly regression analysis for gas consumption on temperatures
differences for each type of insulation, fitting a separate intercept
and common slope for each type of insulation (Reduced Model 2).
Answer the following questions
1. Obtain the estimated regression analysis for the two types of
insulation.
2. Compare the fit of the two lines.
3. Is the rate of increase in gas consumption as temperature differences
increases less for the new type of insulation? Justify your answer by
using 95% confidence intervals.
4. If the rates are comparable, describe how the two lines differ.
5. Predict the average gas consumption for both homes using new and
standard insulation when the temperature difference is °F.
6. Place 95% confidence intervals on your predicted values in part 5).
7. Is there evidence that average gas consumption has been reduced by
using the new form of insulation?
July18, 2010
Page 3
ST512
HOMEWORK 3
SSII10
Q3.3
The data gives the normal average January minimum temperature in degrees
Fahrenheit with the latitude and longitude of 56 U.S. cities. (For each year
from 1931 to 1960, the daily minimum temperatures in January were added
together and divided by 31. Then, the averages for each year were averaged
over the 30 years.)
Ref: : Peixoto, J.L. (1990) A property of well-formulated polynomial
regression models. American Statistician, 44, 26-30.
Observed facts
1. A partial regression plot of Lat vs JanTemp shows that the relationship
between JanTemp and Lat, after removing the effects of Long, is linear
and negative.
2. A partial regression plot of Long, shows that the relationship between
JanTemp and Long, after removing the effects of Lat, is not linear.
3. The regression model assumes that the relationship between JanTemp and
Long is linear. This plot shows that this assumption is clearly
violated.
4. Peixoto (1990) reports a study in which a linear relationship is
assumed between JanTemp and Lat; then, after removing the effects of
Lat, a cubic polynomial in Long is used to predict JanTemp.
Run the analysis that will support above facts
1. Run a multiple linear regression with Latitude and Longitude as
explanatory variables.
a. Do a residual diagnostic analysis.
b. Is there evidence that multiple linear regression is not the
best fitting?
2. Run a single regression model for longitude,
a. output the residuals,
b. plot Latitude (axis X) vs residuals (axis Y)
c. describe the type of relationship between latitude and
residuals
3. Run a single regression model for latitude,
a. output the residuals,
b. plot Longitude (axis X) vs residuals (axis Y)
c. describe the type of relationship between longitude and
residuals
4. Run a multiple regression model with Linear Latitude and a cubic
polynomial for longitude. Use Type I SS to test significance of
cubic coefficients.
5. Write the regression equation, indicate R2.
6. Plot residuals against Longitude, describe changes.
From Peixoto’s article.
Polynomial models
July18, 2010
Well Formulated model
Page 4
ST512
July18, 2010
HOMEWORK 3
SSII10
Page 5