April 26

M243
Multiple Regression
Observation = Model + Error
April 26
Observation = Fitted + Residual
1. How to choose b0 and b1 in logistic regression model. Model: the probability of success p for a given x is given by
p
= β0 + β1 x
ln
1−p
> gj= glm( Made~Distance,data=janzen,family=binomial)
> x=data.frame(Distance=c(0:40))
> p=predict(gj,x,type=’response’)
2. More than one predictor.
y = β0 + β1 x 1 + · · · + βk x k + ε
3. Same model for ε.
4. Example on other side.
5. Interpretation of coefficients.
6. The really simple null hypothesis and the F -test
H 0 : β1 = · · · = βk = 0
7. Assumptions and conditions
Assumption
Linearity
Independence
Equal variance
Normal population
Condition
Straight enough
Randomization
Plot thickens
Nearly normal, Outlier
Homework
1. Read Chapter 30, pages 783–791.
2. For Tuesday, May 4, do problem 30.12.
Tool
plot residuals against predictors and fitted
residuals against fitted
qqnorm(residuals())
1. Data on 50 states educational systems. Expenditures
per student, salaries of teachers, student-faculty ratio,
SAT scores, and percentage of eligible students taking
the SAT.
1053.32
> lm( PctSAT~Exp,data=EducExp)
Call:
lm(formula = PctSAT ~ Exp, data = EducExp)
> EducExp[1:3,]
State
Exp SFR Salaries PctSAT SATV SATM SAT
1 Alabama 4.405 17.2
31.144
8 491 538 1029
2 Alaska 8.963 17.6
47.951
47 445 489 934
3 Arizona 4.778 19.3
32.175
27 448 496 944
...................................................
> pairs(EducExp[,2:8])
Coefficients:
(Intercept)
-33.48
> lsep = lm (SAT ~ Exp + PctSAT ,data =EducExp)
> summary(lsep)
> lm( SAT~Exp, data=EducExp)
Call:
lm(formula = SAT ~ Exp, data = EducExp)
Call:
lm(formula = SAT ~ Exp + PctSAT, data = EducExp)
Residuals:
Min
1Q
-88.400 -22.884
Exp
-20.89
3. But SAT scores are also related negatively to the percentage
of students taking the SAT and that percentage is positively
related to expenditures.
Call:
lm(formula = SAT ~ PctSAT, data = EducExp)
●
18
PctSAT
22
20
40
60
80
●●
●
●
450
●●
● ●
●
500
●
●
550
●●
● ●
●
●●
●●
● ●
●
● ●●
●
●
● ●●
●
●●
● ● ●
●● ● ●
●
●
● ●
●●
●●
●
● ●●
● ●
● ●●
●
●
● ●
●
●
●
● ●● ●
● ●
●
●
●●
●● ●●
●
● ●● ● ●●
●
●● ●●
●
●
●
●
●
●
●●
●
● ●●
● ●
●
●
●
●●
●●● ●
●
●
●● ● ●
●
●
●
●●●
●●●
● ●
●
●
●
●●●
●
●
●
●
●
●
● ● ●
●
●● ●● ●
●
●●
● ●
● ●
●
●
●
● ● ● ●● ●
●
●
●
●●
●
●
●
● ● ●
●● ● ●●
● ●
●
●● ● ●
●●●●
●
●
●
●
● ●
●● ●
● ●
●
●
●
●● ●●
●
●● ●
●
● ●
● ●
●
●●
●
●● ●● ●
●●
●
●
●
●
●●●●
●
●
●
●
●
●●
● ●●
● ●
●
●
●
● ●
●
●
●
●● ●
●
●
●●
●
●
●●
●
●●
●● ●
● ●
●
●● ● ●
●●●●
●
●
●
●
●
●
●
22
●
●
●
4
●
●
Exp
6
8
●
60
40
20
●
● ●
●●
●
●
●● ● ●
● ●●
●
●●
●●
● ● ●● ●
●
450
500
550
●
●
●
● ●
●
●
●
●●
●
●
●●
●●
● ●
●
● ●● ●
●
●
●
●
●
●
●● ●
● ● ●● ●
●●
●
●●
●
●●
●
●
●●
●
● ●●
●
●
●
●
●
●
4 5 6 7 8 9
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
● ●● ●
●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●●●●
●
●
● ●
●
●
●
●●●
●
●
●
●●●●● ● ● ●
●
●●
● ● ●
●
●
●
●
●
●●
●
●● ●
● ●
● ●
●
● ●●●
●
●
● ●●
●●
●
●●●
●
●
●
●
●
●●
●
●
●●
●●● ●●● ●
●
●
●
● ●● ● ●
●●
●
●●
●●
●
●● ●
●
●
●
●
●
● ● ●
●
●
●
●
●
●
●●
●
● ●●
●
●●
●●●
● ●
●
●
●●
● ●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●● ●●
●
● ●
● ● ●
●
●
● ●● ●
●
●
● ● ● ●
●● ●
●●● ●
● ● ●●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
● ●●
● ●
●
● ●● ●
●
●
●
●
●
●
●
●●
●
●
●
●●
●● ● ●
●
●
●
●
30
40
●
●
●
●
●
●●●
●
●
●●
●
●●
●●
50
●
●
●
● ●
●●
●
●
●●
50
●
●
●
●
●●●● ●
● ●●
●● ● ●
●
●● ●●
●
●
●
●●●
●
● ●
● ●●
● ● ●●
●
●
●
●● ●
●
● ●
● ●
●
●
●●
●●
●
●
●●
●●
●●
● ●● ● ●
●
●
●
●
●
●
●
●
●
●
● ●
● ●● ●
●
● ●
●
●
●● ●
●●
●
●●
●
●
●
●●
●
●
●
● ●
●
●
●
●●
●●●●
●
●●●●
● ●● ●● ●●
●
●
●
● ●●
●● ●
●● ●
●
●
●●
●
●●●●
●
●●
●
●
●
●
●
●●
●●
●●
●●
●●
●
●●
● ●
●
●
●●
●
●●
●
●●
●●
●
●●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●●
●
●
● ●●
●
●●●●
●
440
●
SATM
●
●●
●
●●●
●●
●
●●●
●●●
●●
●●
● ●●
●●
●●●
●●
●
●
●
●●
●
●●
●
●
●●
●●
●
400
●
●●
●
●●
●
● ●
●
●
●
●●●●●
●● ●●
● ●
● ● ●●
●●
●
●
●
●
●
●●
●●●
●●●●●
● ● ●
●
●
●
● ●
●
●●
●● ●
● ●
●
● ●●●
●
●
●●
●
●●
●
●●
●● ●
●
●
● ●●
●●●
●
●
●
●
●
● ●
●
●
● ●●
●● ●●
●
●
●
● ●●●
●●● ●
●
●●
●
●
●
SATV
●
●
●
●●
●● ●
●●
●
●
●
●
●● ●
●
●●
●●●
● ●
●●
● ●
●
●
●
●
●
●●●●● ●
●
●
●
●
●●
●●
● ●
●
●
●
●●●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
● ●
●
●
● ●●
● ●● ●●
● ●
●
● ●●
●
●
●●
● ●
●
●●●
●● ● ●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●●
●
● ●●
●
●
●
●
●
● ●
●●● ●
● ●
●
●
●●●
●
●
●
●
●
● ●
●●
●
● ● ●
●
●
●
●
● ●
●
●
● ●●
● ●●
● ●
● ●
●
● ● ●●
●
●●
●●
●
●
●●
●● ●
●
●
●● ●
●
●●
●
●● ●
●● ●
●
●
● ● ●●
●●
●
●
● ●
●
●
●
●●● ●
● ●
●
●●
● ●
●
●
●
●●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
● ●
●
●● ● ●
●
● ●●
●
●
●
● ●●
●
●
● ●
● ●
●
● ●●
● ●● ●
●
●
●●
●
●
● ●
●
●●
●
●●
● ●
●
●●
●
●
● ●
●
●
●
●
●
● ●
● ●
●●● ●
● ●
●
●
●● ●
●
●
●●
●
●
●
SAT
●
●
●
●
●●
●●
●
●●
●
●●●●
●●
●
●
●
●
480
520
40
●
PctSAT
●
●
●
●
●●
●
●
● ● ●
●
●● ●
●
●●
● ●
●
● ●
● ●
●
● ●●
●
● ●
●
● ●● ● ●
●
●●
●
●
●
●
●
●
●●
● ●
●
●●
●
● ●
●
●
● ●
● ●
●
● ●
●●
●●
●● ●
●
●
●● ●
● ●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●●●●
● ● ●
●●● ●
● ●
●
●
●●
●
● ●
●
● ●
●●
● ●
●
●●
●
●
30
Salaries
●
●
● ●
●●
●● ● ●
● ●
● ●
●
●
●
●●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ● ●
●
●● ●● ●
●
●
●
●
●
●●
●
● ● ● ●●
●
●
●●
●
● ●●
●●●
●●
●
● ●● ●
●
●
●●
●●● ●
●●
● ●● ●●
●
●
●●
●
●
●
●●
● ● ●●●● ●
●
●
●
●
●
●
●
● ●●
●
● ●
●
●
● ●●
●●
●
● ● ●●
●
● ●
●●
●
●●●
●●
●
●
●●●
● ● ●●
●
●
●
● ●
●
●● ●
●●
● ● ● ●●
●
●●●
●●●
●
● ●
●
●●
●
●
●
●
●●●●●
●
●● ●
● ●●
●
●
● ●
●
●
●
● ●●
●●
●●
●
●
● ●●●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●● ●●
●
●
●●
●●
●
●
●
●
520
80
●
● ●●●
●●
●●
●
●● ●●
●
●●
●●
●
●
480
●
● ●
●
●
●
●●
●
●
●
●
440
●
●●
●
●
400
●
●●
●
1050
SFR
●
●
●
● ●
●
●
●
●●
●●
●●●
●
●●
●●
●●
●
● ●
●●
950
●
●
●●
● ●
●● ●
●
●
●●
●●
●● ●
● ● ● ● ●●
● ●●
● ●●●●
● ●●●
● ●
● ●
●
● ● ●
● ●
●
●
● ●
●
●
●● ●
●● ●
● ●
● ● ●
● ●
● ●● ●
● ● ●
●
●● ● ●
●
●
●
●● ●
●
●
● ●
●
850
14
18
● ● ●
● ● ●
● ●●
●
●●●
●●
●● ● ● ●●
●●
●●● ●
●
●●
● ●●
●
●●
●
●●
● ● ●
●●
3Q
19.142
Max
68.755
Residual standard error: 32.46 on 47 degrees of freedom
Multiple R-squared: 0.8195,
Adjusted R-squared: 0.8118
F-statistic: 106.7 on 2 and 47 DF, p-value: < 2.2e-16
10
14
Median
1.968
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 993.8317
21.8332 45.519 < 2e-16 ***
Exp
12.2865
4.2243
2.909 0.00553 **
PctSAT
-2.8509
0.2151 -13.253 < 2e-16 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1
1
> lm (SAT~PctSAT,data=EducExp)
Coefficients:
(Intercept)
Exp
11.64
4. The multiple regression.
2. SAT scores are related negatively to expenditures!
Coefficients:
(Intercept)
1089.29
-2.48
850
950
1050
2