Logistic Regression p = 1 + e(a+bx) a + bx = Log p 1 − p

1.0
The Logitistic Curve for Probabilities
0.6
0.4
0.0
0.2
Probability
0.8
Logistic Regression
−4
−2
0
2
4
X
The Logitistic Function
p=
The Log Odds
e(a+bx)
1 + e(a+bx)
a + bx = Log
p
1−p
The Odds
Cryptosporidium
Odds =
p
1−p
Fraction of Mice Infected = Probability of Infection
Two Different Ways of Writing the Model
Fraction of Mice Infected
1.00
0.75
# 1) using Heads, Tails
glm(cbind(Y, N-Y) ˜ Dose, data=crypto, family=binomial)
#
#
# 2) using weights as size parameter for Binomial
glm(Y/N ˜ Dose, weights=N, data=crypto, family=binomial)
0.50
0.25
0.00
0
100
200
Dose
300
400
The Fit Model
The Fit Model
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
Fraction of Mice Infected
1.00
0.75
0.50
0.25
0.00
0
100
200
300
400
Dose
The Meaning of a Logit Coefficient
Deviance Residuals:
Min
1Q Median
-3.953 -1.244
0.233
3Q
1.553
Max
3.601
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.40777
0.14848
-9.48
<2e-16
Dose
0.01347
0.00105
12.87
<2e-16
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 434.34
Residual deviance: 200.51
AIC: 327
on 67
on 66
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 4
The Meaning of a Logit Coefficient
Logit Coefficient: A 1 unit increase in a predictor = an increase of
β increase in the log-odds ratio of the response.
Log Odds Ratio = Log
Call:
glm(formula = cbind(Y, N - Y) ˜ Dose, family = binomial, data = crypto
p1
1−p1
p2
/ 1−p
Log Odds Ratio = Log
p1
1−p1
/
p2
1−p2
If p1 = 0.5, Odds Ratio = 1
2
With β = 0.01347, we multiply 1 by e0.01347 .
We need to know both p1 and β to interpret this.
The new odds ratio is 1.013561, which means p=0.5033674
What if we Only Have 1’s and 0’s?
Seed Predators
1.00
Predation
0.75
0.50
0.25
0.00
0
2
4
6
8
log.seed.weight
http://denimandtweed.com
A Quick Note on Within and Transformation
seeds <- within(seeds, {
log.seed.weight <- log(seed.weight)
})
The GLM
seed.glm <- glm(Predation ˜ log.seed.weight,
data=seeds, family=binomial)
0.00
2.5
5.0
7.5
Binned Residuals Should Look Spread Out
0.5
0.0
−1.0
Residual
1.0
1.5
2.0
200 Bins
0.1
0.2
0.3
Fitted
0.4
0.5
2
1
1 2
0
−3
−1
1 2 3
Scale−Location
Residuals vs Leverage
−2
−1
0
1113
572
554
3
1113
572
554
Predicted values
log.seed.weight
1113 554
572
Theoretical Quantiles
−3
0.0
−1
1.0
0.25
−2
Normal Q−Q
Predicted values
0.0
Std. deviance resid.
Predation
−3
0.50
Std. Pearson resid.
0.75
−1
Residuals
1113
572
554
−1
Residuals vs Fitted
1
1.00
Std. deviance resid.
Diagnostics Look Odd Due to Binned Nature of the Data
Cook's distance
−1
Fitted Seed Predation Plot
0.000
0.003
Leverage
0.006