Influential Points

Influential Points
By Noelle Hodge
Does the age at which a child begins to talk predict later score on a test of mental ability? A study of
the development of young children recorded the age in month at which each of the 21 children spoke
their first word and Gesell Adaptive Score, the result of an aptitude test taken much later. The data
appears below.
Child
1
2
3
4
5
6
7
8
Age
15
26
10
9
15
20
18 11
Score
95
71
83
91 102 87
9
10 11
12
13 14 15
16
17
18
19
20
21
8
20 7
9
10 11 11
10
12
42
17
11
10
93 100 104 94 113 83
84 84 102 100 105 57
Enter data into calculator, List 1 and List 2
121 86
100
Calculate the LSRL for the data.
• Sketch a scatter plot with the LSRL
Sketch a residual plot of the data.
•.
Is there a point that seems like an outlier in
the y-direction?
• Circle it.
• Which child is it?
Child 19
Is there a point that seems like an outlier in
the x-direction?
• Circle it.
• Which Child is it?
Child 18
Remove the point you chose for the outlier in
the y-direction.
• Sketch the scatter plot with the LSRL
Sketch the residual plot of the data.
•.
What is different from this LSRL and the plots,
than from the original?
•.
With point removed:
Original:
Insert this data point back into your data.
• STAT -> edit
• 2nd -> DEL (Insert) -> (enter 17)
• Curser over to y column -> 2nd -> DEL (Insert) -> (enter 121)
Remove the point you chose for the outlier in
the x-direction.
• Sketch the scatter plot with the LSRL
Sketch the residual plot of the data.
•.
What is different from this LSRL and the plots
than from the original?
•.
With point removed:
Original:
Influential points
• Influence depends on both leverage and residual; a case with high
leverage whose y-value sits right on the line fit to the rest of the data
is not influential. Removing that case won’t change the slope, even if
it is does affect 𝑅2 . A case with modest leverage but a very large
residual can be influential.
• If a point has enough leverage, it too, can pull the line right to it
because it’s highly influential but has a small residual.
• The only way to be sure is to fit both regressions.
Influential Points
• Unusual points in a regression often tell us more about the data and
the model than any other points.
• Whenever you have influential points, you should fit the linear model
to the other points alone and then compare the two regression
models to understand how they differ.
Just Checking:
• For each of the three scatter plots, tell whether the point indicated is
a HIGH LEVERAGE POINT, would have a LARGE RESIDUAL, or IS
INFLUENTIAL.
Not high leverage, not influential, large residual
High Leverage, not influential, small residual
High Leverage, influential, not large residual
12
10
y
8
y
6
4
2
4
6
x
8
10
25
20
20
10
y
15
0
10
-10
5
-20
0
5
10
x
15
20
0
5
10
x
15
20