SMART Notebook - Kenston Local Schools

Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Honors Statistics
Thursday March 3, 2016
Aug 23-8:26 PM
Daily Agenda
3. Review OTL C3#10
4. Review Sprint data
5. Space Shuttle activity
Aug 23-8:31 PM
1
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Feb 28-3:45 PM
Oct 17-7:12 PM
2
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
r
(0 , a)
residual = observed y - predicted y
Oct 17-7:12 PM
Oct 17-7:12 PM
3
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
can be explained by
least squares regression line of
Oct 17-7:12 PM
Study the formulas sheet
for a quiz tomorrow.
A skip none
pg 198: 70 use the LSRL worksheet
examine the point (116, 41)
determine its influence.
I made the Lists LRUSH,LPTSC
PLEASE CHECK YOUR DATA LISTS BEFORE CONTINUING WITH THE WORKSHEET.
Oct 5-6:47 PM
4
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
PLEASE CHECK YOUR DATA LISTS BEFORE CONTINUING WITH THE WORKSHEET.
Below are the results for the statistics.....
rushing yards statistics
points scored statistics
Mar 2-9:33 AM
PLEASE CHECK YOUR DATA LISTS BEFORE CONTINUING WITH THE WORKSHEET.
Oct 12-7:43 PM
5
March 03, 2016
Points scored
Chapter 3 Section 2 day 7 2016s Notes.notebook
Rushing yards
Oct 12-7:44 PM
weak positive linear association between total rushing
yards and points scored in each game of the 2011
Jacksonville NFL season.
Oct 12-7:45 PM
6
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
11.39 + 0.03x
Oct 12-7:45 PM
y = 11.39 + 0.03x
I pick x = 250 yards rushing
y = 11.39 + 0.03(250) = 18.89
I suppose this is feasible ... it is pretty difficult to
score 19 points but .... Jacksonville did score 41
Oct 12-7:45 PM
7
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
8.26
Oct 12-7:45 PM
Rushing yards
Oct 12-7:45 PM
8
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
y = 7.23 + 0.05x
This point is located in a position that would make us think that it is a residual or regression
outlier. Without the point the association between rushing yards and
points scored becomes stronger (from r = 0.1 to r = 0.32) the standard deviation dropped from
s = 8.26 to s = 4.18. Therefore, this point is a residual outlier (or regression outlier).
Oct 12-7:45 PM
Final Conclusion:
A linear model is appropriate for this data set. The
residual plot and scatterplot have no curved pattern.
The correlation coefficient r = 0.10 and coefficient of
determination r2 = 0.01 are very weak. The standard
deviation of the residuals is s = 8.26 so the predictions for
the number of points scored is typically off by about 8.26.
There is one data point which is a regression outlier.
Without this point the association becomes stronger.
So while a linear model is the right choice, the model is not
very accurate and should be used with extreme caution.
Or perhaps not at all.
Oct 28-9:16 AM
9
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Oct 1-9:05 AM
3.1
calc 3.2
Sentence 3.3
Sentence 3.4
3.5
calc 3.6
3.7
Sentence 3.8
3.9
calc
3.10
3.11
calc
3.12
Sentence calc 3.13
Sentence
3.14
3.15
3.16
Sep 26-11:13 AM
10
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Oct 11-2:32 PM
Oct 11-2:33 PM
11
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Oct 11-2:33 PM
Feb 28-3:39 PM
12
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Oct 19-2:06 PM
The Final conclusion is like the summary paragraph of a term paper.
Address the following statements.
1. Should a linear model be used to model this data set?
How do you know?
What is your evidence?
Look back at the original data plot and the evidence of the residual plot.
2. If a line is appropriate, what is the strength of the model?
How do you know?
What is your evidence?
Look back at the r, r2 and s values. What strength do they show?
3. Are their any special points that affect the model?
What are they and do they make the model stronger or
weaker when they are removed?
Look back at the last pattern deviations investigation .
What does it show?
Oct 14-8:56 AM
13
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Final Conclusion:
Feb 28-3:25 PM
regression to mean video
https://youtu.be/B98XzmOA7eg
A music video called Regression to the mean
https://youtu.be/7Td0kSVXoI0
Oct 20-8:25 AM
14
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Should we use a line?
How good is the line at predicting y from x?
How do we calculate the "error" of the prediction?
Oct 3-7:38 PM
REGRESSION EXAMPLE
Data is collected from a small statistics class.
Members participated in a 40 yard sprint and the long jump.
Data Lists are called ....
(7.25, 110)
This scatterplot displays a strong negative linear
association between a students 40-yard sprint time in
seconds and their long jump ability as recorded by inches.
Oct 14-5:44 PM
15
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
The correlation coefficient r=-0.84 verifies a strong
negative linear association between a students 40-yard
sprint time in seconds and their long jump ability as
recorded by inches.
Oct 1-12:35 PM
Track and Field Day
x = sprint time
y = long jump distance
414.79 - 45.74x
When the sprint time is 0 seconds, the predicted
long jump distance is 414.79 inches.
As the sprint time increases by 1 second, the long jump
distance is predicted to decrease by 45.74 inches.
Oct 9-3:29 PM
16
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Track and Field
long jump = 414.79 - 45.74(15 seconds) =
Feb 28-7:21 PM
Track and Field Day
0.70
70% of the variation in the long jump distances can be explained by the
variation in the 40-yard sprint times as calculated by the LSRL of
inches on seconds. (or jump distances on sprint times)
22.38 inches
22.38 inches
is the standard deviation of the residuals. It is the
typical amount the observed jump distance in inches differs
from its predicted jump distances on the LSRL.
Mar 17-8:38 PM
17
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
7.47
The residual plot shows a scatter of
points so a linear model is appropriate.
The residuals are rather large so the model
will not be very accurate.
Feb 28-7:28 PM
(7.25, 110)
Oct 19-2:06 PM
18
Chapter 3 Section 2 day 7 2016s Notes.notebook
With the point (7.25,110)
March 03, 2016
Without the point (7.25,110)
V
V
y = 414.79- 45.74x
y = 447.33 - 51.33x
r = -0.84
r = -0.86
r2 = 0.75
r2 = 0.70
s = 21.27
s = 22.38
The point (7.25, 110) changed the y intercept a little and the slope a little.
It is an influential observation.
When removed it made the association a small bit stronger but not enough to
call it a regression outlier.
(7.25, 110) is an influential observation.
Oct 12-8:40 PM
Final Conclusion:
Feb 28-3:25 PM
19
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
A linear model is the correct model for this field day data because the original
scatterplot of the data does not show any curving and the residual plot
appears scattered (it has no pattern).
The linear model for the animal gestation vs. longevity data is a strong
model. The r = -0.86 and r2 values are strong. The standard deviation is
22.38. So our predictions for long jumps are typically off by 22.38 inches.
The point (7.25, 110) only changes the line and correlation slightly.
Use the model with a bit caution because the predictions are only somewhat
accurate.
Mar 1-3:18 PM
Oct 5-6:47 PM
20
Chapter 3 Section 2 day 7 2016s Notes.notebook
March 03, 2016
Managing Diabetes
Fasting Plasma
406.77
30.23
4.78
HbA
20.62
This scatterplot displays a moderately weak positive linear
association between HbA and fasting plasma.
0.48
The correlation coefficient verifies the moderately weak positive
linear association between HbA and fasting plasma.
Oct 12-8:54 PM
That's enough for you to get
started, we will go over the
rest in class tomorrow!
Mar 3-2:25 PM
21