25 Notes - Inferences with Matched Pairs

INFERENCE WITH
MATCHED PAIRS
a special type of t-inference
AP Statistics
Chapter 25
Which situation requires a 2-sample t procedure, and which requires matched pairs?
A researcher wishes to determine whether listening to music
affects students' performance on memory test.
a)
He randomly selects 50 students and has each student perform a memory test once while listening to
music and once without listening to music. He then compares the two scores for each student, “with
music” vs. “without music”…
b)
He takes 50 students, and has half of them perform a memory test without listening to music, and the
other half perform the memory test while listening to music. He then obtains the means and standard
deviations of the “with music” and the “without music” groups…
Which situation requires a 2-sample t procedure, and which requires matched pairs?
A manufacturer has designed athletic footwear which it hopes
will improve the performance of athletes running the 100-meter
sprint.
a)
30 athletes are selected for this study. One group of 15 runs the sprint wearing the new footwear, and
the other group of 15 runs with their normal footwear. He then compares the mean sprint times
between the two groups…
b)
30 athletes are selected, each of them runs one sprint wearing the new footwear, and also one sprint
with their normal footwear. Randomization (flipping a coin for each athlete) determines which
footwear they run with first. The two times for each athlete are compared…
DESIGNING STUDIES in tonight’s HW!!!
A couple of tips (reminders?):
gasp!
• For an experiment, you don’t need a random sample – volunteers are
okay! But use randomization to split subjects into groups
(use a RNG… or flip a coin for each person…
it is OKAY for groups to be DIFFERENT sizes)
• When designing a matched-pairs procedure, EVERYONE gets both
“treatments” – so randomize the order!!!
(if practical. For “before-after” scenarios, you can’t really do this…)
30 athletes are selected… randomization (flipping a coin for each athlete) determines which footwear
they run with first. The two times for each athlete are compared…
Why matched pairs?
2
(why not stick with 2-sample t?)
Student
1
2
3
4
5
2
s1 s2

?
n1 n2
6
7
8
615
Before
550 520 500 600 640
730
590
s1 = 73.04
After
600 570 550 650 690 665 780
640
s2 = 73.04
+50
+50
+50
+50
+50
+50
+50
+50
sd = 0
• Is there variance in the “before” scores?
• Is there variance in the “after” scores?
• Is there variance in the
2
So instead of
improvements?
2
s1 s2

n1 n2
we use
sd
n
Last flap of our “means” foldable (outside)
matched pairs!
paired t-interval
and
paired t-test
Update your foldables (inside, top half)
Define md (“true mean difference…”)
Conditions:
•
•
•
•
***define which way you are subtracting!!!
Paired data???
Random sample (pairs)
(10%)
Nearly Normal Condition
o n > 30 (number of PAIRS!!!)
o boxplots/histogram of DIFFERENCES!!!
***do NOT graph BOTH boxplots!!!
So we may use a t-distribution,
df = n – 1 (“n” is the number of pairs)
Update your foldables (inside, bottom half)
on calculator, just do “t-test” or
“t-interval” with the differences
paired t-interval:
 sd


xd  t *df 

n
d 

paired t-test:
H O: m d  0
H A: m d >
<0
≠
xd  0
t
sd
nd
SOME TIPS ON HOW TO TELL MATCHED PAIRS…
(WARNING: THIS IS NOT A COMPLETE LIST)
• The two sets of data MUST have the
same number of elements…
• HOWEVER, just because both sets of
data have the same count does NOT
NECESSARILY make it matched pairs (so
be careful!)
• Is each PAIR of numbers linked
somehow? (sometimes this is very difficult to determine)
BIG PICTURE:
BLOCKING/STRATIFYING AND INFERENCE
• Reduces the variability (spread) of our
data (sampling model)
• With LESS variability,
we are MORE likely to reject Ho.
• Makes it easier to detect an “effect”
(a “change” or “difference” or “improvement”, etc.)
2-sample t
(no blocking)
Matched pairs
(blocking)
larger p-value
smaller p-value
A whale-watching company noticed that many customers
wanted to know whether it was better to book an
excursion in the morning or the afternoon. To test this
question, the company collected the following data
(number of whales sighted) on 8 randomly selected days
over the past month. (Note: days were not consecutive)
Day
1
2
Morning
9
Afternoon
10
3
4
5
6
7
8
7
10 10 2
5
7
6
9
9
7
9
6
8
4
Since you have two values for
each day, they are dependent
on the day – making this data
matched pairs
You may subtract either
way – just be careful
when writing Ha
Day
1
2
Morning
9
7
Afternoon
10
Differences
-1
Conditions:
3
4
5
6
7
8
10 10
2
5
7
6
9
9
8
4
7
9
6
-2
1
2
-2 -2 -2
0
You need to state assumptions
using the differences!
• The data are paired by day since whale-watching
conditions may change from day to day
• We have a random sample of days for whale-watching
• n = 8 days is certainly less than 10% of all whalewatching days
Day
Differences
1
2
-1 -2
3
4
5
6
7
1
2 -2 -2 -2
Nearly Normal Condition:
• The box plot of differences is
skewed, but has no outliers, so
normality is plausible (especially with
this small a sample size)
(remember, you can also do a dot plot!!!)
We may use a t-distribution w/ df = 7
8
0
Day
Differences
1
2
-1 -2
3
4
5
6
7
1
2 -2 -2 -2
8
0
At the 5% significance level, is there evidence that
more whales are sighted in the afternoon?
H0: mD = 0
Ha: mD < 0
Be careful writing your Ha!
Think about how you subtracted:
M–A
If you If
subtract
afternoon
–
afternoon
is more, should
the
differences
+ or -?
morning; then
Ha: mD be
>0
(Don’t look at numbers!!!!)
mD = true mean difference in
whale sightings,
define which way you are subtracting!!!
morning – afternoon
Day
Differences
1
2
-1 -2
3
4
5
6
7
1
2 -2 -2 -2
8
0
finishing the hypothesis test:
xm
 .75  0
t


s
1.5811
n
8
p  .1108
df  7
a  .05
In your calculator,
perform
a t-test
Notice
that if
1.3416
using
the
you
subtracted
differences
3)
A-M, then(L
your
test statistic
t = + 1.3416, but
p-value would be
the same
Since p-value (.1108) > a (.05), we fail to reject H0.
We lack sufficient evidence to suggest that more
whales are sighted in the afternoon than in the
morning.
…and now, a paired t-interval
Develop a 90% confidence interval for the true average
difference in number of whales sighted (morning – afternoon)
statistic  (critical value) (SE)
xd  tdf
df = 8 – 1 = 7
*
sd
nd
(1.5811)
(0.75)  1.895
8
(-1.809, 0.3091)
since
this isthat
really
We are 90%
confident
theatrue mean difference in whale
sightings
is from interval,
1.809 fewer
in the
the morning to 0.3091 more in
1-sample
get
the morning.
t* value from the t-table
We can’t really say that it matters when you go whaling!
whale watching!
(here is the problem we did in class, but
without the work, if you wish to give it a
shot and check with someone later)
IS CAFFEINE DEPENDENCE REAL?
The table below contains data on the subjects’ scores
on a depression test. Higher scores show more
symptoms
Subject of
1 depression.
2 3 4 5 6 7 8 9 10 11
Caffeine
Placebo
5
5
16 23
4
3
8
5
5
7
14 24
0
0
2
11
1
6
3
15 12
0
a) Do the data from this study provide
statistical evidence at the 5% level of
significance that caffeine deprivation
leads to an increase in depression?
IS CAFFEINE DEPENDENCE REAL?
The table below contains data on the subjects’ scores
on a depression test. Higher scores show more
symptoms
Subject of
1 depression.
2 3 4 5 6 7 8 9 10 11
Caffeine
Placebo
5
5
16 23
4
3
8
5
5
7
14 24
0
0
2
11
1
6
3
15 12
0
b) Use a 90% confidence interval to
estimate the true mean increase in
depression scores that results from
being deprived of caffeine.