Matched Pairs t Procedures

Matched Pairs t Procedures
One-sample inference is much less common than comparative inference. One common design
to compare the effects of two treatments on a response makes use of one-sample procedures. In a
matched pairs design, subjects are matched in pairs and each treatment is given to one subject
in each pair. The experimenter can assign the two treatments to the two subjects in a pair by
tossing a coin. A common situation calling for a matched pairs design involves measurements
taken on the same individual before and after some treatment is applied.
To compare the responses to the two treatments in a matched pairs design, apply the one-sample
t procedures to the differences in the responses within the matched pairs.
Or, more concisely,
For paired data, analyze the difference
Example:
A football coach wants to know if his punter can kick a football farther if it is filled
with helium instead of air. Helium is much lighter than air, so the coach wants to
know if they can kick the ball farther if it is filled with helium. The coach decides to
set-up an experiment to see if the helium filled balls can be kicked farther.
An experiment like this was conducted at Ohio State university. Two identical
footballs, one air-filled and one helium-filled, were used outdoors on a windless day
at The Ohio State University's athletic complex. The kicker was a novice punter and was not
informed which football contained the helium. Each football was kicked 39 times. The kicker
changed footballs after each kick so that his leg would play no favorites if he tired or improved
with practice.
The results of the experiment (distances recording in yards):
The Data:
Trial
Air
1
25
25
0
2
23
16
-7
3
18
25
7
4
16
14
-2
5
35
23
-12
6
15
29
14
7
26
25
-1
8
24
26
2
Helium Difference
9
24
22
-2
10
28
26
-2
11
25
12
-13
12
19
28
9
13
27
28
1
14
25
31
6
15
34
22
-12
16
26
29
3
17
20
23
3
18
22
26
4
19
33
35
2
20
29
24
-5
21
31
31
0
22
27
34
7
23
22
39
17
24
29
32
3
25
28
14
-14
26
29
28
-1
27
22
30
8
28
31
27
-4
29
25
33
8
30
20
11
-9
31
27
26
-1
32
26
32
6
33
28
30
2
34
32
29
-3
35
28
30
2
36
25
29
4
37
31
29
-2
38
28
30
2
39
28
26
-2
Descriptive Statistics from Minitab on the Difference Variable:
Variable N Mean Median TrMean StDev SE Mean
Difference 39 0.46 1.00
0.40
6.87
Variable Minimum Maximum Q1
Q3
1.10
Difference -14.00
17.00 -2.00 4.00
As you can see, the difference in means is 0.46. That is, on average the helium ball was kicked
0.46 yards farther than a ball filled with air.
Solution:
Now, we need to ask ourselves, is this difference between kicks proof that helium balls will go
farther? Let's do a formal hypothesis test on the difference between means of the air filled balls
and the helium filled balls. The null hypothesis will assume there is no difference.
H0: μ = 0
Ha: μ > 0
We cannot assume this distribution is normal. Let's graph the variable and check for normality.
The graph below shows the distribution to be more or less normally distributed so we can
proceed with our hypothesis test.
Calculate the test statistic.
The p-value for the above test statistic with 38 degrees of freedom is 0.339 (obtained from
Minitab).
With a p-value of 0.339 is would be safe to say that there is no difference between the air-filled
footballs and the helium filled footballs.
P-values are more informative than the reject-or-not result of a fixed α level. Beware of placing
too much weight on traditional values of α, such as α = .05.
Robustness of t procedures
The t confidence interval and test are exactly correct when the distribution of the population is
exactly normal. No data are exactly normal. The usefulness of the t procedure is therefore
dependent on how strongly they are affected by lack of normality.
Robust Procedures
A statistical inference procedure is called robust if the probability calculations required are
insensitive to violations of the assumptions made.
The assumption (behind t-based confidence intervals and t-tests) that the population is normal
rules out outliers, so the presence of outliers shows that this assumption is not valid. The t
procedures are not robust against outliers, because the sample mean and standard deviation are
not resistant to outliers. On the other hand t procedures are quite robust against nonnormality of
the population where no outliers are present and the distribution is roughly symmetric. As we
have seen throughout the course large samples also improve the accuracy of p-values when the
population is not normal. The main premise behind this the Central Limit Theorem. Here is a
checklist for inference on a single mean:
Rules for using the t-test:
- Ideally, the sample comes from a normal distribution.
- The assumption of an SRS is important.
- Make a plot and check for skewness and outliers.
- For sample sizes less than 15, the t-procedures can be used if the data are close to normal. Do
not use t-procedures if the data are clearly nonnormal or if outliers are present.
- For sample sizes 15 or greater, t-procedures can be safely used except in the presence of
outliers or strong skewness.
- For sample sizes 40 or greater, t-procedures can be used even if data is heavily skewed
The power of the t test
The power of an hypothesis test against a specific alternative hypothesis is the chance that the
test correctly rejects the null hypothesis when that alternative hypothesis is true; that is, the
power is 100% minus the chance of a Type II error when that alternative hypothesis is true. The
chance of a Type II error is often denoted by the lowercase Greek letter beta (ß), so the power is
(100% - β).
Calculating the power of the t-test takes in the approximation of σ by s and is a bit complex. But
an approximation that acts as if σ were known is adequate for planning a study.