Alternatives - School of Statistics

Chapter 4
Alternatives to the t-Tools
STAT 3022
School of Statistic, University of Minnesota
2014 spring
1/1
Space Shuttle O-Ring Failures
On Jan 27, 1986, the night before the space shuttle Challenger
exploded, engineers at the company that built the shuttle warned
NASA scientists that the shuttle should not be launched because
of predicted cold weather.
Data: Numbers of O-ring incidents on previous shuttle flights.
> case0401$Incidents[case0401$Launch==’Cool’]
[1] 1 1 1 3
> case0401$Incidents[case0401$Launch==’Warm’]
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2
Question: Is there a higher risk of O-ring incidents at lower launch
temperature?
2/1
t-Tools Not Valid
normality? skewed distribution?
similar shape?
n1 ≈ n2 ?
σ1 ≈ σ2 ?
t-Tools not valid, and no transformation helps.
One-sided p-value = 0.0099 from a Permutation test on the
t-statistic.
3/1
Parametric vs. Nonparametric
Parametric methods assume that the samples come from a
distribution, such as normal, so there will be parameters to
estimate (such as µ and σ in a normal distribution)
Nonparametric methods do not assume the samples come
from any distribution
Examples of nonparametric methods:
Two-sample t-test: Rank-Sum Test (optional); Permutation
Test
Paired t-test: Sign Test; Wilcoxon Signed-Rank Test
(optional)
4/1
General Idea of Permutation Test
Under the null hypothesis, two sample distributions are the
same
The test statistic can be calculated for all the Cn1 +n2 ,n1 (will
be defined later) regroupings (in each regrouping, there is a
test statistic), where Cn1 +n2 ,n1 is the number of all the
outcomes when regrouping the total n1 + n2 numbers into two
subgroups with size n1 and n2 respectively
The relative position of the observed test statistic in all the
possible test statistics could be used to show how extreme the
observation is
5/1
Example of Regroupings
Suppose we have two groups of data: Group 1: 1.0 and 1.3; Group
2: 2.9 and 2.0.
Below is a list of all the possible regroupings:
Regrouping
Regrouping
Regrouping
Regrouping
Regrouping
1:
2:
3:
4:
5:
Group
Group
Group
Group
Group
1:
1:
1:
1:
1:
1.0,
1.0,
1.3,
1.3,
2.9,
2.9;
2.0;
2.9;
2.0;
2.0;
Group
Group
Group
Group
Group
2:
2:
2:
2:
2:
1.3,
1.3,
1.0,
1.0,
1.0,
2.0
2.9
2.0
2.9
1.3
Including the original grouping, there are 6 possible outcomes.
6/1
Number of Regroupings
Cn,k =
n(n − 1) . . . (n − k + 1)
k(k − 1) . . . 1
read as “n choose k” is the number of different ways to choose k
items from a list of n items.
The total number of regroupings for a two-sample problem is
Cn1 +n2 ,n1 which is equal to Cn1 +n2 ,n2
> choose(4, 2)
[1] 6
> choose(10, 5)
[1] 252
> choose(20, 10)
[1] 184756
> choose(30, 15)
[1] 155117520
>
> choose(9, 3)
[1] 84
> choose(9, 6)
[1] 84
7/1
Exact Calculation
Step:
1
Decide on a test statistic, such as two sample t-statistic, and
compute its value from the original two samples.
2
List all regroupings of the n1 + n2 numbers into groups of size
n1 and n2 , and recompute the test statistic for each.
3
Count the number of regroupings that produce test statistics
at least as extreme as the observed test from step 1. (It could
be one-sided or two-sided)
4
The p-value is the number found in step 3 divided by the total
number of regroupings.
8/1
Approximate Calculation
For moderate sample size n1 and n2 , it is extremely time
consuming to calculate the test statistics for ALL regroupings.
> choose(50, 25)
[1] 1.264106e+14
So instead of listing all regroupings of the n1 + n2 numbers, we
only randomly select a large enough subset of regroupings.
9/1
Permutation Test: Space Shuttle O-Ring Failures
Exact test gives p-value: 0.00988. (Section 4.3.1)
Approximate test:
> data <- case0401$Incidents
> n1 <- 4; n2 <- 20; n <- n1 + n2
> tt <- t.test(data[1:n1], data[(n1+1):n], var.equal=T)
> t_stat <- tt$statistic
>
> B <- 100000; statistics <- numeric(B)
> for (i in 1:B) {
+
idx <- sample(n, n1)
+
d1 <- data[idx]; d2 <- data[-idx]
+
mean1 <- mean(d1); mean2 <- mean(d2)
+
var1 <- var(d1); var2 <- var(d2)
+
sp <- sqrt(((n1-1)*var1+(n2-1)*var2)/(n-2))
+
se <- sp*sqrt(1/n1+1/n2)
+
statistics[i] <- (mean1-mean2)/se
+ }
> sum(t_stat <= statistics) / length(statistics)
[1] 0.01063
10 / 1
Summary
Permutation test is used as an alternative for two sample
t-test
When the two samples are highly skewed or contain potential
outliers, the nonparametric alternatives perform better than
two sample t-test
For Permutation test, the computation is too costly
11 / 1
What is Sign Test
The sign test is a quick test of the hypothesis that the mean
difference of a population of pairs is zero.
It is a resistant and distribution-free alternative to the paired t-test.
12 / 1
Side Notes - Bernoulli Distribution
Bernoulli distribution: In single trial experiment, binary outcome
with success probability p follows Bernoulli distribution.
A classical example of a Bernoulli experiment is a single toss of a
coin. The coin might come up heads with probability p and tails
with probability 1 − p. The experiment is called fair if p = 0.5.
13 / 1
Side Notes - Binomial Distribution
binomial distribution: the discrete probability distribution of the
number of successes in a sequence of n independent yes/no (1 for
yes, 0 for no) experiments, each of which yields success with
probability p.
binomial = sum of independent Bernoulli trials with equal success
probability
Suppose x follows binomial distribution with size n and success
probability p, then
Probability (x = k) = Cn,k p k (1 − p)n−k
where the possible values for k are 0, 1, 2, . . . , n
14 / 1
Side Notes - Binomial/Bernoulli in R
# Bernoulli
> rbinom(10, size=1, prob=0.3)
[1] 0 0 0 1 1 0 0 0 0 0
> dbinom(0, size=1, prob=0.3)
[1] 0.7
> dbinom(0.5, size=1, prob=0.3)
[1] 0
Warning message:
In dbinom(0.5, size = 1, prob = 0.3) : non-integer x = 0.500000
>
> # binomial #
> rbinom(10, size=3, prob=0.3)
[1] 0 1 2 2 1 1 3 1 1 2
> dbinom(0, size=3, prob=0.3)
[1] 0.343
> dbinom(0.5, size=3, prob=0.3)
[1] 0
Warning message:
In dbinom(0.5, size = 3, prob = 0.3) : non-integer x = 0.500000
15 / 1
Sign Test - Basic Idea
It counts the number K of pairs in all the n pairs where one
group’s measurement exceeds the other’s
If the null hypothesis is true, the distribution of K is binomial
with n trials, with p = 1/2.
When n is large, sample proportion K /n roughly follows a
1
normal N( 21 , var = 4n
), then
K /n − 1/2
K − n/2
p
= p
∼ N(0, 1)
1/(4n)
n/4
16 / 1
Hypothesis Tests
If the null hypothesis is true, then K should be close to n/2
To standardize the test procedure, we define the difference of
the two samples as sample 1 minus sample 2
K is the number of positive differences
If K is far away from n/2, it is the evidence that the null
hypothesis is not true
More specifically, if K − n/2 is larger than some points, we
could say population 1 has larger mean than that of
population 2; if K − n/2 is smaller than some points, we say
population 1 has smaller mean than that of population 2
17 / 1
Hypothesis Tests (II)
Define z = K√−n/2 , Calculate p-value:
n/4
Alternative µ1 > µ2 :
Probability (Z > z)
Alternative µ1 < µ2 :
Probability (Z < z)
Alternative µ1 6= µ2 :
Probability (Z > |z|) × 2
Where Z follows standard normal distribution.
18 / 1
An Example
> ## generate 10 pairs of data
> data <- round(matrix(rnorm(20), nrow=10, ncol=2), 2)
> print(data <- cbind(data, (data[,1]>data[,2])))
[,1] [,2] [,3]
[1,] -0.09 0.13
0
## if group 1 < group 2, then 0
[2,] 0.79 -1.44
1
## if group 1 > group 2, then 1
[3,] -0.67 -0.61
0
[4,] 1.36 -1.01
1
[5,] 1.51 0.88
1
[6,] -2.61 -0.13
0
[7,] 0.69 -2.51
1
[8,] 0.32 -1.41
1
[9,] 2.02 -1.95
1
[10,] -1.42 -1.26
0
> print(z <- (sum(data[,3])-5)/sqrt(10/sum(data[,3])))
[1] 0.7745967
> round(cbind(pnorm(z), 1-pnorm(z), 2*min(pnorm(z),1-pnorm(z))), 2)
[,1]
[,2]
[,3]
[1,]
0.78
0.22
0.44
## p-value for
(<)
(>)
two-sided alternative
19 / 1