biostat_lecture3_b

BIOSTATISTICS
Statistical tests part IV: nonparametric tests
INTRODUCTION
1. Mann-Whitney test
•
Applicability
2. Wilcoxon test
•
Definition
3. Kruskal-Wallis test
•
Example
Copyright ©2012, Joanna Szyda
INTRODUCTION
TEST
HYPOTHESES
SAMPLE STRUCTURE
Copyright ©2011, Joanna Szyda
MANN-WHITNEY TEST
MANN-WHITNEY TEST
1. Comparison of means
2. Quantitative or ordered data (ranks)
3. No normal distribution required
4. Two independent samples
Copyright ©2012, Joanna Szyda
MANN-WHITNEY TEST
DATA SET
MEDIUM
HIGH
5.5
6.0
6.0
7.0
5.0
7.5
7.0
6.0
5.5
7.5
6.0
8.0
7.0
11.0
6.0
9.0
8.0
8.0
7.0
11.0
6.0
8.0
7.0
8.0
6.0
7.0
8.0
7.0
6.0
7.0
7.0
9.0
1. Shrimp length in different water
salinities
3. Shrimp length [mm] at 4 weeks of
age
10
8
6
N
4
2
0
1 2 3 4 5 6 7 8 9
LENGTH
8
6
N4
2
0
1 2 3 4 5 6 7 8 9 10 11
LENGTH
Copyright ©2012, Joanna Szyda
MANN-WHITNEY TEST
1. Formulate hypotheses H0 and H1
H0: shrimp length does not depend on water salinity
H1: shrimp length depends on water salinity
H0: H = M
H1: H ≠ M
2. Set the significance level
MAX = 0.05
3. Choose the statistical test and calculate test value

n2 n2  1 n2
n1 n1  1 n1 
U  min n1n2 
  r2i , n1n2 
  r1i 
2
2
i 1
i 1


Excel: example
Copyright ©20112 Joanna Szyda
MANN-WHITNEY TEST
3. Choose the statistical test and calculate test value

n2 n2  1 n2
n1 n1  1 n1 
U  min n1n2 
  r2i , n1n2 
  r1i 
2
2
i 1
i 1


1617 
1617 


U  min16 *16 
 182, 16 *16 
 346  
2
2


min 46, 210   46
Copyright ©2011, Joanna Szyda
MANN-WHITNEY TEST
4. Determine distribution of the test
•
Nonparametric test – no known distribution
•
For n1n2 > 20 – approximated by a normal distribution:
U
z
~
U  U
 U2

N U ,  U2


n1n2
U
2
n1n2 n1  n2  1
12
 no tables
~
N 0,1
 tables
Copyright ©2011, Joanna Szyda
MANN-WHITNEY TEST
4. Determine distribution of the test
z
n1n2
16 *16
U
46 
2
2

 3.09 ~
n1n2 n1  n2  1
16 *1633
12
12
N 0,1
5. Determine t:
 t  0.002
Excel: example
or compare with a critical value:
U  0.05,n1 16,n2 16  181
U t  46
Copyright ©2011, Joanna Szyda
MANN-WHITNEY TEST
6. Decision
t < max
H1
Ut < U
H0
ATTENTION !!!
shrimp length depends on water salinity
Copyright ©2012, Joanna Szyda
WILCOXON TEST
WILCOXON TEST
1. Nonparamteric test
2. Quantitative or ordered data
(ranks)
3. No normal distribution required
4. Comparison of two paired
samples
Copyright ©2011, Joanna Szyda
WILCOXON TEST
DATA SET
NO LAMB
IID
LAMB
1
72.00
55.50
2
62.35
43.80
3
55.77
66.80
4
59.98
68.00
5
51.60
57.88
6
61.48
61.90
7
52.57
45.40
8
52.50
56.67
9
56.43
73.30
10
60.13
77.50
11
48.60
63.53
12
42.90
54.50
13
53.50
55.58
14
70.43
91.10
15
47.10
64.05
16
50.08
71.40
1. Feeding behaviour of sheep
2. Data collected in1994-1996
in Canada, Rocky Mountains
region
3. Differences in time of
feeding with / without lamb
4. % time spent on feeding
Copyright ©2011, Joanna Szyda
WILCOXON TEST
DATA SET
NO LAMB
IID
LAMB
1
72.00
55.50
2
62.35
43.80
3
55.77
66.80
4
59.98
68.00
5
51.60
57.88
6
61.48
61.90
7
52.57
45.40
8
52.50
56.67
9
56.43
73.30
10
60.13
77.50
11
48.60
63.53
12
42.90
54.50
13
53.50
55.58
14
70.43
91.10
15
47.10
64.05
16
50.08
71.40
3
2
N
1
0
-25
-20
-15
-10
-5
0
5
10
15
20
DIFFERENCE IN TIME
Copyright ©2011, Joanna Szyda
WILCOXON TEST
1. Formulate hypotheses H0 and H1
H0: feeding time does not depend on a lamb
H1: feeding time depends on a lamb
H0: J = B
H1: J ≠ B
2. Set the significance level
MAX = 0.05
3. Choose the statistical test and calculate test value
 n1  N  n1  
W  min  ri ,  ri 
i  n1
 i 1

Excel: example
Copyright ©2011, Joanna Szyda
WILCOXON TEST
3. Choose the statistical test and calculate test value
 n1  N  n1  
W  min  ri ,  ri   min 107, 29   29
i  n1
 i 1

Copyright ©2011, Joanna Szyda
WILCOXON TEST
4. Determine distribution of the test
•
Nonparametric test – no known distribution
•
For N > 15 – approximated by a normal distribution:
W
z
~
W  W

2
W

N W ,  W2
~

N 0,1
N  N  1
W
4
z
~ N 0,1
N  N  12 N  1
24
16 *17
29 
4
z
 2.02
~ N 0,1
16 *17 * 33
24
Copyright ©2011, Joanna Szyda
WILCOXON TEST
5. Determine t:
 t  0.0437
Excel: example
or compare with a critical value :
W 0.05, N 16  29
Wt  29
6. Decision
t < max
Wt = W
feeding time depends on a lamb
H0
H1
?
?
Copyright ©2011, Joanna Szyda
KRUSKAL-WALLIS TEST
KRUSKAL-WALLIS TEST
1. Comparing variability
2. Quantitative or ordered data (ranks)
3. No normal distribution required
4. Analysis of variance
Copyright ©2012, Joanna Szyda
KRUSKAL-WALLIS TEST
DATA SET
1. Height of adult women in the USA
2. Three age groups
20-29
30-39
40-49
161.925
164.465
173.990
173.355
171.450
175.260
158.115
173.355
167.640
170.815
175.260
166.370
179.705
164.465
168.910
Copyright ©2011, Joanna Szyda
KRUSKAL-WALLIS TEST
1. Formulate hypotheses H0 and H1
H0: women's height is the same in each age interval
H1: women's height differs across age intervals
H0:
H1:
2. Set the significance level
MAX = 0.05
3. Choose the statistical test and
N
calculate test value
NA
12
2
2


H
n
R

R
~

 i i
N A 1
N  N  1 i 1
total no of observations
NA
number of groups
Ri
mean ranking within group i
R
mean overall ranking
Copyright ©2011, Joanna Szyda
KRUSKAL-WALLIS TEST
3. Choose the statistical test and calculate test value
NA
12
12
2
2
2
2








H
n
R

R

5
7
.
2

8

5
8

8

5
8

8
 6.45

i
i
N  N  1 i 1
1515  1
4. Determine distribution of the test: ~  321
5. Determine t:  t  0.0398
6. Decision:
Excel: example
t < max
H0
H1
women's height differs across age intervals
Copyright ©2011, Joanna Szyda
NONPARAMETRIC
TESTS
Copyright ©2012 Joanna Szyda