2.3 Nonparametric methods

1.
Nonparametric Methods:
Both the variance test and t-test used in the previous section depend heavily on the
common normal assumption of the observations. In addition, these tests are very
sensitive to some unusual observations, such as outliers. However, the conclusions
drawn from the nonparametric methods are not much affected if the normal
assumption holds or some outliers exist.
(i) One sample :problem:
X 1 , X 2 , …, X n are random variables with common distribution (not necessarily
normal distribution). Suppose M is the median of the distribution. We want to test if
H 0 : M  M 0 , where M 0 is some number. We now introduce the Wilcoxon sign
rank test.
Wilcoxon sign rank test:
Let Z i  X i  M 0 and Ri is the rank of Z i . Let T  
the ranks corresponding to X i  M 0 >0 while T  
R
i
( X i  M 0 ) 0
R
i
( X i  M 0 )0
be the sum of all
is the sum of all the
ranks corresponding to X i  M 0 <0.
Example:
H0 : M  M0  8
Xi
7.1
13.5
5.2
4.2
6.4
10.4
5.7
5.4
6.3
7.5
Zi
0.9
5.5
2.8
3.8
1.6
2.4
2.3
2.6
1.7
0.5
1
Ri
2
10
8
9
3
6
5
7
4
1
Sign of
_
+
_
_
_
+
_
_
_
_
Xi 8
Then, T   10  6  16, T   2  8  9  3  5  7  4  1  39 .
The following are the Wilcoxon sign rank tests under three alternatives:
(i) H 0 : M  M 0 vs H 1 : M  M 0
 T   Wn, , then reject H 0 , where W n , can be found by table.
(ii) H 0 : M  M 0 vs H 1 : M  M 0
 T   Wn, , then reject H 0 , where W n , can be found by table.
(iii) H 0 : M  M 0 vs H 1 : M  M 0
 T  min( T  , T  )  Wn, , then reject H 0 , where W n , can be found by table.
Note: T   T  
n(n  1)
.
2
Example (continue):
Test H 0 : M  M 0  8 vs H 1 : M  8 ,   0.05.
[sol]:
T   16,W10,0.05  11  T   16  11  not reject H 0 .
Example (Splus):
>s<-c(7.1,13.5,5.2,4.2,6.4,10.4,5.7,5.4,6.3,7.5)
>wilcox.test(s,mu=8,alt=”less”)
>help(wilcox.test)
As the sample size is large, p-value can be obtained vis a normal approximation. As n
2
is large, T   N (
n(n  1) n(n  1)( 2n  1)
,
)
4
24
n(n  1)
4
 N (0,1). Thus, as
n(n  1)( 2n  1)
24
T 
H 0 : M  M 0 vs H 1 : M  M 0 and T   c, then
p-value= P( Z 
n(n  1)
n(n  1)
c
4
4
)  (
) . The p-values under the
n(n  1)( 2n  1)
n(n  1)( 2n  1)
24
24
c
other alternatives can be obtained similarly.
(ii) Two sample problem:
X 1 , X 2 , …, X n ~ F1 ( x) and Y 1 , Y 2 ,…, Ym ~ F2 ( x) , where F1 ( x) and F2 ( x)
are some distributions. Let M 1 and M 2 be the medians of F1 ( x) and F2 ( x) ,
respectively. We want to test H 0 : M 1 = M 2 ( F1 ( x) = F2 ( x) ).
Wilcoxon rank sum test:
I.
Combine X 1 , X 2 , …, X n , Y 1 , Y 2 ,…, Ym and find all the ranks of these
data.
II.
Find the sum of the ranks W corresponding to X 1 , X 2 , …, X n .
III.
The following are the Wilcoxon rank sum tests under three alternatives:
i.
H 0 : M 1  M 2 vs H 1 : M 1  M 2
 reject H 0 as W  Wu , where Wu can be found by table.
ii.
H 0 : M 1  M 2 vs H 1 : M 1  M 2
 reject H 0 as W  WL , where WL can be found by table.
iii.
H 0 : M 1  M 2 vs H 1 : M 1  M 2
 reject H 0 as W  Wu or W  WL .
3
Example:
South
60
100
150
290
North
110
130
140
170
200
The above data are the incomes in both North and South areas. Test
310
H 0 : M1  M 2
vs H 1 : M 1  M 2 with   0.05.
[sol]:
I.
Income
60
100
150
290
110
130
140
170
200
310
Rank
1
2
6
9
3
4
5
7
8
10
II.
n1  4, n2  6,W  1  2  6  9  18,WL  12,Wu  32.
III.
Since 12  W  18  32  not reject H 0 .
Conclusion: there is no difference in income between the two areas.
Example (Splus):
>in1<-c(60,100,150,290)
>in2<-c(110,130,140,170,200,310)
>wilcox.test(in1,in2)
4