Document

Non-parametric methods
t-test (et cetera) tests hypotheses about
parameters of distribution (in t-test about
μ as a parameter of normal distribution);
there are other approaches too
What to do, if data have not normal
distribution?
and disturbance of normality is so large, that I cannot rely on test
robustness
• There are transformations improving the
normality and homoscedascity [we will go
through it later]
• If data have such a distribution, which can be
approximated with selected types of
distribution, then special methods can be used
developed for them (generalized linear modes)
• We use non-parametric tests
Non-parametric methods
• Most often
• Permutation [commonly randomized] tests
• Rank-based tests
Permutation tests
• Basic idea (for t-test):
• Reached level of significance is probability,
that so different samples I get just by
chance, if from one population. So, I can try
it – I put all the observations from both
groups together, and then randomly assign
their group membership (e.g. by tossing
from a hat or by computer random number
generator):
Plant high
12
15
13
14
10
17
19
16
19
22
20
19
t
P
OrigGroup Premut. 1 Premut. 2 Premut. 3 Premut. 4 Permut. 5
1
1
2
2
1
2
1
2
1
2
2
2
1
1
1
2
2
2
1
1
2
2
2
2
1
2
2
1
1
1
2
2
1
1
2
1
2
2
2
1
2
2
2
1
2
2
1
2
2
2
2
2
1
2
2
2
2
1
1
1
2
2
1
1
2
1
2
1
1
2
2
1
-5.331
0.000333
-1.272
0.362
I don’t believe this P as I don’t
know, if assumptions are fulfilled
1.025
-0.414
1.02493
So, I try to simulate
it here.
And so
on, at
least
thousand
times
I look
how often
is |t| from
randomly
generated
groups
bigger
than from
data.
Reached level of significance (P)
is computed then
Number of random
permutations, where “it
was better” than in data (so
where |tpermut | > |tdata |
x 1
n 1
Attention
• I test hypothesis, that both samples are from
one (and same) population. If I want to
interpret the test as location test, then I have
to add an assumption that both populations
have the same distribution shape. If they
differ after that, they can differ in the
location parameter.
Rank-based tests
• Basic idea – We don’t know, what the
distribution is, so we forgot real values and
replace them with their rank
• Many parametric methods have their nonparametric counterparts
Mann-Whitney test
non-parametric analogue of two-sample t-test
• All values from both samples are arrayed
(and so they get numbers from 1 to n, where
n=n1+n2
• It doesn’t matter, if the arrangement is made
from top or from bottom, but I must pay
attention on it, if one-tailed tests are used.
compute
n1n1  1
U  n1n 2 
 R1
2
it gives especially high
value, if ranks in the first
group are low
or
n 2n 2  1
U   n1n 2 
 R2
2
holds U + U' = n1n2,
it gives especially high value,
if ranks in the second group
are low
andafemale
students
are the výšku.
same high.
H0: Male
Studenti
studentky
mají stejnou
Male anda studentky
female students
thevýšku.
same high.
HA: Studenti
nemajíaren’t
stejnou
=0.05
Výška
Pořadí
studentů
Pořadí
výšky
studentek
High studentů
of males Výška
High ofstudentek
females
Highvýšky
of males
rank
High
of females
rank
--------------------------------------------------------------------------------------------------------------193
175
1
7
188
173
2
8
185
168
3
10
183
165
4
11
180
163
5
12
178
6
170
9
n1 = 7
n2 = 5
R1=30
R2=48
U = n1n2 + n1(n1+1)/2 - R1 = (7)(5) + (7)(8)/2 - 30 = 35 + 28 - 30 = 33
U’=n1n2 - U = (7)(5) - 33 = 2
U0.05(2),7,5 = U0.05(2),5,7 = 30
Protože
As 33 > 30, zamítáme
we refuseHH
0. 0
0.01 < P( U  33 nebo U  2) < 0.02.
Mann-Whitney-ův
neparametrické
testování
oboustranné
hypotézy,
neníisrozdíl
mezi výškami
studentů a
Mann-Whitney
testtest
for pro
non-parametric
testing
if two-tailed
hypothesis,
thatžethere
no difference
between
studentek.
heights
of male and female students.
Attention
All sorts of values are tabulated, so pay attention,
what is tabulated and how
Statistica prints 2*1sided exact p (if I want one-tailed
test, if deviation goes in the right direction, I divide
by two)
Normal approximation – if there
is great number of observations,
holds
n n N  1
n1n 2
 
U 
12
2
1 2
U
Z = (U-U)/ U has near normal distribution. At it is easy job
to find corresponding p to it – Statistica prints - Attention – if
I have exact p, this value is never more of interest.
Similar to permutation test
• even M-W has its presumptions:
• It is either test of null hypothesis, that the
samples are from the same population
• If it is formulated as a location test, then
there is an assumption that samples have the
same distribution shape
It is thus absurd to write
• As we had not homogeneity of variances,
we had to use non-parametric test.
• 1. to test, if it is the same population, when I
have proved inhomogeneity of variance
previously, doesn’t make any sense
• 2. for location test, inhomogeneity of
variance is the same problem for MW as for
t-test.
Another presumption - data can
be ranked
Plant high
12
15
13
14
10
17
19
16
19
22
20
19
OrigGroup Sequence GivenSeq
1
2
2
1
5
5
1
3
3
1
4
4
1
1
1
2
7
7
2 8 - 10
9
2
6
6
2 8 - 10
9
2
12
12
2
11
11
2 8 - 10
9
Ties are averaged
– deviation from
original
presumption can
make problem,
some tests use
equalities
correction “ties”
Median test
• I compute median for all observations and
how much observations is in each group
above and how much below this median. I
analyse it then with classic 2 x 2 table. So, it
is test about overall median and it has not
any further assumptions, but it is very weak.
Wilcoxon test
• Analogue of pair t-test
• Attention, more tests are called Wilcoxon,
thus it is sometimes written as Wilcoxon for
pair observations
Wilcoxon test
First, we count differences among observations,
then we rank them according to the size of their
absolute value from the smallest to the largest
one. After that we total of positive differences
ranks and number of negative differences ranks
(marked as T+ and T-). (As the sum of series
numerical from 1 to n is n(n+1)/2, we can easily
compute T+={n(n+1)/2}-T-)
Thus, test reflects number as well as quantity of positive and
negative differences.
H0: Délka
přední
nohyand
u srnce
Length
of foreleg
hindjelegstejná
is thejako
samezadní.
in roe-deer.
HA:Délka
přední
nohy
u
srnce
je
odlišná
od
délky
Length of foreleg and hind leg isn’t the same inzadní.
roe-deer.
 = 0.05
Roe-deerDélka
Hind
leg L.
Foreleg
Difference Rank
Rank with
mark
Srnec
zadní
nohy Délka přední
nohyL.
Rozdíl
Pořadí Pořadí
se znaménkem
(j)
(cm) X1j
(cm) X2j
(dj = X1j-X2j)
| dj |
----------------------------------------------------------------------------------------------------------------------------------1
142
138
4
4.5
4.5
2
140
136
4
4.5
4.5
3
144
147
-3
3
-3
4
144
139
5
7
7
5
142
143
-1
1
-1
6
146
141
5
7
7
7
149
143
6
9.5
9.5
8
150
145
5
7
7
9
142
136
6
9.5
9.5
10
148
146
2
2
2
n=10,
T+ = 4.5 + 4.5 + 7 + 7 + 9.5 + 7 + 9.5 + 2 = 51
T- = 3 + 1 = 4
T0.05(2),10 = 8
As T- < T0.05(2),10, H0 seiszamítá.
rejected
Protože
0.01 < P(T- nebo
or T+  4) < 0.02
Wilcoxonův
párový
test aplikovaný
o délkách
srnčích nohou.
Wilcoxon pair
test applied
upon data na
of data
roe-deer
legs’ length
Approximation can be used again
(for large samples)
n(n  1)
T 
4
T 
n( n  1)( 2n  1)
24
and from this compute Z.
Attention, Statistica shows just normal approximation, does not print
exact p – look for it in tables, if needed.
tables can be found here:
http://fsweb.berry.edu/academic/education/vbissonnette/tables/wilcox_t.pdf
Test has assumption about symmetric distribution of differences.
Sign test
Compares numbers of positive and negative differences
Has no assumptions, but very weak
Non-parametric tests
• If assumptions for parametric test are fulfilled,
non-parametric tests are weaker than
corresponding parametric test.
• Common idea about no assumptions for
nonparametric test is not true.
• Generally – the more observations I have, the
more robust parametric tests used to be to
disturbances of their presumptions
• The stronger assumptions are fulfilled, the more
powerful test I can usually use