Prof K K Acharya LECT 19

NONPARAMETRIC TESTS
Lectures delivered by
Prof. K.K.Achary
Ph.D. course work – Jan. 2015 batch
PARAMETRIC TESTS v/s NONPARAMETRIC TESTS
Statistical tests can be classified into two categoriesparametric and nonparametric
 Parametric tests rely on the distributional assumptions(
such as normality ) on the population from which the
sample is drawn and the hypotheses are formulated
about the population parameters. Hence the tests are
appropriately called ‘parametric tests ‘.
 The data should be in interval scale/ratio scale and
generally continuous.
 When these assumptions fail , parametric tests perform
poorly resulting in errors/ incorrect analysis .

Nonparametric tests deal with hypotheses
which are not statements about the population
parameter. The tests which do not make
assumptions about the distribution of the
sampled population are called distribution-free
tests.
 Despite
the distinction, the
terms
‘nonparametric ‘and ‘distribution-free’ are
interchangeably used.
We will study some important nonparametric tests
only.

Most of the parametric tests,like one-sample ttest,independent samples t-test,paired ttest,One way ANOVA, are robust tests,i.e., they
are not very sensitive to lack of normality.
When we consider fairly large sample sizes,
these tests perform very well.
 However, when it is not possible to make
distributional ssumptions, skewed distributions,
hypotheses are not statements about the
parameters, we use nonparametric tests.

Advantages of nonparametric tests
They allow for testing hypothesis that are
not
statements
about
population
parameters.
 Used when the form of the distribution is
not known
 Data are asymmetric(skewed)
 Computationally easier procedures
 Used when the data are presented as ranks
 Fewer assumptions required

Some disadvantages
Power efficiency of the nonparametric test is
lower than the parametric test. Hence it fails
to detect the difference/ the effect of the
independent variable on dependent variable.
 To detect any specified effect at a given
significance level,nonparametric tests need
large sample sizes. But with large sample
sizes,using Central Limit Theorem/large
sample theory, parametric tests can be used.

Some important nonparametric tests
Chi-square test , Fisher’s exact test
 Sign test, Run test
 Wilcoxon signed ranks test
 Mann-Whitney-Wilcoxon test
 Kolmogorov -Smirnov test
 McNemar’s test
 Kruskal-Wallis test
 Friedman’s test

CHI SQUARE TEST
Statistical tests based on the chi square
distribution are called chi square tests.
 The distribution of the test statistic considered
follows chi square distribution; it is not the
distribution of the population from which the
data are sampled.
 Three types of chi square tests: goodness of fit
test, test for independence , test for homogenity
 A goodness of fit test makes a statement about
the nature of the whole population

Test for independence is used to test the
independence ( association ) of two
categories
 Example: checking the independence of
gender and job categories.
 Test for homogeneity is used for testing
equality in proportion of two or more
categories
 Chi square test is an important tool in
the analysis of categorical/rank data

Goodness-of-fit test
We wish to test whether our sample is
drawn from a specified ( hypothesized)
distribution,i.e. if it is continuous, then has
it come from a normal/exponential/Pareto
distribution
 Or , if the data are discrete valued
(integer valued ), then has the data come
from binomial / Poisson /Negative
binomial distribution?

Test procedure
Assume that a simple random sample is
given.
 The sample data can be arranged into a
frequency distribution. The frequencies
are called the ‘observed frequencies’
 Hypotheses:

H 0 : the sample is drawn from a distribution D
H 1 : the sampled population does not follow D

Test statistic:
2
(
O

E
)
i
2   i
Ei
This statistic has chi square distribution
with d.f. = No. of classes – No. of
restrictions and parameters estimated
from sample data.
 Decision: Reject null hypothesis if
computed value of chi-square is greater
than the critical value of chi-square.
Equivalently, reject null hypothesis if pvalue is smaller than the level of
significance.

Example
Age group
Freq. Popn. Proportion
 20 – 24
103
18
 25 – 34
216
50
 35 - 44
171
32
 Total
490
100
 As per the population
proportion,expected frquencies are to be
calculated.
 Null hypothesis???

Expected freq. for age group 20 – 24:
 =(18/100)x490 =88.2
 For 25 – 34 : (50/100)x490 = 245.0
 For 35 – 44 : ( 32/100)x490 = 156.8
 The chi-square value computed is:

(103  88 .2) 2 (216  245 ) 2 (171  156 .8) 2


88 .2
245
156 .8
 2.483  3.433  1.286
 7.202
Level of significance = 0.05
 Degrees of freedom = k-1 = 2
 Critical value = 5.991( from tables of chi
square dist./Excel )
 Decision: reject null hypothesis
 What is your conclusion ?

Small Expected frequencies
When expected frequencies for some
classes are small , some corrections are
suggested.
 Cochran’s correction:

◦ For unimodal distributions, the minimum
expected frequency can be as low as one. If
expected frequencies are less than
one,combine the adjacent categories to
achieve the require minimum expected
frequency
Chi-square Test for independence
In case of two continuous variables, we use
correlation as a measure of association or
dependence
 To study association /dependence between
attributes/categorical variables we can use chisquare test.
 The data in this type of examples is arranged in
the form of a contingency table with the cell
entries in the table denoting the frequency
counts.

CONTINGENCY TABLE
Consider cross tabulation of two
attributes,say A ( income )and B( health
status) based on a family survey. Suppose
we consider three levels of income and
four levels of family health status.
 This gives a 3 x 4 contingency table as
follows.

Distribution of families with health status v/s income
Income
level
Poor
Average
Good
Excellent
Total
Low
380
156
67
12
615
Middle
168
657
369
35
1229
High
54
650
156
124
984
Total
602
1463
592
171
2828
The numbers in the cells denote the
number of families with different income
levels possessing the different health
status.
 In general ,if the row attribute is at ‘ r’
levels and the column attribute is at ‘c’
levels , cross tabulation results in r x c
contingency table.

We are ,in general interested in testing
association/independence of the two
attributes.
 For eg.,in the family survey example
stated above , we want to see whether
income level and health status are
independent.
 For this we use chi-square test.
 First, we consider a 2 x 2 contingency
table.

2 x 2 contingency table

Let the row attribute ( factor ) be at two
levels,say a1 and a2 and column attribute
be at two levels- b1 and b2.

Then the 2 x 2 table with the cell
frequencies and totals ( marginal totals
and grand total ) will be as follows
2 x 2 contingency table
b1
b2
Total
a1
a
b
a+b
a2
c
d
c+d
Total
a+c
b+d a+b+c+d

The chi-square statistic for 2 x 2
contingency table ( without correction )
is given below:
(a  b  c  d)( ad  bc) 2
 
(a  c)( b  d)( a  b)( c  d)
This chi square statistic has 1 degree of freedom(d. f.)
2
Test for independence in r x c table
Suppose we have two attributes/factors
with r levels for the row attribute and c
levels for the column factor. We can use
the chi square test for testing the
independence of the two attributes.
 The test statistic is

2  
(O ij  E ij ) 2
E ij
, where O ijand E ijare the
observed and expected frequencies of cell (i, j)

To compute expected
frequencies,consider the following r x c
contingency table.
1
2
..
..j
c
total
1
R1
i
Ri
r
Rr
Total
C1
C2
C3
Cj
Cc
GT

The expectedfrequency for the cell (I,j) is
given by
E ij  (Ri x Cj)/GT


The chi square statistic has (r -1) x (c – 1)
d.f.
Decision: If the computed value of test
statistic is greater than or equal to the
critical value at given level of significance
(alpha), then reject the null hypothesis.
Chi square test for homogeneity


When we test for independence,we assume
that a single sample was drawn from a
population and the data are cross-tabulated
to get a contingency table. In this case ,the
row and column totals are chance quantities
,not under the control of the researcher.
There are situations when row or column
totals are under the control of the
researcher. This can be thought of a situation
where, the row( column ) observations are
drawn from different populations.
Here,one set of marginal totals are fixed
while the other set of marginal totals are
random.
 In situations like this, we would like to
know whether the different populations
giving the row ( column ) samples are
homogeneous.
 Hence we apply the chi square test of
homogeneity.

The homogeneity test is concerned with
the question:
 Are the samples drawn from
populations that are homogeneous
with respect to some criterion of
classification?
 Hypothesis:
 Null: the populations are homogeneous
 Alternative: populations are not
homogeneous


Test statistic& decision:
 
2
(O ij  E ij )
E ij
(r - 1)(c - 1) df.
2
, whic h has
SIGN TEST






The one sample t-test or matched pair t-test
depends on normality assumption.
In the absence of normality / small sample size /
rank data, sign test is used instead of t-test
Ordinal data/Interval or ratio scale- continuous
data
Test gets the name because of ‘+’ or ‘ – ‘ signs
used,without considering the numeric values
Very simple to execute.
Focuses on median as a measure of central tendency


We assume that the measurements are
taken on a continuous variable
Null hypothesis is a statement on the
population median is equal to a specified
value
H 0 : population median is M


Level of significance be α = 0.05
Compute the deviations of the observations
from the hypothesized median M. Record
only the sign of the difference( + or - )
The test statistic for the sign test is either
the observed number of + signs or – signs
, which depends on the type of alternative
hypothesis.
 Let N(+) denote the number of ‘+’ signs
and N( - ) denote the number of ‘-’ signs.
 Then the null hypothesis and the different
alternative hypothesis are given by:

H 0 : N( )  N( - )
H 1 : N(  )  N( - ) ( two sided )
H 1 : N(  )  N(-) ( one sided )
H 1 : N(  )  N(-) ( one sided )
In a given test any one of the above three
alternatives are possible.
 A sufficiently small number of + ( or - )
signs will be taken as test
statistic,depending on the alternative
hypothesis.
 The distribution of test statistic is
binomial and the p-value can be easily
computed.





Based on the alternative hypothesis,the
decision rules can be stated
If the alternative hypothesis is N(+)>N(-)( or
N(+)<N(-)),then reject null hypothesis , if
the probability of getting less than or equal k
negative ( or positive ) signs( p-value) is less
than or equal to α.
Here k is the test statistic.
For two sided test,the p-value can be
computed easily,taking one sided probability
and comparing it with α/2.
Sign Test for Paired Data
When the paired t-test assumptions fail
for paired data, we can use sign test.
 From the matched pair data,say (X,Y),find
the difference (X-Y).Select only the signs
of the differences,i.e. + or -.
 Set the null hypothesis N(+) = N(-)( or
median of differences is zero ) against the
one sided or two sided alternatives.
 Execute the sign test.

Mood’s Median Test
Mood’s median test( or simply median test )
compares the medians of two or more samples. It
is closely related to the one sample sign test, and
shares the latter’s properties of robustness and
(generally low) power.
 Assumptions:
 Observations are independent both within and
between samples, i.e. the data are independent
simple random samples or the equivalent.The
data are at least in ordinal scale


The distributions of the populations the samples
were drawn from all have the same shape.
These are the same as for the sign test.
The median test is very robust against
outliers, and fairly robust against
differences in the shapes of the
distributions.
 The median test has poor power for
normally distributed data, even worse
power for short-tailed distributions, but
good relative power for heavy-tailed
(outlier-rich) distributions.

Hypotheses:
 H0: the population medians all are equal
 H1: the population medians are not all
equal
 Rationale:
 If the null hypothesis is true, any given
observation will have probability 0.5 of
being greater than the shared median, by
definition and regardless of which
population it is from.

Therefore, the number of observations
greater than the shared median would
have a binomial distribution with p = 0.5.
 Even if the null hypothesis is true, the
shared population median of course is not
known. It can be estimated, however, by
the median of all the observations (i.e. the
sample median if all the sampleswere
combined into one).

Procedure:
 Determine the overall median.
 For each sample , count how many
observations are greater than the overall
median, and how many are equal to or
less than it.
 Put the counts from step 2 into a 2xk
contingency table:

Sampl
e no.
No. of
obs.
Greater
than
overall
median
No. of
obs.
Less or
equal
1
2
3
..
..
..
k
Perform a chi square test on this
contingency table, testing the hypothesis
that the probability of an observation
being greater than the overall median is
the same for all populations
 Degrees of freedom: (r-1)(c-1)


Decision: If the calculated value of chi
square is greater than the critical
value,then reject the null hypothesis
MANN – WHITNEY TEST
In median test we compare the
observations in the individual samples
with the combined median. This results in
loss of information from individual
samples.
 To overcome this drawback of median
test, we can use Mann –Whitney ( also
called Mann – Whitney – Wilcoxon test
or Wilcoxon rank sum test )

Assumptions
The two samples are independently
drawn from respective populations and
may be of different sizes.
 Measurement scale is at least ordinal
 Variable of interest is continuous
 If the populations differ at all , they differ
in their locations ( median values )
 The parametric test equivalent to MannWhitney-Wilcoxon test is independent
samples t-test.

Hypotheses
H0: the two populations have equal
medians
 H1: medians are not equal ( two-sided
test)
 Other alternatives:
 Median of population 1 is larger than the
median of population 2
 Median of population 1 is smaller than the
median of population 2

Test statistic
Combine the two samples and rank all
observations from smallest to largest ,
while keeping track of the sample to
which each observation belongs
 Tied observations are assigned ranks
equal to the mean of the rank positions
 The test statistics is given below

Mann-Whitney test statistic
n (n  1)
T  S
, where
2
S  sum of the ranks of sample observations
from first ( or X) population
n  no. of observations from first sample
The critical value for specified level of
significance can be obtained from tables
 Decision: If the computed value of T
exceeds the table value , reject the null
hypothesis
 Using normal approximation to the test
statistic,we can perform z-test. Some
authors/texts use this approach.

Wilcoxon signed rank test
This nonparametric test is an alternative
to paired t-test when normality
assumption fails
 Assumptions:

◦
◦
◦
◦
Paired data come from the same population
Each pair is randomly chosen
Data are at least in ordinal scale
Distribution of differences should be
symmetric around the median
Hypothesis:
 H : the two samples are coming from
0
the same distribution ( equal medians )
H 1 : distributions of two populations
systematically differ
Test procedure:
 Given a SRS of size n, compute the
differences between paired observations

H1
Consider the absolute differences (
magnitude only ) and arrange them in
increasing order. Keep track of the
positive differences.
 Rank the ordered differences. Tied
differences are assigned the average ranks.
 Consider the test statistic

W   sum of the ranks of positive difference s

Under the null hypothesis the test
statistic has mean
W


n(n  1)

4
And standard deviation
 W
n (n  1)( 2n  1)

24
The test rejects the null hypothesis when
its computed value is far from its mean(
equivalently, when the p-value is smaller
than the level of significance )
 We can use normal approximation and
use z-test using the z-score/statistic

 W    W
z

 W





The p – value for this test can be
computed based on the sampling
distribution of the test statistic under the
null hypothesis.
 Computation of p – value is difficult. Table
values are available. Statistical softwares
provide the p - value

Kruskal – Wallis Test ( K – W Test )
In ANOVA, we compare the population
means of several ( more than two )
groups.
 Important assumptions for ANOVA are
normality and homogenity
 If the assumptions are violated, then we
use K-W test
 It is also called Kruskal-Wallis ANOVA

In K-W test we apply one-way ANOVA to
the ranked data from all the groups rather
than the original observations
 Hypotheses:

H 0  all groups have same distribution
H1  at least some groups have different distributions
Test procedure:
 Let there be k groups of sizes
 n 1 , n 2 ,... n k with N  n 1  n 2  ...  n k
 Rank all the observations and let R i be
the total rank for i-th group.
 Kruskal-wallis test statistic is:

R i2
12
H
 3( N  1)

N(N  1)
ni
Decision : Reject null hypothesis if p –
value is less than α
 Computation of p-value is complicated
 The test statistic H is approximately following
chi-square distribution with (k-1) d.f. and hence
the p-value can be computed using chi-square
distribution.
 If null hypothesis is rejected, we have to
perform post-hoc analysis

KAPPA STATISTIC