Chi Square Test

Chi Square Test
Dr. Asif Rehman

Outline
 Types of Variables
 Quantitative Data Assessment (parametric)
 Descriptive assessment
 T-test
 Qualitative Data Assessment (Non parametric)
 Descriptive Assessment
 Chi Square test(Fisher Exact test)
Types of Data
 Quantitative data or numerical data
 Qualitative or Categorical data
 Nominal Data(unordered, Do not represent any amount)
 Sex (male ,female)
 Marital status (Married, Unmarried)
 Blood group (O, A, AB, B)
 Color of eyes (blue, green, brown ,Black)
 Nationality of a person (Pakistani, American, Turkish)
 Ordinal data(ordered)
 Measurement of height (tall, medium, short )
 Degree of pain (mild, Moderate, severe)
 Size of garment (large, medium ,small )
Chi square test are done when;
 Chi square test is used when both variables are measured on a
nominal scale
 It can be applied to interval or ratio data that have been
categorized in to a small number of groups
 It assumes that the observations are randomly sampled from
the population
 All observations are independent (an individual can appear only
once in a table and there are no overlapping categories)
Categorical data assessment
 Chi Square test (X2) Compares observed and expected
frequencies.
 This test is applied to compare two or more than two
proportions to test whether there is significant association
between two are not
 It is non parametric test, but is included in traditional methods
of parametric tests
Chi Square test
 The chi-square test is always testing what scientists call the Null
Hypothesis, which states that there is no significant difference
between the expected and observed result.
 For estimating how closely an observed distribution matches an
expected distribution
 For estimating whether two random variables are independent.
Conducting Chi-Square Analysis
1)
2)
3)
Make a hypothesis based on your basic research question
Determine the expected frequencies
Create a table with observed frequencies, expected frequencies,
and chi-square values using the formula:
(O - E)2
E
4)
5)
6)
Find the degrees of freedom: (C - 1)( R - 1)
Find the chi-square statistic in the Chi-Square Distribution table
If chi-square statistic > your calculated chi-square value, you do
not reject your null hypothesis and vice versa.
Chi Square Test steps
The 5 Steps in a Chi-Square Test:

Step 1: Write the null and alternative hypothesis.
H0: There is no relationship between the variables.
Ha: There is a relationship between the variables.

Step 2:
Check conditions.
A) All expected counts should be > 1.
B) At least 80% of expected counts should > 5
Chi Square Test steps

Step 3: Calculate Test Statistic and p-value.
The test statistic measure the difference between the
observed counts and the expected counts assuming
independence.
Chi Square Test steps

Step 3 Cont. Find the p-value.

If the χ2- statistic is large, it implies that the observed counts are not close to the
counts we would expect to see if the two variables were independent. Thus,
''large'' χ2 gives evidence against the null hypothesis, and supports the
alternative.

The p-value of the chi-square test is the probability that the χ2- statistic, is as
large or larger than the value we obtained if H0 is true. Also, if H0 is true, the χ2statistic has chi-square distribution with (r-1)x(c-1) df.

Thus, the p-value for Chi-Square test is ALWAYS the area to the right of the test
statistic under the curve, i.e. p-value = P(X> χ2), where X has a chi-square
distribution with (r-1)x(c-1) df curve.

To get this probability we need to use a chi-square distribution with (r-1)x(c-1)
df (Table ). Using Minitab, or any other statistical software, you can obtain the pvalue form the output. Otherwise, you can report a range for the p-value using
Table (since usually you will not be able to find the exact p-value on the table.
Chi Square Test steps


Step 4: Decide whether or not the result is statistically significant.
 The results are statistically significant if the p-value is less than alpha,
where alpha is the significance level (usually α = 0.05).
Step 5: Report the conclusion in the context of the situation.


The p-value is ______ which is < a, this result is statistically significant.
Reject the H0 Conclude that (the two variables) are related.
The p-value is ______ which is > a, this result is NOT statistically
significant. We cannot reject the H0 Cannot conclude that (the two
variables) are related.
Example
To see the prophylactic value of Chloroquine, a study was
conducted on 3540 persons. Out of 606 persons , who were given
Chloroquine prophylactically, only 19 contracted malaria. Among
those who were not given prophylactic treatment 193 contracted
malaria. Comment on prophylactic value of Chloroquine.
(Continue)
Descriptive frequencies
Total study population(n)= 3540
Chloroquine given=606
1. developed malaria=19
2. Did not developed malaria=587
Chloroquine not given=2934
1. Contracted malaria=193
2. Did not contract malaria=2741
2x2 contingency table
Contracted
malaria
Chloroquine
given
Chloroquine not
given
Total
Did not
contract
malaria
Total
19
587
606
193
2741
2934
212
3328
n=3540
 Null Hypothesis (H0 )
 Chloroquine has no role in prevention of malaria.
At the end we have to reject or Accept the hypothesis
Calculation of expected values
 E= Row total x Column total/Grand total
 E1 = 606x212/3540=36
 E2 = 606x3328/3540=570
 E3 = 2934x212/3540=176
 E4 = 2934x3328/3540=2758
 The greater the difference between observed and expected
numbers( values),the larger the value of x2 and less likely the
difference is due to chance.
2x2 contingency table
Observed
Contracted malaria
Did not contract
malaria
Total
Chloroquine given
19
587
606
Chloroquine not given
193
2741
2934
Total
212
3328
n=3540
Expected
Contracted malaria
Did not contract
malaria
Chloroquine given
36
570
Chloroquine not given
176
2758
Total
Total
Calculation of x2 value
Observed
value (O)
Expected value O-E
(E)
(O-E)2
(O-E)2 /E
O1=19
E1=36
-17
289
8.02
O2=587
E2= 570
+17
289
0.50
O3=193
E3= 176
+17
289
1.64
O4=2741
E4= 2758
-17
289
0.10
0
∑=10.26
Calculation of degree of freedom
Degree of freedom=(R - 1) x (C - 1)
= (Rows-1)(Column-1)
= (2 - 1)(2 - 1)
=1
Calculation of degree of freedom
*If chi-square statistic > your calculated value, then you do not
reject your null hypothesis. There is a significant difference that is
not due to chance.
Interpretation of results by consulting X2 Table
 Table value of X2 with 1 degree of freedom, at the significance
level of 0.05 is 3.84
 Our calculated value of X2 is 10.26.which is more than table
value of 3.84
 So we will reject the null hypothesis and will say that
chloroquine does have the prophylactic role in malaria and P <
0.05.
 ( the probability of occurrence of difference between two
groups of persons only due to chance is <0.05 or 5%.
THANK YOU