Lecture 13

Outline
Lecture 13
Testing the Difference between
Means and Variances
*
ChiChi-square tests
© The McGraw-Hill Companies, Inc., 2000
Outline
10-3 Testing the Difference
between Two Variances
10-1 Introduction
10-4 Testing the Difference
between Two Means: Small
Independent Samples
10-5 Testing the Difference
between Two Means: Small
Dependent Samples
10-1 Introduction
10-2 Testing the Difference
between Two Means: Large
Samples
© The McGraw-Hill Companies, Inc., 2000
© The McGraw-Hill Companies, Inc., 2000
10-1 Introduction
When comparing two means using the t test,
the researcher must decide if the two
samples are independent or dependent.
When the samples are independent, there are
two different formulas that can be used
depending on whether or not the variances
are equal (F test).
© The McGraw-Hill Companies, Inc., 2000
Researchers wish to compare two
sample means using experimental and
control groups.
Example: two brands of cough syrup
might be tested to see whether one
brand is more effective than the other.
© The McGraw-Hill Companies, Inc., 2000
1010-2 Testing the Difference between
Two Means: Large Samples
Assumptions for this test:
Samples are independent.
The sampling populations must
be normally distributed.
Standard deviations are known
or samples must be at least 30.
© The McGraw-Hill Companies, Inc., 2000
1010-2 Testing the Difference between
Two Means: Large Samples
µ,σ
1
n, s
1
2
1
2
1
µ,σ
2
2
2
n,s
2
Situation 1
2
2
© The McGraw-Hill Companies, Inc., 2000
© The McGraw-Hill Companies, Inc., 2000
1010-2 Formula for the z Test for Comparing
Two Means from Independent Populations
Situation 2
z=
(X
1
− X ) − (µ − µ
2
1
σ
2
1
n
1
© The McGraw-Hill Companies, Inc., 2000
1010-2 z Test for Comparing Two Means
from Independent Populations -Example
A survey found that the average hotel
room rate in New Orleans is $88.42 and
the average room rate in Phoenix is
$80.61. Assume that the data were
obtained from two samples of 50 hotels
each and that the standard deviations
were $5.62 and $4.83 respectively. At
α = 0.05, can it be concluded that there is
no significant difference in the rates?
© The McGraw-Hill Companies, Inc., 2000
+
σ
2
)
2
2
n
2
© The McGraw-Hill Companies, Inc., 2000
1010-2 z Test for Comparing Two Means
from Independent Populations - Example
Step 1: State the hypotheses and
identify the claim.
H0: µ1 = µ2 (claim) H1: µ1 ≠ µ2
Step 2: Find the critical values. Since
α = 0.05 and the test is a two-tailed test,
the critical values are z = ±1.96.
Step 3: Compute the test value.
© The McGraw-Hill Companies, Inc., 2000
1010-2 z Test for Comparing Two Means
from Independent Populations - Example
z=
(X
1
− X ) − (µ − µ
2
1
σ
2
1
n
1
=
+
σ
2
)
2
2
n
2
(88.42 − 80.61) − 0 = 7.45
2
1010-2 z Test for Comparing Two Means
from Independent Populations - Example
2
5.62 4.83
+
50
50
Step 4: Make the decision. Reject the
null hypothesis at α = 0.05, since
7.45 > 1.96.
Step 5: Summarize the results. There is
enough evidence to reject the claim that
the means are equal. Hence, there is a
significant difference in the rates.
© The McGraw-Hill Companies, Inc., 2000
1010-2 Formula for Confidence Interval for
Difference Between Two Means : Large
Samples
1010-2 P-Values
© The McGraw-Hill Companies, Inc., 2000
The P-values for the tests can be
determined using the same procedure
as shown in Section 9-3.
The P-value for the previous example
will be: P-value = 2×
×P(z > 7.45) ≈ 2(0) = 0.
You will reject the null hypothesis since
the P-value = 0 < α = 0.05.
(X − X ) − (z
1
)
α 2
2
σ
2
1
n
1
1
(X
© The McGraw-Hill Companies, Inc., 2000
Means: Large Samples - Example
Find the 95% confidence interval for the
difference between the means for the
data in the previous example.
Substituting in the formula one gets
(verify) 5.76 < µ1 − µ2 < 9.86.
Since the confidence interval does not
contain zero, one would reject the null
hypothesis in the previous example.
© The McGraw-Hill Companies, Inc., 2000
σ
2
2
n
2
< µ −µ <
1
2
− X ) + (z
2
)
α 2
σ
2
1
n
1
1010-2 Confidence Interval for Difference of Two
+
+
σ
2
2
n
2
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference Between
Two Variances (F test)
For the comparison of two
variances or standard deviations,
an F test is used.
The sampling distribution of the
variances is called the
F distribution.
© The McGraw-Hill Companies, Inc., 2000
1010-3 Characteristics of the
F Distribution
1010-3 Curves for the F Distribution
The values of F cannot be negative.
The distribution is positively skewed.
The mean value of F is approximately
equal to 1.
The F distribution is a family of curves
based on the degrees of freedom of the
variance of the numerator and
denominator.
© The McGraw-Hill Companies, Inc., 2000
1010-3 Formula for the F Test
1010-3 Assumptions for Testing the
Difference between Two Variances
2
s
s
where s is the larger of the two variances.
numerator degrees of freedom = n − 1
denominator degrees of freedom = n − 1
n is the sample size from which the larger
variance was obtained .
F=
© The McGraw-Hill Companies, Inc., 2000
1
2
2
2
1
1
2
The populations from which the
samples were obtained must be
normally distributed.
The samples must be independent
of each other.
1
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference between
Two Variances - Example
A researcher wishes to see whether the
variances of the heart rates (in beats per
minute) of smokers are different from
the variances of heart rates of people
who do not smoke. Two samples are
selected, and the data are given on the
next slide. Using α = 0.05, is there
enough evidence to support the claim?
© The McGraw-Hill Companies, Inc., 2000
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference between
Two Variances - Example
1
For smokers n1 = 26 and s1 = 36; for
2
nonsmokers n2 = 18 and s2 = 10.
Step 1: State the hypotheses and
identify the claim.
H0: σ 12 = σ 22 H1: σ 12 ≠ σ 22 (claim)
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference between
Two Variances - Example
Step 2: Find the critical value. Since
α = 0.05 and the test is a two-tailed test,
use the 0.025 table. Here d.f. N. = 26 – 1
= 25, and d.f.D. = 18 – 1 = 17. The
critical value is F = 2.56.
Step 3: Compute the test value.
2
2
F = s1 / s2 = 36/10 = 3.6.
1010-3 Testing the Difference between
Two Variances - Example
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference between
Two Variances - Example
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference between
Two Variances - Example
3.6
2.56
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference between
Two Variances - Example
Is there enough evidence to support her
claim, using α = 0.01?
Step 1: State the hypotheses and
identify the claim.
H0: σ 12 ≤ σ 22 H1: σ 12 > σ 22 (claim)
An instructor hypothesizes that the
standard deviation of the final exam
grades in her statistics class is larger
for the male students than it is for the
female students. The data from the final
exam for the last semester are: males
n1 = 16 and s1 = 4.2; females n2 = 18 and
s2 = 2.3.
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference between
Two Variances - Example
© The McGraw-Hill Companies, Inc., 2000
Step 4: Make the decision. Reject the
null hypothesis, since 3.6 > 2.56.
Step 5: Summarize the results. There is
enough evidence to support the claim
that the variances are different.
Step 2: Find the critical value. Here,
d.f.N. = 16 –1 = 15, and
d.f.D. = 18 –1 = 17.
For α = 0.01 table, the critical value is
F = 3.31.
Step 3: Compute the test value.
F = (4.2)2/(2.3)2 = 3.33.
© The McGraw-Hill Companies, Inc., 2000
1010-3 Testing the Difference between
Two Variances - Example
1010-3 Testing the Difference between
Two Variances - Example
Step 4: Make the decision. Reject the
null hypothesis, since 3.33 > 3.31.
Step 5: Summarize the results. There is
enough evidence to support the claim
that the standard deviation of the final
exam grades for the male students is
larger than that for the female students.
3.33
3.31
© The McGraw-Hill Companies, Inc., 2000
1010-4 Testing the Difference between
Two Means: Small Independent Samples
When the sample sizes are small (< 30)
and the population variances are
unknown, a t test is used to test the
difference between means.
The two samples are assumed to be
independent and the sampling
populations are normally or
approximately normally distributed.
© The McGraw-Hill Companies, Inc., 2000
1010-4 Testing the Difference between
Two Means: Small Independent Samples
There are two options for the use of the
t test.
When the variances of the populations
are equal and when they are not equal.
The F test can be used to establish
whether the variances are equal or not.
© The McGraw-Hill Companies, Inc., 2000
1010-4 Testing the Difference between
Two Means: Small Independent Samples Test Value Formula
© The McGraw-Hill Companies, Inc., 2000
1010-4 Testing the Difference between
Two Means: Small Independent Samples Test Value Formula
Unequal Variances
t=
(X
1
− X ) − (µ − µ
2
1
2
2
1
2
1
2
2
Equal Variances
)
t=
s s
+
n n
(X
− X ) − (µ − µ )
(n − 1) s + (n − 1) s 1 1
+
n +n −2
n n
1
2
1
1
2
© The McGraw-Hill Companies, Inc., 2000
2
2
1
1
d . f . = smaller of n − 1 or n − 1
1
2
2
2
2
1
2
d . f . = n + n − 2.
1
2
© The McGraw-Hill Companies, Inc., 2000
1010-4 Difference between Two Means:
Small Independent Samples - Example
The average size of a farm in Greene County,
PA, is 199 acres, and the average size of a farm
in Indiana County, PA, is 191 acres. Assume
the data were obtained from two samples with
standard deviations of 12 acres and 38 acres,
respectively, and the sample sizes are 10 farms
from Greene County and 8 farms in Indiana
County. Can it be concluded at α = 0.05 that
the average size of the farms in the two
counties is different?
1010-4 Difference between Two Means:
Small Independent Samples - Example
© The McGraw-Hill Companies, Inc., 2000
1010-4 Difference between Two Means:
Small Independent Samples - Example
Since 10.03 > 4.20, the decision is to
reject the null hypothesis and conclude
the variances are not equal.
Step 1: State the hypotheses and
identify the claim for the means.
H0: µ1 = µ2
H1: µ ≠ µ2 (claim)
© The McGraw-Hill Companies, Inc., 2000
1010-4 Difference between Two Means:
Small Independent Samples - Example
© The McGraw-Hill Companies, Inc., 2000
1010-4 Difference between Two Means:
Small Independent Samples - Example
Step 4: Make the decision. Do not reject
the null hypothesis, since 0.57 < 2.365.
Step 5: Summarize the results. There is
not enough evidence to support the
claim that the average size of the farms
is different.
Note: If the the variances were equal use the other test value formula.
© The McGraw-Hill Companies, Inc., 2000
Assume the populations are normally
distributed.
First we need to use the F test to
determine whether or not the variances
are equal.
The critical value for the F test for
α = 0.05 is 4.20.
The test value = 382/122 = 10.03.
Step 2: Find the critical values. Since
α = 0.05 and the test is a two-tailed test,
the critical values are t = –2.365 and
+2.365 with d.f. = 8 – 1 = 7.
Step 3: Compute the test value.
Substituting in the formula for the test
value when the variances are not equal
gives t = 0.57.
© The McGraw-Hill Companies, Inc., 2000
1010-5 Testing the Difference between
Two Means: Small Dependent Samples
When the values are dependent,
employ a t test on the differences.
Denote the differences with the
symbol D, the mean of the population
of differences with µD, and the sample
standard deviation of the differences
with sD.
© The McGraw-Hill Companies, Inc., 2000
1010-5 Testing the Difference between Two
Means: Small Dependent Samples Formula for the test value.
t=
D−µ
s
n
D
1010-5 Testing the Difference between Two
Means: Small Dependent Samples Formula for the test value.
D
where
D = sample mean
Note: This test is similar to a
one sample t test, except it is
done on the differences when
the samples are dependent.
degrees of freedom = n − 1
© The McGraw-Hill Companies, Inc., 2000
© The McGraw-Hill Companies, Inc., 2000
Outline
Chi-Square Tests
12-1 Introduction
12-2 Test for Goodness of Fit
12-3 Tests Using Contingency
Tables
© The McGraw-Hill Companies, Inc., 2000
Introduction
© The McGraw-Hill Companies, Inc., 2000
Introduction
Chi-square distribution
test for frequency distributions, such as
“If a sample of buyers is given a choice
of automobile colors, will each color be
selected with the same frequency?”
© The McGraw-Hill Companies, Inc., 2000
Test the independence of two variables.
For example, “Are senators’ opinions
on gun control independent of party
affiliations?”
© The McGraw-Hill Companies, Inc., 2000
Chi-square distribution
Chi-square distribution
It is a family of curves.
Each curve depends on a degree of
freedom.
A chi-square variable cannot be
negative.
Curves are asymmetric.
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit
When one is testing to see
whether a frequency
distribution fits a specific
pattern, the chichi-square
goodnessgoodness-ofof-fit test is used.
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
Cherry Straw- Orange
berry
32
28
16
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
If there were no preference, one
would expect that each flavor would
be selected with equal frequency.
In this case, the equal frequency is
100/5 = 20.
That is, approximately 20 people
would select each flavor.
© The McGraw-Hill Companies, Inc., 2000
Suppose a market analyst wished to see
whether consumers have any preference
among five flavors of a new fruit soda. A
sample of 100 people provided the
following data:
Lime
Grape
14
10
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
The frequencies obtained from the
sample are called observed
frequencies.
frequencies
The frequencies obtained from
calculations are called expected
frequencies.
frequencies
Table for the test is shown next.
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
1212-2 Test for Goodness of Fit Example
Freq.
Cherry Straw- Orange Lime Grape
berry
Observed
32
28
16
14
10
Expected
20
20
20
20
20
The observed frequencies will almost
always differ from the expected
frequencies due to sampling error.
Question: Are these differences
significant, or are they due to chance?
The chi-square goodness-of-fit test will
enable one to answer this question.
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Formula
χ = ∑
2
Is there enough evidence to reject the
claim that there is no preference in the
selection of fruit soda flavors? Let
α = 0.05.
Step 1: State the hypotheses and
identify the claim.
© The McGraw-Hill Companies, Inc., 2000
E
d . f . = number of categories − 1
O = observed frequency
E = expected frequency
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
(O − E )
2
The appropriate hypotheses for this
example are:
H0: Consumers show no preference for
flavors of the fruit soda.
H1: Consumers show a preference.
The d. f. for this test is equal to the
number of categories minus 1.
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
H0: Consumers show no preference for
flavors (claim).
H1: Consumers show a preference.
Step 2: Find the critical value. The d. f.
are 5 – 1 = 4 and α = 0.05. Hence, the
critical value = 9.488.
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
Step 3: Compute the test value.
χ2 = (32 – 20)2/20 + (28 is – 20)2/20 + … +
(10 – 20)2/20 = 18.0.
Step 4: Make the decision. The
decision is to reject the null hypothesis,
since 18.0 > 9.488.
1212-2 Test for Goodness of Fit Example
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
18.0
9.488
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
Step 1: State the hypotheses and
identify the claim.
H0: The club consists of 10% freshmen,
20% sophomores, 40% juniors, and
30% seniors (claim)
H1: The distribution is not the same as
stated in the null hypothesis.
© The McGraw-Hill Companies, Inc., 2000
Step 5: Summarize the results. There is
enough evidence to reject the claim that
consumers show no preference for the
flavors.
The advisor of an ecology club at a large
college believes that the group consists
of 10% freshmen, 20% sophomores, 40%
juniors, and 30% seniors. The
membership for the club this year
consisted of 14 freshmen, 19
sophomores, 51 juniors, and 16 seniors.
At α = 0.10, test the advisor’s conjecture.
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
Step 2: Find the critical value. The d. f.
are 4 – 1 = 3 and α = 0.10. Hence, the
critical value = 6.251.
Step 3: Compute the test value.
χ 2 = (14 – 10)2/10 + (19 – 20)2/20 + … +
(16 – 30)2/30 = 11.208.
© The McGraw-Hill Companies, Inc., 2000
1212-2 Test for Goodness of Fit Example
Step 4: Make the decision. The
decision is to reject the null hypothesis,
since 11.208 > 6.251.
Step 5: Summarize the results. There is
enough evidence to reject the advisor’s
claim.
1212-3 Tests Using Contingency Tables
© The McGraw-Hill Companies, Inc., 2000
1212-3 Tests Using Contingency Tables
The test of independence of variables is
used to determine whether two variables
are independent when a single sample is
selected.
The test of homogeneity of proportions is
used to determine whether the proportions
for a variable are equal when several
samples are selected from different
populations.
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Independence Example
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Independence Example
GGroup
roup
Prefer
Prefer
No
Prefer
Prefer
No
new
old
preference
new
old
preference
procedure
procedure
procedure procedure
Nurses
100
80
20
Nurses
100
80
20
50
50
120
120
30
30
© The McGraw-Hill Companies, Inc., 2000
Suppose a new postoperative procedure
is administered to a number of patients
in a large hospital.
Question: Do the doctors feel
differently about this procedure from
the nurses, or do they feel basically the
same way?
Data is on the next slide.
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Independence Example
Doctors
Doctors
When data can be tabulated in table
form in terms of frequencies, several
types of hypotheses can be tested
using the chi-square test.
Two such tests are the independence of
variables test and the homogeneity of
proportions test.
The null and the alternative hypotheses
are as follows:
H0: The opinion about the procedure is
independent of the profession.
H1: The opinion about the procedure is
dependent on the profession.
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Independence Example
If the null hypothesis is not rejected, the
test means that both professions feel
basically the same way about the
procedure, and the differences are due to
chance.
If the null hypothesis is rejected, the test
means that one group feels differently
about the procedure from the other.
1212-3 Test for Independence Example
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Independence Example
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Independence Example
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Homogeneity of
Proportions
Here, samples are selected from several
different populations and one is
interested in determining whether the
proportions of elements that have a
common characteristic are the same for
each population.
© The McGraw-Hill Companies, Inc., 2000
Note: The rejection of the null hypothesis
does not mean that one group favors the
procedure and the other does not.
The test value is the χ 2 value (same as the
goodness-of-fit test value).
The expected values are computed from:
(row sum)×
×(column sum)/(grand total).
From the MINITAB output, the
P-value = 0. Hence, the null hypothesis
will be rejected.
If the critical value approach is used, the
degrees of freedom for the chi-square
critical value will be (number of
columns –1)×
×(number of rows – 1).
d.f. = (3 –1)(2 – 1) = 2.
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Homogeneity of
Proportions
The sample sizes are specified in
advance, making either the row totals or
column totals in the contingency table
known before the samples are selected.
The hypotheses will be:
H0: p1 = p2 = … = pk
H1: At least one proportion is different
from the others.
© The McGraw-Hill Companies, Inc., 2000
1212-3 Test for Homogeneity of
Proportions
The computations for this test
are the same as that for the test
of independence.
© The McGraw-Hill Companies, Inc., 2000