Independent Samples: Comparing Means Lecture 39 Section 11.4 Fri, Apr 1, 2005 Independent Samples In a paired study, two observations are made on each subject, producing one sample of bivariate data. Or we could think of it as two samples of paired data. Often these are “before” and “after” observations. By comparing the “before” mean to the “after” mean, we can determine whether the intervening treatment had an effect. Independent Samples On the other hand, with independent samples, there is no logical way to “pair” the data. One sample might be from a population of males and the other from a population of females. Or one might be the treatment group and the other the control group. The samples could be of different sizes. Independent Samples We wish to compare population means 1 and 2. We do so by comparing sample meansx1 andx2. More specifically, we will usex1 –x2 as an estimator of 1 – 2. If we want to know whether 1 = 2, we test to see whether 1 – 2 = 0 by computingx1 –x2. The Distributions ofx1 andx2 Let n1 and n2 be the sample sizes. If the samples are large, thenx1 andx2 have (approx.) normal distributions. However, if either sample is small, then we will need an additional assumption. The populations are normal. Further Assumption We will also assume that the two populations have the same standard deviation. Call it . If this assumption is not supported by the evidence, then it should not be made. This assumption is often not warranted, but without it, the formulas become much more complicated. See p. 658. The t Distribution Let s1 and s2 be the sample standard deviations. Whenever we use s1 and s2 instead of , then we will have to use the t distribution instead of the standard normal distribution, unless the sample sizes are large. The Distribution ofx1 –x2 Suppose thatx1 andx2 have normal distributions with means 1 and 2 and standard deviations 1/n1 and 2/n2 (according to the Central Limit Theorem, p. 500). Thenx1 –x2 is a normal random variable with the following properties: The mean is 1 – 2. The standard deviation is (12 /n1 + 22 /n2 ). The Distribution ofx1 –x2 If we assume that 1 = 2, then the standard deviation may be simplified to 2 2 1 1 n1 n2 n1 n2 That is, 1 1 x1 x2 is N 1 2 , . n1 n2 The Distribution ofx1 –x2 x1 is N 1 , n1 0 1 The Distribution ofx1 –x2 x2 is N 2 , n2 0 2 1 The Distribution ofx1 –x2 1 1 x1 x2 is N 1 2 , n1 n2 0 1 – 2 2 1 The Distribution ofx1 –x2 If 1 1 x1 x2 is N 1 2 , . n1 n2 then it follows that x1 x2 1 2 Z 1 1 n1 n2 Estimating Individually, s1 and s2 estimate . However, we can get a better estimate than either one if we “pool” them together. The pooled estimate is n1 1s1 n2 1s2 2 sp n1 n2 2 2 . x1 –x2 and the t Distribution If we use sp instead of , and the sample sizes are small, then we should use t instead of Z. The number of degrees of freedom is df = df1 + df2 = n1 + n2 – 2. That is x x 1 2 t (n n 2) 1 2 1 2 sp 1 1 n1 n2 Hypothesis Testing See Example 11.4, p. 647 – Comparing Two Headache Treatments. State the hypothesis. H0: 1 = 2 H1: 1 > 2 State the level of significance. = 0.05. The t Statistic, a.k.a. Our Second Bad Formula Compute the value of the test statistic. The test statistic is x1 x2 t 1 1 sp n1 n2 with df = n1 + n2 – 2. Computations 9 s1 9 s2 sp 5.052. 18 22.6 19.4 t 1.416. 1 1 5.052 10 10 2 2 Hypothesis Testing Calculate the p-value. The number of degrees of freedom is df = df1 + df2 = 18. p-value = P(t > 1.416) = tcdf(1.416, E99, 18) = 0.0869. Hypothesis Testing State the conclusion. Since p-value > , we conclude that, At the 5% level of significance, the data do not support the claim that Treatment 1 is more effective than Treatment 2. Confidence Intervals Confidence intervals for 1 – 2 use the same theory. The point estimate isx1 –x2. The standard deviation ofx1 –x2 is approximately sp. Confidence Intervals The confidence interval is 1 1 x1 x2 z n1 n2 ( known, large samples) or or x1 x2 z s p 1 1 n1 n2 x1 x2 t s p 1 1 n1 n2 ( unknown, large samples) ( known, normal pops., small samples) Confidence Intervals The choice depends on Whether is known. Whether the populations are normal. Whether the sample sizes are large. Example Find a 95% confidence interval for 1 – 2 in Example 11.4. x1 –x2 = 3.2. sp = 5.052. Use t = 2.101. The confidence interval is 3.2 (2.101)(2.259) = 3.2 4.75.
© Copyright 2026 Paperzz