Unit 8: Inference for Two Samples Statistics 571: Statistical Methods Ramón V. León 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 1 Introductory Remarks • A majority of statistical studies whether experimental or observational are comparative • Simplest type of comparative study compares two populations • Two principal designs for comparative studies – Using independent samples – Using matched pairs • Graphical methods for informal comparisons • Formal comparisons of means and variances of normal populations – Confidence intervals – Hypothesis tests 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 2 1 Independent Samples Design Example: Compare Control Group to Treatment Group •Children are randomly assigned to two groups. One group is taught using an old teaching method and the other group using a new teaching method. To avoid one possible source of confounding both methods are taught by the same teacher. Independent samples design: Sample 1: x1 , x2 ,..., xn1 Possibly different numbers Sample 2: y1 , y2 ,..., yn2 •The two samples are independent •Independent sample design relies on random assignment to make the two groups equal (on the average) on all attributes except for the treatment used (treatment factor). 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 3 Matched Pairs Design Example: Comparison of Two Methods of Teaching Spelling. •Pretest the children and match pairs of children with similar test scores. •Randomly assign one child of each pair to the control group and the other to the treatment group •The pretest score is called a blocking factor Pair: Sample 1: Sample 2: 1 x1 y1 2 x2 y2 … n xn yn The same number The x’s are the changes in the test scores of the children taught by the old method. The y’s are the changes in the test scores of the children taught by the new method. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 4 2 Pros and Cons of Each Design • In the independent sample design the two groups may not be quite equal on some attribute, especially if the sample sizes are small, because randomization assures equality only on the average. Does one group have a higher pretest score than the other? Difference in the groups could be the result of this difference not the treatment factor. • The matched pair design enables more precise comparisons to be made between treatment groups because of smaller experimental error • However, it can be difficult to form matched pairs, especially when dealing with human subjects • The two types of designs can be used in observational studies where the two groups are not formed by randomization. Then a cause and effect relationship cannot be established. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 5 Hospitalization Cost: Independent Sample Design 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 6 3 Graphical Methods for Comparing Independent Samples: Side-by-Side Box Plots Outliers present and variation seems different 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 7 Side-by-Side Box Plots: Log Costs •No outliers for LogeCost •Assumption of equal variance appears reasonable for the two groups in the log scale 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 8 4 Another Graphical Methods for Comparing Two Independent Samples Q-Q Plot Plot of the order statistics ordered pairs ( x(i ) , y( i ) ) which are the i quantiles n+1 of the respective samples 7/14/2004 Plot suggests that treatment group costs are less than control group costs. But is this true? Book discusses how to prepare this graph when the two samples are not of the same size. Unit 8 - Stat 571 - Ramón V. León 9 Graphical Displays of Data from Matched Pairs • Plot the pairs (xi, yi) in a scatter plot. Using the 45° line as a reference, one can judge whether the two sets of values are similar or whether one set tends to be larger than the other • Plots of the differences or the ratios of the pairs may prove to be useful • Will show a JMP plot that is very informative • A Q-Q plot is meaningless for paired data because the same quantiles based on the order observations do not, in general, come from the same pair. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 10 5 Comparing Means of Two Populations: Independent Samples Design (Large Samples Case) Suppose that the observations x1 , x2 ,..., xn1 and y1 , y2 ,..., yn2 are independent random samples from two populations with means µ1 and µ2 and variances σ 12 and σ 22 . Both means and variances are assumed to be unknown. The goal is to compare µ1 and µ2 in terms of their difference µ1 -µ2 . We assume that n1 and n2 are large (say > 30). 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 11 Comparing Means of Two Populations: Independent Samples Design E ( X − Y ) = E ( X ) − E (Y ) = µ1 − µ2 Var ( X − Y ) = Var ( X ) + Var (Y ) = σ 12 n1 + σ 22 n2 Therefore the standarized r.v. Z= X − Y − ( µ1 − µ2 ) σ 12 n1 + σ 22 n2 has mean = 0 and variance = 1 If n1 and n2 are large, then Z is approximately N (0,1) by the Central Limit Theorem though we did not assume the samples came from normal populations. (We also use fact that the difference of independent normal r.v.'s is also normal.) 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 12 6 Large Sample (Approximate) 100(1-α)% CI for µ1−µ2 ( x − y ) − zα 2 s12 s22 s12 s22 + ≤ µ1 − µ2 ≤ ( x − y ) + zα 2 + n1 n2 n1 n2 Note si2 has been substituted for σ i2 because samples are large, i.e., bigger than 30. Example 8.2: Two large calculus classes of 150 students taught by two different professors (99% CI for difference of means) x − y ± z.005 (15) 2 (12) 2 s12 s22 + = (75 − 70) ± 2.576 + = [0.960,9.040] 150 150 n1 n2 Note interval does not contain zero so there is a α =.01 level significant difference between the two professors 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 13 Large Sample (Approximate) Test of Hypothesis H 0 : µ1 − µ 2 = δ 0 vs. H1 : µ1 − µ 2 ≠ δ 0 (Typically δ 0 = 0) Test statistics: z = (x − y) − δ0 s12 n1 + s22 n2 Example 8.2: Final exam grades for large calculus classes of 150 students taught by two different professors. (Same test used on both classes.) (x − y) − δ0 (75 − 70) − 0 = = 3.188 z= 2 2 s1 n1 + s2 n2 (15) 2 150 + (12) 2 150 P − value = P[ z ≥ 3.188] = 2[1 − Φ (3.188)] = .0014 Result is significant at the α = .01 level. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 14 7 Inference for Small Samples Case 1: Variances σ 12 and σ 22 assumed equal. Assumption of normal populations is important since we cannot invoke the CLT Pooled estimate of the common variance: S2 = (n1 − 1) S12 + (n2 − 1) S22 = (n1 − 1) + (n2 − 1) ∑(X i − X ) 2 + ∑ (Yi − Y ) 2 n1 + n2 − 2 Note: S = ( S + S ) / 2 if sample sizes are equal 2 T= 2 1 2 2 X − Y − ( µ1 − µ 2 ) has t -distribution with n1 + n2 − 2 d.f. S 1 n1 + 1 n2 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 15 Inference for Small Sample: Confidence Intervals and Hypothesis Tests Case 1: Variances σ 12 and σ 22 assumed equal. Two-sided 100(1-α )% CI is given by: x − y − tn1 + n2 − 2,α 2 s 1 1 1 1 + ≤ µ1 − µ 2 ≤ x − y + tn1 + n2 − 2,α 2 s + n1 n2 n1 n2 Test of Hypothesis: H 0 : µ1 − µ 2 = δ 0 vs. H1 : µ1 − µ 2 ≠ δ 0 Test statistics: t = x − y −δ0 1 1 s + n1 n2 Reject H 0 if t > tn1 + n2 − 2,α 2 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 16 8 JMP: Checking Assumptions for Log Costs 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 17 Normal Plots by Group After log transformation, control group data seems normal 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 18 9 Normal Plots by Group After log transformation, treatment group data seems normal 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 19 Testing for Normality Control: Treatment: 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 20 10 Hospitalization Cost Example •Work with log cost since then the normal assumption holds •Assumption of equal variance is also reasonable as seen by the side-by-side box plots of loge cost (Slide 8) Pooled estimate of the common variance: 29(1.167) 2 + 29(0.905) 2 s= = 1.045 with 58 d.f. 58 SE(x - y ) = 0.2698 95% CI: [ x − y ± t58,.025 s 1 1 1 1 + = (8.251 − 8.084) ± 2.002 × 1.045 + = [−0.373, 0.707] n1 n2 30 30 Contains zero Difference in cost in logarithmic scale are not significant. Contrast this conclusion with apparent difference seen on the Q-Q plot in Figure 8.1 (Slide 9) 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 21 Interpretation of Difference in Means on the Log Scale Mean (log Cost) = Median (log Cost) = log (Median Cost) Because distribution of log cost is symmetric Because the log preserves ordering −0.373 ≤ Mean(log CostC ) − Mean(log CostT ) ≤ 0.707 −0.373 ≤ log( Median CostC ) − log( Median CostT ) ≤ 0.707 Median CostC −0.373 ≤ log ≤ 0.707 Median CostT Median CostC .689 = exp(−0.373) ≤ ≤ exp(0.707) = 2.028 Median CostT 7/14/2004 Unit 8 - Stat 571 - Ramón V. León This Interpretation is not in your textbook 95% confidence interval for the ratio of median costs 22 11 JMP Analysis of Hospitalization Cost Example 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 23 JMP Analysis of Hospitalization Cost Example 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 24 12 JMP Analysis of Hospitalization Cost Example 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 25 Means Diamonds and x-Axis Proportional A means diamond illustrates a sample mean and 95% confidence interval, as shown by the schematic in "Means Diamonds and XProportional Options". The line across each diamond represents the group mean. The vertical span of each diamond represents the 95% confidence interval for each group. Overlap marks are drawn above and below the group mean. For groups with equal sample sizes, overlapping marks indicate that the two group means are not significantly different at the 95% confidence level. If the X-Axis Proportional display option is in effect, the horizontal extent of each group along the x-axis (the horizontal size of the diamond) is proportional to the sample size of each level of the x variable. It follows that the narrower diamonds are usually the taller ones because fewer data points yield a less precise estimate of the group mean. The confidence interval computation assumes that variances are equal across observations. Therefore, the height of the confidence interval (diamond) is proportional to the reciprocal of the square root of the number of observations in the group. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 26 13 JMP Analysis of Hospitalization Cost Example 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 27 Inference for Small Samples Case 2: Variances σ 12 and σ 22 unequal. T= X − Y − ( µ1 − µ 2 ) does not have a Student t - distribution S12 S 22 + n1 n2 T is not a pivotal quantity since it depends on ratio of sigmas. However, when n1 and n2 are large T has an approximate N(0,1) distribution 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 28 14 Inference for Small Samples Case 2: Variances σ 12 and σ 22 unequal. For small samples T= X − Y − ( µ1 − µ 2 ) has an approximately t -distribution S12 S22 + n1 n2 with ν = ( w1 + w2 ) 2 degrees of freedom w12 (n1 − 1) + w22 (n2 − 1) where w1 = ( SEM( x ) ) = 2 s12 s2 2 and w2 = ( SEM( y ) ) = 2 n1 n2 Note: d.f. are estimated from the data and are not a function of the samples sizes alone Note: ν is not usually an integer but is rounded down to the nearest integer 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 29 Inference for Small Samples Case 2: Variances σ 12 and σ 22 unequal. Approximate 100(1-α )% two-sided CI for µ1 − µ 2 : x − y − tν ,α 2 s12 s22 s12 s22 + ≤ µ1 − µ 2 ≤ x − y − tν ,α 2 + n1 n2 n1 n2 Test statistics for H 0 : µ1 − µ 2 = δ 0 vs. H1 : µ1 − µ 2 ≠ δ 0 is t = x − y −δ0 s12 n1 + s12 n1 Reject H 0 if t > tν ,α 2 . 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 30 15 Hospitalization Costs: Inference Using Separate Variances Since n1 = n2 = n we have s12 s22 2 s12 + s22 2 2 2 s 2 + = × = s = n1 n2 n n 2 n where s 2 is the pooled sample variance. Hence, t= x − y − δ0 s12 n1 + s12 n1 = x − y − δ0 2s 2 n = x − y − δ0 s 1 n +1 n = 0.619 Recall that we are working with log cost That is, with equal sample sizes this t is the same as the t with pooled variance The textbook calculates the d.f. to be ν = 54.6 = 54 Recall that assuming equal variance the d.f. were n1+ n2 - 2 = 58 Since the t distributions with 54 and 58 d.f. do not differ by much the results not assuming equal variances do not differ by much from those gotten assuming equal variances. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 31 Hospitalization Costs: Inference Using Separate Variances – Conclusions • Result with equal variances and unequal variance do not differ by much because – The two sample variances are not too different – The two sample sizes are equal • In other cases there can be considerable loss in d.f. as a result of using separate variance estimates • If there is any doubt about the equality of variances do not use the pooled variance method since this method is not robust against the violation of the assumption of equal variances. This is particularly so when sample sizes are unequal. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 32 16 Testing for the Equality of Variances Section 8.4 covers the classical F test for the equality of two variances and associated confidence intervals. However, this method is not robust against departures from normality. For example, p-values can be off by a factor of 10 if the distributions have shorter or longer tails than the normal. A robust alternative is Levene’s Test. His test applies the two-sample t-test to the absolute value of the difference of each observation and the group mean | Y1i − Y1 |, i = 1, 2, ⋅⋅⋅, n1 | Y2i − Y2 |, i = 1, 2, ⋅⋅⋅, n2 This method works well even though these absolute deviations are not independent. In the Brown-Forsythe test the response is the absolute value of the difference of each observation and the group median 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 33 JMP Analysis Without Assuming Equal Variances 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 34 17 JMP 4 Analysis Without Assuming Equal Variances Classical Test for equality of several variances is not robust against departure from the normality assumption ν = 54.6 = 54 These tests can be used when there are more than two populations Note that (.6181)2 = 0.3820 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 35 Independent Sample Design: Sample Size Determination Assuming Equal Variances H 0 : µ1 − µ 2 = 0 vs. H1 : µ1 − µ 2 ≠ 0 σ ( zα 2 + z β ) n = n1 = n2 = 2 δ 2 Because we assume a known variance this n is a slight underestimate of sample size Smallest difference of practical importance that we want to detect Hospital cost example: Assume σ = 1 in loge scale costs (consistent with data) Assume we want to detect change of a factor of two in the median cost with at least .90 power. The alpha level is .05. median2 Cost Then δ = E (log Y1 ) − E (log Y2 ) log = log 2 = 0.693 median1 Cost σ = 1, zα 2 = z.025 = 1.960, and z β = z.10 = 1.282 2 1.0(1.960 + 1.282) n = 2 = 43.75 44 .693 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 36 18 Sample Size Determination With JMP 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 37 Sample Size Determination With JMP: Method 1 .693/2 = δ/2 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 38 19 Sample Size Determination With JMP Sample size here is 2n 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 39 Sample Size Determination With JMP: Exploring a Range of Standard Deviations 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 40 20 Sample Size Determination With JMP Exploring a Range of Standard Deviations 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 41 Sample Size Determination With JMP: A Simpler Way 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 42 21 Sample Size Determination With JMP: A Simpler Way Extra Params is only for multi-factor designs. Leave this field zero in simple cases. 2n 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 43 Matched Pairs Design Example: Clinical study to compare two methods of measuring cardiac output discussed in Example 4.11. Method A is invasive while Method B is noninvasive. To control for difference between persons, both methods are used on each person participating in the study. The purpose is to compare the two methods in terms of the mean difference. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 44 22 Cardiac Output Data 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 45 Cardiac Output JMP File You can download this file in the course home page following the Textbook Data Sets link. Methods used for matched pair are the one-sample methods already learned applied to the A-B data. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 46 23 Cardiac Output JMP Analysis 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 47 Testing Equality of Means Distribution JMP platform: Example 8.6 in the textbook has a slight rounding error in the calculation of the mean difference Test that the mean difference is zero 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 48 24 Confidence Interval for Difference of Means 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 49 JMP’s Matched Pairs Platform: Another Way to Compare Matched Pairs 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 50 25 JMP’s Matched Pairs Platform Significant difference between the methods because significance bands do not contain 0. X axis measures how different the pairs are. The more scatter in this axis the more likely the results are to generalize to other persons outside the sample. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 51 Statistical Justification of Matched Pairs Design 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 52 26 Sample Size Determination ( zα + z β )σ D n= δ 2 ( zα 2 + z β )σ D n= δ (One-Sided Test) 2 (Two-Sided Test) •One needs a planning value for σD •This formulas come from the one-sample formulas applied to the differences 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 53 Comparing Variances of Two Populations •Application arises when comparing instrument precision or uniformities of products. Also when checking assumptions. •The method discussed in the book (F test) is applicable only under the assumption of normality of the data. The F test is highly sensitive to even modest departures from normality. • In case of nonnormal data there are nonparametric and other robust methods for comparing data dispersion. We recall some supported by JMP. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 54 27 Example 8.7 (Hospitalization Costs: Test of Equality of Variances) 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 55 Comparing Variances with JMP These tests are alternatives to the F test used in Example 8.7 on Page 288. These tests can be used with more than two samples. Levene’s test is relatively robust against violations of the normality assumption . See JMP’s help for details. 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 56 28 Comparing Variances of Two Populations: F Test Independent sample design: Sample 1: x1 , x2 ,..., xn1 is a random sample from N ( µ1 , σ 12 ) Sample 2: y1 , y2 ,..., yn1 is a random sample from N ( µ1 , σ 12 ) F= S12 σ 12 has an F distribution n1 − 1 and n2 − 1 d.f. respectively S 22 σ 22 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 57 An Important Industrial Application: Example 8.8 Do the two labs have equal measurement precision? 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 58 29 Example 8.8: Calculation Details Used the fact that d.f. = (12 - 1) + (12 - 1) + (12 - 1) = 33 d.f. = (6 - 1) + (6 - 1) + (6 - 1) = 15 7/14/2004 Unit 8 - Stat 571 - Ramón V. León 1 f n ,m,1−α = f m,n ,α 59 30
© Copyright 2024 Paperzz