Simple Comparative Experiments – Introduction to ANOVA Read Sections 3.2 – 3.4 in the text Note: These notes were modified from lecture notes created by Tisha Hooks and Christopher Malone. In the last set of notes we looked at the Tension Bond Strength example with both a sample of the data and the entire data set. Another important feature to examine when analyzing a data set is the _________________ of difference due to a particular factor. Confidence Interval A confidence interval can be used to ______________________ measure the amount of difference between two factor levels. That is, we can use a confidence interval to measure the __________ effect of a factor across its levels. Let’s look at the sample of 4 observations from the Tension Bond Strength example. Recall, the data values used: To obtain a confidence interval in Minitab, choose Stat ANOVA General Linear Model Fit General Linear Model and then enter the variables as follows: Next, choose Stat ANOVA General Linear Model Comparisons and then specify the following: 1 Note, you need to double-click on “Group” so that it gets highlighted as shown above. Next, click on the Results… tab and make sure both boxes are checked as shown below. By default 95% confidence intervals will be given. If you want to change the confidence level, click on the Options… tab and enter a new level. Click OK twice and you should get the following output. The output is divided into two section: one for the confidence interval and one for the hypothesis test. The following output is given in the output window. In addition, a second output window should open with the following: 2 Confidence Interval The 95% confidence interval for the difference in factor level means is circled in the output above. Additionally, the confidence interval is given graphically in the second output window. Questions: 1. Identify the 95% confidence interval. 2. Interpret the 95% confidence interval. 3. Does it make sense this confidence interval includes 0? Explain your reasoning. 4. Sketch a confidence interval that would suggest the tension bond strength of the unmodified group is statistical higher than the tension bond strength of the modified group. 3 Hypothesis Test We might also be interested in conducting a hypothesis test to see if there is a significant difference between the factor levels. First, let’s write out what is being tested. H0: Ha: The test statistic and p-value for the hypothesis test can be found in the output given above. Questions: 5. What is the outcome of this test? 6. Note that the adjusted p-value here is exactly the same as the p-value we found in the last set of notes. Why are these two tests identical? Another way to compare groups Towards the top of the output, you’ll see the output given above. This is another way to determine whether or not the groups are significantly different from one another. If the two groups have _________________ letters, then they are statistically different from one another. If they have the same letter (as we see here) then the groups are not statistically different from one another. This part of the output is useful when comparing more than two groups. 4 Example Revisited: Now, let’s take another look at the complete data set for this example found in the file cement_mortar.mpj on the course website. Questions: 7. Based on the ANOVA conducted in the last set of notes, would you expect this confidence interval to contain 0 or not? Explain your reasoning. 8. Using Minitab, find and interpret the 95% confidence interval. 9. Verify the F-statistic from the ANOVA is equivalent to the t-statistic found in the comparisons output. 5 We’ve just examined the Tension Bond Strength data carrying out the ANOVA while intuitively obtaining the measures of error by computing the sums of squares “by hand.” These calculations are made much simpler using the framework for a _______________________________________________________________________ (GLM). The GLM approach does require the use of matrices and some linear algebra operations. The Model In general, we wish to compare _____ different levels of a single factor. Also, there are ______ observations under each factor level. One way to write the statistical model for the Tension Bond Strength example is given below. yij = µ + τi + εij where i = 1, 2 identifies the _________________ and j = 1, 2, identifies the _________________ Let’s identify the meaning of each term in the model. yij: µ: τi: εij: After the data are collected and used to estimate the model terms (parameters), statisticians typically place a “hat” over the model terms to indicate they have been __________________ from the data. For example, the observed overall mean of the response is denoted by ______ instead of ______. Using your intuition, sketch the estimated model parameters on the dotplot below for our data. yˆ ij = μˆ + ˆτi + εˆij 6 Questions: 10. Using the model parameters, what is the mean of the modified group? 11. Using the model parameters, what is the mean of the unmodified group? The Model in Matrix Notation We can look at our statistical model in matrix notation. The model for our simple example is given in matrix notation below. y11 μ τ1 ε11 y μ τ ε 12 1 12 y 21 μ τ 2 ε21 y 22 μ τ 2 ε22 The above model is equivalent to the one given below. y11 1 y 1 12 y21 1 y22 1 Y 1 0 ε11 μ 1 0 ε12 * τ 0 1 1 ε21 τ 0 1 2 ε22 X Using the above notation, Y is the _________________ vector and X is the __________________ matrix. The estimated model parameters can be obtained as follows: Model Estimates = _______________________________ Note: This approach to estimating the model parameters is _____________ general and in fact works for any linear model. 7 Let’s return to our example: 1 1 X= 1 1 1 0 1 0 and Y = 0 1 0 1 16.52 16.40 . 16.62 16.75 Problem: The columns of X’X are NOT linearly independent, therefore the inverse CANNOT be computed. Solution: We need to re-parameterize the model so the model parameters can be estimated. Consider the following re-parameterization. y11 1 y 1 12 y 21 1 y 22 1 Y 1 ε11 1 μ ε12 * 0 τ1 ε21 0 ε22 X 8 This re-parameterization uses the ________________________ restriction where τ2 = ______. Let’s use this parameterization to estimate the model parameters where 1 1 X= 1 1 1 1 and Y = 0 0 16.52 16.40 . 16.62 16.75 Let’s identify the following values: μ̂ = _____________ τ̂ 1 = _____________ τ̂ 2 = _____________ Estimated mean for the modified group = _________________________________ Estimated mean for the unmodified group = _________________________________ 9 Another possible parameterization uses the ___________________________ restriction where _________________. For this parameterization, we will use the following model: y11 1 1 ε11 y 1 1 μ 12 * ε12 . y21 1 1 τ1 ε21 y 22 1 1 ε22 Y X Therefore, to estimate the model parameters we’ll need 1 1 1 1 and Y = X= 1 1 1 1 16.52 16.40 . 16.62 16.75 Let’s identify the following values: μ̂ = _____________ τ̂ 1 = _____________ τ̂ 2 = _____________ Estimated mean for the modified group = _________________________________ Estimated mean for the unmodified group = _________________________________ 10 Using Minitab to obtain parameter estimates We can obtain the above estimates using Minitab. Choose Stat ANOVA General Linear Model. Then click on Results… shown in the figure below. Click OK twice and you should get the following output. Question: 12. What parameterization does Minitab use? We can also obtain information about the group means by selecting Options… and entering the following information in the dialogue box. Once you click OK twice, the following output should be displayed. 11 Error via the General Linear Model Approach Define ŷ as the ________________ response vector. In our model, this vector simple contains the _______________ for each group. We can compute the predicted response vector for our data in the following manner: 1 1 1 1 μˆ * = ŷ = 1 -1 ˆτ 1 1 -1 The amount of error present in the model is simply the difference between the ___________________ vector and the ___________________ response vector. Let’s compute the error for our small example. 16.52 16.46 16.40 16.46 ˆ = = Error = (y - y) 16.62 16.685 16.75 16.685 Now, to obtain SSE = Error’Error = The average amount of ____________ error is σ̂2 MSE = dferror Thus, the average amount of _____________ is σ̂ = MSE = 12 Question: 13. Where is this quantity in the Minitab output? The Standard Error of the Model Estimates The estimate of σ̂2 is used in computing the standard errors for our parameter estimates. The standard error of the model estimates are necessary for conducting ________________________ tests and ________________________ intervals. The standard error for our parameter estimates is obtained in the following manner: Variance of parameter estimates = Compute the variance for the parameter estimates in our example. Next, let’s compute the standard error of the parameter estimates. Standard error of parameter estimates = 13 Question: 14. Where is this information given in the Minitab output? Note: If we want to test H0: τ1 = 0 Ha: τ1 ≠ 0 When comparing two groups, the amount of difference between the two group averages is of interest. Recall, the following information we’ve already found: Average bond tension strength for the modified group: ____________________ Average bond tension strength for the unmodified group: ____________________ The difference in averages for the two groups: _________________________________________ The variance of the difference is found in the following manner: Var ˆτ2 - ˆτ1 = 14 Therefore, se ˆτ2 - ˆτ1 = Question: 15. Find this value in the Minitab output given below. 16. What are the hypotheses being tested? 17. How is the t-value computed? 18. How is the adjusted p-value computed? 19. What is the outcome of this test? 15 Confidence Intervals for a Difference in Means Finally, let’s look at how a confidence interval for the difference in means in calculated. Lower limit: ˆτ2 - ˆτ1 - ____________________* Var ˆτ2 - ˆτ1 Center: ˆτ2 - ˆτ1 = Upper Limit: ˆτ2 - ˆτ1 + ____________________* Var ˆτ2 - ˆτ1 This missing quantity in the interval is obtained from the t-distribution with df = __________. To obtain this value in Minitab, choose Calc Probability Distributions t… and enter the values as shown below. Question: 20. Verify the lower and upper limits of the confidence interval given in the Minitab output above. 16
© Copyright 2026 Paperzz