Chapter 12 Section 12.1 – Goodness of Fit We used the Chi-square distribution to test variance and standard deviation, but we will also use it to test the independence of two variables… When one is testing to see whether a frequency distribution fits a specific pattern, the chi-square goodness of fit test is used. Example #1 Suppose a market analyst wished to see whether consumers have any preference among five flavors of a new soda. The following data was obtained Cherry Strawberry Orange Lime Grape 32 28 14 10 16 If there was NO preference, one would expect each flavor to be selected with equal frequency. This is referred to as the ______________________distribution OBSERVED Cherry Strawberry Orange Lime Grape 32 28 16 14 10 EXPECTED The observed vs. Expected will almost always be different …. But the question is .. ”Are these differences significant? Or are they due to chance?” Let’s start … Step 1 – State the hypothesis Remember that the Null hypothesis should state that there’s NO difference or No change. H 0 : Consumers show no preference for flavors of the fruit soda. H 1 : Consumers show a preference 1 Step 2 – find the CV using the 2 Distribution and df = 5-1 =4 ( number of categories - 1) Step 3 – Compute the test value using the Goodness of Fit formula 2 Observed Expected Test _ Value Expected Cherry Strawberry Orange Lime Grape OBSERVED 32 28 16 14 10 EXPECTED 20 20 20 20 20 Using the FORMULA … Step 4 – Make a decision Summarize Using Commands in CALCULATOR 2 Example #2 Randomly selected deaths from car crashes were obtained, and the results are included in the table below. Use a 0.05 significance level to test the claim that car crash fatalities occur with equal frequency on the different days of the week. Day of the week Frequency (observed) Sunday 132 Monday 98 Tuesday 95 Wednesday 98 Thursday 105 Friday 133 Saturday 158 EXPECTED 3 Example #3 A researcher has developed a theoretical model for predicting eye color. After examining a random sample of parents, she predicts the eye color of the first child. The table below lists the eye color of offspring. Based on her theory, she predicted that 87% of the offspring would have brow eyes, 8% would have blue eyes and 5% would have green eyes. Use significance level of 0.05 to test the claim that the actual frequencies correspond to her predicted distribution. Frequency Brown eyes Blue eyes Green Eyes 132 17 0 4 12.2 – Contingency Tables and Association Definitions 1) Contingency table relates tow categories of data. 2) Marginal distribution is a frequency or relative frequency distribution of either the row or column variable. 3) Conditional distribution lists the relative frequency for each CELL. Example #1 Suppose a sociologist wishes to see whether the number of years a college person has competed is related to her or his place of residence. A sample of 88 people is selected and classified as shown. Location No College 4 Year Degree Advanced Degree Urban 15 12 8 Suburban 8 15 9 Rural 6 8 7 Total TOTAL a) Construct a frequency marginal distribution. b) Construct a relative frequency marginal distribution 5 12.3 – Test for Independence and Homogeneity a) 2 test for independence is used to determine whether there is an association between row variables and column variables in contingency tables. 1. NULL hypothesis = Variables are NOT associated … ie, variables are independent 2. Alternate hypothesis = Variables are dependent. Example #1 Suppose a sociologist wishes to see whether the number of years a college person has competed is related to her or his place of residence. A sample of 88 people is selected and classified as shown. At significance level of 0.05, can the sociologist conclude that a person’s location is dependent on the number of years of college? Location No College 4 Year Degree Advanced Degree Urban 15 12 8 Suburban 8 15 9 Rural 6 8 7 Total TOTAL Step 1 – state the hypothesis H 0 : A person’s place of residence is independent of the number of years of college completed H 1 : A person’s place of residence is dependent of the number of years of college completed Step 2 – Find the CV where df = (rows-1)(colums -1) = (3-1)(3-1) = (2)(2) =4 6 Step 3 – Find the Expected value of each cell using the formula below… and write it in contingency table. Expected = row _ sumcolumn _ sum GRAND_ TOTAL After you have all EXPECTED VALUES, compute the test value using Observed Expected 2 Expected Make Decision and summarize Example #1 A random sample of 90 adults is classified according to gender and the number of hours they watch television during a week. Hours/Week Male Female More than 25 hours 15 29 Less than 25 hours 27 19 Use a 0.01 significance level to test that the time spent watching television and the gender of a viewer are not related. 7 Example #2 If you’re using a graphing calculator, please write the function you’re using. Don’t forget to include the equation I gave in class. Follow the format we went over in class!! Suppose a study of speeding violations and drivers who use car phones produced the following fictional data: Car phone user Not a car phone user Total Speeding violation in No speeding violation in the the last year last year 25 280 305 45 405 450 70 685 755 Total a) Compute the MARGINAL Frequency distributions. b) P(person is a car phone user) = c) P(person had no violation in the last year) = d) P(person had no violation in the last year AND was a car phone user) = e) P(person is a car phone user OR person had no violation in the last year) = h) P(person is a car phone user GIVEN person had a violation in the last year) = i) P(person had no violation last year GIVEN person was not a car phone user) = j) Is using a “car phone” independent of receiving a speeding violation? Use the 5-Step hypothesis test to answer this question. 8
© Copyright 2025 Paperzz