Homework 9 Solutions (1) What is the purpose of an ANOVA? (A) To test if the means of two populations are the same against the alternative that they are different. (B) To test if the means of more than two populations are same against the alternative that at least one is different. (C) To test if the variances of more than two populations are the same against the alternative that at least one is different. B (2) A meterologist wants to understand whether the mean temperature in College Station is the different to the mean temperature in Snook (which is about 20 miles away from College Station). In order to test this hypothesis she measures the temperature at 10am in College Station and Snook between April 6th and April 15th (10 readings from both towns). (a) What test would you recommend she does to test equality of the mean temperatures against the alternative that they are different? (A) (B) (C) (D) (E) A two sample independent t-test. A paired t-test. A Wilcoxon sign rank test. A Wilcoxon sum rank test. ANOVA. – C - Since College Station and Snook are close to each other there is likely to be a dependence in their daily data, thus a Paired test is required. However, the sample size (10 observations) is small, so it seems sensible to do a Wilcoxon sign rank test. (b) Considering how the data was collected, what fundemental assumption (that all the above tests require) is likely to have been violated. These observations are take over a period of 10 days, thus it is highly likely that there is a dependency between the observations (today’s temperature has an influence on tomorrow’s temperature). Therefore the assumption of independence of the observations (a major assumption in all the tests we have been done so far) is unlikely to hold. (3) In a few weeks time local elections will take place in Sutown. Mr. Big is contesting against Dr. Small. In a recent poll of 700 people, it was found that 380 were in favour of Dr. Small and 320 were in favour of Mr. Big. (i) Test the hypothesis that Dr. Small will win the election (this will be the alternative hypothesis) (state precisely the null and alternative and do the test at the 5% level). Let p = the proportion of the vote Dr. Small will q get. H0 : p ≤ 0.5 2 against HA : p > 0.5. p̂ = 380/700 = 0.542, the s.e. = 0.5 = 0.0188. The 700 z-transform is Z = (0.542 − 0.5)/0.0188 = 2.23. Therefore P(Z > 2.23) = 0.0127 = 1.27%. The p-value is 1.27%, as this is smaller than 5% we reject the null at the 5% level. Therefore, given the data is seems likely that Dr. Small will win the election. (ii) How many people need to be in the sample such that the 95% confidence interval for the proportion who will vote for Dr. Small will have length 3% (look at lecture 28)? q . We want this length to be The length of the CI is 2 × 1.96 × π(1−π) n q 0.03. Thus we solve 2 × 1.96 × π(1−π) = 0.03. However, the proportions n are unknown, however as a polling company we want to be sure at least 95% confident this interval contains the mean, so we use the π 0 s that make this CI the longest, that is π = 0.05 (a different π and the interval will be narrower) - note that this choice of π has nothing to do with part (i) of the question, always in the case of proportions you should use π = 0.5 in constructing the CI to construct a confidence interval which is at least at least a 95% CI. p Solving 2 × 1.96 × 0.25/n = 0.03 gives n = 4269. (4) Castaneda v Partida is an important court case in which statistical methods were used as part of the legal argument. In Castaneda the plantiffs alleged that the method for selecting juries in a county in Texas as biased against Mexican Americans. For the period of time of issue, there were 181,535 persons eligible for jury duty, of whom 143,611 were Mexican Americans. Of the 870 people selected for jury duty, 339 were Mexican American. (a) We can do a one-sample proportion test, by using p = 143,611 = 0.79 as the overall 181,535 proportion of mexicans in the county and comparing the proportion of Mexican american on jury to this. Figure 1: (i) Explain why we are testing H0 : p = 0.79 against HA : p < 0.79? The platiff alleges that the method of selection is biased against Mexican Americans. This means they want to see if there is evidence to suggest that Mexican Americans are under-represented in Juries. We articulate this in the alternative as the proportion of Mexican Americans doing jury service being less than the total proportion in the Mexican Americans in the county, ie. HA : p < 0.79. (ii) Using the output in Figure 1 calculate the z-transform and the the corresponding p-value. = −29.7. The alternative is pointing to the left, so the z = 0.38−0.79 0.138 p-value is the area to the left of z = −29.7. If we use the tables we see that it only goes up to z = −3.49, which corresponds to the area of 0.02%. Therefore the p-value is a lot less than 0.02% and there is evidence to suggest that the proportion of Mexican Americans who did jury service is less than 0.79.z = 0.38−0.79 = −29.7. The 0.138 alternative is pointing to the left, so the p-value is the area to the left of z = −29.7. If we use the tables we see that it only goes up to z = −3.49, which corresponds to the area of 0.02%. Therefore the p-value is a lot less than 0.02% and there is evidence to suggest that the proportion of Mexican Americans who did jury service is less than 0.79. (iii) Is there any evidence to support the view that Mexican Americans are being underrepresented in jury duty? Yes, since the p-value is a lot smaller than 5%. If the p-value was quite large the defence would argue that the proportion of mexican americans on juries in less than the proportion of non-Mexican Americans, however this difference can be explained by random chance. (b) An alternative method is to do a two-sided test and compare proportions of Mexican Americans who did jury duty with the proportion of non-Mexican American who did jury duty. Figure 2: (i) What hypothesis are you testing using this method? H0 : pA − pM = 0 against HA : pA − pM > 0 (where pM = proportion of Mexican Americans and pA = proportion of non-Mexican Americans). (ii) Using the output in Figure 2 calculate the z-transform and the the corresponding p-value. The z-transform is z = 0.0116/(3.98 × 10−4 ) = 29.1. Since the alternative is pointing to the right, the p-value is the area to the right of 29.1. Using the same arguments as those in (aii) this corresponds to the a p-value which is less than 0.02%. (iii) Is there any evidence to support the view that Mexican Americans are being underrepresented in jury duty? As the p-value is less than 0.02%, which is less than the 5% significance level, there is evidence to suggest that the proportion of Mexican Americans who did jury service is less than the proportion of non-Mexican Americans who did jury service. (5) Three instructors gave STAT 651 exams last week. The number of As and Bs are recorded below. Students want to know whether the instructor has an influence on the grade a student obtains. (a) State the null and alternative hypothesis to investigate. H0 :There is no association between grade and instructor HA :There is an association between grade and instructor. (b) State the test you would do and why? A chi-squared test for independence. A B Instructor 1 20 30 50 Instructor 2 40 30 70 Instructor 3 20 30 50 80 90 170 (c) Do the test at the 5% level, and report your findings. Since χ2 = 4.85 ≯ 5.99 = χ20.05,2 , we cannot reject H0 . There is no evidence to assume the dependence between grade and instructor. A B Instructor 1 80 × 50 170 90 × 50 170 50 Instructor 2 80 × 70 170 90 × 70 170 70 Instructor 3 80 × 50 170 90 × 50 170 50 80 90 170 (6) A gaming magazine is conducting a survey on the number of different gaming consoles an individual owns. There are currrently three different gaming consoles on the market (Nintendo, Playstation and Xbox) and it thought that the probability an individual should own any one of these consoles is 0.4 (ie, If I randomly pick a person and ask them do you own a Nintendo the probability is 40%. If I randomly pick a person and ask do you own a Playstation the probability is 40%. I randomly pick a person and ask them if they own a Xbox it is 40%). (a) Assuming that ownership of one system is completely independent of ownership of another system calculate the probability that an individual owns none, one, two or all three systems (hint: use the Binomial distribution). No consoles 1 console 2 consoles 3 consoles 0.216 0.432 0.288 0.064 Probability (b) The gaming magazine interviews 220 people, and asks them how many consoles they own (if any). The results are given below. Based on the above survey data, 0 consoles Number 120 0.545 1 console 70 0.318 2 consoles 20 0.091 3 consoles 10 0.046 is there evidence to suggest that the probabilities derived in part A are not the correct probabilities? State the test you will do, do the test at the 5% level and give the conclusions of the test. Use Chi-squared test. Based on the expected and observed probability we can get χ2 = 148.01 > 7.815 = χ20.05,3 and therefore we can reject the null. That is, there is evidence to suggest the probabilities derived in part A are not correct.
© Copyright 2026 Paperzz