Section 10: Hypothesis/Significance Testing for Proportions (Major Concept Review) Example 1: Count Buffon (Georges-Louis Leclerc, Comte de Buffon, 1707-1788) tosses coin 4040 times, gets 2048 heads. One story has it that he was in prison at the time. Another story is that he paid an orphan boy to do it. Either way, he was either very bored, very rich, or both. Are his results unusual? That is, is the coin loaded to favor heads? For a fair coin, the sampling distribution for the proportion of heads in 4040 flips would be 𝑝̂ ≈ N (0.5, √ 0.5(1 − 0.5) ) 4040 What is the probability of a fair coin resulting in so many heads? (right tail)? 𝑝̂ ≥ 2048 4040 2048 − 0.5 𝑍 ≥ 4040 √0.5(1 − 0.5) 4040 𝑍 ≥ 0.88 0.1894 A fair coin flipped 4040 times: 19% of the time you’d get as many heads as he got. 19% of samples result in as many heads as he got. Judgment call: It is simply not that unusual to get this many heads. It is still plausible that the coin is fair. We have no evidence that the coin is loaded in favor or heads. We have no evidence that p (the probability of a head for this coin) is more than 0.5, so do not reject the theory that p =0.5. Assuming the coin is fair, we run the numbers and conclude that it wouldn’t be unusual for a fair coin to do such a thing. So it’s still plausible that the coin is fair. Example 2: Kaktus Fabric Softener claims that 60% of customers prefer their brand. A SRS of 2000 customers reveals that 1140 prefer their brand. Does this sample constitute evidence that they’re exaggerating? Let p = the proportion of all customers who prefer their brand. One theory is that p is 0.6. Another theory is that that p is less than 0.6. (It is also possible that p is more than 0.6, but we’re not interested in this case.) The null hypothesis 𝑯𝟎 is a theory about what p equals, a potential value for the purpose of running the numbers. The alternative hypothesis 𝑯𝒂 is a range of possible alternatives for p (less than, greater than, not equal to a number). 𝐻0 : 𝑝 = 0.6 𝐻𝑎 : 𝑝 < 0.6 If 𝐻0 is the truth, then the company is telling the truth. If 𝐻𝑎 is the truth, then they’re exaggerating. Notice that if 𝐻0 is the truth, we know what p is, but if 𝐻𝑎 is the truth, we don’t know what p is. We always assume 𝐻0 (the null hypothesis) for the purpose of running the numbers. In the end, it’s either plausible in light of the sample (do not reject the theory, no evidence of the alternative), or it’s not plausible in light of the sample (reject the theory, evidence of the alternative). If the theory being true would make the sample very unlikely, then we will reject the theory. 𝐻0 is the foundation of the calculation; if it produces something very unlikely, the foundation crumbles. Assuming they’re telling the truth (𝐻0 ), what is the probability of a sample as extreme as ours? What does “as extreme as” mean in context? For a coin which might be loaded in favor of heads, high proportions of heads incline you toward that explanation. For a fabric softener company which might be exaggerating, low proportions of customers liking their brand incline you to conclude that they’re exaggerating). In our case, low 𝑝̂ ’s constitute evidence of 𝐻𝑎 . (In our previous example, high samples did: 𝐻0 : 𝑝 = 0.5 𝐻𝑎 : 𝑝 > 0.5) Assuming 𝐻0 , the probability of 𝑝̂ as low as ours is 𝑝̂ ≤ 0.57 𝑍≤ 0.57 − 0.6 √0.6(1 − 0.6) 2000 𝑍 ≤ −2.74 0.0031 If they’re telling the truth, this sample is very unusual. If they’re telling the truth, this sample is very unlucky for them. Unbelievably unlucky. So don’t believe. Judgmental call: If 60% of customers prefer their brand, it would be very unusual to get a sample result this low. 𝐻0 is no longer plausible. We have evidence of 𝐻𝑎 . We have evidence that p is less than 0.6. Reject the theory that p = 0.6. The probability of getting a sample as extreme as ours (as low, as high, as different) on the current understanding of the facts (𝐻0 ) is called the p-value. If the p-value is “very low”, reject 𝐻0 (it is no longer plausible in light of the sample) and find evidence of 𝐻𝑎 . If the p-value is “not that low”, do not reject 𝐻0 (it is still plausible in light of the sample) and do not find evidence of 𝐻𝑎 . At no point do we use the word “accept”. 𝐻0 doesn’t have to be accepted; it already has the benefit of the doubt, until we reach the breaking point. Nor does it make sense to “accept” 𝐻𝑎 because it isn’t a specific explanation of the facts. If the company is exaggerating, we still don’t know what proportion of people prefer their brand, only that it’s less than 60%. Like a court of law, 𝐻0 is guilty (low p-value) or not guilty (p-value not that low). The sample either convicts it or fails to convict it. The alternative 𝐻𝑎 comes into it (if at all) only by the process of elimination. In light of the sample we have to work with, 𝐻0 is either plausible (do not reject) or implausible (reject). We either have evidence of the alternative 𝐻𝑎 or we don’t have evidence of it. Example 3: Claim: 30% of jellybeans in a large vat are licorice. SRS n= 1000, 270 licorice. (a) I like licorice. Am I being cheated? The logical alternatives are 𝐻0 : 𝑝 = 0.3 𝐻𝑎 : 𝑝 < 0.3 Assuming they’re telling the truth (𝐻0 ), what is the probability of a sample as extreme as ours? Low 𝑝̂ ’s constitute evidence of 𝐻𝑎 . 𝑝̂ ≤ 0.27 𝑍≤ 0.27 − 0.3 √0.3(1 − 0.3) 1000 𝑍 ≤ 02.07 0.0192 If he’s right, 1.92% of samples result is a value this low. Do I consider this to be “unbelievably unlikely?” Well, unlike the previous examples (which were clear cut), it comes down to what I consider to be “unbelievably unlikely.” The 5% significance level means that occurring less than 5% of the time through random chance, is too unlikely to believe in. Having a pre-set standard of what you’ll consider too unlikely to believe in (before running the test), renders our results scientific and no longer subjective. At the 5% level, 0.0192 < 0.05, so yes: reject 𝐻0 , evidence of 𝐻𝑎 o If occurring less than 5% of the time through chance is my standard, then this p-value is too unlikely to believe. At the 1% level, 0.0192 ≮ 0.01, so no: do not reject 𝐻0 , no evidence of 𝐻𝑎 o If occurring less than 1% of the time through chance is my standard, then this p-value is still believable. It comes down to what you consider “unbelievably unlikely.” If you have to be among the 5% weirdest samples to be considered “unacceptable weird”, it is. If you have to be among the 1% weirdest samples to be considered “unacceptable weird”, it isn’t. (b) Suppose that I have no opinion of licorice one way or the other. In this case, the advanced claim is just a number which might be true or might not be true. Do I have evidence that this claimer is mistaken? (That is, that it is not the case that 30% of the vat is licorice.) 𝐻0 : 𝑝 = 0.3 𝐻𝑎 : 𝑝 ≠ 0.3 Assuming they’re telling the truth (𝐻0 ), what is the probability of a sample as extreme as ours? 𝑝̂ ’s far from 0.3 in either direction constitute evidence of 𝐻𝑎 . In other words, what is the probability of getting a 𝑝̂ as far from 0.3 as ours is? 𝑝̂ ≤ 0.27 or 𝑝̂ ≥ 0.33 Note that you don’t actually need the other number (0.33). These are two identical tails, one of which we’re set up to compute (or in this case, already know). 𝑝̂ ≤ 0.27 0.0192 Same tail: 0.192 Both tails = 0.0192 * 2 = 0.0384 In the two-sided hypothesis test (when you have the ≠ alternative), double the tail you look up. If 30% of the vat is licorice, the probability of getting a sample as far away from 30% as ours is, is 0.0384. That is, 3.84% of samples would be as far from the truth as ours is. Do I consider this to be “unbelievably unlikely?” At the 5% level, 0.0384 < 0.05, so yes: reject 𝐻0 , evidence of 𝐻𝑎 At the 1% level, 0.0384 ≮ 0.05, so no: do not reject 𝐻0 , no evidence of 𝐻𝑎 When evaluating a p-value, ask the question, is it “suspiciously small”? If so, it renders the current theory (𝐻0 ) suspect. It counts against the current theory. Otherwise 𝐻0 is still plausible. In conclusion, don’t make it more complicated than it is. It is unlikely conclusions which cast doubt on our assumptions.
© Copyright 2025 Paperzz