An Explanation of the Concept of Independent Events The city of Metropolis is about to have an election. Let us say that 46% of its citizens support candidate Larch for Mayor. To simplify matters the population is divided into three racial groups, White, Asian, and Latino. Question 1 – Is being white independent of support for candidate Larch for Mayor? What this is trying to ask is, does the white support for candidate Larch differ from the population as a whole? Thus, the question of independence is always asking if a subgroup of the population differs in proportion compared to the population as a whole. Suppose that we answer that the white support for candidate Larch is 46%. Then, the two events “a person supports Larch,” and a white person supports Larch,” are independent. Suppose that it is not 46%, then the two events are not independent. Which means the proportion changes when we look at the subgroup, “White,” versus the entire population. Why, is independence such an important concept? For our class this question is two-fold. During this chapter, the main reason is that if we have independence we can calculate certain probabilities in an easier manner. In statistics as a whole, when variables are independent, the result is that no new information can be ascertained by associating the variables; we will make this clear in the examples that follow. The table below contains the population of Metropolis divided into support for the candidate, and racial makeup in 100’s. Metropolis holds a million people of voting age. Questions 2 through 5 refer to this table. For Against Total Asian 805 945 1750 Latino 1311 1539 2850 White 2484 2916 5400 Total 4600 5400 10000 Question 2 – How many people are classified as “For” candidate Larch? Answer: 4600 (100’s) or 460,000 people. Question 3 – If a person is chosen at random, what is the probability of them being “For” candidate Larch? Answer: I will use function notation (this is important given what is about to follow) to represent the question; P(For). When this question is asked, I am assuming I will answer it with respect to the population in question which is the voting population of Metropolis. P(For) = 4600 = 0.46 10000 Question 4 – What is the probability that if a Latino voter is chosen at random, that this person is for candidate Larch? In order to answer this question I am going to introduce another notation that represents the question’s situation. What is the situation? Read the sentence again, and ask yourself what is the population that is being discussed here? If you say voting citizen’s of Metropolis, you would be wrong. The population being discussed is Latino’s living in Metropolis. Because this is a subset of the Metropolis group, I need notation depicting this; THIS IS IMPORTANT!! Do not bypass this, unless your goal is to fail the next exam. Practice using/writing this notation, when it is called for. Let A and B be two events. The notation P(A | B) which reads, the probability of A given B. The vertical line “| “ is read “given”. Let me restate the interpretation. What is the probability of event A, given that we will only consider event A, within the context of the subpopulation called B, INSTEAD of the original population. So question 4, I would write as P(For | Latino), which says I want to calculate the probability of people that are “for” candidate Larch, but only considering the subgroup from Metropolis, called Latinowhat proportion of Latinos are for the candidate Larch. Now look at what numbers I am going to use to answer the question, and hopefully it will make the meaning absolutely clear. For Against Total Asian 805 945 1750 Latino 1311 1539 2850 White 2484 2916 5400 Total 4600 5400 10000 1311 = 0.46 Notice the numbers I used to answer the question. I had to use 2850 for 2850 the denominator because that is how many people are Latino. Now of those 1311 are Latino. P(For | Latino) = 4600 = 0.46. Now the fact that both produced the same proportion, 10000 0.46 says that the event a person is “for” the candidate, and “a person is Latino” are independent, that is the proportion of Latino’s in support of Larch is no different that the rest of the population as a whole. This is different from P(For) = If the event A, and event B are independent then, P(A | B) = P(A) and P(B | A) = P(B) Question 5 – What is the probability that if a White person is chosen at random, that this person is “for” candidate Larch? What is the probability that if Asian person is chosen at random that this person is “for” candidate Larch? P(For | White) = 2884 = 0.46 5400 Asian 805 945 1750 For Against Total Latino 1311 1539 2850 P(For | Asian) = White 2484 2916 5400 805 = 0.46 1750 Total 4600 5400 10000 The results show that overall, that race is independent from support from the candidate. That is, race is not a factor in candidate support since each group’s proportional support is the same as the whole of 46%. Now we divide the citizens of Metropolis according to age. Notice that 46% of citizens are in favor of candidate Larch. Look carefully at how the questions are asked and how the answer relates to the question. Age Group (years) For Against Total 18 - 27 1058 1242 2300 27 - 38 1400 1900 3300 over 38 1800 2600 4400 Total 4600 5400 10000 Question 6 – What is the probability that we choose someone at random that is 27 – 38 years of age? P(27 – 38) = 3300 = 0.33 10000 Question 7 – Given that a person is 27-38 years of age, what is the probability that they are for the candidate? P(For | 27 – 38) = 1400 = 0.42 3300 Question 8 – Are the events a person is “for” candidate Larch, and the event a person is “27-28 years” independent? No, since the proportion of 27-28 year olds that are for candidate Larch, does not match the population as whole. P(For | 27 – 38) ≠ P(For) There are two formulas that require the events be independent if they are to be true statements. One we have been discussing, P(A | B) = P(A). The other appears in section 4.1, P(A and B) = P(A)P(B) but only if the events are independent. Thus, if the question involves whether two events are independent, you must use one of these formulas to determine if they are or are not independent. Let’s consider the Metropolis example to see how the P(A and B) = P(A)P(B) formula works. Age Group (years) For Against Total 18 - 27 1058 1242 2300 27 - 38 1400 1900 3300 over 38 1800 2600 4400 Total 4600 5400 10000 Question 9 – Are the two events a person is 18- 27 years of age independent from the event a person is for candidate Larch? One way to make the determination is to use the formula P(A and B) = P(A)P(B). I will need to calculate each side separately. P(18 – 27 AND For) = 1058 = 0.1058 (To get my answer I need only look at the appropriate slot in the 10000 table.) Now I will calculate the other side. P(18 – 27) = 2300 4600 = 0.2300, P(For) = = 0.4600. 10000 10000 Lastly, we need to see if we have equality. P(18 – 27 AND For) = P(18-27)P(For) 0.1058 = (0.2300)(0.4600) = 0.1058 Since both sides are equal we see that the two events are independent. Question 10 – Are the two events a person is against candidate Larch and a person is over 38 independent? P(Against AND over 38) = 2600 = 0.26 10000 P(Against) = 5400 4400 = 0.54, P(over 38) = = 0.44 10000 10000 P(Against AND over 38) = P(Against) P(over 38) 0.26 = 0.54(0.44) ≠ 0.2376 Since we do not have equality we don’t have independence. Homework Use the following table to answer the questions below. A poll asked 10,000 people selected at random in the State of Texas to rate their job happiness. Assume that the table exactly represents the correct proportions for Texas. Happy Neutral Not Happy Total 18 - 27 1200 1026 474 2700 Age Group 27 - 38 over 38 1722 1178 1596 1178 882 744 4200 3100 Total 4100 3800 2100 10000 a. A person is chosen at random from the state of Texas. What is the probability that this person is happy at work? b. A person within the age group 18 – 27 is chosen at random from the state of Texas. What is the probability that this person is happy at work? c. Are the two events a person is “Happy” at work, and a person is in the age group 18 – 27 years of age independent? It is assumed that I am only considering the Texas population. d. What is the probability that someone is in the age group 27 – 38 in the state of Texas? e. A person is chosen at random from the state of Texas. That person is “Not Happy” at work. What is the probability that they are in the age group 27 – 38? f. Are the two events a person is in the age group “27 – 38,” and someone is “Not Happy” at work independent? g. A person is chosen at random from the state of Texas. That person is “over 38.” What is the probability that they are “Neutral”? h. Are the two events a person is in the age group “over 38,” and someone is “Neutral” at work independent? Use both formulas to check your result. i. Use the P(A and B) = P(A)P(B) to see if the events a person is over 38 and a person is not Happy at work independent. Answers a. P(Happy) = 0.41 b. P(Happy | 18 – 27) = 0.44 d. P(27 – 38) = 0.42 e. P(27 – 38 | Not Happy) = 0.42 g. P(Neutral | over 38) = 0.38 c. The two events are not independent. f. The two events are independent. h. Yes, the two events are independent. i. The two events are not independent.
© Copyright 2026 Paperzz