Solution - Julio Herrera

Math 140: Introductory Statistics
Instructor: Julio C. Herrera
Exam 2 Review
January 2, 2016
Instructions: This exam review covers the material from chapter 5 through 7. Please read
each question carefully before you attempt to solve it. Remember that these reviews are
for your benefit of learning, practicing, and preparing for the exam. Hence, you will not
get much out of these reviews if you just read the solutions. You have to try to solve the
problems. It is the process of solving the problems that will help you learn the concepts and
procedures that you need to do well on the exam.
Problem 1: In a hotly contested U.S. election, two candidates for president, a Democrat
and a Republican, are running neck and neck; each candidate has 50% of the vote. Suppose
a random sample of 1000 voters are asked whether they will vote for the Republican
candidate.
What percentage of the sample should be expected to express support for the
Republican? What is the standard error for this sample proportion? Does the Central
Limit Theorem apply? If so, what is the approximate probability that the sample
proportion will fall within two standard errors of the population value of p = 0.50?
Solution:
We took a random sample, so the expected value of our estimator p̂ is equal to the true
population proportion p. In other words, there is no bias in the estimation procedure.
Therefore, we expect that 50% of our sample supports the Republican candidate. The
standard error for the sample proportion can be obtained as follows
r
SE=
p(1 − p)
=
n
r
0.5(0.5)
= 0.0158
1000
Simply put, we expect our sample proportion to be 50%, give or take 1.58 percentage
points. To check if the CLT applies we verify that np̂ ≥ 10, n(1 − p̂) ≥ 10, and N ≥ 10n;
the first two quantities equal 500 and N (all US voters) is certainly greater than 10n.
Hence, the CLT does apply. The CLT tells us that we can use the Normal Distribution,
N (0.50, 0.0158). To calculate the probability that the sample proportion will fall within
two standard errors of the population value of p = 0.50, you can use the computer or the
empirical rule. The empirical rule states that the probability we are looking for is
approximately 0.95.
Page 1 of 7
Math 140: Introductory Statistics
Instructor: Julio C. Herrera
Exam 2 Review
January 2, 2016
Problem 2: The excel file named Tee-Times has data on a national survey of 900 women
golfers. The survey was conducted to learn how women golfers view their treatment at golf
courses in the United States. The survey found that 396 of the women golfers were satisfied
with the availability of Tee-Times. Estimate the proportion of the population of women
golfers who are satisfied with the availability of Tee-Times. Find the margin of error and a
95% confidence interval estimate of the population proportion. INTERPRET your results
within the context of the given problem.
Solution:
= 0.44. This is the point estimate for our population
The sample proportion is p̂ = 396
900
proportion. The margin of error (z ∗ SE) you should have calculated is
r
z ∗ SE = 1.96
0.44(1 − 0.44)
= 0.0324
900
The 95% confidence interval estimate for the population proportion is
95% CI = 0.44 ± 0.0324 = (0.4076, 0.4724). The survey results enable us to state with 95%
confidence that between 40.76% and 47.24% of all women golfers in the nation are satisfied
with the availability of Tee-Times.
Problem 3: A polling agency is deciding how many voters to poll. The agency wants to
estimate the percentage of voters in favor of extending tax cuts, and it wants to provide a
margin of error of no more than 1.8 percentage points.
a. Using 95% confidence, how many respondents must the agency poll?
b. If the margin of error is to be no more than 1.7%, with 95% confidence, should the
sample be larger or smaller than that determined in part a? Explain your reasoning.
Solution:
a. To solve this problem you simply needed to use the formula n = m12 , where m is the
1
margin of error. We are given that m = 0.018, so the sample size n = (0.018)
2 = 3086
is the sample size we need for the given conditions.
b. We use the same formula as in part a. However, this time just use m = 0.017 as the
margin of error. The sample size that we need is 3460, which is a larger sample than
that of part a.
Page 2 of 7
Math 140: Introductory Statistics
Instructor: Julio C. Herrera
Exam 2 Review
January 2, 2016
Problem 4: Dividend yield is the annual dividend per share a company pays divided by
the current market price per share expressed as a percentage. A sample of 10 large
companies provided dividend yield data found in the excel file named Dividend.
a. What are the mean and median dividend yields?
b. What are the variance and standard deviation?
c. Which company provides the highest dividend yield?
d. What is the z-score for McDonald’s? Interpret this z-score.
e. What is the z-score for General Motors? Interpret this z-score.
f. Based on z-scores, do the data contain any outliers?
Solution:
a. The mean dividend yield is 2.3 and the median dividend yield is 1.85.
b. The sample variance is 1.9, so the sample standard deviation is 1.38.
c. Altria Group provides the highest dividend yield at 5%?
d. The z-score for McDonald’s is z = −0.51. This shows that McDonald’s provides below
mean dividend yield. Investors hoping for good dividend yields will be disappointed.
e. The z-score for General Motors is 1.02. This shows that General Motors provides above
mean level dividend yield.
f. Wal-Mart has the lowest dividend yield and Altria Group has the highest dividend
yield, each having a z-score of z = −1.16 and z = 1.96, respectively. Notice that these
z-scores are within two standard deviations of the mean, so we do not have any outliers.
Page 3 of 7
Math 140: Introductory Statistics
Instructor: Julio C. Herrera
Exam 2 Review
January 2, 2016
Problem 5: A recent study conducted by the personnel manager of a major computer
software company showed that 30% of the employees who left the firm within two years did
so primarily because they were dissatisfied with their salary. 20% left because they were
dissatisfied with their work assignments and 12% of the former employees indicated
dissatisfaction with both their salary and their work assignments. What is the probability
that an employee who leaves within two years does so because of dissatisfaction with
salary, dissatisfaction with the work assignment, or both? Find the previous probabilities
again, but this time assuming that no employees left because they were both dissatisfied
with their salary and their work (Hint: Think of mutually exclusive events.).
Solution:
Let
S = the event that the employee leaves because of salary
W = the event that the employee leaves because of work assignment
We have P (S) = 0.30, P (W ) = 0.20, and P (S ∩ W ) = 0.12. Using the addition law we get
P (S ∪ W ) = P (S) + P (W ) − P (S ∩ W ) = 0.30 + 0.20 − 0.12 = 0.38
We find a 0.38 probability that an employee leaves for salary or work assignment reasons.
If no employees left because they were both dissatisfied with their salary and their work,
then P (S ∩ W ) = 0. Then the probability in question becomes
P (S ∪ W ) = P (S) + P (W ) − P (S ∩ W ) = 0.30 + 0.20 − 0 = 0.5
Problem 6: Consider the promotion status of male and female officers of a major
metropolitan police force in the eastern United States. The police force consists of 1200
officers, 960 men, and 240 women. Over the past two years, 324 officers on the police force
received promotions.
After reviewing the promotion record, a committee of female raised a discrimination case
on the basis that 288 male officers had received promotions but only 36 female officers had
received promotions. The police administration argued that the relatively low number of
promotions for female officers was due not to discrimination, but to the fact that relatively
few females are members of the police force. Let
M = event an officer is a man.
W = event an officer is a woman.
A= event an officer is promoted
Page 4 of 7
Math 140: Introductory Statistics
Instructor: Julio C. Herrera
Exam 2 Review
January 2, 2016
Ac =event an officer is not promoted.
Does the discrimination charge have any merit? That is, is the P (A|M ) significantly
greater than P (A|W )?
Here is a summary of the promotion status of police officers over the past two years:
Solution:
P (A ∩ M )
P (A ∩ W )
and P (A|W ) =
. Let’s find these
P (M )
P (W )
probabilities individually first:
We know that P (A|M ) =
P (W ) = 240/1200 = 0.2 and P (M ) = 960/1200 = 0.8
P (A ∩ W ) = 36/1200 = 0.03 and P (A ∩ M ) = 288/1200 = 0.24
Now let’s plug these values into the conditional probability equations:
P (A|W ) =
P (A ∩ W )
0.03
=
= 0.15
P (W )
0.2
P (A|M ) =
P (A ∩ M )
0.24
=
= 0.30
P (M )
0.8
These numbers suggest that the probability that an officer is promoted given that the
officer is a woman is 0.15. On the other hand, the probability that an officer is promoted
given that the officer is a man is 0.30. That is, the probability of a promotion given that
the officer is a man is twice the probability of promotion given that the officer is a woman.
While this is not enough to conclude that the discrimination is existent, the conditional
probability values support the argument of discrimination.
Page 5 of 7
Math 140: Introductory Statistics
Instructor: Julio C. Herrera
Exam 2 Review
January 2, 2016
Problem 7: Roll a fair six-sided die. A fair die is one in which each side is equally likely
to end up on top. You will win $4 if you roll a 5 or a 6. You will lose $5 if you roll a 1. For
any other outcome, you will win or lose nothing. Give a table that shows the probability
distribution for the amount of money you will win. Draw a graph of this probability
distribution function. For example, you will win $0 if you roll a 2, 3, or 4, so the probability
is 36 = 21 . Remember to graph probabilities on the y-axis and winnings on the x-axis.
Solution:
This is practically example 2 on pg. 246 of the class text.
Problem 8: For borrowers with good credit scores, the mean debt is $15,015. Assume the
standard deviation is $3,540 and that debt amounts are normally distributed.
a. What is the probability that the debt for a borrower with good credit is more than
$18,000?
b. What is the probability that the debt for a borrower with good credit is less than
$10,000?
c. What is the probability that the debt for a borrower with good credit is between
$12,000 and $18,000?
d. What is the probability that the debt for a borrower with good credit is no more
than $14,000?
Solution:
a. The debt amounts for borrowers with good credit scores follows a normal distribution
with µ = $15, 015 and standard deviation σ = $3, 540. Then the probability that the
debt for a borrower with good credit is more than $18,000 is given by:
P (x > 18, 000)
= P (x − µ > 18, 000 − 15, 015)
18, 000 − 15, 015
x−µ
>
=P
σ
3, 540
= P (z > 0.84)
= 1 − P (z < 0.84)
(by symmetry)
= 1 − 0.7995
= 0.2005
The computer will give you this output with the press of a button, but you should
still understand the intuition behind this process.
Page 6 of 7
Math 140: Introductory Statistics
Instructor: Julio C. Herrera
Exam 2 Review
January 2, 2016
b. Following a similar process to part a, P (x < 10, 000) = P (z < −1.42) = 0.0778
c. Following the process as in part a,
P (12, 000 < x < 18, 000) = P (−0.85 < z < 0.84) = P (z < 0.84) − P (z < −0.85) =
0.7995 − 0.1977 = 0.6018.
You really should have the graph of the standard normal distribution in mind when
calculating this probability. Believe me, it will make it way easier to understand why
one of the < gets switched around in
P (−0.85 < z < 0.84) = P (z < 0.84) − P (z < −0.85).
d. The probability that we want is P (x ≤ 14, 000) = P (z ≤ −0.29) = 0.3859. Keep in
mind that for continuous distributions there is no difference between using P (z < z0 )
and P (z ≤ z0 ) because there is no area under the curve for a specific z value.
Problem 9: When an opinion poll calls residential telephone numbers at random, only
20% of the calls reach a live person. You watch the random digit dialing machine make 15
calls.
a What is the standard deviation and mean number of calls to reach a live person?
b What is the probability that exactly 3 calls reach a person?
c What is the probability that at most 3 calls reach a person?
d What is the probability that fewer than 3 calls reach a person?
Solution:
Check on your own that this situation meets the criteria of a binomial distribution. In this
case, a success is an event in which a random call reaches a live person. We are told that
20% of calls reach a live person, so the proportion of success is p = 0.20. Now we can begin
to answer the questions at hand.
a The p
mean is µ = npp= 15(0.20) = 3 p
calls and the standard deviation is
σ = np(1 − p) = 3(1 − 0.20) = 3(0.80) = 1.549.
b Using the binomial distribution we find that P (X = 3) = 0.2501. The probability
that exactly 3 calls reach a person is 0.2501.
c Using the binomial distribution we find that
P (X ≤ 3) = P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) = 0.6481. The
probability that at most 3 calls reach a person is 0.6481.
d Using the binomial distribution we find that
P (X < 3) = P (X = 0) + P (X = 1) + P (X = 2) = 0.398. We did not include
P (X = 3) because we wanted strictly fewer than 3 calls. The final answer is that the
probability that fewer than 3 calls reach a person is 0.398.
Page 7 of 7