Normal Approximation for a Binomial Distribution

McGraw-Hill Ryerson
Data Management 12
7.5
Section 5.1
Connections to Discrete
Random Variables
5.1
7.5
Connections to Discrete
Random Variables
I am learning to
• make connections between a normal distribution and a binomial distribution
• make connections between a normal distribution and a hypergeometric
distribution
• recognize the role of the number of trials in these connections
Success Criteria
I will know I am successful when I can
• use a normal distribution to approximate probabilities associated with a binomial distribution
• use a normal distribution to approximate probabilities associated with a hypergeometric distribution
• tell whether it is appropriate to use a normal approximation for a given number of trials
• apply a continuity correction to determine the probability associated with a discrete distribution using
a normal distribution
• describe the connection between the number of trials and the fit of a normal approximation to a
binomial or hypergeometric distribution
What are some other success criteria?
7.5
Connections to Discrete
Random Variables
Recall from Chapter 4 that tossing a coin several times and recording the
number of heads obtained is an example of a binomial distribution.
How does the shape of the distribution depend on the number of times the
experiment is tried?
Example: As the number of trials increases, the shape of the distribution
becomes closer to the normal distribution.
Predict the form of a graph of the number of heads possible when a coin is
flipped five times. Make a sketch of your prediction.
Example: The graph will have a normal distribution.
Click to Reveal
5.1
7.5
Connections to Discrete
Random Variables
Investigate 1 Compare the Binomial Distribution to the Normal Distribution
1. Click on the icon to open the Fathom™ file.
The file shows the probability distribution for the number of heads flipped on five trials.
To approximate the binomial distribution with a normal distribution, you can calculate the
mean using the formula   np, and the standard deviation using the formula   npq .
Check some of the probabilities by calculating them yourself.
2. How well does the normal distribution match the binomial distribution?
3. Right click on the collection box, and add 5 new cases. Adjust the scales on the axes if
necessary. How well do the distributions match with 10 tosses of the coin?
4. Right click on the collection box, and add 10 new cases. Adjust the scales on the axes if
necessary. How well do the distributions match with 20 tosses of the coin?
5. Reflect How does the fit of the normal distribution to the binomial
distribution depend on the number of trials? Use your Fathom™
simulation to try a larger number of trials, such as 100 and then 1000.
5.1
7.5
Connections to Discrete
Random Variables
Investigate 2 Compare the Hypergeometric Distribution to the Normal Distribution
A committee of 4 is chosen from a group of 200 people. How many males are on the committee?
How does the distribution change as the size of the committee is increased? Does the population
size have any effect?
1. Click on the icon to open the Fathom™ file.
The file shows the probability distribution for the number of males selected in a committee of
4 people from a sample of 100 males and 100 females.
If the sample size is small compared to the population size, the probability of selecting a
male is approximately equal to the number of males divided by the population size.
To approximate the hypergeometric distribution with a normal
distribution, you can calculate the mean using the formula   np and
the standard deviation using the formula
 NP  n .
  npq 

 NP  1 
5.1
7.5
Connections to Discrete
Random Variables
Investigate 2 Compare the Hypergeometric Distribution to the Normal Distribution
2. How well does the normal distribution match the hypergeometric distribution?
3. Change the committee membership from 4 to 10. Right click on the collection box, and add 6 new
cases. Adjust the scales on the axes of the graph if necessary. How is the fit with 10 members on the
committee?
4. Change the committee membership to 20. Right click on the collection box, and add 10 new cases.
How is the fit with 20 members on the committee?
5. Reflect The committee membership must remain a small fraction of the population size, typically
less than one-tenth. Why is this necessary? Consider the values of p and q in the above investigation
in your response.
6. Extend Your Understanding Suppose that a committee of 4 were
chosen in a random selection from a very small population, say a club
with 6 male members and 2 female members. Can you see any
problems with calculating the number of males on the committee?
Consider some extreme cases. Use the Fathom™ simulation from the
investigation to explore different scenarios.
5.1
7.5
Connections to Discrete
Random Variables
When is a normal approximation reasonable?
Click to Reveal
Binomial Distribution
usually considered reasonable if np > 5 and nq > 5
Hypergeometric Distribution
usually considered reasonable if n < 0.1 NP, where n is
the sample size and NP is the population size
Why is it not considered reasonable to use a normal approximation outside these parameters?
5.1
7.5
Connections to Discrete
Random Variables
Continuity Correction
Suppose you want to use a normal approximation to determine the probability of flipping a coin 5 times
and getting exactly 1 head. You cannot simply determine the area under the normal curve for that
particular point, as there is no area under a point. (A point has a width of zero). We apply a continuity
correction in cases like this to determine the probability of discrete outcomes using a (continuous)
normal distribution.
To understand a continuity correction, it might help to envision
a histogram overlaid upon the normal approximation—think of
the binomial distribution it is approximating.
The area we want is the area of the bar centred on the
point at (1,f(1)), with a width of 1 unit. Its left bound is
x = 0.5 and its right bound is x = 1.5.
This rectangular area can be approximated by considering the area under the normal curve over the
same interval. So, the approximate probability is given by
P(1 head) = P(0.5 < X < 1.5).
5.1
7.5
Connections to Discrete
Random Variables
Example 1
Normal Approximation for a Binomial Distribution
A data management quiz consists of 25 multiple choice questions with 4 choices per question. Charlie
didn’t study for the quiz, so he guesses an answer for each question.
a) Is it reasonable to approximate this distribution with a normal distribution?
Give a reason.
b) What values need to be determined in order to use a normal
approximation? Determine these values.
c) What is the probability that Charlie will get a passing grade
(50% or more) on this quiz?
Click for Hints
To use a normal approximation, you
need to determine the mean and
standard deviation.
  npq
  np
Use a continuity correction for
answering part c).
5.1
7.5
Connections to Discrete
Random Variables
Example 2
Normal Approximation for a Hypergeometric Distribution
Lizzie deals 5 cards from a standard deck of 52 cards. She would
like to deal as many face cards as possible.
Click for Hints
a) Is it reasonable to approximate this distribution with a
normal distribution? Give a reason.
There are 3 face cards (J, Q, K) in
each of 4 suits for a total of 12 face
cards per deck.
b) What values need to be determined in order to use a normal
approximation? Determine these values.
To use a normal approximation, you
need to determine the mean and
standard deviation.   np
c) What is the probability that Lizzie will deal 3 or more face
cards?
 NP  n 
  npq 

 NP  1 
Use a continuity correction when
answering part c).
7.5
Connections to Discrete
Random Variables
Reflect
In what circumstances would you prefer not to use a normal distribution to approximate a binomial or
hypergeometric distribution?
Click to Reveal
Example: When looking for the probability of a single event, it may be faster to calculate the
probability using the binomial or hypergeometric formula.
7.5
Connections to Discrete
Random Variables
1. True or false?
It is always better to approximate binomial and hypergeometric distributions with normal distributions.
Click for Answer
False
7.5
Connections to Discrete
Random Variables
2. True or false?
It is appropriate to use a normal approximation for a binomial distribution with p = 0.6 and n = 10.
Click for Answer
False
7.5
Connections to Discrete
Random Variables
3. True or false?
A continuity correction is used when using the area under a normal distribution to model the probability of
discrete events.
Click for Answer
True
7.5
Connections to Discrete
Random Variables
4. Select the best answer.
When approximating a binomial distribution with a normal distribution, increasing the
sample size will make the approximation
A
better
B
better as long as the sample does not exceed 10% of the population size
C
perfect
D
worse
Click for Answer
A
7.5
Connections to Discrete
Random Variables
5. Select the best answer.
The graphing calculator entry to approximate the probability of 5 or more discrete events,
using a normal approximation with a mean of 6 and a standard deviation of 2.3 is
A
normalcdf(5, 999999, 6, 2.3)
B
normalcdf (5, infinity, 6, 2.3)
C
normalcdf (4.5, 999999, 6, 2.3)
D
normalcdf (4.5, infinity, 2.3, 6)
Click for Answer
C
Section 5.1
The following pages contain
solutions for the previous
questions.
Solutions
Investigate 1 Compare the Binomial Distribution to the Normal Distribution
1. Click on the icon to open the FathomTM file.
To approximate the binomial distribution with a normal distribution, you can calculate the mean using
the formula   np , and the standard deviation using the formula   npq . Check some of the
probabilities by calculating them yourself.
Using the normal distribution to approximate the probability of tossing 2 heads
[normalcdf(1.5, 2.5, 2.5, 1.118)] gives a value of about 0.3144.
Using the normal approximation to find the probability of tossing 4 heads
[normalcdf(3.5, 4.5, 2.5, 1.118)] gives a value of about 0.1487.
2. How well does the normal distribution match the binomial distribution?
The normal approximation is a smooth, continuous curve, where the
binomial distribution has a small number of discrete points, and its
graph looks like a jagged curve which is fairly close to the normal
approximation, especially at the integer points.
Solutions
Investigate 1 Compare the Binomial Distribution to the Normal Distribution
3. Right click on the collection box, and add 5 new cases. Adjust the scales on the axes if necessary.
How well do the distributions match with 10 tosses of the coin?
The normal approximation fits slightly better for 10 tosses of the coin.
4. Right click on the collection box, and add 10 new cases. Adjust the
scales on the axes if necessary. How well do the distributions match
with 20 tosses of the coin?
The normal approximation fits better still for 20 tosses of the coin.
Solutions
Investigate 1 Compare the Binomial Distribution to the Normal Distribution
5. Reflect How does the fit of the normal distribution to the binomial distribution depend on the
number of trials? Use your Fathom™ simulation to try a larger number of trials, such as 100 and then
1000.
The normal approximation fits better as the number of trials increases.
Solutions
Investigate 2 Compare the Hypergeometric Distribution to the Normal Distribution
A committee of 4 is chosen from a group of 200 people. How many males are on the committee?
How does the distribution change as the size of the committee is increased? Does the population
size have any effect?
1. Click on the icon to open the Fathom™ file.
To approximate the hypergeometric distribution with a normal distribution, you can calculate the mean
using the formula   np and the standard deviation using the formula
 NP  n  .
  npq 

 NP  1 
2. How well does the normal distribution match the
hypergeometric distribution?
The normal distribution matches the hypergeometric
distribution reasonably well.
Solutions
Investigate 2 Compare the Hypergeometric Distribution to the Normal Distribution
3. Change the committee membership from 4 to 10. Right click on the collection box, and add 6 new
cases. Adjust the scales on the axes of the graph if necessary. How is the fit with 10 members on the
committee?
The fit is slightly better with 10 members on the committee.
Solutions
Investigate 2 Compare the Hypergeometric Distribution to the Normal Distribution
4. Change the committee membership to 20. Right click on the collection box, and add 10 new cases.
How is the fit with 20 members on the committee?
The fit is even better with 20 members on the committee.
5. Reflect The committee membership must remain a
small fraction of the population size, typically less than
one-tenth. Why is this necessary? Consider the values
of p and q in the above investigation in your response.
We are approximating p by taking the number of males
in the population and dividing by the size of the
population. Once a sample is taken, for example when
the first person is selected for the committee, the value
of p will change, since we are not replacing each
committee member before picking the next one. If the
value of p changes too much, the approximation for the
mean and standard deviation will not be very good.
Solutions
Investigate 2 Compare the Hypergeometric Distribution to the Normal Distribution
6. Extend Your Understanding Suppose that a committee of 4 were chosen in a random selection
from a very small population, say a club with 6 male members and 2 female members. Can you see
any problems with calculating the number of males on the committee? Consider some extreme cases.
Use the Fathom™ simulation from the investigation to explore different scenarios.
If the number of males, or females as in this example, is less than the sample size, it is possible the
normal approximation will predict a non-zero probability for a committee with more males or females
than exist in the whole population, which is clearly impossible.
Solutions
Example 1
Normal Approximation for a Binomial Distribution
a) Is it reasonable to approximate this distribution with a normal distribution? Give a reason.
Yes. The probability of guessing correctly on each question is 25%, so p = 0.25 and q = 0.75. Since
np = (25)(0.25) = 6.25 > 5 and nq = 18.75 > 5, it is reasonable to approximate this binomial distribution
with a normal distribution.
b) What values need to be determined in order to use a normal approximation? Determine these
values.
In order to use a normal approximation, we need to determine the mean and standard deviation:
c) What is the probability that Charlie will get a passing grade (50% or more) on this quiz?
The lowest passing grade would be 13 out of 25. We should use a
continuity correction and calculate P(X > 12.5).
P(X > 12.5) = 0.00195
The probability of Charlie passing this quiz is less than 0.2%.
Solutions
Example 2
Normal Approximation for a Hypergeometric Distribution
a) Is it reasonable to approximate this distribution with a normal distribution? Give a reason.
Yes. Since n = 5 is slightly less than 0.1NP = 0.1(52), it is considered reasonable to approximate
this scenario with a normal distribution.
b) What values need to be determined in order to use a normal approximation? Determine these
values.
In order to use a normal approximation, we need to determine the mean and standard deviation:
Solutions
Example 2
Normal Approximation for a Hypergeometric Distribution
c) What is the probability that Lizzie will deal 3 or more face cards?
We should use a continuity correction.
There is about a 5% probability that Lizzie will deal 3 or more face cards.
Attachments
7.5 CoinToss.ftm
7.5 Committee.ftm