Chapter 3: Displaying and Describing Categorical Data

Bernoulli Trials
A trial is a Bernoulli Trial if
1.
2.
3.
there are two possible outcomes
the probability of success is constant
the trials are independent
Common examples of Bernoulli trials:



Tossing a coin
Looking for defective products off an assembly line
Shooting free throw shots in a basketball game
Back to the McDonalds/Cars
Example (Chapter 11)
McDonalds, as we all know, puts toys in
Happy Meals. Suppose their most recent
promotion has 3 toys from the movie “Cars”
– Lightning McQueen, Mator, and Doc
Hudson. McDonalds announces that 20% of
happy meals contain a toy Lightning
McQueen, 30% contain a toy Mator, and the
remaining contain Doc Hudson.
I NEED a McQueen!
What’s the probability that you find a Lightning
McQueen in
1.
the first happy meal you open?
0.20
2.
the second happy meal you open?
0.8 0.2 = 0.16
3.
the fifth happy meal?
0.8
4.
4
0.2 = 0.08192
How many happy meals might you expect to
open?
What do you think?
Geometric Probability Model
Geometric probability model for Bernoulli trials: Geom(p)
p = probability of success (and q = 1 – p = probability of failure)
X = number of trials until the first success occurs
𝑃 𝑋 = 𝑥 = 𝑞 𝑥−1 𝑝
Expected value: μ =
1
𝑝
Standard deviation: 𝜎 =
𝑞
𝑝2
Independence

An important requirement for Bernoulli
trials is that (you guessed it!) the trials
be independent.

The 10% Condition: Bernoulli trials
must be independent. If that
assumption is violated, it is still okay to
proceed as long as the sample is
smaller than 10% of the population.
Example 1
People with O-negative blood are called
“universal donors” because O-negative blood
can be given to anyone else, regardless of
the recipient’s blood type. Only about 6% of
people have O-negative blood. If donors line
up at random for a blood drive, how many do
you expect to examine before you find
someone who has O-negative blood? What’s
the probability that the first O-negative donor
found is one of the first four people in line?
Example 1
Before we answer these questions, have we met
the conditions to use the Geometric model?
 Two outcomes
 The probability of success for each person is 0.06,
because they are lined up randomly
 Not independent because there is a finite population,
however, less than 10% of all possible donors are
lined up, so the 10% condition is met
Example 1
A.
How many do you expect to examine
before you find someone who has Onegative blood? 𝐸 𝑋 = 1 ≈ 16.7
0.06
Blood drives like this should expect to examine an
average of 16.7 people to find a universal donor.
B.
What’s the probability that the first Onegative donor found is one of the first
four people in line?
𝑃 𝑋 ≤4 =𝑃 𝑋 =1 +𝑃 𝑋 =2 +𝑃 𝑋 =3 +𝑃 𝑋 =4
= 0.06 + 0.94 0.06 + 0.94 2 0.06 + 0.94
= 0.2193
3
0.06
TI Tips
Your calculator knows the geometric model 
2nd
Vars
(same place as normalcdf)
E: geometpdf(
Finds the probability of an
individual outcome
F: geometcdf(
Calculates the probability of
finding success on or before
the xth trial
TI Tips

Look carefully – at the top it
differentiates between pdf and cdf!
geometpdf:
Probability
of success
The trial you get
your success on

geometcdf:
The Binomial Model

You buy 5 happy meals. What’s the
probability you get exactly 3 McQueen toys?

Same idea as the Bernoulli trial, but now
we’re concerned with the number of
successes rather than the number of times
until a success
The Binomial Model

Uses two parameters:
 The number of trials, 𝑛
 The probability of success, 𝑝

Denote: Binom 𝑛, 𝑝

In our McQueen example: Binom(5, 0.2)
The Binomial Model
Without getting too much into the derivation of
the formula, in general the probability of exactly
𝑘 successes in 𝑛 trials is:
𝑛 𝑘 𝑛−𝑘
𝑝 𝑞
𝑘
The Binomial Model
Binomial probability model for Bernoulli trials: Binom(n,p)
n = number of trials
p = probability of success (and q = 1 – p = probability of failure
x = number of success in n trials
𝑛 𝑥 𝑛−𝑥
𝑛
𝑃 𝑋=𝑥 =
𝑝 𝑞
, where
=
𝑥!
𝑥
𝑥
Mean: 𝜇 = 𝑛𝑝
Standard deviation: 𝜎 = 𝑛𝑝𝑞
𝑛!
𝑛−𝑥 !
Example 2
Suppose 20 donors come to the blood drive.
a.
What are the mean and standard deviation of
the number of universal donors among them?
In groups of 20 randomly
selected blood donors, I expect
to find an average of 1.2
20 0.06 0.94 = 1.06 universal donors with a standard
deviation of 1.06.
𝐸 𝑋 = 𝑛𝑝 = 20 0.06 = 1.2
𝑆𝐷 𝑋 = 𝑛𝑝𝑞 =
b.
is the probability that there are 2 or 3
universal donors? About 31% of the time, I’d find 2 or 3
universal donors among the 20 people.
𝑃 𝑋 = 2 𝑜𝑟 3 = 𝑃 𝑋 = 2 + 𝑃 𝑋 = 3
20
20
=
0.06 2 0.94 18 +
2
2
0.06
2
0.94
18
= 0.3106
TI Tips
Your calculator can also calculate Binomial
probabilities.
2nd
Vars
A: binompdf(
for individual outcomes
B: binomcdf(
for the probability of get x
or fewer successes in n trials
For both calculations, enter n, p, x
# of trials
# of successes
Probability of a success
The Normal Model to the Rescue!
Suppose the Tennessee Red Cross anticipates
the need for at least 1850 units of O-negative
blood this year. It estimates that it will collect
blood from 32,000 donors. How great is the
risk that the Tennessee Red Cross will fall
short of meeting its need?
While we have calculators to do this, it would
be insane to try to calculate this by hand.
The Success/Failure Condition
A Binomial model is approximately Normal
if we expect at least 10 successes and 10
failures:
𝑛𝑝 ≥ 10 𝑎𝑛𝑑 𝑛𝑞 ≥ 10
If this condition holds, we can use the
Normal model to approximate probabilities
(like the Tennessee Red Cross example).
Tennessee “Normalized”
Let’s readdress this example with the Normal model.
𝜇 = 𝑛𝑝 = 1920
𝜎 = 𝑛𝑝𝑞 = 42.48
𝑥 − 𝜇 1850 − 1920
𝑧=
=
= −1.65
𝜎
42.48
𝑃 𝑧 < −1.65 = 0.05
The Tennessee Red Cross has about a 5% chance of
running short on O-Negative blood.
Continuous Random Variables
There’s a problem with approximating a Binomial
model with a Normal model:
The Binomial model is discrete
 The Normal model is continuous

While we can use the Normal model to
approximate intervals of values, we cannot find
particular values with it (we cannot calculate the
probability of the Red Cross getting exactly 1850
O-negative blood donors).
Example 4
As noted a few chapters ago, the Pew Research
Center reports that they are actually able to
contact only 76% of randomly selected
households drawn for a telephone survey.
a.
Explain why these phone calls can be
considered Bernoulli trials.
There are two outcomes (contact, no
contact), the probability of contact is 0.76,
and random calls should be independent.
Example 4
b.
Which of the models of this chapter
(Geometric, Binomial, or Normal) would you
use to model the number of successful
contacts from a list of 1000 sampled
households? Explain.
Binomial, with n = 1000 and p = 0.76.
For actual calculations, we could
approximate using a Normal model with
𝜇 = 𝑛𝑝 = 1000 0.76 = 760 and 𝜎 =
𝑛𝑝𝑞 = 1000 0.76 0.24 = 13.5
Example 4
c.
Pew further reports that even after they contacted a
household, only 38% agree to be interviewed, so
the probability of getting a completed interview for a
randomly selected household is only 0.29. Which
of the models of this chapter would you use to
model the number of households Pew has to call
before they get the first completed interview?
Geometric, with p = 0.29
What Can Go Wrong?

Be sure you have Bernoulli trials.

Don’t confuse Geometric and Binomial
models.

Don’t use the Normal approximation with
small n.