The Normal Distribution

Chapter
5 Normal Probability Distributions
N
Elementary Statistics
Larson
Farber
1
Section 5.1
Introduction to
Normal Distributions
2
Properties of a Normal Distribution

x
•The mean, median, and mode are equal
•Bell shaped and is symmetric about the mean
•The total area that lies under the curve is one or 100%
3
Properties of a Normal Distribution
Inflection point
Inflection point
 
 

x
•As the curve extends farther and farther away from the
mean, it gets closer and closer to the x- axis but never touches it.
•The points at which the curvature changes are called inflection
points. The graph curves downward between the inflection points
and curves upward past the inflection points to the left and to the
right
4
Means and Standard Deviations
Curves with different means, same standard deviation
10 11 12 13 14
15 16 17 18 19
20
Curves with different means different standard deviations
9 10 11 12 13 14
15 16 17 18 19
20 21 22
5
Empirical Rule
68%
About 68% of the area lies
within 1 standard deviation
of the mean
3 2    2 3
About 95% of the area lies
within 2 standard
deviations
About 99.7% of the area lies within 3
standard deviations of the mean
6
Determining Intervals
  4.2 hrs.
  0.3 hrs
x
3.3 3 3.62
3.91
0
4.2
1
4.5
2
4.8
3
5.1
An instruction manual claims that the assembly time for a
product is normally distributed with a mean of 4.2 hours
and standard deviation 0.3 hours. Determine the
interval in which 95% of the assembly times fall.
95% of the data will fall within 2 standard deviations of the mean.
4.2 - 2 (0.3) = 3.6 and 4.2 +2 (0.3) = 4.8.
95% of the assembly times will be between 3.6 and 4.8 hrs.
7
Section 5.2
The Standard
Normal Distribution
8
The Standard Score
The standard score, or z-score, represents the number of
standard deviations a random variable x falls from the mean.
value - mean
x
z

standard deviation

The test scores for a civil service exam are normally distributed
with a mean of 152 and standard deviation of 7. Find the
standard z-score for a person with a score of:
(a) 161
(b) 148
(c) 152
(a) 161  152
(b) 148  152 (c) 152  152
z
z
z 
7
7
7
z0
z  0.57
z  1.29
9
The Standard Normal Distribution
The standard normal distribution has a mean of 0 and a
standard deviation of 1.
Using z- scores any normal distribution can be transformed into
the standard normal distribution.
4
3
2
1 0
1
2
3
4
z
10
Cumulative Areas
The
total
area
under
the curve
is one.
-3
-2 -1
0 1
2
3
z
• The cumulative area is close to 0 for z-scores close to -3.49.
• The cumulative area for z = 0 is 0.5000
• The cumulative area is close to 1 for z scores close to 3.49.
11
Cumulative Areas
Find the cumulative area for a z-score of -1.25.
0.1056
-3
-2 -1
0 1
2
3
z
Read down the z column on the left to z = -1.2 and across to the
column under .05. The value in the cell is 0.1056, the cumulative
area.
The probability that z is at most -1.25 is 0.1056.
P ( z  -1.25) = 0.1056
12
Finding Probabilities
To find the probability that z is less than a given value, read the
cumulative area in the table corresponding to that z-score.
Find P( z < -1.45)
-3
-2 -1
0 1
2
3
z
Read down the z-column to -1.4 and across to .05. The cumulative
area is 0.0735.
P ( z < -1.45) = 0.0735
13
Finding Probabilities
To find the probability that z is greater than a given
value, subtract the cumulative area in the table from 1.
Find P( z > -1.24)
Required area
0.1075
0.8925
z
-3 -2 -1 0 1 2 3
The cumulative area (area to the left) is 0.1075. So the area to the
right is 1 - 0.1075 = 0.8925.
P( z > -1.24) = 0.8925
14
Finding Probabilities
To find the probability z is between two given values, find the
cumulative areas for each and subtract the smaller area from the
larger.
Find P( -1.25 < z < 1.17)
-3
1. P(z < 1.17) = 0.8790
-2 -1
0 1
2
3
z
2. P(z < -1.25) =0.1056
3. P( -1.25 < z < 1.17) = 0.8790 - 0.1056 = 0.7734
15
Summary
To find the probability that z is less
than a given value, read the
corresponding cumulative area.
-3 -2 -1 0 1 2 3
z
To find the probability is
greater than a given value,
subtract the cumulative area in
the table from 1.
z
-3 -2 -1 0 1 2 3
To find the probability z is between
two given values, find the cumulative
areas for each and subtract the smaller
area from the larger.
-3 -2 -1 0 1
2
3
z16
Section 5.3
Normal Distributions
Finding Probabilities
17
Probabilities and Normal Distributions
If a random variable, x is normally distributed, the probability
that x will fall within an interval is equal to the area under the
curve in the interval.
IQ scores are normally distributed with a mean of 100 and
standard deviation of 15. Find the probability that a person
selected at random will have an IQ score less than 115.
B
100
4
3.99  1
115
To find the area in this interval, first find the standard score
equivalent to x = 115.
115  100
z
15
1
18
Probabilities and Normal Distributions
4
B
3.99  1
Normal Distribution
  100
SAME
Find P(x < 115)
Standard Normal Distribution
 0
SAME
  15
100
B
4
115
3.99  1
 1
Find P(z < 1)
0 1
P( z < 1) = 0.8413, so P( x <115) = 0.8413
19
Application
Monthly utility bills in a certain city are normally distributed
with a mean of $100 and a standard deviation of $12. A utility
bill is randomly selected. Find the probability it is between $80
and $115.
Normal Distribution
  100
  12
z
80  100
 1.67
12
P(80 < x < 115)
z
115  100
 1.25
12
P(-1.67 < z < 1.25)
0.8944 - 0.0475 = 0.8469
The probability a utility bill is between
$80 and $115 is 0.8469.
20
Section 5.4
Normal Distributions
Finding Values
21
From Areas to z-scores
Find the z-score corresponding to a cumulative area of 0.9803.
4
3
0.9803
0.9803
2 1 0 1
2
3
4
z
Locate 0.9803 in the area portion of the table. Read the values at
the beginning of the corresponding row and at the top of the
column. The z-score is 2.06.
z = 2.06 corresponds roughly to the 98th percentile.
22
Finding z-scores From Areas
Find the z-score corresponding to the 90th percentile.
.90
0
z
The closest table area is .8997. The row heading is 1.2
and column heading .08. This corresponds to z = 1.28.
A z-score of 1.28 corresponds to the 90th percentile.
23
Finding z-scores From Areas
Find the z-score with an area of .60 falling to its right.
.40
.60
z
0
z
With .60 to the right, cumulative area is .40. The
closest area is .4013. The row heading is –0.2 and
column heading is .05. The z-score is –0.25.
A z-score of –0.25 has an area of .60 to its right.
It also corresponds to the 40th percentile
24
Finding z-scores From Areas
Find the z-score such that 45% of the area under the curve
falls between –z and z.
.275
.275
.45
-z
0
z
The area remaining in the tails is .55. Half this area is
in each tail, so since .55/2 =.275 is the cumulative area for the
negative z value and .275 + .45 = .725 is the cumulative area
for the positive z. The closest table area is .2743 and the zscore is –0.60. The positive z score is 0.60.
25
From z-Scores to Raw Scores
To find the data value, x when given a standard score, z:
x    z
The test scores for a civil service exam are normally distributed
with a mean of 152 and standard deviation of 7. Find the test score
for a person with a standard score of
(a) 2.33
(b) -1.75
(c) 0
(a) x = 152 + (2.33)(7) = 168.31
(b) x = 152 + ( -1.75)(7) = 139.75
(c) x = 152 + (0)(7) = 152
26
Finding Percentiles or Cut-off values
Monthly utility bills in a certain city are normally distributed with
a mean of $100 and a standard deviation of $12. What is the
smallest utility bill that can tbe 1.28
in the
top 10% of the bills?
 1.29  4
90%
10%
z
Find the cumulative area in the table that is closest to 0.9000 (the
90th percentile.) The area 0.8997 corresponds to a z-score of 1.28.
x    z
To find the corresponding x-value, use
x = 100 + 1.28(12) = 115.36.
$115.36 is the smallest value for the top 10%.
27
Section 5.5
The Central Limit
Theorem
28
Sampling Distributions
A sampling distribution is the probability distribution of a sample
statistic that is formed when samples of size n are repeatedly taken
from a population. If the sample statistic is the sample mean, then
the distribution is the sampling distribution of sample means.
Sample
Sample
x3
x1
Sample
x4
Sample
x5
Sample
x2
Sample
x6
The sampling distribution consists of the values of the sample means,
x1 , x2 , x3 , x4 , x5 , x6 ,...
29
The Central Limit Theorem
If a sample n  30 is taken from a population with any type
distribution that has a mean =  and standard deviation =


x
the sample means will have a normal distribution
with a mean
x
xx and standard deviation
x  
xx
xx x
x x x
x x x
x x xx x
x x x x x

x 

n
30
The Central Limit Theorem
If a sample of any size is taken from a population with a
normal distribution with mean = and standard deviation=


the distribution of means of sample size n , will be normal
with a mean
x  
standard deviation
x 

n

x
x
x
xx
xx x
x x x
x x x
x x x x x
x x x x x

31
Application
  69.2 and   2.9
The mean height of American men (ages 20-29) is
inches. Random samples of 60 such men are selected. Find the mean and
standard deviation (standard error) of the sampling distribution.
  69.2
  2.9
69.2
Distribution of means of sample size 60 ,
will be normal.
x
x
xx
xx x
x x x
x x x
x x x x x
x x x x x
mean
 x    69.2
Standard deviation
2.9
x 
60
 0.3744
32
Interpreting the Central Limit Theorem
The mean height of American men (ages 20-29) is  = 69.2”. If
a random sample of 60 men in this age group is selected, what
is the probability the mean height for the sample is greater
than 70”? Assume the standard deviation is 2.9”.
Since n > 30 the sampling distribution of x
mean
 x  69.2
standard deviation
x
will be normal
2.9

 0.3744
60
Find the z-score for a sample mean of 70:
z
x
x
70  69.2

 2.14
0.3744
33
Interpreting the Central Limit Theorem
t
1.87  1.88  4
P ( xLimit
> 70) Theorem
Interpreting the Central
= P (z > 2.14)
= 1 - 0.9838
2.14
= 0.0162
zThere is a 0.0162 probability that a sample of 60 men
will have a mean height greater than 70”.
34
Application Central Limit Theorem
During a certain week the mean price of gasoline in California was $1.164
per gallon. What is the probability that the mean price for the sample of 38
gas stations in California is between $1.169 and $1.179? Assume the
standard deviation = $0.049.
x will be normal
 1.164
Since n > 30 the sampling distribution of
mean
x  x
standard deviation
x 

0.049

 0.0079
n
38
Calculate the standard z-score for sample values of $1.169 and $1.179.
1.169  1.164
1.179  1.164
z
 0.63 z 
 1.90
0.0079
0.0079
35
Application Central Limit Theorem
P( 0.63 < z < 1.90)
= 0.9713 - 0.7357
= 0.2356
z
.63
1.90
The probability is 0.2356 that the mean for the sample is
between $1.169 and $1.179.
36
Section 5.6
Normal Approximation
to the Binomial
37
Binomial Distribution Characteristics
•
There are a fixed number of independent trials. (n)
•
Each trial has 2 outcomes, Success or Failure.
•
The probability of success on a single trial is p and
the probability of failure is q. p + q = 1
•
We can find the probability of exactly x successes out
of n trials. Where x = 0 or 1 or 2 … n.
x is a discrete random variable representing a count
of the number of successes in n trials.
•
  np and  = npq
38
Application
34% of Americans have type A+ blood. If 500 Americans are
sampled at random, what is the probability at least 300 have type A+
blood?
Using techniques of chapter 4 you could calculate the probability
that exactly 300, exactly 301…exactly 500 Americans have A+
blood type and add the probabilities.
Or…you could use the normal curve probabilities to approximate
the binomial probabilities.
If np  5 and nq  5, the binomial random variable x is approximately
normally distributed with
mean
and
  np
  npq
39
Why do we require np  5 and nq  5?
0
1
2
3
44
5
n=5
p = 0.25, q = .75
np =1.25 nq = 3.75
n = 20
p = 0.25
np = 5 nq = 15
0
1
2
3
4
5
6
4
7
8
9 10 11 12 13 14 15 16 17 18 19 20
n = 50
p = 0.25
np = 12.5
nq = 37.5
0
10
20
30
40
50
40
Binomial Probabilities
The binomial distribution is discrete with a probability
histogram graph. The probability that a specific value of x will
occur is equal to the area of the rectangle with midpoint at x.
If n = 50 and p = 0.25 find P (14 x  16)
Add the areas of the rectangles with midpoints at
x = 14, x = 15, x = 16.
0.111 + 0.089 + 0.065 = 0.265
0.111 0.089
0.065
P (14 x  16)
= 0.265
14
15
16
41
Correction for Continuity
Use the normal approximation to the binomial to find
P(14 x  16) if n = 50 and p = 0.25
Check that np= 12.5  5 and nq= 37.5  5.
14
15
16
Values for the binomial random variable x are
14, 15 and 16.
42
Correction for Continuity
Use the normal approximation to the binomial to find
P(14 x  16) if n = 50 and p = 0.25
Check that np= 12.5  5 and nq= 37.5  5.
14
15
16
The interval of values under the normal curve is 13.5  x  16.5.
To ensure the boundaries of each rectangle are included in
the interval, subtract 0.5 from a left-hand boundary and
add 0.5 to a right-hand boundary.
43
Normal Approximation to the Binomial
Use the normal approximation to the binomial to find
P(14 x  16) if n = 50 and p = 0.25
Find the mean and standard deviation using binomial
distribution formulas.
  np  50(.25)  12.5
  npq  50(.25)(.75)  3.062
Adjust the endpoints to correct for continuity P(13.5  x  16.5)
Convert each endpoint to a standard score
13.5  12.5
z
 0.33
3.062
z
16.5  12.5
 1.31
3.062
P(0.33  z  1.31) = 0.9049 - 0.6293 = 0.2756
44
Application
A survey of Internet users found that 75% favored government
regulations on “junk” e-mail. If 200 Internet users are randomly
selected, find the probability that fewer than 140 are in favor of
government regulation.
Since np = 150  5 and nq = 50  5 use the normal approximation
to the binomial.
  np  200(.75)  150
  npq  200(.75)(.25)  6.1237
The binomial phrase of “fewer than 140” means
0, 1, 2, 3…139.
Use the correction for continuity to translate to the continuous
variable in the interval (- , 139.5). Find P( x < 139.5 )
45
Application
A survey of Internet users found that 75% favored government
regulations on “junk” e-mail. If 200 Internet users are randomly
selected, find the probability that fewer than 140 are in favor of
government regulation.
Use the correction for continuity P(x < 139.5)
z
139.5  150
 1.71
6.1237
P( z < -1.71) = 0.0436
The probability that fewer than 140 are in favor of government
regulation is approximately 0.0436
46