Z - Department of Statistics, Purdue University

4
Continuous Random
Variables and
Probability Distributions
Copyright © Cengage Learning. All rights reserved.
4.1
Probability Density
Functions
Copyright © Cengage Learning. All rights reserved.
Probability Density Functions
A discrete random variable (rv) is one whose possible
values either constitute a finite set or else can be listed in
an infinite sequence (a list in which there is a first element,
a second element, etc.).
A random variable whose set of possible values is an entire
interval of numbers is not discrete.
3
Probability Density Functions
Recall from Chapter 3 that a random variable X is
continuous if
(1) possible values comprise either a single interval on the
number line (for some A < B, any number x between A
and B is a possible value) or a union of disjoint intervals,
and
(2) P(X = c) = 0 for any number c that is a possible value of
X.
4
Probability Distributions for Continuous Variables
Definition
5
Probability Distributions for Continuous Variables
P(a  X  b) = the area under the density curve between a and b
Figure 4.2
For f(x) to be a legitimate pdf, it must satisfy the following
two conditions:
1. f(x)  0 for all x
2.
= area under the entire graph of f(x)
=1
6
Example 4.4
The direction of an imperfection with respect to a reference
line on a circular object such as a tire, brake rotor, or
flywheel is, in general, subject to uncertainty.
Consider the reference line connecting the valve stem on a
tire to the center point, and let X be the angle measured
clockwise to the location of an imperfection. One possible
pdf for X is
7
Example 4.4
cont’d
The pdf is graphed in Figure 4.3.
The pdf and probability from Example 4
Figure 4.3
8
Example 4.4
cont’d
Clearly f(x)  0. The area under the density curve
is just the area of a rectangle:
(height)(base) =
(360) = 1.
The probability that the angle is between 90 and 180 is
9
Example 4.4
cont’d
The probability that the angle of occurrence is within 90 of
the reference line is
P(0  X  90) + P(270  X < 360) = .25 + .25 = .50
10
Probability Distributions for Continuous Variables
Because whenever 0  a  b  360 in Example 4.4 and
P(a  X  b) depends only on the width b – a of the interval,
X is said to have a uniform distribution.
Definition
11
Probability Distributions for Continuous Variables
When X is a discrete random variable, each possible value
is assigned positive probability.
This is not true of a continuous random variable (that is, the
second condition of the definition is satisfied) because the
area under a density curve that lies above any single value
is zero:
12
Probability Distributions for Continuous Variables
The fact that P(X = c) = 0 when X is continuous has an
important practical consequence: The probability that X lies
in some interval between a and b does not depend on
whether the lower limit a or the upper limit b is included in
the probability calculation:
P(a  X  b) = P(a < X < b) = P(a < X  b) = P(a  X < b)
(4.1)
If X is discrete and both a and b are possible values (e.g.,
X is binomial with n = 20 and a = 5, b = 10), then all four of
the probabilities in (4.1) are different.
13
Example 5.5
“Time headway” in traffic flow is the elapsed time between
the time that one car finishes passing a fixed point and the
instant that the next car begins to pass that point.
Let X = the time headway for two randomly chosen
consecutive cars on a freeway during a period of heavy
flow. The following pdf of X is essentially the one suggested
in “The Statistical Properties of Freeway Traffic” (Transp.
Res., vol. 11: 221–228):
14
Example 5.5
cont’d
The graph of f(x) is given in Figure 4.4; there is no density
associated with headway times less than .5, and headway
density decreases rapidly (exponentially fast) as x
increases from .5.
The density curve for time headway in Example 5
Figure 4.4
15
Example 5.5
Clearly, f(x)  0; to show that
calculus result
cont’d
f(x)dx = 1, we use the
e–kx dx = (1/k)e–k  a.
Then
16
Example 5.5
cont’d
The probability that headway time is at most 5 sec is
P(X  5) =
=
.15e–.15(x–.5)dx
= .15e.075
e–15x dx
=
17
Example 5.5
cont’d
= e.075(–e–.75 + e–.075)
= 1.078(–.472 + .928)
= .491
= P(less than 5 sec)
= P(X < 5)
18
4.2
Cumulative Distribution
Functions and Expected Values
Copyright © Cengage Learning. All rights reserved.
19
The Cumulative Distribution Function
The cumulative distribution function (cdf) F(x) for a discrete
rv X gives, for any specified number x, the probability
P(X  x) .
It is obtained by summing the pmf p(y) over all possible
values y satisfying y  x.
The cdf of a continuous rv gives the same probabilities
P(X  x) and is obtained by integrating the pdf f(y) between
the limits
and x.
20
The Cumulative Distribution Function
Definition
A pdf and associated cdf
Figure 4.5
21
Example 6.6
Let X, the thickness of a certain metal sheet, have a
uniform distribution on [A, B].
The density function is shown in Figure 4.6.
The pdf for a uniform distribution
Figure 4.6
22
Example 6.6
cont’d
For x < A, F(x) = 0, since there is no area under the graph
of the density function to the left of such an x.
For x  B, F(x) = 1, since all the area is accumulated to the
left of such an x. Finally for A  x  B,
23
Example 6.6
cont’d
The entire cdf is
The graph of this cdf appears in Figure 4.7.
The cdf for a uniform distribution
Figure 4.7
24
Using F(x) to Compute Probabilities
The importance of the cdf here, just as for discrete rv’s, is
that probabilities of various intervals can be computed from
a formula for or table of F(x).
Proposition
25
Using F(x) to Compute Probabilities
Figure 4.8 illustrates the second part of this proposition; the
desired probability is the shaded area under the density
curve between a and b, and it equals the difference
between the two shaded cumulative areas.
Computing P(a  X  b) from cumulative probabilities
Figure 4.8
This is different from what is appropriate for a discrete
integer valued random variable (e.g., binomial or Poisson):
P(a  X  b) = F(b) – F(a – 1) when a and b are integers.
26
Example 7.7
Suppose the pdf of the magnitude X of a dynamic load on a
bridge (in newtons) is
For any number x between 0 and 2,
27
Example 7.7
cont’d
Thus
The graphs of f(x) and F(x) are shown in Figure 4.9.
The pdf and cdf for Example 4.7
Figure 4.9
28
Example 7.7
cont’d
The probability that the load is between 1 and 1.5 is
P(1  X  1.5) = F(1.5) – F(1)
The probability that the load exceeds 1 is
P(X > 1) = 1 – P(X  1)
= 1 – F(1)
29
Example 7.7
cont’d
=1–
Once the cdf has been obtained, any probability involving X
can easily be calculated without any further integration.
30
Obtaining f(x) from F(x)
For X discrete, the pmf is obtained from the cdf by taking
the difference between two F(x) values. The continuous
analog of a difference is a derivative.
The following result is a consequence of the Fundamental
Theorem of Calculus.
Proposition
31
Example 8.8
When X has a uniform distribution, F(x) is differentiable
except at x = A and x = B, where the graph of F(x) has
sharp corners.
Since F(x) = 0 for x < A and F(x) = 1 for
x > B, F(x) = 0 = f(x) for such x.
For A < x < B,
32
Percentiles of a Continuous Distribution
When we say that an individual’s test score was at the 85th
percentile of the population, we mean that 85% of all
population scores were below that score and 15% were
above.
Similarly, the 40th percentile is the score that exceeds 40%
of all scores and is exceeded by 60% of all scores.
33
Percentiles of a Continuous Distribution
Proposition
According to Expression (4.2), (p) is that value on the
measurement axis such that 100p% of the area under the
graph of f(x) lies to the left of (p) and 100(1 – p)% lies to
the right.
34
Percentiles of a Continuous Distribution
Thus (.75), the 75th percentile, is such that the area under
the graph of f(x) to the left of (.75) is .75.
Figure 4.10 illustrates the definition.
The (100p)th percentile of a continuous distribution
Figure 4.10
35
Example 4.9
The distribution of the amount of gravel (in tons) sold by a
particular construction supply company in a given week is a
continuous rv X with pdf
The cdf of sales for any x between 0 and 1 is
36
Example 4.9
cont’d
The graphs of both f(x) and F(x) appear in Figure 4.11.
The pdf and cdf for Example 4.9
Figure 4.11
37
Example 4.9
cont’d
The (100p)th percentile of this distribution satisfies the
equation
that is,
((p))3 – 3(p) + 2p = 0
For the 50th percentile, p = .5, and the equation to be
solved is 3 – 3 + 1 = 0; the solution is  = (.5) = .347. If
the distribution remains the same from week to week, then
in the long run 50% of all weeks will result in sales of less
than .347 ton and 50% in more than .347 ton.
38
Percentiles of a Continuous Distribution
Definition
A continuous distribution whose pdf is symmetric—the
graph of the pdf to the left of some point is a mirror image
of the graph to the right of that point—has median equal
to the point of symmetry, since half the area under the
curve lies to either side of this point.
39
Percentiles of a Continuous Distribution
Figure 4.12 gives several examples. The error in a
measurement of a physical quantity is often assumed to
have a symmetric distribution.
Medians of symmetric distributions
Figure 4.12
40
Expected Values
For a discrete random variable X, E(X) was obtained by
summing x  p(x)over possible X values.
Here we replace summation by integration and the pmf by
the pdf to get a continuous weighted average.
Definition
41
Example 4.10
The pdf of weekly gravel sales X was
f(x) =
(1 – x2) 0  x  1
0
otherwise
So
42
Expected Values
When the pdf f(x) specifies a model for the distribution of
values in a numerical population, then  is the population
mean, which is the most frequently used measure of
population location or center.
Often we wish to compute the expected value of some
function h(X) of the rv X.
If we think of h(X) as a new rv Y, techniques from
mathematical statistics can be used to derive the pdf of Y,
and E(Y) can then be computed from the definition.
43
Expected Values
Fortunately, as in the discrete case, there is an easier way
to compute E[h(X)].
Proposition
44
Example 4.11
Two species are competing in a region for control of a
limited amount of a certain resource.
Let X = the proportion of the resource controlled by species
1 and suppose X has pdf
f(x) =
0x1
otherwise
which is a uniform distribution on [0, 1]. (In her book
Ecological Diversity, E. C. Pielou calls this the “broken- tick”
model for resource allocation, since it is analogous to
breaking a stick at a randomly chosen point.)
45
Example 4.11
cont’d
Then the species that controls the majority of this resource
controls the amount
h(X) = max (X, 1 – X) =
The expected amount controlled by the species having
majority control is then
E[h(X)] =
max(x, 1 – x)  f(x)dx
46
Example 4.11
=
max(x, 1 – x)  1 dx
=
max(x, 1 – x)  1 dx +
cont’d
x  1 dx
=
47
Expected Values
For h(X), a linear function, E[h(X)] = E(aX + b) = aE(X) + b.
In the discrete case, the variance of X was defined as the
expected squared deviation from  and was calculated by
summation. Here again integration replaces summation.
Definition
48
Expected Values
The variance and standard deviation give quantitative
measures of how much spread there is in the distribution or
population of x values.
Again  is roughly the size of a typical deviation from .
Computation of 2 is facilitated by using the same shortcut
formula employed in the discrete case.
Proposition
49
Example 4.12
For weekly gravel sales, we computed E(X) = . Since
E(X2) =
=
=
x2  f(x) dx
x2 
(1 – x2) dx
(x2 – x4) dx =
50
Example 12
cont’d
and X = .244
When h(X) = aX + b, the expected value and variance of
h(X) satisfy the same properties as in the discrete case:
E[h(X)] = a + b and V[h(X)] = a2  2.
51
4.3
The Normal
distribution
Copyright © Cengage Learning. All rights reserved.
52
The Normal Distribution
The normal distribution is the most important one in all of
probability and statistics. Many numerical populations have
distributions that can be fit very closely by an appropriate
normal curve.
Examples include heights, weights, and other physical
characteristics (the famous 1903 Biometrika article “On the
Laws of Inheritance in Man” discussed many examples of
this sort), measurement errors in scientific experiments,
anthropometric measurements on fossils, reaction times in
psychological experiments, measurements of intelligence
and aptitude, scores on various tests, and numerous
economic measures and indicators.
53
The Normal Distribution
Definition
Again e denotes the base of the natural logarithm system
and equals approximately 2.71828, and  represents the
familiar mathematical constant with approximate value
3.14159.
54
The Normal Distribution
The statement that X is normally distributed with
parameters  and 2 is often abbreviated X ~ N(, 2).
Clearly f(x; , )  0, but a somewhat complicated calculus
argument must be used to verify that
f(x; , ) dx = 1.
It can be shown that E(X) =  and V(X) = 2, so the
parameters are the mean and the standard deviation of X.
55
The Normal Distribution
Figure 4.13 presents graphs of f(x; , ) for several
different (, ) pairs.
Two different normal density curves
Figure 4.13(a)
Visualizing  and  for a normal
distribution
Figure 4.13(b)
56
The Normal Distribution
Each density curve is symmetric about  and bell-shaped,
so the center of the bell (point of symmetry) is both the
mean of the distribution and the median.
The mean 𝜇 is a location parameter, since changing its
value rigidly shifts the density curve to one side or the
other; 𝜎 is referred to as a scale parameter, because
changing its value stretches or compresses the curve
horizontally without changing the basic shape..
57
The Normal Distribution
The inflection points of a normal curve (points at which the
curve changes from turning downward to turning upward)
occur at  -  and  + . Thus the value of s can be
visualized as the distance from the mean to these inflection
points.
A large value of s corresponds to a density curve that is
quite spread out about , whereas a small value yields a
highly concentrated curve
The larger the value of , the more likely it is that a value of
X far from the mean may be observed.
58
The Standard Normal Distribution
The computation of P(a  X  b) when X is a normal rv with
parameters  and  requires evaluating
(4.4)
None of the standard integration techniques can be used to
accomplish this. Instead, for  = 0 and  = 1, Expression
(4.4) has been calculated using numerical techniques
and tabulated for certain values of a and b.
This table can also be used to compute probabilities for any
other values of  and  under consideration.
59
The Standard Normal Distribution
Definition
60
The Standard Normal Distribution
The standard normal distribution almost never serves as a
model for a naturally arising population.
Instead, it is a reference distribution from which information
about other normal distributions can be obtained.
Appendix Table A.3 gives
= P(Z  z), the area under the
standard normal density curve to the left of z, for
z = –3.49, –3.48,..., 3.48, 3.49.
61
The Standard Normal Distribution
Figure 4.14 illustrates the type of cumulative area
(probability) tabulated in Table A.3. From this table, various
other probabilities involving Z can be calculated.
Standard normal cumulative areas tabulated in Appendix Table A.3
Figure 4.14
62
Example 4.13
Let’s determine the following standard normal probabilities:
(a) P(Z  1.25),
(b) P(Z > 1.25),
(c) P(Z  –1.25), and
(d) P(–.38  Z  1.25).
a. P(Z  1.25) = (1.25), a probability that is tabulated in
Appendix Table A.3 at the intersection of the row
marked 1.2 and the column marked .05.
The number there is .8944, so P(Z  1.25) = .8944.
63
Example 4.13
cont’d
Figure 4.15(a) illustrates this probability.
Normal curve areas (probabilities) for Example 13
Figure 4.15(a)
b. P(Z > 1.25) = 1 – P(Z  1.25) = 1 – (1.25), the area
under the z curve to the right of 1.25 (an upper-tail
area). Then (1.25) = .8944 implies that
P(Z > 1.25) = .1056.
64
Example 4.13
cont’d
Since Z is a continuous rv, P(Z  1.25) = .1056.
See Figure 4.15(b).
Normal curve areas (probabilities) for Example 13
Figure 4.15(b)
c. P(Z  –1.25) = (–1.25), a lower-tail area. Directly from
Appendix Table A.3, (–1.25) = .1056.
By symmetry of the z curve, this is the same answer as
in part (b).
65
Example 4.13
cont’d
d. P(–.38  Z  1.25) is the area under the standard
normal curve above the interval whose left endpoint is
–.38 and whose right endpoint is 1.25.
From Section 4.2, if X is a continuous rv with cdf F(x),
then P(a  X  b) = F(b) – F(a).
Thus P(–.38  Z  1.25) =
(1.25) –
(–.38)
= .8944 – .3520
= .5424
66
Example 4.13
cont’d
See Figure 4.16.
P(–.38  Z  1.25) as the difference between two cumulative areas
Figure 4.16
67
Example 4.13
e. P(Z ≤ 5) = Ф(5), the cumulative area under the z curve
to the left of 5. This probability does not appear in the table
because the last row is labeled 3.4. However, the last entry
in that row is Φ(3.49) = .9998. That is, essentially all of the
area under the curve lies to the left of 3.49 (at most 3.49
standard deviations to the right of the mean). Therefore we
conclude that P(Z ≤ 5) ≈1.
68
Percentiles of the Standard Normal Distribution
For any p between 0 and 1, Appendix Table A.3 can be
used to obtain the (100p)th percentile of the standard
normal distribution.
69
Example 4.14
The 99th percentile of the standard normal distribution is
that value on the horizontal axis such that the area under
the z curve to the left of the value is .9900.
Appendix Table A.3 gives for fixed z the area under the
standard normal curve to the left of z, whereas here we
have the area and want the value of z. This is the “inverse”
problem to P(Z  z) = ?
so the table is used in an inverse fashion: Find in the
middle of the table .9900; the row and column in which it
lies identify the 99th z percentile.
70
Example 4.14
cont’d
Here .9901 lies at the intersection of the row marked 2.3
and column marked .03, so the 99th percentile is
(approximately) z = 2.33.
(See Figure 4.17.)
Finding the 99th percentile
Figure 4.17
71
Example 4.14
cont’d
By symmetry, the first percentile is as far below 0 as the
99th is above 0, so equals –2.33 (1% lies below the first
and also above the 99th).
(See Figure 4.18.)
The relationship between the 1st and 99th percentiles
Figure 4.18
72
Percentiles of the Standard Normal Distribution
In general, the (100p)th percentile is identified by the row
and column of Appendix Table A.3 in which the entry p is
found (e.g., the 67th percentile is obtained by finding .6700
in the body of the table, which gives z = .44).
If p does not appear, the number closest to it is often used,
although linear interpolation gives a more accurate answer.
73
Percentiles of the Standard Normal Distribution
For example, to find the 95th percentile, we look for .9500
inside the table.
Although .9500 does not appear, both .9495 and .9505 do,
corresponding to z = 1.64 and 1.65, respectively.
Since .9500 is halfway between the two probabilities that
do appear, we will use 1.645 as the 95th percentile and
–1.645 as the 5th percentile.
74
z Notation for z Critical Values
In statistical inference, we will need the values on the
horizontal z axis that capture certain small tail areas under
the standard normal curve.
z notation Illustrated
Figure 4.19
75
z Notation for z Critical Values
For example, z.10 captures upper-tail area .10, and z.01
captures upper-tail area .01.
Since  of the area under the z curve lies to the right of z,
1 –  of the area lies to its left. Thus z is the 100(1 – )th
percentile of the standard normal distribution.
By symmetry the area under the standard normal curve to
the left of –z is also . The z s are usually referred to as
z critical values.
76
z Notation for z Critical Values
Table 4.1 lists the most useful z percentiles and z values.
Standard Normal Percentiles and Critical Values
Table 4.1
77
Example 4.15
z.05 is the 100(1 – .05)th = 95th percentile of the standard
normal distribution, so z.05 = 1.645.
The area under the standard normal curve to the left of
–z.05 is also .05. (See Figure 4.20.)
Finding z.05
Figure 4.20
78
Nonstandard Normal Distributions
When X ~ N(,  2), probabilities involving X are computed
by “standardizing.” The standardized variable is (X – )/.
Subtracting  shifts the mean from  to zero, and then
dividing by  scales the variable so that the standard
deviation is 1 rather than .
79
Nonstandard Normal Distributions
Proposition
80
Nonstandard Normal Distributions
The key idea of the proposition is that by standardizing, any
probability involving X can be expressed as a probability
involving a standard normal rv Z, so that Appendix Table
A.3 can be used.
This is illustrated in Figure 4.21.
Equality of nonstandard and standard normal curve areas
Figure 4.21
81
Example 4.16
The time that it takes a driver to react to the brake lights on
a decelerating vehicle is critical in helping to avoid rear-end
collisions.
The article “Fast-Rise Brake Lamp as a CollisionPrevention Device” (Ergonomics, 1993: 391–395) suggests
that reaction time for an in-traffic response to a brake signal
from standard brake lights can be modeled with a normal
distribution having mean value 1.25 sec and standard
deviation of .46 sec.
82
Example 4.16
cont’d
What is the probability that reaction time is between 1.00
sec and 1.75 sec? If we let X denote reaction time, then
standardizing gives
1.00  X  1.75
if and only if
Thus
83
Example 4.16
cont’d
= P(–.54  Z  1.09) =
(1.09) –
(–.54)
= .8621 – .2946 = .5675
This is illustrated in Figure 4.22
Normal curves for Example 16
Figure 4.22
84
Example 4.16
cont’d
Similarly, if we view 2 sec as a critically long reaction
time, the probability that actual reaction time will exceed
this value is
85
Nonstandard Normal Distributions
These results are often reported in percentage form and
referred to as the empirical rule (because empirical
evidence has shown that histograms of real data can very
frequently be approximated by normal curves).
It is indeed unusual to observe a value from a normal
population that is much farther than 2 standard deviations
from m. These results will be important in the development
of hypothesis-testing procedures in later chapters
86
Percentiles of an Arbitrary Normal Distribution
The (100p)th percentile of a normal distribution with mean
 and standard deviation  is easily related to the (100p)th
percentile of the standard normal distribution.
Proposition
Another way of saying this is that if z is the desired
percentile for the standard normal distribution, then the
desired percentile for the normal (, ) distribution is z
standard deviations from .
87
Example 4.18
The authors of “Assessment of Lifetime of Railway Axle”
(Intl. J. of Fatigue, (2013: 40–46) used data collected from
an experiment with a specified initial crack length and
number of loading cycles to propose a normal distribution
with mean value 5.496 mm and standard deviation .067
mm for the rv X = final crack depth.
For this model, what value of final crack depth would be
exceeded by only .5% of all cracks under these
circumstances? Let c denote the requested value. Then the
desired condition is that P(X > c) = .005, or, equivalently,
that P(X ≤ c) = .995.
88
Example 4.18
cont’d
Thus c is the 99.5th percentile of the normal distribution
with µ = 5.496 and 𝜎 = .067. The 99.5th percentile of the
standard normal distribution is 2.58, so
89
The Normal Distribution and Discrete Populations
The normal distribution is often used as an approximation
to the distribution of values in a discrete population.
In such situations, extra care should be taken to ensure
that probabilities are computed in an accurate manner.
The correction for discreteness of the underlying
distribution in Example 19 is often called a continuity
correction.
It is useful in the following application of the normal
distribution to the computation of binomial probabilities.
90
Example 4.19
IQ in a particular population (as measured by a standard
test) is known to be approximately normally distributed with
 = 100 and  = 15.
What is the probability that a randomly selected individual
has an IQ of at least 125?
Letting X = the IQ of a randomly chosen person, we wish
P(X  125).
The temptation here is to standardize X  125 as in
previous examples. However, the IQ population distribution
is actually discrete, since IQs are integer-valued.
91
Example 4.19
cont’d
So the normal curve is an approximation to a discrete
probability histogram, as pictured in Figure 4.24.
A normal approximation to a discrete distribution
Figure 4.24
The rectangles of the histogram are centered at integers,
so IQs of at least 125 correspond to rectangles beginning
at 124.5, as shaded in Figure 4.24.
92
Example 4.19
cont’d
Thus we really want the area under the approximating
normal curve to the right of 124.5.
Standardizing this value gives P(Z  1.63) = .0516,
whereas standardizing 125 results in P(Z  1.67) = .0475.
The difference is not great, but the answer .0516 is more
accurate. Similarly, P(X = 125) would be approximated by
the area between 124.5 and 125.5, since the area under
the normal curve above the single value 125 is zero.
93
Approximating the Binomial Distribution
Recall that the mean value and standard deviation of a
binomial random variable X are X = np and X =
respectively.
94
Approximating the Binomial Distribution
Figure 4.25 displays a binomial probability histogram for
the binomial distribution with n = 25,, p = .6, for which
 = 25(.6) = 15 and  = 25 .6 . 4) = 2.449
95
Approximating the Binomial Distribution
For example,
P(X = 10) = B(10; 25, .6) – B(9; 25, .6) = .021,
whereas the area under the normal curve between 9.5 and
10.5 is P(–2.25  Z  –1.84) = .0207.
More generally, as long as the binomial probability
histogram is not too skewed, binomial probabilities can be
well approximated by normal curve areas.
It is then customary to say that X has approximately a
normal distribution.
96
Approximating the Binomial Distribution
Proposition
97
Example 4.20
Suppose that 25% of all students at a large public
university receive financial aid.
Let X be the number of students in a random sample of
size 50 who receive financial aid, so that p = .25.
Then  = 12.5 and  = 3.06.
Since np = 50(.25) = 12.5  10 and np = 37.5  10, the
approximation can safely be applied.
98
Example 4.20
cont’d
The probability that at most 10 students receive aid is
Similarly, the probability that between 5 and 15 (inclusive)
of the selected students receive aid is
P(5  X  15) = B(15; 50, .25) – B(4; 50, .25)
99
Example 4.20
cont’d
The exact probabilities are .2622 and .8348, respectively,
so the approximations are quite good.
In the last calculation, the probability P(5  X  15) is being
approximated by the area under the normal curve between
4.5 and 15.5—the continuity correction is used for both the
upper and lower limits.
100
4.4
The Exponential and
Gamma Distributions
Copyright © Cengage Learning. All rights reserved.
101
The Exponential Distributions
The family of exponential distributions provides probability
models that are very widely used in engineering and
science disciplines.
Definition
102
The Exponential Distributions
Some sources write the exponential pdf in the form
so that  = 1/ . The expected value of an exponentially
distributed random variable X is
,
Obtaining this expected value necessitates doing an
integration by parts. The variance of X can be computed
using the fact that V(X) = E(X2) – [E(X)]2.
The determination of E(X2) requires integrating by parts
twice in succession.
103
The Exponential Distributions
The results of these integrations are as follows:
Both the mean and standard deviation of the exponential
distribution equal 1/.
Graphs of several exponential
pdf’s are illustrated in Figure 4.26.
Exponential density curves
Figure 4.26
104
The Exponential Distributions
The exponential pdf is easily integrated to obtain the cdf.
105
Example 4.21
The article “Probabilistic Fatigue Evaluation of Riveted
Railway Bridges” (J. of Bridge Engr., 2008: 237–244)
suggested the exponential distribution with mean value 6
MPa as a model for the distribution of stress range in
certain bridge connections.
Let’s assume that this is in fact the true model. Then
E(X) = 1/ = 6 implies that  = .1667.
106
Example 4.21
cont’d
The probability that stress range is at most 10 MPa is
P(X  10) = F(10 ; .1667)
= 1 – e–(.1667)(10)
= 1 – .189
= .811
107
Example 4.21
cont’d
The probability that stress range is between 5 and 10
MPa is
P(5  X  10) = F(10; .1667) – F(5; .1667)
= (1 – e –1.667) – (1 – e –.8335)
= .246
108
The Exponential Distributions
The exponential distribution is frequently used as a model
for the distribution of times between the occurrence of
successive events, such as customers arriving at a service
facility or calls coming in to a switchboard.
The reason for this is that the exponential distribution is
closely related to the Poisson process discussed in
Chapter 3.
109
The Exponential Distributions
Proposition
110
The Exponential Distributions
Although a complete proof is beyond the scope of the text,
the result is easily verified for the time X1 until the first
event occurs:
P(X1  t) = 1 – P(X1 > t) = 1 – P [no events in (0, t)]
which is exactly the cdf of the exponential distribution.
111
Example 4.22
Suppose that calls to a rape crisis center in a certain
county occur according to a Poisson process with rate  =
.5 call per day.
Then the number of days X between successive calls has
an exponential distribution with parameter value .5, so the
probability that more than 2 days elapse between calls is
P(X > 2) = 1 – P(X  2)
= 1 – F(2; .5)
= e–(.5)(2)
112
Example 4.22
= .368
The expected time between successive calls is 1/.5 = 2
days.
113
The Exponential Distributions
Another important application of the exponential distribution
is to model the distribution of component lifetime.
A partial reason for the popularity of such applications is
the “memoryless” property of the exponential
distribution.
Suppose component lifetime is exponentially distributed
with parameter .
114
The Exponential Distributions
After putting the component into service, we leave for a
period of t0 hours and then return to find the component
still working; what now is the probability that it lasts at least
an additional t hours?
In symbols, we wish P(X  t + t0 | X  t0).
By the definition of conditional probability,
115
The Exponential Distributions
But the event X  t0 in the numerator is redundant, since
both events can occur if X  t + t0 and only if. Therefore,
This conditional probability is identical to the original
probability P(X  t) that the component lasted t hours.
116
The Exponential Distributions
Thus the distribution of additional lifetime is exactly the
same as the original distribution of lifetime, so at each point
in time the component shows no effect of wear.
In other words, the distribution of remaining lifetime is
independent of current age.
117
The Gamma Function
To define the family of gamma distributions, we first need to
introduce a function that plays an important role in many
branches of mathematics.
Definition
118
The Gamma Function
The most important properties of the gamma function are
the following:
1. For any  > 1,
= ( – 1) 
[via integration by parts]
2. For any positive integer, n,
( – 1)
= (n – 1)!
3.
119
The Gamma Function
Now let
(4.7)
Then f(x; )  0 Expression (4.6) implies that
,
Thus f(x; a) satisfies the two basic properties of a pdf.
120
The Gamma Distribution
Definition
121
The Gamma Distribution
The exponential distribution results from taking  = 1 and
 = 1/. Figure 4.27(a) illustrates the graphs of the gamma
pdf f(x; , ) (4.8) for several (, ) pairs, whereas
Figure 4.27(b) presents graphs of the standard gamma pdf.
Gamma density curves
Figure 4.27(a)
standard gamma density curves
Figure 4.27(b)
122
The Gamma Distribution
For the standard pdf, when   1, f(x; ) , is strictly
decreasing as x increases from 0; when  > 1, f(x; ) rises
from 0 at x = 0 to a maximum and then decreases.
The parameter  in (4.8) is called the scale parameter
because values other than 1 either stretch or compress the
pdf in the x direction.
123
The Gamma Distribution
The mean and variance of a random variable X having the
gamma distribution f(x; , ) are
E(X) =  = 
V(X) = 2 = 2
When X is a standard gamma rv, the cdf of X,
is called the incomplete gamma function [sometimes the
incomplete gamma function refers to Expression (4.9)
without the denominator
in the integrand].
124
The Gamma Distribution
There are extensive tables of available F(x; a); in Appendix
Table A.4, we present a small tabulation for
 = 1, 2, …, 10 and x = 1, 2, …,15.
125
Example 4.23
The article “The Probability Distribution of Maintenance
Cost of a System Affected by the Gamma Process of
Degradation” (Reliability Engr. and System Safety, 2012:
65–76) notes that the gamma distribution is widely used to
model the extent of degradation such as corrosion, creep,
or wear.
Let X represent the amount of degradation of a certain
type, and suppose that it has a standard gamma
distribution with  = 2. Since
P(a  X  b) = F(b) – F(a)
126
Example 4.23
cont’d
When X is continuous,
P(3  X  5) = F(5; 2) – F(3; 2) = .960 – .801
= .159
The probability that the reaction time is more than 4 sec is
P(X > 4) = 1 – P(X  4) = 1 – F(4; 2)
= 1 – .908
= .092
127
The Gamma Distribution
The incomplete gamma function can also be used to
compute probabilities involving nonstandard gamma
distributions. These probabilities can also be obtained
almost instantaneously from various software packages.
Proposition
128
Example 4.24
Suppose the survival time X in weeks of a randomly
selected male mouse exposed to 240 rads of gamma
radiation has a gamma distribution with  = 8 and  = 15.
(Data in Survival Distributions: Reliability Applications in the
Biomedical Services, by A. J. Gross and V. Clark, suggests
  8.5 and   13.3.)
The expected survival time is E(X) = (8)(5) = 120 weeks,
whereas V(X) = (8)(15)2 = 1800 and
x =
= 42.43 weeks.
129
Example 4.24
cont’d
The probability that a mouse survives between 60 and 120
weeks is
P(60  X  120) = P(X  120) – P(X  60)
= F(120/15; 8) – F(60/15; 8)
= F(8;8) – F(4;8)
= .547 –.051
= .496
130
Example 4.24
cont’d
The probability that a mouse survives at least 30 weeks is
P(X  30) = 1 – P(X < 30)
= 1 – P(X  30)
= 1 – F(30/15; 8)
= .999
131
The Chi-Squared Distribution
Definition
132
4.5
Other Continuous
Distributions--skip
Copyright © Cengage Learning. All rights reserved.
133
4.6
Probability Plots
Copyright © Cengage Learning. All rights reserved.
134
Probability Plots
An investigator will often have obtained a numerical sample
x1, x2,…, xn and wish to know whether it is plausible that it
came from a population distribution of some particular type
(e.g., from a normal distribution).
For one thing, many formal procedures from statistical
inference are based on the assumption that the population
distribution is of a specified type. The use of such a
procedure is inappropriate if the actual underlying
probability distribution differs greatly from the assumed
type.
135
Probability Plots
The essence of such a plot is that if the distribution on
which the plot is based is correct, the points in the plot
should fall close to a straight line.
If the actual distribution is quite different from the one used
to construct the plot, the points will likely depart
substantially from a linear pattern.
The details involved in constructing probability plots differ a
bit from source to source. The basis for our construction is
a comparison between percentiles of the sample data and
the corresponding percentiles of the distribution under
consideration.
136
Sample Percentiles
This leads to the following general definition of sample
percentiles.
Definition
Once the percentage values 100(i – .5)/n(i = 1, 2,…, n)
have been calculated, sample percentiles corresponding to
intermediate percentages can be obtained by linear
interpolation.
137
Sample Percentiles
For example, if n = 10, the percentages corresponding to
the ordered sample observations are 100(1 – .5)/10 = 5%,
100(2 – .5)/10 = 15%, 25%,…, and 100(10 – .5)/10 = 95%.
The 10th percentile is then halfway between the 5th
percentile (smallest sample observation) and the 15th
percentile (second-smallest observation).
For our purposes, such interpolation is not necessary
because a probability plot will be based only on the
percentages 100(i – .5)/n corresponding to the n sample
observations.
138
A Probability Plot
If the sample percentiles are close to the corresponding
population distribution percentiles, the first number in each
pair will be roughly equal to the second number. The
plotted points will then fall close to a 45 line .
Substantial deviations of the plotted points from a 45 line
cast doubt on the assumption that the distribution under
consideration is the correct one.
139
Example 4.29
The value of a certain physical constant is known to an
experimenter. The experimenter makes n = 10 independent
measurements of this value using a particular
measurement device and records the resulting
measurement errors (error = observed value – true value).
These observations appear in the accompanying table.
140
Example 4.29
cont’d
Figure 4.33 shows the resulting plot. Although the points
deviate a bit from the 45 line, the predominant impression
is that this line fits the points very well. The plot suggests
that the standard normal distribution is a reasonable
probability model for measurement error.
Plots of pairs (z percentile, observed value) for the data of Example 4.29:
Figure 4.33
141
Example 4.29
cont’d
Figure 4.34 shows a plot of pairs (z percentile, observation)
for a second sample of ten observations 45. The line gives
a good fit to the middle part of the sample but not to the
extremes.
The plot has a well-defined
S-shaped appearance. The
two smallest sample
observations are considerably
larger than the corresponding
z percentiles (the points on
the far left of the plot are well
above the 45 line).
Plots of pairs (z percentile, observed value)
for the data of Example 4.29:
Figure 4.34
142
A Probability Plot
If each observation is exactly equal to the corresponding
normal percentile for some value of , the pairs
(  [ z percentile], observation) fall on a 45 line, which has
slope 1.
This then implies that the (z percentile, observation) pairs
fall on a line passing through (0, 0) (i.e., one with y-intercept
0) but having slope  rather than 1. The effect of a nonzero
value of  is simply to change the y-intercept from 0 to .
143
A Probability Plot
144
Example 4.30
There has been recent increased use of augered cast-inplace (ACIP) and drilled displacement (DD) piles in the
foundations of buildings and transportation structures.
In the article “Design Methodology for Axially Loaded
Auger Cast-in-Place and Drilled Displacement Piles” (J. of
Geotech. Geoenviron. Engr., 2012: 1431–1441),
researchers propose a design methodology to enhance the
efficiency of these piles.
Here are length-diameter ratio measurements based on 17
static pile load tests on ACIP and DD piles from various
construction sites.
145
Example 4.30
The values of p for which z percentiles are needed are (1 .5)/17 = .029, (2 - .5)/17 = .088, … and .971.
146
Example 4.30
cont’d
Figure 4.35 shows the resulting normal probability plot. The
pattern in the plot is quite straight, indicating it is plausible
that the population distribution of dielectric breakdown
voltage is normal.
Normal probability plot for the dielectric breakdown voltage sample
Figure 4.35
147
A Probability Plot
There is an alternative version of a normal probability plot
in which the z percentile axis is replaced by a nonlinear
probability axis. The scaling on this axis is constructed so
that plotted points should again fall close to a line when the
sampled distribution is normal. Figure 4.36 shows such a
plot from Minitab for the breakdown voltage data of
Example 4.30.
Normal probability plot of the breakdown voltage data from Minitab
Figure 4.36
148
A Probability Plot
A nonnormal population distribution can often be placed in
one of the following three categories:
1. It is symmetric and has “lighter tails” than does a normal
distribution; that is, the density curve declines more
rapidly out in the tails than does a normal curve.
2. It is symmetric and heavy-tailed compared to a normal
distribution.
3. It is skewed.
149
A Probability Plot
The result is an S-shaped pattern of the type pictured in
Figure 4.34.
Plots of pairs (z percentile, observed value)
for the data of Example 29:
Figure 4.34
150
A Probability Plot
A sample from a heavy-tailed distribution also tends to
produce an S-shaped plot. However, in contrast to the lighttailed case, the left end of the plot curves downward
(observed < z percentile), as shown in Figure 4.37(a).
Probability plots that suggest a nonnormal distribution:
(a) a plot consistent with a heavy-tailed distribution;
Figure 4.37(a)
151
A Probability Plot
If the underlying distribution is positively skewed (a short
left tail and a long right tail), the smallest sample
observations will be larger than expected from a normal
sample and so will the largest observations.
In this case, points on both ends of the plot will fall above a
straight line through the middle part, yielding a curved
pattern, as illustrated in Figure 4.37(b).
(b) a plot consistent with a positively skewed distribution
Figure 4.37(b)
152
A Probability Plot
A sample from a lognormal distribution will usually produce
such a pattern. A plot of (z percentile, ln(x)) pairs should
then resemble a straight line.
153