( ) F x . If we define any function of X , say

Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
Review for the previous lecture
Definition: cdf, pmf, pdf, identically distributed
Theorem: How to determine the cdf, pmf, or pdf.
Example: how to calculate cdf, pmf, or pdf.
Chapter 2 – Transformations and Expectations
Chapter 2.1 – Distributions of Functions of a Random Variable
Problem: Let X be a random variable with cdf FX ( x) . If we define any function of X , say Y = g ( X ) , then
Y = g ( X ) is also a random variable whose distribution depends on FX and the function g . Specifically, for any set
A,
P (Y ∈ A) = P ( g ( X ) ∈ A).
Formally, if X ∈ X and Y ∈ Y , then g ( X ) is a mapping from the sample space of X , X , to the sample space of Y ,
Y , i.e., g ( x) : X → Y .
To go from Y back to X , we define the inverse function of g , denoted by g −1 , as
g −1 ( A) = {x ∈ X : g ( x) ∈ A} .
If A is a set that only contains a single point, say A = { y} , then
g −1 ( y ) = {x ∈ X : g ( x) = y} .
1
Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
Therefore, for the random variable Y = g ( X ) ,
P(Y ∈ A) = P( g ( X ) ∈ A) = P({x ∈ X : g ( x) ∈ A}) = P( X ∈ g −1 ( A)) .
Discrete Case: Let X be a discrete random variable with pmf f X ( x) . In this case X is a countable set. Define
Y = g ( X ) for some function g so that Y = { y : y = g ( x), x ∈ X} is also a countable set. Then the pmf of Y is given
by.
fY ( y ) = P(Y = y ) = ∑ x∈g −1 ( y ) P( X = x) = ∑ x∈g −1 ( y ) f X ( x) .
Example 2.1.1 (Binomial transformation): Let X be a discrete random variable with pmf (known as the
Binomial pmf):
n
f X ( x) = P( X = x) =   p x (1 − p ) n− x , x = 0,1,", n ,
 x
where n is a positive integer and 0 ≤ p ≤ 1 . In this case, we call n and p as parameters of a Binomial distribution.
Consider Y = g ( X ) = n − X so that Y = {0,1,", n} has the same sample space with X. Then
 n  n− y
fY ( y ) = P(Y = y ) = P (n − X = y ) = 
p (1 − p ) y , y = 0,1,", n .

n − y
Y also has a Binomial distribution but with parameters n and 1 − p .
Continuous Case: The cdf of Y = g ( X ) is:
2
Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
FY ( y ) = P(Y ≤ y ) = P( g ( X ) ≤ y ) = P({x ∈ X : g ( x) ≤ y}) = ∫
{ x∈X:g ( x )≤ y }
f X ( x)dx.
Sometimes, it is difficult to find {x ∈ X : g ( x) ≤ y} .
Example 2.1.2 (Uniform transformation): Suppose X has a uniform distribution on the interval (0,2π ) , that is
1/(2π ) 0 < x < 2π
f X ( x) = 
.
0
Otherwise

Consider Y = sin 2 ( X ) . Then
FY ( y ) = P(Y ≤ y ) = P( g ( X ) ≤ y ) = P({x ∈ X : sin 2 ( x) ≤ y}).
Definition: The support set or the support of a distribution or a random variable X is defined as{x ∈ X : f X ( x) > 0} .
Definition: A function g is increasing (or decreasing) if u > v implies g (u ) > g (v) (or g (u ) < g (v) ). If g is
either increasing or decreasing, then g is said to be a monotone function.
Remark:
1. When we transfer {x ∈ X : Y = g ( x) ≤ y} , we only need to consider the support set of X .
2. If g is a monotone, then g is said to be one-to-one and onto from X → Y .
3. If g is a monotone, then g −1 ( y ) has a single value and g −1 is a monotone function.
3
Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
Theorem 2.1.3: Let X have cdf FX ( x) , let Y = g ( X ) , and let X and Y be X = {x : f X ( x) > 0} and
Y = { y : y = g ( x) for some x ∈ X} , then
a. If g is an increasing function on X , then FY ( y ) = FX ( g −1 ( y )) for y ∈ Y .
b. If g is an decreasing function on X and X is a continuous random variable, then FY ( y ) = 1 − FX ( g −1 ( y )) for
y ∈Y .
Proof:
a. If g is an increasing function, then
FY ( y ) = P({x ∈ X : g ( x) ≤ y}) = P ({x ∈ X : g ( x) ≤ y})
= P({x ∈ X : g −1 ( g ( x)) ≤ g −1 ( y )})
= P({x ∈ X : x ≤ g −1 ( y )})
= FX ( g −1 ( y )).
b. If g is an decreasing function and X is continuous random variable, then
FY ( y ) = P({x ∈ X : g ( x) ≤ y}) = P({x ∈ X : g ( x) ≤ y})
= P({x ∈ X : g −1 ( g ( x)) ≥ g −1 ( y )})
= P({x ∈ X : x ≥ g −1 ( y )})
= 1 − P ({x ∈ X : x < g −1 ( y )}) = 1 − FX ( g −1 ( y ))(continuity).
Example 2.1.4 (Uniform-exponential relationship-I): Obtain the cdf of Y = − log( X ) if X has a uniform
distribution at (0,1): f X ( x) = 1 if 0 < x < 1 , and f X ( x) = 0 for other x .
Solution:
4
Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
1. The support set of X is (0,1) and X is a continuous random variable
2. Therefore, Y = (0, ∞) .
3. g ( x) = − log x is a decreasing function and g −1 ( y ) = e − y .
4. FY ( y ) = 1 − FX ( g −1 ( y )) = 1 − e − y ( y ∈ (0, ∞))
Question: How do we get the pdf of Y = g ( x) from the pdf of X ?
Theorem 2.1.5: Let X have pdf f X ( x) and let Y = g ( X ) , where g is a monotone function. Let X and Y be
X = {x ∈ X : f X ( x) > 0} and Y = { y : y = g ( x) for some x ∈ X} . Suppose that f X ( x) is continuous on X and that
g −1 ( y ) has a continuous derivative on Y . Then the pdf of Y is
d −1

−1
 f X ( g ( y )) | g ( y ) | y ∈ Y
dy
fY ( y ) = 
0
Otherwise.

Proof: Application of Theorem 2.1.3.
Example 2.1.6: (Inverted gamma pdf): Let f X ( x) be the gamma pdf
f X ( x) =
1
x n−1e − x / β , 0<x<∞.
n
(n − 1)!β
where β is a positive constant and n is a positive integer. Let Y = 1/ X . Obtain the pdf fY ( y ) of Y .
Solution: Y = (0, ∞) and g is a decreasing function, g −1 ( y ) = 1/ y , thus we have
5
Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
d −1
( g ( y )) = f X (1/ y )) − 1/ y 2 )
dy
1
1 −1/( β y ) 1
1
1 −1/( β y )
e
e
=
=
.
2
n
n −1
n
y
(n − 1)!β y
(n − 1)!β y n+1
fY ( y ) = f X ( g −1 ( y ))
In this case, we call fY ( y ) as the inverted gamma pdf.
Question: What happens if g is not a monotone transformation?
Example 2.1.7 (Square transformation): Let X be a continuous random variable with cdf FX ( x) and pdf f X ( x) .
Find the cdf and pdf of Y = X 2 .
Solution:
FY ( y ) = P(Y ≤ y ) = P ( X 2 ≤ y ) = P(− y ≤ X ≤ y ) = FX ( y ) − FX (− y ) ,
fY ( y ) =
d
1
1
FY ( y ) =
fX ( y ) +
f X (− y ) .
dy
2 y
2 y
Theorem 2.1.8: Let X have a pdf f X ( x) and Y = g ( X ) , and define fX(x), let Y = g(X), and define the sample
space X = {x ∈ X : f X ( x) > 0} . Suppose there exists a partition of X , A0 , A1 ,", Ak , such that P( X ∈ A0 ) = 0 and
f X ( x) is continuous on each Ai . Further, suppose there exist functions g1 ( x),", g k ( x) , defined on A1 ,", Ak ,
respectively, satisfying
i. g ( x) = gi ( x), for x ∈ Ai .
6
Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
ii. gi ( x) is monotone on Ai
iii. the set Y = { y : y = gi ( x) for some x ∈ Ai } is the same for each i = 1,2,", k , and
iv. gi−1 ( y ) has a continuous derivative on Y , for each i = 1, . . . ,k, then.
d −1
 k
−1
∑ i =1 f X ( gi ( y )) | gi ( y ) | y ∈ Y
dy
fY ( y ) = 
0
otherwise.

Note: If any of the conditions of Theorem 2.1.8 is not satisfied, it will be very difficult to find the distribution of
Y = g( X ) .
Example 2.1.9 (Normal-chi squared relationship): Let X have a standard normal distribution, i.e.,
f X ( x) =
1 − x2 / 2
e
, −∞ < x < ∞ .
2π
Show that Y = X 2 has pdf
fY ( y ) =
1 1 −y/2
e
, 0 < y < ∞.
2π y
In this case, the pdf of Y is the pdf of a chi squared random variable with 1 degree of freedom.
Theorem 2.1.10 (Probability integral transform): Let X have continuous cdf FX ( x) and define the random
variable Y as Y = FX ( X ) . Then Y is uniformly distributed on (0, 1), i.e., P(Y ≤ y ) = y , 0 < y < 1 .
Proof: Define FX−1 ( y ) = inf{x : FX ( x) ≥ y} for 0 < y < 1, then FX−1 ( y ) is increasing (why?);
7
Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
P(Y ≤ y ) = P ( FX ( X ) ≤ y )
= P ( FX−1[ FX ( X )] ≤ FX−1 ( y )) (FX−1 is increasing)
= P ( X ≤ FX−1 ( y ))
= FX ( FX−1 ( y ))
(definition of FX )
= y.
(continuity of FX ).
Note: We require that X is a continuous random variable. If X is not continuous, many properties of FX−1 ( y ) used
in the proof do not hold anymore.
Example: Suppose that a cdf in example 1.5.2, what is FX−1 ( y ) .
Solution:
0 if − ∞ < x < 0
1/8 if 0 ≤ x < 1

FX ( x) = 4/8 if 1 ≤ x < 2
7/8 if 2 ≤ x < 3

1 if 3 ≤ x < ∞
3 if 7/8<y ≤ 1
2 if 4/8<y ≤ 7/8

−1
FX ( y ) = 1 if 1/8<y ≤ 4/8
0 if 0<y ≤ 1/8

−∞ if y=0
Note: Theorem 2.1.10 has a very important application. If one is interested in generating a random variable from a
population with cdf FX ( x) , one only needs to generate a uniform random number u , between 0 and 1, and solve
for x in the equation FX ( x) = u .
8
Lecture 6 on BST 631: Statistical Theory I – Kui Zhang, 09/07/2006
Example: Let X be a continuous random variable with cdf FX ( x) = 1 − e − x . In this case, we say that X has an
exponential distribution. Then if U = FX ( X ) , then
then
u = FX ( x) = 1 − e − x ⇒ x = − log(1 − u ).
9