Module 5: Multiple Random Variables Lecture – 1: Joint Probability

Module 5: Multiple Random Variables
Lecture – 1: Joint Probability Distribution
1. Introduction
First lecture of this module presents the joint distribution functions of multiple (both discrete
and continuous) random variables. Joint pdf and CDF of bivariate distributions are
particularly discussed with the help of numerical examples.
2. Multiple Random Variables
Previously the theoretical concepts of single random variable have been discussed. In many
cases it may be necessary to deal with more than one random variable within the same
experiment and the same sample space. In this lecture, the theory from single random
variable would be extended to two random variables and then to multiple random variables.
It may be recalled that random variable is a function that maps each points over a sample
space to a numerical value on the real line. Let us consider two (or more) random variables,
both (or all) of them mapping from the same sample space. Accordingly, Multiple Random
Variable may be defined as follows.
An n -dimensional random vector (i.e., vector of random variables) is a function that maps
each and every outcome from the sample space S to R n ( N dimensional Euclidean space).
(For any non-negative integer n , the space of all n -tuples of real numbers forms an ndimensional vector space called N dimensional Euclidean space over R and denoted by R n ,
where R denotes the field of real numbers).
Graphical representation of bivariate random variables requires a three-dimensional form in
which the two horizontal axes represent the two random variables and the pmf or pdf is
measured vertically.
Sample
Space
Fig. 1. Graphical representation of bivariate random variables
For simplicity first two random variables (Bivariate Random Variables), will be considered
here, i.e n  2 , so that our the random vector becomes a ordered pair  X , Y  . Bivariate
random variables may be discrete or continuous.
3. Bivariate Random Variables
Many real life situations in the field of civil engineering require consideration of two or more
random variables. For example, average rainfall over a catchment area and volume of
streamflow passing through the outlet of the catchment over a period of time. If other
variables, such as, depth of ground water table is also considered then it becomes
multivariate.
4. Probability Distribution Function
The probability of two events, A  X  x , B  Y  y defined as functions of x and y
respectively, are called Cumulative Distribution Functions (CDF).
FX  x   P X  x  FY  y   PY  y  To consider the joint event X  x, Y  y , a concept called joint distribution function is
used.
5. Joint Probability Distributions of Discrete Bivariate RV
If X and Y are two random variables, the joint probability distribution of X and Y is a
description of the set of points  x, y  in the range of  X , Y  along with the probability of each
point. Joint cumulative distribution function of X and Y denoted by FXY  x, y  is given by
FX ,Y  x, y   P X  x   Y  y  The pair  X , Y  is referred as the Bivariate random variable.
Joint probability distribution is also referred to as Bivariate Probability Distribution or
Bivariate Distribution for the case of two random variables and generalized to any number of
random variables as Multivariate Distribution.
5.1 Properties of Joint Distribution Function
As a probability function, FXY  x, y  holds certain properties:
1. 0  FX ,Y  x, y   1 for    x  ,  y  ,
2. FXY  x, y  is nonnegative, and a nondecreasing function of x and y
3. FX ,Y (, )  1; FX ,Y (,)  0 4. FX ,Y (, y )  0; FX ,Y (, y )  FY  y  5. FX ,Y ( x,)  0; FX ,Y ( x,)  FX  x 
5.2 Joint PMF and CDF of Discrete RV
The joint probability mass function of two discrete random variables  X , Y  describes how
much probability mass is concentrated on each possible pairs of  x, y  . It is given by the
intersection probability.
p X ,Y x, y   P X  x   Y  y 
 x, y   S The joint cumulative distribution function is the sum of probabilities associated with all point
pairs xi , y i  in the subset xi  x, y i  y. It is given by
FX ,Y  x, y  
 p x , y 
X ,Y
xi  x; y j  y 
i
j
5.3 Properties of Joint PMF
1. 0  p X ,Y  x, y   1 2.
 p  x, y   1
 x , y S
3.
X ,Y
 
all X  x all Y  y
p X ,Y  x, y   FXY  x, y  5.4 Problem on Joint PMF
Q. The joint pmf of two random variables X and Y is given by
k ( 2 x  5 y )
p X ,Y x, y   
0

x  1,2; y  1,2
otherwise
What is value of k ?
Soln.
From the properties of joint pmf,
 p  x, y   1  x , y S
X ,Y

all x all y
2
2
p X ,Y  x, y    k (2 x  5 y )
Thus, k 
x 1 y 1
 k 2  5  2  10   4  5  4  10  42 k  1
1
.
42
Q. Streamflows at two gauging stations on two nearby tributaries are categorized into four
different states, i.e., 1, 2, 3 and 4. These catagories are represented by two random variables
X and Y respectively for two tributaries. pmf of streamflow categories  X and Y  are
shown in the table on the next slide. Calculate the probability of X  Y .
Y 1
Y 2
Y 3
Y 4
p X x 
X 1
0.310
0.060
0.000
0.000
0.370
X 2
0.040
0.360
0.010
0.000
0.410
X 3
0.010
0.025
0.114
0.030
0.179
X 4
0.010
0.001
0.010
0.020
0.041
pY  y 
0.370
0.446
0.134
0.050
1
Soln.
Let the probability P  A represent the event X  Y . This will include the set 2,1 , 3, 2 ,
3,1, 4, 3, 4, 2 and 4,1.
Thus, probabilities of these sets should be added up to the required probability.
Thus according to joint probability mass function, the probability is given by:
p X ,Y  x, y   P X  Y 


all possible x  y
p X ,Y x, y 
 p X ,Y 2,1  p X ,Y 3,2   p X ,Y 3,1  p X ,Y 4,3
 p X ,Y 4,2   p X ,Y 4,1
 0.040  0.025  0.010  0.010  0.001  0.010  0.096
6. Joint Pdf and CDF of Continuous Random Variables
Let X , Y be the continuous random variable, then probability for the event x1  X  x2  and  y1  Y  y 2  is defined by the integration of joint pdf over the region of interest in the sample
space.
Px1  X  x 2    y1  Y  y 2  
x2
y2
f
x1 y1
X ,Y
x, y dydx Graphically, the equation represents the volume under the joint pdf, f XY  x, y  over the region
of interest (refer Fig 2).
Fig. 2. Joint pdf of continuous RVs
6.1 Properties of Joint pdf
1.
f X ,Y x, y   0 
2.

  f x, y dxdy  1 X ,Y
  
6.2 Joint CDF of continuous variables
The joint distribution of  X , Y  can be completely described with their joint CDF
F X ,Y x, y   P   X  x      Y  y  
x
y
f
X ,Y
  
Fig. 3. Joint CDF of continuous RVs
 ,  dd The relationship between jont pdf and joint CDF is given by:
f X ,Y  x, y  
2
FX ,Y  x, y  xy
It may be noted that partial derivatives are used in place of the derivatives for bivariate /
multivariate cases.
6.3 Problem on Joint pdf
Q. A storm event occurring at a point in space is characterized by two variables, namely, the
duration X of the storm, and its intensity Y , which is defined as the average rainfall rate. The
variables X and Y are taken to be distributed as follows:
FX  x   1  e  x , x  0
FY  y   1  e
2 y
,y0
The joint CDF of X and Y is assumed to follow exponential bivariate distribution given by:
FX ,Y  x, y   1  e  x  e 2 y  e  x  2 y cxy
x, y  0
with c denoting a parameter describing the joint variability of the two variates. Find the
possible values that c can take.
Soln.
The lower and upper boundary of c has to be determined.
First, let us determine the lower bound of c .
Now,
P  X  x, Y  y   P  X  x 
Thus, 1  e  x  e 2 y  e  x  2 y cxy  1  e  x which gives,  x1  cy   0 Since x and y are always nonnegative, the inequality holds, if and only if 1  cy   0 Now, let us determine the upper bound of c .
We know, f X ,Y  x, y  
2
FX ,Y  x, y  xy
Differentiating the CDF w.r.t x , we have
F  1  e  x  e 2 y  e  x  2 y cxy   x

 e  1  cy e  x  2 y cxy
x
x
Now differentiating the above equation w.r.t y
f X ,Y  x, y  
 2 F  e  x  1  cy e  x 2 y cxy 

xy
y
 1  cy 2  cx   c e  x  2 y cxy
For x  y  0 , the joint pdf at the origin is f XY 0,0   2  c .
Since the pdf is a nonnegative function, the inequality 2  c   0 must hold; hence, the upper
bound of parameter c is c  2 .
7. Concluding Remarks
Joint pdfs and CDFs of bivariate random variables are discussed in this lecture. Example
problems on joint distributions are also presented here. The next lecture presents marginal
probability distributions of discrete and continuous random variables.