Tutorial 2 Discrete random variable Probability

Tutorial 2
Discrete random variable
Probability distribution of discrete random variable
Trials
Prepared by:
Wojciech Artichowicz Phd Eng.
GUT Department of Hydro-Engineering
Summer 2014/15
1
DISCRETE RANDOM VARIABLE AND ITS PROBABILITY DISTRIBUTION ................ 3
TRIALS .................................................................................................................................... 13
LITERATURA ......................................................................................................................... 16
2
DISCRETE RANDOM VARIABLE AND ITS PROBABILITY DISTRIBUTION
Exercise 1. (Source [12])
Make a plot of the probability distribution function (PDF) and cumulative distribution
function (CDF) of random variable X, for which: P( X  0)  101 , P( X  1)  109 . Find its
expected value and standard deviation.
Solution:
The plot is made by marking the given points at horizontal axis (here X=0 and X=1), which
have assigned probabilities (here 1/10 and 9/10).
Expected value is given by:
E ( X )   x  P( x)
xD
thus
E( X )  0 
Variance
(which
1
9
9
 1 
10
10 10
is
squared
standard
deviation) is given with:

   x  E ( X )   p ( x)
V ( X )  E  X  E ( X ) 
2
2
xD
2
2
9 1 
9 9
90
9

V ( X )   0     1    

 10  10  10  10 1000 100
Standard deviation is a square root of variance:
s( X )  V ( X ) 
9
3

100 10
Exercise 2.
Specify a random variable of an experiment consisting of tossing a cubic die. Make a plot of
its probability distribution and cumulative probability distribution. Find its expected value and
variance.
Solution:
Assuming that the die is a standard one with dots on its facets, the simplest approach is to
create a random variable assigning integer number to the number of dots in a facet. That is
3
numbers 1, 2, ..., 6. Probability of obtaining any of random variables values is equal to 1/6.
The probability distribution in that case is:
x
1
2
3
4
5
6
p
1/6
1/6
1/6
1/6
1/6
1/6
Expected value of a discrete random variable is
given with
E ( X )   x  p ( x)
xD
E( X )  1 16  2  16  ...6  16  72  3,5 .
Variance of a discrete random variable is given by:

   x  E ( X )   p ( x) .
V ( X )  E  X  E ( X ) 
2
2
xD
V ( X )  (1  3,5) 2  16  (2  3,5) 2  16  ...  (6  3,5) 2  16  2,917 .
Exercise 3.
Find the probability distribution function, cumulative distribution function, expected value
and variance of a random variable X – a sum of number of dots obtained as a result of tossing
two tetrahedral dice. Make a plot of PDF and CDF.
Solution:
The easiest approach is to create a table, and write possible outcomes, numbers of favourable
events and their probabilities.
x
2
3
4
5
6
7
8
1+3, 2+2, 1+4, 2+3, 2+4, 3+3,
3+4, 4+3 4+4
 1+1 1+2, 2+1
3+1
3+2, 4+1
4+2
p 1/16
2/16
3/16
4/16
3/16
2/16
1/16
F 1/16
3/16
6/16
10/16
13/16
15/16
16/16
4
Expected value:
E ( X )   x  p( x)  2  161  3  162  ...  8  161  5 .
xD
Variance:
V ( X )   ( x  E ( X )) 2  p ( x)  (2  5) 2  161  (3  5) 2  162  ...  (8  5) 2  161  52 .
xD
Exercise 4. (Source [4])
Discrete random variable has a probability distribution
defined in the table on the right. Make a plot of its
probability distribution and cumulative probability
distribution function. Find its expected value and
x
1
3
6
8
p
0.2
0.1
0.4
0.3
variance.
Solution:
Cumulative distribution function:
x
1
3
6
8
F
0.2
0.3
0.7
1
Expected value:
E ( X )   x  p( x)  1 0.2  3  0.1  6  0.4  8  0.3  5.3 .
xD
5
Variance:
V ( X )   ( x  E ( X )) 2  p( x)  (1  5.3) 2  0.2  (3  5.3) 2  0.1  (6  5.3) 2  0.4  (8  5.3) 2  0.3  6.61
xD
Exercise 5. (Source [4])
In a batch consisting of ten items, there are eight of a standard type (remaining two are nonstandard types). Two items are randomly chosen. Find the probability distribution of the
number of standard items among chosen two. Find expected value and variance. Make a plot
of probability distribution function and cumulative distribution function.
Solution:
A first step is to find the probabilities of obtaining a certain number of standard items among
chosen ones. There are following possibilities:
- none standard items among chosen (both are non-standard ones):
P( X  0) 
2 1 1
;
 
10 9 45
- one item is of standard type, one is of non-standard type:
P( X  1) 
2 8 8 2 16
;
   
10 9 10 9 45
- two standard items were chosen:
P( X  2) 
8 7 28
.
 
10 9 45
6
The values obtained by random variable are defined by the number of
standard items among two chosen, that is x=0, 1, 2. The table on the
x
0
1
2
p
1
45
16
45
28
45
F
1
45
17
45
1
right contains random variable values, probabilities and cumulative
probability function values.
Expected value:
28
8
E ( X )   x  p( x)  0  451  1  16
45  2  45  5  1.6
xD
Variance:
V ( X )   ( x  E ( X ))2  p ( x) 
xD
8 2 28
 (0  85 ) 2  451  (1  85 ) 2  16
45  (2  5 )  45 
64
225
 0.284444
Exercise 6. (Source [4])
There are six white and three black balls in the box. Randomly two balls are chosen. Find the
probability distribution of the random variable describing the number of white balls among
chosen ones. Make a plot of its probability distribution and cumulative distribution function.
Solution:
Probabilities that none, one and two white balls will be chosen are equal to:
P( X  0) 
P( X  1) 
3 2 1
  ,
9 8 12
3 6 6 3 1
    ,
9 8 9 8 2
P( X  2) 
6 5 5
  .
9 8 12
7
Probability distribution and cumulative distribution
function:
x
0
1
2
p
1
12
1
2
5
12
F
1
12
7
12
1
Exercise 7. (Source [12])
Make a plot of probability distribution function and cumulative distribution function F(x) of a
random variable X, for which:
a) P( X  0)  P( X  1)  12 ;
b) P( X  1)  14 , P( X  2)  34 ;
c) P( X  k ) 
1
, k = 1, 2, 3, ...
2k
Find F(0), F(1), F(2). Mark those values on the plots. In c) find expected value and variance.
Solution:
The value of cumulative distribution function in a given point x0 is a sum of probabilities for
values of x less than or equal to x0.
a)
F (0) 
1
2
F (1)  1
F (2)  1
8
b)
F (0)  0
F (1) 
1
4
F (2)  1
c) Probabilities that a random variable will get the certain value is described with
P( X  k )  1/ 2k , for k = 1, 2, 3, ... . Cumulative distribution function sums up all the
probabilities for all values up to x. Cumulative distribution function will be expressed as:
x
F ( x)  
i 1
1
. Finding following values of P(x) and F(x) one gets
2k
P(0)  0 and F (0)  0 ,
P(1) 
P ( 2) 
1
1
and F (1)  ,
2
2
2
1
1
1 1 1 3
1
and F (2)   k  1  2    .
2 2
2 4 4
4
i 1 2
Remark: The plot does not include the whole range of values, only x=0,1 ,2, ..., 10.
9
Expected value is given with formula E ( X )   x  p ( x) . After introduction of probability
xD
distribution function it yields:

E( X )   x 
x 1
1
.
2x
The expected value is a sum of the series

x
 2.
x
x 1 2
E( X )  
Remark: The analysis of convergence of series is not a part of statistics course. It is wise to use encyclopaedia of
mathematics or software allowing to solve such problems, for example Wolfram Alpha.
Variance is defined as V ( X )   ( x  E ( X )) 2  p ( x) , thus:
xD
( x  2) 2
2.
2x
x 1

V (X )  
Remark: Wolfram Alpha outcome.
Exercise 8. (Source [4])
Discrete random variable X takes three possible values: x1=4 with probability p1=0.5, x2=6
with probability p2=0.3, x3 with p3. Find x3 and p3 knowing, that E(X)=8.
Solution:
To solve this exercise it is necessary to use expected values formula:
n
E ( X )   xi  pi .
i 1
Here n=3, so writing above formula yields:
3
E ( X )   xi  pi  x1  p1  x2  p2  x3  p3 .
i 1
Given values are E(X), p1, x1, p2, x2. So it can be written:
x1  p1  x2  p2  x3  p3  E( X ) .
Introducing values yields:
4  0.5  6  0.3  x3  p3  8 .
Rearranging this equation gives:
x3  p3  4,2 .
10
There are two unknowns and one equation. Using axiomatic definition of probability allows
to write that the sum of probabilities has to be equal to one:
p1  p2  p3  1 .
From this equation p3 can be found:
0,5  0,3  p3  1,
p3  0,2 .
Introducing this value into
x3  p3  4,2
gives x3:
x3  0,2  4,2 ,
x3  21.
The answer is x3  21 and p3  0,2 .
Exercise 9. (Source [4])
A random variable X takes the following values: x1=-1, x2=0, x3=1. Its expected value and
expected value of its square are equal E(X)=0.1 and E(X2)=0.9. Find the probabilities p1, p2
and p3, assigned to values x1, x2 and x3.
Solution:
There are three unknowns in the exercise. Basing only on the contents of the exercise two
equations can be written. First one regards to expected value:
n
E ( X )   xi  p i .
i 1
E(X2) means that values taken by the random variable are squared and multiplied by
probabilities assigned to these values
n
E ( X 2 )   xi2  pi .
i 1
n
Additionally it is known that
p
i 1
i
 1 , which can be missing equation. Following system of
equations is obtained:
11
n
 xi  pi  E ( X )
 i 1
 x1  p1  x2  p2  x3  p3  E ( X )
n 2
 2
2
2
2
2
 xi  pi  E ( X )   x1  p1  x2  p2  x3  p3  E ( X ) ,
 i 1
p  p  p 1
2
3
 1
n
 pi  1
 i 1
which after introduction of numerical values takes the following form:
 1  p1  0  p 2  1  p3  0.1

2
2
2
(1)  p1  0  p 2  1  p3  0.9 .
p  p  p 1
2
3
 1
The remaining step is to solve this system of equations:
2 p3  1
p3  0.5
p1  0.9  p3  0.4
 p1  p3  0.1

 p1  p3  0.9
p  p  p 1
2
3
 1
p2  1  p1  p3  0.1
Finally the solution is obtained p1  0.4 , p2  0.1 , p3  0.5 .
Exercise 10. (Source [11])
Random variable X
takes values xk with probabilities pk=c∙qk, where 0<q<1,
k=0,1, 2, ... . Find the constant c.
Solution:
The sum of probabilities for every k has to be
equal to 1 (based on the definition of
 qk
k 0
geometric series, with sum equal to:

q
k 0
k

1
.
1 q
Remark: Wolfram Alpha outcome.
k 0
k
1


probability). If 0<q<1 then

p
is a
cq
k
1
k 0

c   qk  1
k 0
c
1
1
1 q
c 1  q
12
TRIALS
Exercise 11. (Source [12])
The probability of the event, that a statistical student is not prepared for classes is equal to
p=1/3. The teacher randomly chooses 4 students. Let X denote number of students among
chosen which are not prepared for the classes. Find P(X=3).
Solution:
Among four chosen students all can be prepared for classes (meaning that X=0), one can be
unprepared (X=1), etc. To find the probability P(X=3) that three of four chosen students will
be unprepared, every such situation has to be considered. That is first, second and third
student is unprepared, fourth is prepared, first, second and fourth student is unprepared, third
one is prepared and so on. That is
1 1 1 2 1 1 2 1 1 2 1 1 2 1 1 1 2 2 2 2
8
P( X  3)                      .
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 81 81 81 81 81
It can be noticed, that the probability P(X=3) is equal to the sum of probabilities of three
unprepared students out of four in each permutation:
3
1 2 8
P( X  3)  4      .
 3  3 81
The solution above was obtained by applying the multiplication theorem.
This exercise can also be solved using different approach: let us ask how many ways there are
to choose three out of four students (assuming that the order is meaningless). It is a
combination C3, 4  4 . Multiplying the number of such possibilities C3, 4 by the probability of
choosing three unprepared students out of four chosen gives:
3
1 2 8
P( X  3)  C3, 4      .
 3  3 81
It can be noticed, that it turns out to be the binomial random variable
13
P( X  k )  C k , N  p k  q N k ,
in which N=4, k=3, p=1/3, q=2/3.
Exercise 12. (Source [4])
Two equally skilled players are playing chess. What is more probable for each of them: To
win two out of four games, or three out of six? Let us assume that the game draw is
impossible.
Solution:
If the players are equally skilled it means that the probability of winning for each of them is
equal to p=1/2. Analogically the probability of losing the game is equal to q=1-p=1/2.
Binomial distribution allows to easily compute the probability of given event occurring k
times during N trials. The first question regards two wins among four games k=2, N=4.
P( X  k )  C k , N  p k  q N k ,
P ( X  2)  C 2 , 4  p 2  q 4  2 ,
2
2
4!
1 1 3
P( X  2) 
     .
2!(4  2)!  2   2  8
Second question is about three wins among six games k=3, N=6:
3
3
6!
5
1 1
P( X  3) 
     .
3!(6  3)!  2   2  16
Winning two times out of four is more probable:
P( X  2) 
6
5
  P( X  3) .
16 16
Exercise 13. (Source [4])
Prove that the expected value of a discrete random variable X – number of occurrences of
event A (where p=P(A)) in N independent trials is equal toE(X)=N.p.
14
Solution:
Number of occurrences of an event with constant probability in N independent trials is
defined by Binomial distribution:
P( X  k )  C k , N  p k  q N k ,
N
N
N
k 1
k 1
k 1
E ( X )   k  P( X  k )   k  C k , N  p k  q N k   k 
N!
 p k  q N k .
k!( N  k )!
Let us notice, that
k
1
1

k! (k  1)!
thus
N
N!
 p k  q N k ,
(
k

1
)!

(
N

k
)!
k 1
E( X )  
N  ( N  1)!
 p k  q N k ,
(
k

1
)!

(
N

k
)!
k 1
N
E( X )  
N  ( N  1)!
 p  p k 1  q N k .
k 1 ( k  1)!( N  k )!

N
E( X )  

Excluding N and p before the summation operator:
( N  1)!
 p k 1  q N k .
k 1 ( k  1)!( N  k )!
N
E( X )  N  p  
Next so called binomial formula has to be used:
n
n!
 a i  b n i .
i
!

(
n

i
)!
i 0
( a  b) n  
To do this it is necessary to change the range of indexes in the summation operator from k=1,
2, ..., N to z k=0, 1, 2, ..., N-1. The expression under the summation operator has to be
rearranged so its value remains the same, regardless to the the fact that index range is
changed. To obtain this in the place of k it is necessary to introduce k+1.
N 1
1
1

.


k 1 ( k  1)!
k  0 k!
N
N 1
N
Remark: Testing for N=3, if the sum above didn’t change:

k 1
N
Similarly the expression has to be reformed

k 1
1
( k 1)!
( N 1)!
( k 1)!( N  k )!
 01!  11!  21!   k1! .
k 0
 p k 1  q N k :
N 1
( N  1)!
( N  1)!
k 1
N k

p

q

 p ( k 1)1  q N ( k 1) ,


k 1 ( k  1)!( N  k )!
k 0 ( k  1)  1!( N  ( k  1))!
N
15
( N  1)!
N 1
 k!( N  k  1)!  p
k
 q N k 1 .
k 0
~
Let us introduce N  N  1 .
~
~
N!
 p k  q N k .
~

k 0 k!( N  k )!
~
N
Now it can be noticed easily, that the obtained expression is identical to the binomial formula,
and can be written as:
~
~
~
N!
 p k  q N k  ( p  q) N .
~

k 0 k!( N  k )!
~
N
~
Knowing that N  N  1 yields
( p  q) N 1
thus
E( X )  N  p  ( p  q) N 1 .
It is known, that q=1-p, thus
E ( X )  N  p   p  (1  p)
N 1
,
E( X )  N  p 1N 1 ,
E( X )  N  p
LITERATURA
[1] Buslenko N.P. Golenko D.I., Sobol I.M., Sragowicz W.G., Szrejder J.A. (1976), Metoda
Monte Carlo, Warszawa
[2] Deutsch R. (1969), Teoria Estymacji. Państwowe Wydawnictwo Naukowe. Warszawa.
[3] Devore J. L. (2012), Probability and Statistics for Engineering and the Sciences.
International edition. 8th edition. BROOKS/COLE.
[4] Gmurman W.J. (1976), Zbiór zadań z rachunku prawdopodobieństwa i statystyki
matematycznej. Wydawnictwa Naukowo-Techniczne. Warszawa.
[5] Hastie T., Tibshirani R., Friedman J. (2009), The Elements of Statistical Learning. Data
Mining, Inference, and Prediction. Second Edition. Springer
[6] Greń J. (1970), Statystyka matematyczna. modele i zadania. Wydanie czwarte uzupełnione.
Warszawa.
16
[7] Ligman J., Stachowski E., Zalewska A. (1996), Zbiór zadań z kombinatoryki i rachunku
prawdopodobieństwa dla uczniów szkół średnich. Oficyna wydawniczo-poligraficzna i
reklamowo-handlowa „ADAM” Warszawa.
[8] Ligman J. (1976), Zbiór zadań z kombinatoryki i rachunku prawdopodobieństwa dla szkół
średnich. Wydawnictwa szkolne i pedagogiczne. Warszawa.
[9] Medina M.A. (2011), Introduction To Probability and Statistics in Hydrology. Duke
University, Dyrham
[10] Pawłowski Z. (1976), Statystyka matematyczna. Państwowe Wydawnictwo naukowe.
Warszawa.
[11] Plucińska A., Pluciński E. (2000), Probabilistyka. Wydawnictwa Naukowo-Techniczne.
Warszawa.
[12] Plucińska A., Pluciński E.(1975) Zadania z rachunku prawdopodobieństwa i statystyki
matematycznej dla studentów politechnik. Wydawnictwa Naukowo-Techniczne. Warszawa.
[13] Kaczmarek Z. (1970), Metody statystyczne w hydrologii i meteorologii. Wydawnictwa
komunikacji i łączności. Warszawa.
[14] Koronacki J., Mielniczuk J. (2006), Statystyka dla studentów kierunków technicznych i
przyrodniczych. Wydawnictwa Naukowo-Techniczne. Warszawa.
[15] Korzyński M. (2006), Metodyka eksperymentu. Planowanie, realizacja i statystyczne
opracowanie
wyników
eksperymentów
technologicznych.
Wydawnictwa
Naukowo-
Techniczne. Warszawa.
[16]
Szlenk
W.
(1970),
Rachunek
prawdopodobieństwa
dla
klasy
IV
liceum
ogólnokształcącego i technikum. Wydawnictwa szkolne i pedagogiczne. Warszawa.
[17] Taylor J.R. (2011), Wstęp do analizy błędu pomiarowego. Wydanie II zmienione.
Wydawnictwo Naukowe PWN Warszawa.
[18] Walesiak M., Gatnar E. (2009), Statystyczna analiza danych z wykorzystaniem programu
R. Wydawnictwo Naukowe PWN. Warszawa.
[19] Walpole R.E, Myers R.H., Myers S.L, Ye K., (2007), Probability & Statistics for
Engineers & Scientists. Pearson Education International
[20] Węglarczyk W. (2010), Statystyka w inżynierii środowiska. Politechnika Krakowska,
Kraków.
[21] Yevjevich V. (1972), Probability and statistics in hydrology. Water Resources
Publications, Fort Collins, Colorado, U.S.A
[22] Zaleski J. (2004) Modele stochastyczne i symulacja komputerowa. Zastosowanie do
systemów zaopatrzenia w wodę. Wydawnictwo Naukowe PWN. Warszawa.
17
[23] Zieliński R. (1970), Metody Monte Carlo. Wydawnictwa Naukowo-Techniczne.
Warszawa
[24] Zieliński R. (1974), Rachunek prawdopodobieństwa z elementami statystyki
matematycznej. Wydawnictwa Szkolne i Pedagogiczne. Warszawa.
18