Binomial Random Variable Approximations,
Conditional Probability Density Functions
and Stirling’s Formula
Let X represent a Binomial r.v ,Then from
P(" Occurrence s of A is between k1 and k2 " )
n k n k
P( X k1 X k1 1 X k2 ) P( X k ) p q .
k k1
k k1 k
k2
k2
n k n k
P k1 X k2 Pn (k ) p q .
k k1
k k1 k
k2
k2
for large n. In this context, two approximations are extremely useful.
The Normal Approximation (Demoivre-Laplace Theorem)
Suppose n with p held fixed. Then for k in the
neighborhood of np, we can approximate
n k n k
p q
k
npq
1
( k np ) 2 / 2 npq
e
.
2npq
And we have:
Pk1 X k2
k2
k1
x2
1
1 y2 / 2
( x np )2 / 2 npq
e
dx
e
dy,
x
1
2npq
2
where
x1
k1 np
k np
, x2 2
.
npq
npq
As we know, Pk1 X k 2
n k nk
p q
P
(
k
)
n
k k1
k k1 k
k2
k2
If k1 and k2 are within np npq , np npq , with approximation:
Pk1 X k2
where x1
k2
k1
x2
1
1 y2 / 2
( x np )2 / 2 npq
e
dx
e
dy,
x
1
2npq
2
k1 np
k np
, x2 2
.
npq
npq
We can express this formula in terms of the normalized integral
erf ( x )
1
2
x
0
that has been tabulated extensively.
e
y2 / 2
dy erf ( x )
x
erf(x)
x
erf(x)
x
erf(x)
x
erf(x)
0.05
0.10
0.15
0.20
0.25
0.01994
0.03983
0.05962
0.07926
0.09871
0.80
0.85
0.90
0.95
1.00
0.28814
0.30234
0.31594
0.32894
0.34134
1.55
1.60
1.65
1.70
1.75
0.43943
0.44520
0.45053
0.45543
0.45994
2.30
2.35
2.40
2.45
2.50
0.48928
0.49061
0.49180
0.49286
0.49379
0.30
0.35
0.40
0.45
0.50
0.11791
0.13683
0.15542
0.17364
0.19146
1.05
1.10
1.15
1.20
1.25
0.35314
0.36433
0.37493
0.38493
0.39435
1.80
1.85
1.90
1.95
2.00
0.46407
0.46784
0.47128
0.47441
0.47726
2.55
2.60
2.65
2.70
2.75
0.49461
0.49534
0.49597
0.49653
0.49702
0.55
0.60
0.65
0.70
0.75
0.20884
0.22575
0.24215
0.25804
0.27337
1.30
1.35
1.40
1.45
1.50
0.40320
0.41149
0.41924
0.42647
0.43319
2.05
2.10
2.15
2.20
2.25
0.47982
0.48214
0.48422
0.48610
0.48778
2.80
2.85
2.90
2.95
3.00
0.49744
0.49781
0.49813
0.49841
0.49865
Example
A fair coin is tossed 5,000 times.
Find the probability that the number of heads is between 2,475 to 2,525.
Solution
We need
P(2,475 X 2,525).
Since n is large we can use the normal approximation.
1
p , so that np 2,500 and npq 35.
2
np npq 2,465, and np npq 2,535,
So the approximation is valid for k1 2,475 and k 2 2,525.
Example - continued
Pk1 X k2
x2
x1
Here,
x1
1 y2 / 2
e
dy.
2
k1 np 5
k np 5
, x2 2
.
7
npq
npq 7
Using the table,
erf (0.7) 0.258 .
P2,475 X 2,525 erf ( x2 ) erf ( x1 )
erf ( x2 ) erf (| x1 |)
5
2erf 0.516
7
The Poisson Approximation
For large n, the Gaussian approximation of a binomial r.v is valid only
if p is fixed, i.e., only if np 1 and npq 1.
What if np is small, or if it does not increase with n?
- for example,
p0
as
n , such that np
is a fixed number.
The Poisson Approximation
Consider random arrivals such as telephone calls over a line.
n: total number of calls in the interval (0, T ).
as T we have n
Suppose n T .
Δ : a small interval of duration
0
n
2
1
T
The Poisson Approximation
p .
T
p: probability of a single call (during 0 to T) occurring in Δ:
p0
as T .
np T
T
Suppose the interval Δ in the figure:
- (H) “success” : A call inside Δ,
- (T ) “failure” :
0
n
2
1
Normal approximation is invalid here.
T
A call outside Δ
Pn (k ): probability of obtaining k calls (in any order) in an interval of duration Δ ,
n!
Pn (k )
p k (1 p) n k
(n k )! k!
The Poisson Approximation
n(n 1) (n k 1) (np )k
n k
Pn (k )
(
1
np
/
n
)
nk
k!
k
n
1 2 k 1 (1 / n )
1 1 1
.
k
n k! (1 / n )
n n
Thus,
lim
n , p 0 , np
Pn (k )
k
k!
e
the Poisson p.m.f
Example: Winning a Lottery
Suppose
- two million lottery tickets are issued
- with 100 winning tickets among them.
a) If a person purchases 100 tickets, what is the probability of winning?
Solution
The probability of buying
a winning ticket
p
No. of winning tickets
100
5
5
10
.
6
Total no. of tickets
2 10
Winning a Lottery - continued
X: number of winning tickets
n: number of purchased tickets ,
np 100 5 105 0.005.
P( X k ) e
k
k!
,
P: an approximate Poisson distribution with parameter
So, The Probability of winning is:
P( X 1) 1 P( X 0) 1 e 0.005.
Winning a Lottery - continued
b) How many tickets should one buy to be 95% confident of having a winning
ticket?
Solution
we need P( X 1) 0.95.
P( X 1) 1 e 0.95 implies ln 20 3.
But np n 5 10 5 3 or
n 60,000.
Thus one needs to buy about 60,000 tickets to be 95% confident of having a
winning ticket!
Example: Danger in Space Mission
A space craft has 100,000 components n
The probability of any one component being defective is 2 10 5 ( p 0).
The mission will be in danger if five or more components become defective.
Find the probability of such an event.
Solution
n is large and p is small
Poisson Approximation with parameter np 100,000 2 10 5 2
4
P( X 5) 1 P( X 4) 1 e
k 0
k
k!
1 e
4 2
1 e 2 1 2 2 0.052.
3 3
2
4
k
k!
k 0
Conditional Probability Density Function
P( A B )
P( A | B )
, P( B) 0.
P( B )
FX ( x) P X ( ) x ,
P X ( ) x B
FX ( x | B) P X ( ) x | B
.
P( B )
P X ( ) B
FX ( | B )
P( B )
P X ( ) B
FX ( | B )
P( B )
P( B )
1,
P( B )
P( )
0.
P( B )
Conditional Probability Density Function
Further,
Since for
P( x1 X ( ) x2 | B )
P x1 X ( ) x2 B
P( B )
FX ( x2 | B ) FX ( x1 | B ),
x2 x1,
X ( ) x2 X ( ) x1 x1 X ( ) x2 .
f X ( x | B)
dFX ( x | B )
,
dx
FX ( x | B)
x
f X (u | B)du.
P x1 X ( ) x2 | B
x2
x1
f X ( x | B )dx.
Example
Toss a coin and X(T)=0, X(H)=1.
B {H }.
Determine FX ( x | B).
Suppose
Solution
FX (x)
has the following form.
We need FX ( x | B)
For x 0,
for all x.
X ( ) x ,
so that
X ( ) x B ,
and FX ( x | B) 0.
FX ( x | B)
FX (x)
1
1
q
1
(a)
x
1
(b)
x
Example - continued
For 0 x 1, X ( ) x T , so that
X ( ) x B T H
For x 1,
and FX ( x | B) 0.
X ( ) x , and
X ( ) x B B {B} and
FX ( x | B)
FX ( x | B)
1
1
x
P( B )
1
P( B )
Example
Given FX (x ), suppose B X ( ) a. Find f X ( x | B).
Solution
We will first determine FX ( x | B).
FX ( x | B )
For x a,
P X x X a
.
P X a
X x X a X x
FX ( x | B )
For x a,
so that
P X x
F ( x)
X
.
P X a
FX ( a )
X x X a ( X a)
so that FX ( x | B) 1.
Example - continued
Thus
FX ( x )
, x a,
FX ( x | B ) FX ( a )
1,
x a,
and hence
f X ( x)
,
d
f X ( x | B)
FX ( x | B ) FX ( a )
dx
0,
FX ( x | B)
x a,
otherwise.
f X ( x | B)
1
f X (x )
FX (x)
a
(a)
x
a
(b)
x
Example
Let B represent the event a X ( ) b with b a.
For a given FX (x ), determine FX ( x | B) and f X ( x | B).
Solution
FX ( x | B ) P X ( ) x | B
P X ( ) x a X ( ) b
Pa X ( ) b
P X ( ) x a X ( ) b
.
FX (b) FX (a )
Example - continued
For
x a,
and hence
we have
X ( ) x a X ( ) b ,
FX ( x | B) 0.
For a x b, we have X ( ) x a X ( ) b {a X ( ) x}
and hence
FX ( x | B )
P a X ( ) x FX ( x ) FX ( a )
.
FX (b) FX ( a )
FX (b) FX ( a )
x b, we have X ( ) x a X ( ) b a X ( ) b
so that FX ( x | B) 1.
For
Thus,
f X ( x)
,
f X ( x | B ) FX (b) FX (a )
0,
a x b,
f X ( x | B)
f X (x )
otherwise.
a
b
x
Conditional p.d.f & Bayes’ Theorem
First, we extend the conditional probability results to random variables:
We know that If U [ A1 , A2 ,, An )] is a partition of S and B is an arbitrary event,
then:
P( B) P( B | A1 ) P( A1 ) P( B | An ) P( An )
By setting B { X x} we obtain:
P( X x) P( X x | A1 ) P( A1 ) P( X x | An ) P( An )
Hence
F ( x) F ( x | A1 ) P( A1 ) F ( x | An ) P( An )
f ( x) f ( x | A1 ) P( A1 ) f ( x | An ) P( An )
A1 , An form a partition on S
Conditional p.d.f & Bayes’ Theorem
Using:
We obtain:
P( A | B )
P( B | A) P( A)
.
P( B )
PA | X x )
P X x | A
P X x
F x | A
P( A)
F x
P( A)
For B x1 X ( ) x2 ,
PA | x1 X x 2
P x1 X x 2 | A
P ( A)
P x1 X x 2
F ( x | A) FX ( x1 | A)
X 2
P ( A)
FX ( x 2 ) FX ( x1 )
x2
f X ( x | A) dx
x1
x2
x1
P ( A).
f X ( x) dx
Conditional p.d.f & Bayes’ Theorem
Let x1 x, x2 x ,
or
0,
so that in the limit as 0,
f X ( x | A)
lim PA | ( x X x ) P A | X x
P( A).
0
f X ( x)
f X | A ( x | A)
P( A | X x ) f X ( x )
.
P( A)
we also get
P( A) f X ( x | A)dx P( A | X x ) f X ( x )dx,
1
or
P( A) P( A | X x ) f X ( x )dx
(Total Probability Theorem)
Bayes’ Theorem (continuous version)
using total probability theorem in
f X | A ( x | A)
P( A | X x) f X ( x)
,
P( A)
We get the desired result
f X | A ( x | A)
P( A | X x ) f X ( x )
P( A | X x ) f X ( x )dx
.
Example: Coin Tossing Problem Revisited
p P(H ) probability of obtaining a head in a toss.
For a given coin, a-priori p can possess any value in (0,1).
f P ( p) : A uniform in the absence of any additional information
After tossing the coin n times, k heads are observed.
How can we update
f P ( p) this is new information?
Solution
Let A= “k heads in n specific tosses”.
f P ( p)
Since these tosses result in a specific sequence,
P( A | P p ) p k q n k ,
and using Total Probability Theorem we get
1
1
0
0
P( A) P( A | P p) f P ( p)dp p k (1 p)n k dp
0
(n k )! k!
.
(n 1)!
1
p
Example - continued
The a-posteriori p.d.f f P| A ( p | A) represents the updated information given the
event A,
Using
f X | A ( x | A)
f P| A ( p | A)
P( A | X x) f X ( x)
,
P( A)
P ( A | P p )f P ( p )
P (A )
(n 1)!
p k q n k , 0 p 1
(n k )!k !
0
1
p
f P |A ( p | A )
( n , k ).
This is a beta distribution.
We can use this a-posteriori p.d.f to make further predictions.
For example, in the light of the above experiment, what can we say about the
probability of a head occurring in the next (n+1)th toss?
Example - continued
Let B= “head occurring in the (n+1)th toss, given that k heads have occurred in n
previous tosses”.
Clearly
P ( B | P p ) p,
From Total Probability Theorem,
Using (1) in (2), we get:
1
P( B) P( B | P p) f P ( p | A)dp.
0
(n 1)!
k 1
k n k
P (B ) p .
p q dp
0
(n k )!k !
n 2
Thus, if n =10, and k = 6, then
1
P( B )
which is more realistic compare to p = 0.5.
7
0.58,
12
© Copyright 2026 Paperzz