A random variable
Random Variables
An experiment
These rules identify a random variable.
• For a given sample space S of some experiment, a
random variable is any rule that associates a
number with each outcome of S.
• More formally: A random variable is a real and
single-valued function for each element in S, i.e., a
random variable X is a function such that:
∀ s ∈ S: X(s) ∈ R
or
X: S→Rx ⊂ R
1
Example
Types of random variables (r.v.)
Experiment : Toss a fair coin three times and
observe the sequence of heads and tails
Random variable : number of H’s in three tosses
S
A discrete r.v. is a r.v. whose possible values either
constitute a finite set or else can be listed in an
infinite sequence in which there is a first element, a
second element and so on, i.e., if S is countable.
e.g. Coin toss, # of defectives, throwing dice
HHH
HHT
HTH
THH
0
1
2
2
A r.v. is continuous if its set of possible values
consists of an entire interval on the number line
(uncountably infinite).
e.g. lifetime of humans, weight or height of
people
3
TTT
THT
TTH
HTT
3
4
Probability mass function
Probability distribution
for discrete r.v.s:
Let X be a r.v. The probability distribution of X tells
how the probability of 1 is distributed among the
various possible X values.
Example:
• The probability distribution or probability mass function (pmf)
of a discrete r.v. is defined for every number x by p(x)=P(X=x),
where:
i) ∀x∈S p ( x) ≥ 0
ii)
∑ p ( x) = 1
x∈S
Example:
Observing two heads after tossing a fair coin three times
1) Toss a fair coin (Heads=1, Tails=0) :
P(X=1) = P(X=0) = 0.5
p(0) = P(X=0) = 1/8
p(1) = P(X=1) = 3/8
p(2) = P(X=2) = 3/8
2) Throw a fair die : P(X=i) = 1/6 ∀i = 1, .., 6
5
p(3) = P(X=3) = 1/8
6
1
A parameter of
a probability distribution
Bernoulli r.v.
Any r.v. whose only possible values are 0 and 1 is called a Bernoulli r.v.
•
1 − α
p ( x ;α ) = α
0
Examples:
1) A product is defective or not :
1
x =
0
defect
nondefect
p(0)=0.95
p(1)=0.05
2) A machine is broken or working :
1
x =
0
working
broken
p(0)=0.1
Consider Bernoulli r.v.s we saw: Each of them has the following
p.m.f.
p(1)=0.9
x = 0
x = 1
otherwise
•
For each different α: p(x; α) yields a different pmf.
⇒
More probability distributions which can be defined through the
same function for p(x) and which depend on parameters changing
by the experiment considered.
7
8
Example: The Bernoulli Tack Factory
Parameter definition
Suppose p(x) depends on a quantity that can be
assigned any one of a number of possible values, with
each different value determining a different
probability distribution. Such a quantity is called a
parameter of the distribution.
The Bernoulli tack factory is manufacturing brass tacks.
some of which are defective. Each new tack is
non-defective with probability p
The collection of all possible distributions for
different values of the parameter is called a family of
probability distribution.
9
Example: The Bernoulli Tack Factory
10
The Binomial Random Variable
What is the number of non-defective tacks produced in n trials ?
11
12
2
The Binomial Random Variable
Binomial experiment
An experiment is called a binomial experiment if:
1) The experiment consists of a sequence of n trials, where n
is fixed in advance
2) The trials are identical and independent with two possible
outcomes, S or F, where the probability of success, p, is
constant.
Examples:
1) n coin tosses: p = p(H)
2) n die rolls: p = p(1)
13
Binomial distribution
Definition: Given a binomial experiment consisting of n trials,
binomial r.v. X associated with this experiment is defined as the
number of successes among the n trials.
We denote the p.m.f. of binomial r.v. X by b(x; n, p).
The p.m.f. of a binomial distribution is given by:
n
b(x; n, p)=p X ( x ) = p x (1-p ) n-x
x
x = 0, 1, 2,....., n
Check: ∑ b(x; n, p) = 1 ??
Example: What is the probability of two H’s in three coin
tosses?
14
Example
• In a nuclear reactor, the fission is controlled by
inserting into the radioactive core a number of
special rods (that absorb neutrons and slow down
the nuclear chain reaction). Assume that a
particular reactor has 10 of these rods, each
operating independently, and each having a
probability 0.8 of functioning properly. A meltdown
is prevented if at least half of the rods perform
satisfactorily. What is the probability that the
system will fail ?
15
16
The Bernoulli Tack Factory
The Geometric random variable
We need to deliver a non-defective item to a customer. How many tacks are
produced until we obtain the first non-defective tack ?
• X is a geometric random variable if X is the total number of
attempts until a “success” in sequential and independent
Bernoulli trials with individual probability of success p.
p X ( x) = (1 − p ) x −1 p
17
x = 1, 2,3...
18
3
Example
The Poisson Random Variable
X: number of the times a fair die is rolled until a ‘1’ occurs
X is a geometric r.v. with p=1/6
Definition: A r.v. X is said to have a Poisson distribution if the
p.m.f. of X is:
p(1) = 1/6
p(2) = (5/6)x(1/6)
.
.
.
p(k) = (5/6)k-1 x(1/6)
pX(x) =
e−λ λ x
x!
x= 0, 1, 2, ....
for some λ > 0.
The value of λ ,is frequently a rate per unit time or per unit
area.
Check: ∑ pX(x) = 1 ??
19
20
The Poisson Distribution as a Limit
This week:
Proposition: Suppose that in the binomial p.m.f. b(x; n, p),
• HW2: due today
we let n→ ∞ and p→ 0 in such a way that np → λ > 0. Then:
b(x; n, p) → p(x; λ)
• HW3: a new procedure
Proof: …
– Go to http://etutor.ku.edu.tr
– Follow the instructions to solve PS3
• PSs on Friday
Thus, when n is large and p is small b(x; n, p ) ≈ p(x; λ) with
λ = np
Rule of thumb: n ≥ 100
p ≤ 0.01
and
np ≤ 20
21
Example
22
Functions of random variables
A publisher of non-technical books takes great pains
to ensure that its books are free of typographical
errors, so that probability of any given page
containing at least one such error is 0.005. Errors are
independent from page to page.
What is the probability that one of its 400-page
novels will contain exactly 1 page with errors? At
most 3 pages with errors?
23
Example: Let X be sum of two dice. Assume that you may
gamble with Y = X mod(3) being the money you earn whenever
you toss X on two dice, or with Z=(X-7)2/10. Find P(Y=0) and
P(Z=0).
X
2
3
4
5
6
7
8
9
10
11
12
Y
2
0
1
2
0
1
2
0
1
2
0
Z
2.5 1.6 0.9 0.4 0.1
0
0.1 0.4 0.9 1.6 2.5
24
4
Functions of random variables
& their distributions
A chess game
• Fischer and Spassky play a chess match in which the
first player to win a game wins the match. After 10
successive draws, the match is declared drawn.
Each game is won by Fischer with probability 0.4, is
won by Spassky with probability 0.3, and is a draw
otherwise, independently of previous games.
a) What is the probability that Fischer wins the
match?
b) What is the pmf of the duration of the match?
Let X be a random variable and g(X) be a function.
Then Y = g(X) is another r.v.
pY ( y ) =
∑p
X
{ x| g ( x ) = y }
( x)
• pmf of Y and Z?
25
Grade of service
26
FB versus GS
• An ISP has c modems and n customers. At a given time each customer needs
a connection with probability p.
Now: c=50, n=1000, p=0.01
a) pmf of # of modems in use?
b) approx. pmf of modems in use?
c) probability that there are more
customers than there are modems?
d) approx. probability that there
are more customers than there
are modems?
FB and GS are set to play a series of n soccer games,
when n is odd. Each match continues until the
golden goal or the result of penalties, so there is
always a winner! FB has a probability p of winning
any one game, independently of other games.
a) Find the values of p for which n=5 is better for FB
than n=3.
b) Generalize part (a): For any k>0, find the values
of p for which n=2k+1 is better for FB than n=2k-1.
27
How many girls??
28
Jury
• A family has 5 natural children and has adopted 2
girls. Each natural child has equal probability of
being a boy or a girl, independently of the other
children. Find the pmf of the number of girls out of
7 children.
29
• Consider a jury trial in which it takes 8 of 12 jurors
to convict: In order for the defendant to be
convicted, at least 8 of the jurors must vote him
guilty. If we assume that jurors act independently
and each makes the right decision with probability
θ, what is the probability that the jury randers a
correct decision? Note that a person on a trial if
guilty with probability α, and not guilty with 1-α.
30
5
Example
Keys?
You just rented a large house, and your landlord gave you 5
keys, one for each of the 5 doors of the house. Unfortunately,
all keys look alike, so to open the front door, you try them at
random.
a) Find the pmf of the number of trials you will need to open
the door, under the following assumptions: (1) after an
unsuccessful trial, you mark the corresponding key, so that you
never try it again, (2) at each trial you are equally likely to
choose any key.
b) repeat part (a) for the case when you have an extra
duplicate key for each of the 5 doors.
• Experiment: Tossing 3 coins
• Random variable: X = # of H’s in 3 coin tosses
• The experiment is repeated 20 times with the following
outcome:
0
1
2
3
3
7
8
2
• What is the average number of H’s in an experiment?
0 × 3 + 1× 7 + 2 × 8 + 3 × 2
= 1.45
20
average of the
observations xi
31
Example
32
Expected value of a discrete r.v.
• Average of the observations xi
Definition: Let X be a discrete r.v. with a set of
possible values D and pmf p(x). The expected value
or mean value of X, E(X) or µx, is given by:
3 7 8 2
0 + 1 + 2 + 3 = 1.45
20 20 20 20
E(X ) = µx =
Relative frequency of observing 0 in 20 experiments
• Average of the random variable, X = ?
∑
xp ( x )
x∈ D
1 3 3 1
0 + 1 + 2 + 3 = 1.5
8 8 8 8
Probability of 0
33
34
Example: X as the sum of two dice
A relevant question
• X : the sum of two dice
• Y = X mod(3)
• Z = (X-7)2/10
X
2
3
4
5
6
7
pX(x)
1/36
2/36
3/36
4/36
5/36
6/36
X
8
9
10
11
12
pX(x)
5/36
4/36
3/36
2/36
1/36
Y = X mod(3)
• You earn Y or Z amount of money. Which game
would you play?
Z=(X-7)2/10
You earn Y or Z amount of money
• What are the corresponding expected gains at each
game?
35
Y
0
1
2
pY(y)
1/3
1/3
1/3
Z
0
0.1
0.4
0.9
1.6
2.5
pZ(z)
6/36
10/36
8/36
6/36
4/36
2/36
36
6
Variance of a discrete r.v.
Moments of a random variable
Definition: Let X be a discrete r.v. with a set of possible
values D, pmf p(x) and E(X)= µ. Then, the expected value
the function, h(x) = (x-µ)2, is called the variance of the
r.v. X, and denoted by V(X) or σ2x or σ2:
• Defn: Let X be a rv. Then the rth moment of X is
E[Xr].
V ( X ) = E (( X − µ ) ) = ∑ ( x − µ ) p( x)
2
2
x∈D
• µ = E(X) : the mean is the 1st moment
The standard deviation of X is:
σ x = V ( X ) = σ x2
37
38
What does variance measure?
Example
• Comparing two random variables:
Mean and variance of a Bernoulli r.v.
– X1 and X2 with the following pmf’s:
1 / 2
p X 1 ( x ) = 1 / 2
0
1 / 2
p X 2 ( x ) = 1 / 2
0
x = 0
E(X) =
x = 10
o th e rw is e
V(X) =
Σx p(x) = ?
Σ(x-E(X))2 p(x) =
?
x = 4
x = 6
o th e rw is e
39
40
Example: X as the sum of two dice
Expected value of
a function of a r.v.
Y = X mod(3)
Z=(X-7)2/10
If X is a discrete r.v. with sample space D and pmf
p(x), then the expected value of any function h(X) is
defined as follows:
E ( h ( X )) = µ h ( x ) =
∑ h(x) p(x)
x∈ D
Verification: ...
41
You earn Y or Z amount of money
X
2
3
4
5
6
7
pX(x)
1/36
2/36
3/36
4/36
5/36
6/36
Y
2
0
1
2
0
1
Z
2.5
1.6
0.9
0.4
0.1
0
X
8
9
10
11
12
pX(x)
5/36
4/36
3/36
2/36
1/36
Y
2
0
1
2
0
Z
0.1
0.4
0.9
1.6
2.5
42
7
Properties of variance
Properties of expected value
V(h(x)) = ∑ (h(x) – E [h(x)])2 p(x)
Let a and b be any two constants. Then:
Let a and b be any two constants. Then:
E(aX + b) = a E(X) + b, so that:
E(aX) = a E(X)
(i) V(aX + b) = a2 V(X)
E(X + b) = E(X) + b
(ii) σ = aσx
Proof. …
43
44
Example: The Bernoulli Tack Factory
Proposition
The Bernoulli tack factory is manufacturing brass tacks.
some of which are defective. Each new tack is
non-defective with probability p
V(X) = σ2 = ∑[ x2 p(x)] - µ2 = E(X2)- (E(X))2
Proof:
σ2 = ∑ (x-µ)2p(x)
45
46
The mean & variance of a Binomial
random variable
A relevant question
You can consider a binomial r.v. as the sum of n Bernoulli
variables with parameter p.
• Let Y be the number of trials required in order to
obtain a non-defective item. What is E[Y] ?
If Y is a Bernoulli r.v., then E[Y] = p.
So we would expect :
If X is a binomial r.v., then E[X] = np.
• Y~geometric with p=P(success)
Proposition:
If X is distributed according to Binomial(n ,p) then
1) E(X) = np, and
2) V(X) = np(1-p) so that σx = np (1- p )
47
48
8
The mean & variance of a Poisson
random variable
Example
Proposition:
If X is distributed according to a Poisson distribution with λ,
then:
1) E(X) = λ, and
2) V(X) = λ
Each day the price of a stock moves up by one point or
stays constant with probabilities ¾ and ¼ respectively.
What is the expected gain after six days?
What is the variance of the six-day gain?
Intuition:
b(x;n,p) → p(x; λ)
49
Example: A quiz problem
50
Jointly Distributed Random Variables
• Question A: P(correct)=0.8, prize=$100
• Question B: P(correct)=0.5, prize=$200
Definition: Let X and Y be two discrete rv’s defined on the
sample space S of an experiment. The joint probability
mass for p(x,y) is defined for each pair of numbers (x,y) by:
• If first question is answered incorrectly, quiz terminates,
nothing to win!! If correct, the person attempts the second
question.
• S/he can keep whatever s/he has earned in the first question
even if the second question is answered incorrectly.
p(x,y) = P(X=x and Y=y)
Let A be any set consisting of pair of (x,y) values. Then
the probability P[(X,Y)∈A] is obtained summing the joint
pmf over all pairs in A.
P [(X, Y) ∈ A] =
• Which question to start with to maximize the expected value
of total prize money?
∑ p ( x, y )
( x , y )∈A
51
Example: A small pump
A small pump is inspected for two quality-control
characteristics. Each characteristic is classified as good, minor
defect (not affecting operation) or major defect (affecting
operation). The result in each check point is independent of
previous ones. A pump is to be selected and defects counted.
X : # minor defects
X=0,1,2
Y : # major defects
Y=0,...,2-X
52
The marginal probability
mass function
• The marginal probability mass function of X & Y,
denoted by Px(x) and Py(y), respectively, are given
by:
p X (x) = ∑ p ( x, y )
y
pY ( y ) = ∑ p( x, y )
• The sample space of (X, Y) ?
• p (x, y) if p1 : P(minor defect in a check point), p2 : P(major
defect in a check point) with p1 + p2 < 1?
x
• P(no major defect)?
53
54
9
Example: A small pump
X : # minor defects
X=0,1,2
Y : # major defects
Y=0,...,2-X
Functions of multiple random variables
Let X and Y be two rv’s, and Z=g(X,Y) be a
function of X and Y.
p1 : P(minor defect in a check point) = 0.1
pZ ( z ) =
p2 : P(major defect in a check point) = 0.01
Find pX(x) and pY(y).
∑p
X ,Y
{( x , y )| g ( x , y ) = z }
E ( g ( X , Y )) =
( x, y )
∑ ∑ g ( x, y ) p
x∈S X y∈SY
X ,Y
( x, y )
55
Example: A small pump
X : # minor defects
X=0,1,2
Y : # major defects
Y=0,...,2-X
56
Functions of more than
two random variables
Let X1, ..., Xn be n rv’s, and Z=Σ ai Xi.
p1 = P(minor defect in a check point) = 0.1
p2 = P(major defect in a check point) = 0.01
n
n
i =1
i =1
E (∑ ai X i ) = ∑ ai E ( X i )
Assume that each minor defect costs 1YTL, while each major
defect costs 20YTL.
• What are the possible values the cost of a pump can take?
• What are the corresponding probabilities for possible costs?
• What is the expected cost that a pump causes due to its
defects?
57
Example: Mean of the Binomial
58
Mean of the Binomial
• 300 students. p=1/3 = P(getting an A).
• What is the mean of X = number of A students?
59
60
10
Hat Problem
Key Problem
• n people throw their hats in a box and then each
picks one hat at random. What is the expected
value of the number of people that get their own
hat?
• You rent large house, realtor gives you 5 keys, one
for each of the 5 doors. They look identical so try
them at random to open the front door.
– E[num trials] if (1) after each trial you mark the key so
never try again, and (2) at each trial you are equally likely
to choose any key
• (2) is geometric with parameter 1/5 – end of lect.
• (1) is discrete uniform 1..5. mean is 3.
61
Summary of Joint PMFs
62
Old Lady in the Plane
• Joint PMF of X and Y (two variables associated with the same
experiment):
p(x,y) = P(X=x and Y=y)
• Old lady boarding the plane first cannot see well, so
picks a random seat. People who come later sit in
their own seat if empty, otherwise pick a random
seat. Plane is full with n seats. What is the Pr that
the last person will sit in his own seat.
• Marginal PMF can be obtained from Joint PMF:
p X (x) = ∑ p ( x, y )
– Extra credit question.
y
• Functions of multiple random variables:
E ( g ( X , Y )) =
∑ ∑ g ( x, y ) p
x∈S X y∈SY
X ,Y
( x, y )
• If linear:
E (aX + bY + c) = aE ( X ) + bE (Y ) + c
63
64
PMF conditional on event
Remember conditional probability
• Def: P(A|B) = P(A … B)/P(B)
• pX|A(x) = P(X=x | A) = P({X=x} … A)/P(A)
• pX|A(x)’s sum to 1.
• Multiplication Rule: P(A…B) = P(B) P(A|B)
P(A1,A2,…,An) = P(A1)P(A2|A1)P(A3|A1,A2)…P(An|A1,A2,…,An-1)
• Total Probability Theorem:
P(B) = P(A1 … B)+P(A2 … B)+… +P(An… B)
= P(A1)P(B|A1) + P(A2)P(B|A2)+… + P(An)P(B|An)
where {Ai|i=1,..n} is a partition of the sample space
• Bayes: P(A|B)=P(B|A)P(A)/P(B)
65
66
11
PMF conditional on an event
PMF cond on another r.v.
• Example: Roll a fair die.
• pX|Y(x|y)=P(X=x|Y=y) = P(X=x, Y=y)/P(Y=y) = pX,Y(x,y)/pY(y)
– pmf of number on the die given that roll is an even number.
• pX|Y(x|y)’s add up to 1.
• Example: student takes a test repeatedly up to max n
times with p prob of passing. What is the pmf of
number of times given he passes?
67
68
Example (cond->joint)
Joint vs. Conditional vs. Marginal
• Marginal:
• Professor answers questions incorrectly p=1/4. In
each lecture 0, 1, 2 questions asked with probability
p=1/3.
– X = number of questions asked,
– Y = number of questions wrong.
• Conditional:
• Joint pmf?
• Draw tree and table for joint.
• Calculate at least one wrong answer.
• Joint (using multiplication rule):
69
70
Example (cond->marginal)
Conditional Expectation
• Computer network.
X=travel time for a message
P{X=10-4Y}=1/2, P{X=10-3Y}=1/3, P(X=10-2Y}=1/6.
Y=length of a message.
P{Y=100 }=5/6, P{Y=104}=1/6.
– PMF X?
– Find the conditional pmfs of X given Y=100 and given Y=104.
– Find the marginal PMF of X.
E[ X | A] = ∑ xp X | A ( x)
x
E[ g ( X ) | A] = ∑ g ( x) p X | A ( x)
x
E[ X | Y = y ] = ∑ xp X |Y ( x | y )
x
E[ X ] = ∑ pY ( y ) E[ X | Y = y ]
y
∩ Ai = φ
∪ Ai = Ω,
E[ X ] = ∑ P ( Ai ) E[ X | Ai ]
i
71
E[ X | B ] = ∑ P( Ai | B ) E[ X | Ai ∩ B]
72
i
12
Total Expectation Theorem
TET example
• The unconditional average can be obtained by
averaging the conditional averages.
• P(Message from Boston to NY)=0.5
• P(Message from Boston to Chicago)=0.3
• P(Message from Boston to SF)=0.2
E[ X ] = ∑ pY ( y ) E[ X | Y = y ]
y
• X = transit time with E(X|NY) = 0.05, E(X|C) = 0.1,
E(X|SF) = 0.3.
• Compute E[X].
73
74
Mean and var of geometric
Two envelopes paradox
• double the money.
• extra credit question.
75
76
Remember Independence
• If events A and B are independent:
– P(A|B) = P(A)
– P(B|A) = P(B)
– P(A,B) = P(A) P(B)
ENGR 200
• Conditional independence:
Independence of Random Variables
– P(A,B|C) = P(A|C) P(B|C)
• Independence of several events:
A1 , A2 ,..., An are independent if
P( I Ai ) = ∏ P( Ai ) for every subset S of {1,2,..., n}
i∈S
77
i∈S
78
13
Independence of multiple r.v.
Independence of r.v. from an event
• P(X=x, A) = P(X=x) P(A) = pX(x) P(A) for all x.
• P(X=x, A) = pX|A(x)P(A) always => pX|A(x) = pX(x) for all x.
• pX,Y(x,y) = pX(x) pY(y)
• pX|Y(x|y) = pX(x) for all x,y
•
•
•
•
• Example: Two tosses of a fair coin.
X = number of heads. A = number of heads is even.
– find unconditional pmf X
– find conditional pmf of X
– another r.v: Y=1 if the first toss is tails
Conditional independence:
P(X=x,Y=y|A) = P(X=x|A) P(Y=y|A)
pX,Y|A(x,y) = pX|A(x) pY|A(y)
pX|Y,A(x|y) = pX|A(x) for all x,y p(y)>0
79
Conditional independence does not imply
independence:
80
Independent => E[XY]=E[X]E[Y]
• Proof...
• Also E[g(X) h(Y)] = E[g(X)] E[h(Y)] if X,Y are
independent.
81
82
Summary
Independent => var(X+Y)=var(X)+var(Y)
• X,Y any random variables:
• E[X+Y] = E[X]+E[Y]
• Proof...
• E[aX+b] = aE[X] + b
• E[aX+bY+c] = aE[X]+bE[Y]+c
• var[aX+b] = a2 var[X]
• X,Y independent random variables:
– E[XY] = E[X]E[Y]
– var[X+Y] = var[X] + var[Y]
83
84
14
Ex: Variance of Binomial
Sample Mean
•
•
•
•
•
•
Xi: Bernoulli, n’th person’s vote for presidency
Mean=p, Variance=p(1-p)
Ask n people, how confident are we?
Define sample mean: Sn=(X1+…+Xn)/n
E(Sn)=?
var(Sn)=?
• For arbitrary r.v. (not bernoulli), var(Sn)=var(X)/n, still a good
estimate “Law of large numbers”, Chap 7.
85
Estimation by simulation
86
Summary: func of r.v.
• Perform an experiment n times, observe event A k
times => estimate P(A) = k/n
• g(X): an arbitrary function of a r.v. X:
• E[g(X)] = Σ g(x) pX(x)
• Example: Measuring the bias of a coin, P(H) = ?
• In general : E[g(X)] ≠ g(E[X])
• var[g(X)] = E[g(X)2] – E[g(X)]2
87
Summary: sum and product of r.v.
• E[X+Y] = E[X] + E[Y]
88
Summary: known PMFs
always
• Discrete uniform [a,b]:
– E[X] = (a+b)/2 var(X) = (b-a)(b-a+2)/12
• E[aX+bY+c] = aE[X]+bE[Y]+c
• var(X+Y) = var(X) + var(Y)
• E[XY] = E[X]E[Y]
always
• Bernoulli with p:
only if X and Y are independent
– E[X] = p var(X) = p(1-p)
• Binomial with p, n:
– E[X] = np var(X) = np(1-p)
only if X and Y are independent
• Geometric with p:
– E[X] = 1/p var(X) = (1-p)/p^2
• var(XY) = ?
• Poisson with λ:
89
– E[X] = λ var(X) = λ
90
15
Summary: calculating probabilities starting
with the conditional
• Joint: pX,Y(x,y) = pY(y) pX|Y(x|y)
• Marginal: pX(x) = Σy pY(y) pX|Y(x|y)
• Expectation: E[X] = Σy pY(y) E[X|Y=y]
91
16
© Copyright 2026 Paperzz