probability tree for textbooks

IEOR 4106: Introduction to Operations Research: Stochastic Models
Spring 2004, Professor Whitt
First Midterm Exam: Thursday, February 19
Chapters 1-4 in Ross, SOLUTIONS
Justify your answers; show your work.
1. Which textbook is best? (35 points)
In its never-ending quest for educational excellence, the IEOR Department has tried a different probability-and-statistics textbook for its introductory probability-and-statistics course
during each of the last three semesters. During the first semester, 50 students used the textbook
by Professor Mean; during the second semester, 30 students used the textbook by Professor
Median; and during the third semester, 20 students used the textbook by Professor Mode.
Surveys were taken at the end of each course, asking all students their opinion. In the first
survey, 20 of the 50 students admitted being satisfied with Mean’s book; in the second survey,
15 of the 30 students admitted being satisfied with Median’s book; and in the third survey, 16
of the 20 students admitted being satisfied with Mode’s book.
(a) What is the probability that a student selected at random from all these students will
have admitted being satisfied by his textbook? (5 points)
—————————————————————————–
It is helpful to start by constructing a probability tree. Let S denote satisfied; let N
denote not satisfied.
probability tree for textbooks
S
0.4
Mean
0.20
0.6
N
0.5
0.5
random
student
0.3
Median
S
0.15
0.5
N
0.2
0.8
S
Mode
0.16
0.2
N
P (S) = P (S ∩ M ean) + P (S ∩ M edian) + P (S ∩ M ode) = 0.20 + 0.15 + 0.16 = 0.51
—————————————————————————–
(b) Suppose that a student, selected at random from all these students, admitted having
been satisfied by his textbook. What is the probability that the student used the textbook by
Professor Mean? (10 points)
—————————————————————————–
Use Bayes’ Theorem, exploiting the definition of conditional probability:
P (M ean|S) =
P (M ean ∩ S)
0.20
0.20
=
=
≈ 0.39
P (S)
0.20 + 0.15 + 0.16
0.51
The last numerical calculation is not required.
—————————————————————————–
(c) By this experiment, which book is most likely to be best (assuming that we can judge
by student opinion)? (5 points)
—————————————————————————–
It is natural to pick the book with P (S|book) being largest. These probabilities are obtained
from the second set of branches on the tree. By that criterion, the textbook by Professor Mode
is most likely to be best: P (S|M ode) = 0.80, larger than the other two P (S|book).
—————————————————————————–
(d) Looking at the results, one curmudgeon on the faculty (perhaps Professor Mean) said,
“The experiment is inconclusive. An outcome like that could have occurred by chance, with
each student tossing a coin to determine whether or not to say that he was satisfied with his
textbook.” How justified is that √
criticism? In particular,
how well does the fair-coin model
√
work for Professor Mode? (Hint: 5 ≈ 2.2, while 50 ≈ 7.1.) (15 points)
—————————————————————————–
We do a statistical test; we use a normal approximation for the binomial.
The question is: How likely could the observed outcome or something even more extreme
occur by chance, assuming that the students make judgements randomly and independently,
with each student’s evaluation being positive or negative with probability 1/2? Clearly, the
overall number of satisfied students – 51 out of 100 – is not inconsistent with the criticism.
That clearly could occur by chance in exactly the way described.
But consider the results for the individual textbooks. First, since exactly half of the
students using Professor Median’s book were satisfied, that result is not inconsistent with
the criticism either. So look at the results for the textbooks by Professors Mean and Mode.
Looking at each separately, we can use a normal approximation for the binomial distribution.
Intuitively, you should see that the statistics for Professor Mode’s book are most extreme.
So it would be natural to examine that case first. In fact, we are asked to do just that.
Textbook by Professor Mode: Let Y be the number of satisfied students in a class
of 20 students, assuming that the decisions are random with probability 1/2. We observe 16.
That value or more extreme would be 16 or larger. The question then is: what is P (Y ≥ 16)?
Note that the mean is EY = 20×1/2
√ = 10 and the variance is V ar(Y ) = 20×1/2×1/2 = 5,
so the standard deviation is SD(Y ) = 5 ≈ 2.23. (You can apply the hint.) A rough estimate
in your head would be 2.2; even somewhere between 2 and 2.5 is good enough. Reasoning
quickly, 16 is more than 2 standard deviations above the mean 10, so we conclude it is unlikely
2
to occur by chance. It is not prohibitively unlikely, but it is quite unlikely. Reasoning more
slowly, we get the normal approximation
µ
¶
16 − 10
Y − 10
P (Y ≥ 16) = P
≥
≈ P (N (0, 1) ≥ 2.69) ≈ 0.0036
2.23
2.23
using the table on p. 81 of your textbook by Ross. A refined approximation, to account for
the fact the Y is integer-valued is to consider
µ
¶
Y − 10
15.5 − 10
P (Y ≥ 16) = P (Y ≥ 15.5) = P
≥
≈ P (N (0, 1) ≥ 2.47) ≈ 0.0068
2.23
2.23
The probability that the outcome for Professor Mode would be that good or better is thus
about 1/200. It is less than 1/100.
That probability is not extraordinarily small, but it is sufficiently small that we would
reasonably conclude that the outcome did not occur by chance in the way contemplated.
We should perhaps consider both very small and very large values as extreme. Thus we
should really look at the probability P (|Y − EY | ≥ 6) = P (Y ≤ 4) + P (Y ≥ 16), which is twice
the probability above. Again, we conclude that the outcome is unlikely to occur by chance in
the way described. We would fairly conclude that Professor Mode’s textbook is indeed better.
However, we might want further evidence.
—————————————————————————–
2. Random Motion on the Chessboard (30 points)
Suppose that a rook (chess piece) is placed on a corner square of an empty standard 8 × 8
chessboard, and then allowed to make a succession of independent random moves, with each
of its legal moves being equally likely at each step. (Recall that a rook can move any non-zero
number of squares vertically or horizontally, but not both, and not diagonally.)
(a) What is the probability that the rook is back at its starting square after two moves?
(5 points)
———————————————————————
For this part, we observe that, the rook always has 14 possible moves on each move, 7
horizontal moves and 7 vertical moves. For any first move, it can return to its initial square
on the second move in only one way. Hence the probability is clearly 1/14.
———————————————————————
(b) What is the probability that the rook is back at its starting square after three moves?
(5 points)
———————————————————————
We observe that, in order to return to the starting corner square in three moves, the
rook must either go horizontally three straight times or vertically three straight times. With
probability 1/2, it goes horizontally on the first move. On the second move it must move to
one of the 6 other squares on the same row as the starting square (but not the starting square
itself!), which occurs with probability 6/14. On the third move it must move to the original
square, which occurs with probability 1/14. The same situation happens vertically. So, the
total probability is 2 × (1/2) × (6/14) × (1/14) = (6/196) = (3/98).
3
———————————————————————
(c) Identify the closed and open communication classes of this Markov chain. (5 points)
———————————————————————
There are 64 states, the 64 squares on the chessboard. The MC is irreducible. There is
only one closed communication class.
———————————————————————
(d) Which of the following properties hold for this Markov chain: (i) irreducible, (ii) periodic, (iii) absorbing? (5 points)
———————————————————————
Only irreducible. The chain is aperiodic. The chain is NOT absorbing.
———————————————————————
(e) What is the long-run proportion of steps ending with the rook on its initial corner
square? (5 points)
———————————————————————
We exploit the reversibility discussed in Section 4.8 of the text. It suffices to count the
number of moves possible from each square, and then take this number for the corner and divide
it by the sum of the numbers. However, the number of moves from each square is precisely
14. Thus the steady-state probability of being in any one square is 14/(14 × 64) = 1/64. That
steady-state probability coincides with the long-run proportion of visits there.
———————————————————————
(f) Starting from its initial corner square, what is the expected number of moves until the
rook first returns to that starting square? (5 points)
———————————————————————
The expected time to return is the reciprocal of the steady-state probability. Thus it is 64.
See Remark (ii) in Section 4.4, on page 212.
———————————————————————
3. A Finite Markov Chain (35 points)
Consider a Markov chain on the eight states {1, 2, · · · , 8} with transition matrix P given
by






P =





0.2
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.1
0.5
0.0
0.0
0.1
0.2
0.0
0.0
0.2
0.0
1.0
0.0
0.4
0.0
0.0
0.0
0.0
0.0
0.0
0.5
0.1
0.0
0.0
0.5
0.2
0.0
0.0
0.0
0.1
0.0
0.0
0.0
0.1
0.5
0.0
0.0
0.0
0.8
0.0
0.0
**Note that we are numbering the states 1, 2, ... , 8.
4
0.2
0.0
0.0
0.5
0.2
0.0
0.5
0.0
0.0
0.0
0.0
0.0
0.1
0.0
0.5
0.5












(a) Which states are accessible from state 1? (2 points)
—————————————————————————–
A state j is accessible from another state i if it is possible in some finite number of steps
to get from i to j; see Section 4.3 of Ross. The issue is whether you can ever get there. Thus,
all states are accessible from state 1. You can see that by drawing the graph to determine the
communication classes. Unfortunately, many people thought you had to be able to get to the
other state in only one step, but that is not the definition. Since the exam was open-book, you
could have just checked by looking at the first sentence of Section 4.3 on page 193.
—————————————————————————–
(b) From which states is state 1 accessible? (2 points)
—————————————————————————–
State 1 is only accessible from state 1. You can see that by looking at the first column of
P.
—————————————————————————–
(c) Put the transition matrix in canonical form. (5 points)
—————————————————————————–
We order the states in the order (3, 2, 6, 4, 7, 8, 5, 1); we order the recurrent states by the
size of the closed communication class, and then by the original state number. With that
ordering, the canonical form is


1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 


 0.0 0.2 0.8 0.0 0.0 0.0 0.0 0.0 


 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 

P =
 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 


 0.0 0.0 0.0 0.5 0.0 0.5 0.0 0.0 


 0.4 0.1 0.0 0.1 0.2 0.1 0.1 0.0 
0.2 0.1 0.1 0.0 0.2 0.0 0.2 0.2
—————————————————————————–
(d) Identify the open and closed communication classes for this Markov chain. (2 points)
—————————————————————————–
The communication classes are subsets of the state space. The communication classes for
a partition of the state space: Each state is in one and only one class. The communication
classes are: {2, 6}, {3}, {4, 7, 8}, {5} and {1} The first three classes are closed; the last two
are open. You can get from state 1 to state 5, but you cannot get back to state 1 from state
5. Thus it is natural to put state 1 last, but that is not required.
—————————————————————————–
(e) Which states are transient? (2 points)
—————————————————————————–
The closed communication classes are {2, 6}, {3}, 4, 7, 8}; these are the classes from which
you cannot leave. The open communication classes are {5} and {1}; they contain the transient
states. Thus the transient states are 1 and 5.
5
—————————————————————————–
(f) Which states are recurrent? (2 points)
—————————————————————————–
The recurrent states are all the states in the closed communication classes. A state is
recurrent if, starting in that state, you eventually return with probability 1. Thus the recurrent
states are 2, 6, 3, 4, 7 and 8.
—————————————————————————–
(2)
(g) Compute the two-step transition probability P1,7 . (3 points)
—————————————————————————–
It is easy to draw a probability tree. You find three ways with positive probabilities. In
particular,
(2)
P1,7
= P1,1 P1,7 + P1,5 P5,7 + P1,7 P7,7
= (0.2 × 0.2) + (0.2 × 0.2) + (0.2 × 0.5) = 0.18
You could compute the square of P , i.e., P 2 , but that is the hard way.
—————————————————————————–
(h) Starting in state 2, what is the expected total number of visits to state 2? (2 points)
—————————————————————————–
Infinite, because state 2 is part of the closed communication class {2, 6}.
—————————————————————————–
(i) Starting in state 1, what is the expected total number of visits to state 5? (5 points)
—————————————————————————–
States 1 and 5 are the two transient states. We thus want to use the fundamental matrix
N . Indeed, we want N1,5 , where N = (I − Q)−1 , with the states labelled appropriately.
This question is about an absorbing Markov chain. The matrix can be written in the
block-matrix form
µ
¶
I 0
P =
,
R Q
where Q is a 2 × 2 matrix. For this question, that is all we need. Here the matrix Q is
µ
¶
0.1 0.0
Q=
,
0.2 0.2
where we have ordered the states by putting 5 first and then 1. Thus
µ
¶
0.9 0.0
I −Q=
,
−0.2 0.8
and
µ
N=
10/9 0.0
5/18 10/8
6
¶
,
Thus, starting in state 1, the expected number of steps spent in state 5 before absorption
in one of the closed communication classes is 5/18.
—————————————————————————–
(j) Starting in state 4, what is the expected number of steps before you return again to
state 4? (5 points)
—————————————————————————–
It suffices to focus on the three-state closed communication class containing the states 4, 7
and 8. The steady-state probability for the three-state subchain containing the states 4, 7 and
8 is (1/3, 1/3, 1/3). You can determine that by solving π = πP for the 3 × 3 subchain, which
is


1/2 1/2 0
P =  0 1/2 1/2  ,
1/2 0 1/2
Since this transition matrix is doubly stochastic (its column sums are all 1), the stationary
probability vector attaches equal probability to all states. No arithmetic is required.
Thus the expected number of steps is 3, the reciprocal of the steady-state probability 1/3.
—————————————————————————–
(25)
(k) What is the approximate value of P1,6 ? (5 points)
—————————————————————————–
The stationary vector for the 2 × 2 subchain associated with states 2 and 6, i.e.,
µ
¶
1/2 1/2
P =
,
1/5 4/5
is π = (2/7, 5/7). We next want to find the probability of being absorbed in that particular
subchain, given that we start in state 1. To do that, we make an absorbing chain out of the
original matrix, using the canonical form in part (c). The absorbing chain is
µ
¶
I 0
P =
,
R Q
where Q is a 2 × 2 matrix, just as before, but I is 3 × 3 and R is 2 × 3. In particular, with the
closed communication classes put in the order {3}, {2, 6} and {4, 7, 8}, in order of size, and
the states within the subchains ordered in numerical order, overall as (3, 2, 6, 4, 7, 8), then
µ
¶
0.4 0.1 0.4
R=
,
0.2 0.2 0.2
For the absorbing chain, we compute B = N R, for N computed in part (h) and R here.
Then
µ
¶
4/9
1/9
4/9
B=
,
13/36 10/36 13/36
(n)
Then P1,{2,6} = 10/36. So
(25)
P1,6 = (10/36) × (5/7) = 50/252 ≈ 0.1984
—————————————————————————–
7