Modeling the Game of RISK Using Markov Chains

ECE 541 Project Report:
Modeling the Game of RISK Using Markov Chains
Stochastic Signals and Systems
Rutgers University, Fall 2014
Sijie Xiong, RUID: 151004243
Email: [email protected]
Contents
1 The Game of RISK
1
2 Modeling RISK via Markov Chain
2.1 State Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 A Single Round of Rolling Dice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Transition Probability Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1
2
3
3 The Probability that the Attacker Wins
4
References
5
1
The Game of RISK
Tan [2] represented the detailed rules for the game of RISK. Here, the rules are slightly different.
The basic idea is that attacking country and defending country initially have a0 and d0 armies
respectively, and whether they lose armies depends on the results of rolling dice. That is, at each
round, attacker rolls i = min(a − 1, 3) dice instead of min(a, 3) in Tan [2] if it has a armies
remaining and defender rolls j = min(d, 2) dice if it has d armies remaining. The highest and
second highest rolls of the attacker and defender are compared sequentially. When the roll of the
attacker is strictly greater, the defender loses 1 army; otherwise the attacker loses 1 army. The
battle ends when either side loses all of its armies, or equivalently, the attacker wins if it has at
least 1 army and defender has 0 army, and vice versa. Tan’s Table 1, which gives an example of
a battle, is reproduced with some modifications. It shows that the attacker wins the battle at the
4th round. Since each outcome of rolling dice is at random, the game can be modeled as a random
process. In this report, what is of interest is the probability that the attacker wins.
2
2.1
Modeling RISK via Markov Chain
State Space
Let Xn = (an , dn ) denote the state of the system at the nth round:
Xn = (an , dn ), 0 6 an 6 a0 , 0 6 dn 6 d0 , n > 0,
(1)
where an and dn are the remaining armies of the attacker and defender, respectively. The initial
state of the system is X0 = (a0 , d0 ). The probability of the system changing from one state at the
1
Table 1: An example of a battle
Round #
No. of armies
attacker defender
No. of dice rolled
Outcome
No. of losses
attacker
defender
attacker
defender
attacker
defender
3
3
2
5,3,1
6,2
1
1
3
2
2
2
4,4
4,3
1
1
2
2
1
1
5
1
0
1
1
0
1st
4
2nd
3rd
4th
nth round to another state at the (n + 1)th round depends only on Xn ,
P Xn+1 = (an+1 , dn+1 )|Xn = (an , dn ), · · · , X0 = (a0 , d0 )
= P Xn+1 = (an+1 , dn+1 )|Xn = (an , dn ) .
(2)
Therefore, {Xn , n = 0, 1, · · · } can be characterized as a Markov chain. Obviously, (0, 0) isn’t a
valid state. The total number of armies that the attacker and defender lose in each round is either
1 or 2. More specifically,
∆a + ∆d = min(i, j) ∈ {1, 2}, 0 6 ∆a 6 2, 0 6 ∆d 6 2, 0 6 i 6 3, 0 6 j 6 2,
(3)
where ∆a denotes the number of armies the attacker loses and ∆d denotes the number of armies
the defender loses.
The possible states can actually be separated into two groups. Intuitively, the states that either
side loses all of its armies indicate the end of a battle, and Tan [2] referred to these states as
absorbing states. We can order these states and construct a vector of absorbing states, which has
(a0 + d0 ) entries,
T
A = (0, 1), (0, 2), · · · , (0, d0 ), (1, 0), (2, 0), · · · , (a0 , 0) .
(4)
On the other side, the states in which both the attacker and defender have at least 1 army are
transient, since the system will definitely lose at least 1 army and changes to another state according
to equation (3). We can also obtain a vector of transient states, which has (a0 ∗ d0 ) entries,
T
T = (1, 1), (1, 2), · · · , (1, d0 ), (2, 1), (2, 2) · · · , (2, d0 ), · · · , (a0 , 1), (a0 , 2), · · · , (a0 , d0 ) .
(5)
The process of the game RISK can be characterized as follows: the system starts with an initial
state (a0 , b0 ), which is also a transient state, then jumps among different transient states in T, until
it reaches an absorbing state in A.
2.2
A Single Round of Rolling Dice
Since we are only concerned about the probability that the attacker wins, i.e., the defender loses
all of its armies and the attacker still has at least 1 army. These states correspond to the (1 + d0 )th
to the (a0 + d0 )th entries of the absorbing states vector A, namely (1, 0), (2, 0), · · · , (a0 , 0). Let
Pij∆d denote the probability that the defender loses ∆d armies when the attacker and defender
2
Table 2: 14 Distinct values of Pij∆d
i
j
∆d
Pij∆d
Value
1
1
0
P110
0.417
1
1
1
P111
0.583
1
2
0
P120
0.255
1
2
1
P121
0.745
2
1
0
P210
0.579
2
1
1
P211
0.421
3
1
0
P310
0.228
3
1
1
P311
0.324
2
2
0
P220
0.448
2
2
1
P221
0.660
2
2
2
P222
0.340
3
2
0
P320
0.372
3
2
1
P321
0.336
3
2
2
P322
0.293
respectively roll i and j dice. According to Osborne [3], there are a total of 14 distinct values of
Pij∆d , which are computed from the marginal and joint probability distributions of rolling 2 or 3
dice. Osborne’s Table 2 is present here as a reference. Note that the sum of probabilities under
each pair of (i, j) is 1.
2.3
Transition Probability Matrix
Based on the states vectors A, T in Section 2.1 and Table 2, we are able to construct the transition
probability matrix of the system,
Q R
P=
,
0 I
(6)
where Q(a0 ∗d0 )×(a0 ∗d0 ) contains the probabilities of the system going from one transient state to
another transient state; R(a0 ∗d0 )×(a0 +d0 ) contains the probabilities of the system going from a
transient state to an absorbing state. According to the discussions in Section 2.1, the system
changes among transient states during every single round, until it reaches and stays in an absorbing
state. Therefore, the diagonal entries of Q are all 0. Since states (1, 1), (1, 2), · · · , (1, d0 ) go to
(0, 1), (0, 2), · · · , (0, d0 ) respectively with probability 1, the first d0 diagonal entries of R are all 1.
The identity matrix I(a0 +d0 )×(a0 +d0 ) represents that once the system enters an absorbing state, it
will stay in that state with probability 1. The nonzero and less-than-one entries of Q and R are
drawn from Table 2. Each row of the transition probability matrix P sums to 1.
3
Table 3: Probability that the attacker wins under different initial states
3
(a0 , d0 )
10
20
30
40
50
60
10
0.483
0.049
0.002
0
0
0
20
0.973
0.586
0.157
0.022
0.002
0
30
1
0.957
0.649
0.256
0.061
0.009
40
1
1
0.954
0.697
0.342
0.111
50
1
1
1
0.958
0.736
0.415
60
1
1
1
1
0.965
0.771
The Probability that the Attacker Wins
Yates [1] showed that the n-step transition matrix Pn completely describes the evolution of probabilities in a Markov chain. Similarly, let the (a0 ∗ d0 ) × (a0 + d0 ) matrix Fn denote the transition
probability matrix of the system’s final visit to an absorbing state at the nth round from the previous
transient state at the (n − 1)th round. We have
Fn = Q(n−1) R.
(7)
This means that the system must be in transient states during the first (n − 1) rounds and the nth
must be from a transient state to an absorbing state. According to Osborne [3], the system then
proceeds enough rounds to achieve an absorbing state. The transition probability matrix from an
initial state to the last absorbing state is therefore:
F=
∞
X
n=1
n
F =
∞
X
Q(n−1) R = (I − Q)−1 R.
(8)
n=1
It follows that if the attacker wins, the system goes from the initial state (a0 , d0 ) to one of the
a0 absorbing states (1, 0), (2, 0), · · · , (a0 , 0). These transitions correspond to the last a0 columns in
the (a0 d0 )th (last) row of F. Let A denote the event that the attacker wins the battle, then,
dX
0 +a0
P A|X0 = (a0 , d0 ) =
F(a0 d0 , j).
(9)
j=d0 +1
Table 3 gives some numerical results of P A|X0 = (a0 , d0 ) and Figure 1 shows a more detailed
relationship between P (A) and different initial states X0 = (a0 , d0 ). Three conclusions can be made:
a) with the initial number of armies of either side being fixed (a0 or d0 fixed), the probability that
the opposed side wins increases as its initial number of armies increases; b) if both sides have
equal number (at least 10) of armies (a0 = d0 ), the chance that the attacker wins increases as
a0 increases, and is greater than 50%; c) if the defender’s army is outnumbered by the attacker
(a0 = d0 + c, where c is a positive constant), the possibility that the attacker wins decreases first
and then increases as d0 increases.
4
Figure 1: The relationship between P (A) and initial states (a0 , d0 )
References
[1] Yates, Roy D., and David J. Goodman. ”Probability and Stochastic Process.” (2003).
[2] Tan, Baris. ”Markov chains and the RISK board game.” Mathematics Magazine (1997): 349357.
[3] Osborne, Jason A. ”Markov chains for the risk board game revisited.” Mathematics Magazine
(2003): 129-135.
5