DP for Optimum Strategies in Games J.-S. Roger Jang (張智星) [email protected] http://mirlab.org/jang MIR Lab, CSIE Dept. National Taiwan University Outline Game of dice sum Game of colored jenga 2/13 Game of Dice Sum Description Toss a dice 8 times and place the value into 4 double-digit number right after each toss. Find the total of these 4 numbers. If the total is bigger than 150, your score is 0. Otherwise your score is the total. Your goal Find the optimum strategy to play the game such that the expected total is optimized. Credit: Peter Norvig at Google CS283: AI Programming Techniques (1989 at UC Berkeley) 3/13 Three-step Formula of DP: Step 1 Optimum-value function D(p, q, s)=expected max score when p: No. of ten’s position left q: No. of one’s position left s: current sum of the game Credit: 電機系賀正翔 Game state of (1, 2, 67) 4/13 Three-step Formula of DP: Steps 2 and 3 Recurrent formula for the optimum-value function General recurrence : D p, q, s 1 / 6 * maxD p 1, q, s 10 , D p, q 1, s 1 1 / 6 * maxD p 1, q, s 20 , D p, q 1, s 2 1 / 6 * maxD p 1, q, s 30 , D p, q 1, s 3 1 / 6 * maxD p 1, q, s 40 , D p, q 1, s 4 1 / 6 * maxD p 1, q, s 50 , D p, q 1, s 5 1 / 6 * maxD p 1, q, s 60 , D p, q 1, s 6 Boundary condition : D p, q, s 0 if s 150, p, p... (more to be added) Answer: D(4, 4, 0) 5/13 Strategy during the Game Recurrent formula for the optimum-value function p, q, s 1,2,100 Given the dice value is 4, our strategy : pos D(0, 2, 140) D(1,1,104)?tens : ones; 6/13 Game of Colored Jenga Description: http://codeforces.com/problemset/problem/424/E Techniques Dynamic programming Hash table 7/13
© Copyright 2026 Paperzz