OR II GSLM 52800 1 Outline introduction problem statement long-term solving to discrete-time Markov Chain average cost pre unit time the MDP by linear programming 2 States of a Machine State Condition 0 1 2 3 Good as new Operable – minor deterioration Operable – major deterioration Inoperable – output of unacceptable quality 3 Transition of States State 0 1 2 3 0 0 7/8 1/16 1/16 1 0 3/4 1/8 1/8 2 0 0 1/2 1/2 3 0 0 0 1 4 Possible Actions Decision Action Relevant States 1 Do nothing 0, 1, 2 2 3 Overhaul (return to state 1) Replace (return to state 0) 2 1, 2, 3 5 Problem adopting different collections of actions leading to different long-term average cost per unit time problem: to find the policy that minimizes the long-term average cost per unit time 6 Costs of Problem cost of defective items state cost 0: 0; state 1: 1000; state 2: 3000 of replacing the machine = 4000 cost of losing production in machine replacement = 2000 cost of overhauling (at state 2) = 2000 7 Policy Rd: Always Replace When State 0 half of the time at state 0, with cost 0 half of the time at other states, all with cost 6000, because of machine replacement average cost per unit time = 3000 7/8 0 1 1/16 1 1 1/16 1 3 2 8 Long-Term Average Cost of a Positive, Irreducible Discrete-time Markov Chain a positive, irreducible discrete-time Markov chain with M+1 states, 0, …, M only M of the balance eqt plus the normalization eqt M balance eqt.: j i pij , j 0,..., M i 0 M normalization eqt.: i 1 i 0 9 Policy Ra: Replace at Failure but Otherwise Do Nothing 0 3 1 78 0 34 1 3/4 1 1 1 2 16 0 8 1 2 2 1 1 1 3 16 0 8 1 2 2 0 1 2 3 1 2 ; 7 ; 2 ; 2 0 13 1 13 2 13 3 13 7/8 0 1 1 1/8 1/16 1 1/16 1/8 1/2 3 1/2 2 (0)0 10001 30002 60003 1923 10 Policy Rb: Replace in State 3, and Overhaul in State 2 0 3 1 78 0 34 1 2 3/4 1 1 2 16 0 8 1 1 1 3 16 0 8 1 0 2 ; 21 1 75 ; 2 1 1 1/8 0 1 2 3 1 0 7/8 2 ; 21 3 1/16 1 1/16 2 21 3 1 1/8 2 (0)0 10001 40002 60003 1667 11 Policy Rc: Replace in States 2 and 3 0 2 3 1 78 0 34 1 3/4 1 1 2 16 0 8 1 1 1 3 16 0 8 1 0 7 ; 11 2 1 1 1/8 0 1 2 3 1 2 ; 0 11 1 7/8 1 ; 11 3 1 1 11 1/16 1 1 1/8 1/16 3 2 (0)0 10001 60002 60003 1727 12 Problem in this case the minimum-cost policy is Rb, i.e., replacing in State 3 and overhauling in State 2 question: Is there any efficient way to find the minimum cost policy if there are many states and different types of actions? impossible to check all possible cases 13 Linear Programming Approach for an MDP let Dik be the probability of adopting decision k at state i i be the stationary probability of state i yik = P(state i and decision k) Cik = the cost of adopting decision k at state i 14 Linear Programming Approach for an MDP yik i Dik , i 0,..., M M j i pij , j 0,..., M M K i 0 yik 1 M i 1 i 0 k 1 K M K k 1 i 0 k 1 i 0 y jk yik pij (k ), j 0,1,..., M yik 0, i 0,1,..., M ; k 1,..., K M K M K i 0 k 1 i 0 k 1 E (C ) i Cik Dik = Cik yik 15 Linear Programming Approach for an MDP M K min Z = Cik yik , i 0 k 1 s.t. M K yik 1, i 0 k 1 K y jk k 1 at optimal, Dik = 0 or 1, i.e., a deterministic policy is used M K yik pij (k ) 0, j 0,1,..., M yik 0, i 0 k 1 i 0,1,..., M ; k 0,1,..., K 16 Linear Programming Approach for an MDP actions possibly to adopt at state 0: do nothing (i.e., k = 1) 1: do nothing or replace (i.e., k = 1 or 3) 2: do nothing, overhaul, or replace (i.e., k = 1, 2, or 3) 3: replace (i.e., k = 3) variables: y01, y11, y13, y21, y22, y23, and y33 17 Linear Programming Approach for an MDP min Z =1000y11 6000y13 +3000y21 State 0 1 2 3 0 0 7/8 1/16 1/16 1 0 3/4 1/8 1/8 2 0 0 1/2 1/2 3 0 0 0 1 +4000y22 +6000y23 +6000y33 , s.t. y01 y11 y13 +y21 +y22 +y23 +y33 1 y01 ( y13 +y23 +y33 ) 0 y11 +y13 ( 78 y01 + 34 y11 +y22 ) 0 1 y +1 y +1 y )0 y21 +y22 +y23 ( 16 01 8 11 2 21 1 y +1 y +1 y )0 y33 ( 16 01 8 11 2 21 yik 0, i 0,1,..., M ; k 0,1,..., K 18 Linear Programming Approach for an MDP solving, y01 = 2/21, y11 = 5/7, y13 = 0, y21 = 0, y22 = 2/21, y23 = 0, y33 = 2/21 optimal at policy state 0: do nothing state 1: do nothing state 2: overhaul state 3: replace 19
© Copyright 2026 Paperzz