Session 6a Agenda Stochastic dynamic programming Examples: –TV Game Show –Personalized Marketing –Stochastic Production Scheduling TV Game Show Beginning the game with no accumulated winnings, you spin the wheel once, and after each spin, you are allowed to spin again if you wish. On any one spin, four possible outcomes: Outcome Win $1 Win $5 Win $10 Lose 100% (game over) Probability 0.25 0.25 0.25 0.25 TV Game Show After each spin, assuming you still have money, you must either (i) take your accumulated winnings and thereby end the game or (ii) choose to spin the wheel again. Your objective is to maximize your expected accumulated winnings from playing the game. Solve the problem assuming there are one or two or three (or ten) spins available. Single Spin Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 A Initial State $ $ $ $ - B $ 1.00 $ 5.00 $ 10.00 wipe out C D E End State Prob. $ 1.00 0.25 $ 5.00 0.25 =A4+B4 $ 10.00 0.25 $ 0.25 F $ 4.00 Initial State $ 20.00 $ 20.00 $ 20.00 $ 20.00 $ 1.00 $ 5.00 $ 10.00 wipe out G H I J =SUMPRODUCT(C2:C5,D2:D5) End State Prob. $ 21.00 0.25 $ 25.00 0.25 $ 30.00 0.25 $ 0.25 $ 19.00 =SUMPRODUCT(C10:C13,D10:D13) Case 1. Begin with $0: Best to spin Case 2. Begin with $20: Best not to spin How to organize a general model for all situations? Multiple-Spins Analysis: Tree Representation 0 0 0 0 3 2 1 7 6 11 5 10 12 15 10 11 16 15 20 21 20 25 30 Characteristics of Example Characteristic 1 –The problem can be divided into stages with a decision required at each stage. Characteristic 2 –Each stage has a number of states associated with it. –By a state, we mean the information that is needed at any stage to make an optimal decision. Characteristic 3 –The decision chosen at any stage describes how the state at the current stage is transformed into the state at the next stage (transition state). –How? Characteristics of DP Applications Characteristic 4 − Given the current state, the optimal decision for each of the remaining stages must not depend on previously reached states or previously chosen decisions. − This is known as the principle of optimality. Characteristic 5 − If the states for the problem have been classified into one of T stages, there must be a recursion that relates the cost or reward earned during stages t, t+1, …., T to the cost or reward earned from stages t+1, t+2, …. T (cost/value-to-go function). DP with Uncertainty In deterministic dynamic programming, a specification of the current state and current decision was enough to tell us with certainty the new state (transition state) and the immediate costs/payoff during the current stage. In many practical problems, these factors may not be known with certainty, even the current state and decision are known. Fishing example • Profit/ton can be uncertain • Reproduction rate can be uncertain Adapt deterministic DP methodology to incorporate uncertainty. • Expected values? • Multiple transition states, each assigned a probability? The value of the best decision, given State i at Stage t, 𝑛 𝑓 𝑆𝑖,𝑡 = 𝑚𝑎𝑥 𝑆𝑖 , 𝑓 𝑆𝑖,𝑡+1 𝑃 𝑆𝑡+1 = 𝑆𝑖 𝑖=1 EMV from Spinning Cash A 1 2 3 4 5 6 7 8 9 10 A $ B B 1.00 $ 5.00 0.25 0.25 Value Function 3 Spins Left 2 Spins Left $ 6.81 $ 6.00 $ 6.56 $ 7.13 $ 7.69 $ 8.25 C C $ D E F G H I 2 Spins Left $ $ 1.00 $ 2.00 $ 3.00 $ 4.00 1 Spin Left $ $ 1.00 $ 2.00 $ 3.00 $ 4.00 0 Spins Left $ $ 1.00 $ 2.00 $ 3.00 $ 4.00 D 10.00 Wipe Out 0.25 0.25 1 Spin Left 0 Spins Left $ 4.00 $ =MAX(H7,(SUM(C8,C12,C17)/4)) $ 4.75 $ 1.00 $ 5.50 $ 2.00 $ 6.25 $ 3.00 $ 7.00 $ 4.00 States 3 Spins Left $ - Multiple-Spins Analysis: Tree Representation 0 0 $6.81 0 0 3 1 $6.56 2 $5.50 7 6 $8.50 5 $8.81 10 $11.50 11 12 15 10 $11.88 11 $12.25 15 $15.25 16 20 21 20 $19.00 25 30 Personalized Marketing Suppose you run an online wine shop that sells 12-bottle cases of wines of your choice. You have segmented your customers based on how many boxes of wine they have purchased in the previous quarter. You plan to send emails offering discount on the 12-bottle cases to two unique segments of customers: •active customers who have made a purchase in the previous quarter, •inactive customers who have not made a purchase in the previous quarter. Personalized Marketing By analyzing historical data, you have estimated the chance of a customer making a purchase in response to different discount levels. You have found that customers in different segments respond different to discount offers. Customer Segment Inactive Inactive Inactive Active Active Active Discount Probability of Purchase None 0.030 Minor 0.051 Major 0.171 None 0.110 Minor 0.150 Major 0.504 You would like to run the online shop for five more quarters. What is your optimal strategy to maximize the expected lifetime value of a customer. Assume 4% annual discount rate. Personalized Marketing In particular, you are considering choosing one of the three levels of discounts for each customer each quarter: Discounts 0% 11.67% 46.67% Description Direct Cost No discount 0 Minor discount $1 Major discount $1 Your supplier charges you $65 per 12-bottle case of wines. Without any discount, your customer would pay $150 for a case of wines. This implies that a customer needs to pay only $80 if he/she is offered a major 46.67% discount. Active means “made a purchase this month”. Note that a customer is always in one of two states, as represented in this network diagram. 4 Months to Go 3 Months to Go 2 Months to Go 1 Month to Go 0 Months to Go Inactive Inactive Inactive Inactive Inactive Active Active Active Active Active Also note that there are three decision alternatives for each state, but only two possible subsequent states (each of which could be reached through any of the three decision alternatives). We will define an end stage, when there are no months left. Assume that neither type of customer buys anything in this “terminal” stage. Inactive $0.00 Active $0.00 EMV = Expected Profit This Period + Expected Profit in Future Periods Inactive customer with 1 month to go: No discount (0.030 * $85.00) + 0.99 * ($0.00) = $2.55 Minor discount (0.051 * $66.50) + 0.99 * ($0.00) = $2.44 Inactive $2.55 Inactive $0.00 Active Active $0.00 Major discount (0.171 * $14.00) + 0.99 * ($0.00) = $1.57 Expected Profit this Period Expected Profit in Future Periods Might have multiple outcomes Might involve discounting Inactive customer with 1 month to go: No discount (0.030 * $85.00) + 0.99 * ($0.00) = $2.55 Minor discount (0.051 * $66.50) + 0.99 * ($0.00) = $2.44 Major discount (0.171 * $14.00) + 0.99 * ($0.00) = $1.57 1 2 Cost 3 A 4 5 6 Probability 7 Inactive 8 Active 9 10 Revenue 11 12 Profit 0.0000 0.1167 0.4667 0.030 0.110 0.051 0.150 0.171 0.504 Expected Profit this Period Inactive Active Discount 0.990099 $ 150.00 $ 132.50 $ 80.00 $65.00 $65.00 $65.00 $85.00 $67.50 $15.00 13 14 0 Month to go 15 Nothing 16 17 18 19 20 21 Expected Profit in Future PeriodsH E F G B C D Nothing Minor Major $0.00 $1.00 $1.00 $0.000 $0.000 Minor $0.000 $0.000 Major $0.000 $0.000 Optimal Decision $0.000 $0.000 =-D$2+D$7*D$12+$F$2*(D$7*$F17+(1-D$7)*$F16) 1 Month to go Nothing Minor Major Inactive $2.55 $2.44 $1.57 Optimal $2.550 Nothing Inactive customer with 1 month to go: No discount (0.030 * $85.00) + 0.99 * ($0.00) = $2.55 Minor discount (0.051 * $66.50) + 0.99 * ($0.00) = $2.44 Major discount (0.171 * $14.00) + 0.99 * ($0.00) = $1.57 Expected Profit from Best Decision 19 20 21 22 A B C D 1 Month to go Nothing Minor Major Inactive $2.55 $2.44 $1.57 Active $9.35 $9.13 $6.56 E F G Optimal =MAX(B21:D21) $2.550 Nothing $9.350 Nothing H Active customer with 1 month to go: No discount (0.110 * $85.00) + 0.99 * ($0.00) = $9.35 Minor discount (0.150 * $66.50) + 0.99 * ($0.00) = $9.13 Major discount (0.504 * $14.00) + 0.99 * ($0.00) = $6.56 Inactive $2.55 Inactive $0.00 Active $9.35 Active $0.00 2 Months to Go 1 Month to Go 0 Months to Go Inactive $5.31 Minor Inactive $2.55 Nothing Inactive $0.00 Active Active $9.55 Nothing Active $0.00 Inactive customer with 2 months to go: No discount: =$-1.00+(0.03*$85.00)+0.99*(0.03*$9.35+(1-0.03)*$2.55)=$5.28 Minor discount: =$-1.00+(0.051*$67.5)+0.99*(0.051*$9.35+(1-0.051)*$2.55)=$5.31 Major discount: =$-1.00+(0.171*$15)+0.99*(0.171*$9.35+(1-0.171)*$2.55)=$5.24 2 Months to Go 1 Month to Go 0 Months to Go Inactive $5.31 Minor Inactive $2.55 Nothing Inactive $0.00 Active $12.66 Minor Active $9.55 Nothing Active $0.00 Active customer with 2 months to go: No discount: =$-1.00+(0.11*$85.00)+0.99*(0.11*$9.35+(1-0.11)*$2.55)=$12.62 Minor discount: =$-1.00+(0.15*$67.5)+0.99*(0.15*$9.35+(1-0.15)*$2.55)=$12.66 Major discount: =$-1.00+(0.504*$15)+0.99*(0.504*$9.35+(1-0.504)*$2.55)=$12.48 Solved Version 2 Months to Go 1 Month to Go 0 Months to Go 4 Months to Go 3 Months to Go Inactive $10.81 Major Inactive $8.07 Minor Inactive $5.31 Minor Inactive $2.55 Nothing Inactive $0.00 Active $18.25 Major Active $15.49 Major Active $12.66 Minor Active $9.55 Nothing Active $0.00 Production Planning At the beginning of each period, decide the production quantity (before knowing the actual demand of that period). Each period’s demand is equally likely to be 1 or 2 units. Production cost is c(x) if x units is produced. Assume $5x. It is required that all demand be met on time. All demand occurs at the beginning of the period. After meeting the current period’s demand out of current production and inventory, the firm’s end-of-period inventory is evaluated, and a holding cost of $1 per unit is assessed. Inventory at the end of each period cannot exceed 3 units. Any inventory on hand at the end of period 3 can be sold at $2 per unit. At the beginning of period 1, the firm has 1 unit of inventory. Production Planning: Characteristics The problem can be divided into three stages (three periods) At each stage, the production quantity has to be decided The state for each stage is the beginning inventory, this is the information needed to make future decisions Optimal decisions for the remaining stages do not depend on how we reached the beginning inventory of the current stage The expected cost from stage t to stage 3 is the sum of the expected immediate cost at stage 3, and the expected cost from stage t+1 to stage 3. Stochastic Production Planning Model Define ft(i) to be the minimum expected net cost incurred during the periods t, t +1,…3 when the inventory at the beginning of period t is i units. Then Probabilities 1 1 1 1 f 3 (i ) min c( x ) i x 1 i x 2 2i x 1 2(i x 2 ) x 2 2 2 2 Production Cost Inventory Costs 2 Scenarios End of other periods Salvage Values 2 Scenarios End of 3rd Period where x must be an integer and x must satisfy (2-i) ≤ x ≤ (4 - i). Stochastic Production Planning Model For t = 1, 2, we can derive the recursive relation for ft(i) by noting that for any month t production level x, the expected costs incurred during periods t, t+1, …,3 are the sum of the expected costs incurred during period t and the expected costs incurred during periods t+1, t+2, …,3 . If x units are produced during month t, the expected cost during month t will be c(x) + (½) (i+x-1)+ (½)(i+x-2). If x units are produced during month t, the expected cost during periods t+1, t+2, …,3 is computed as follows. Stochastic Production Planning Model Half of the time, the demand during period t will be 1 unit, and the inventory at the beginning of t+1 will be i + x – 1. In this situation, the expected costs incurred during periods t+1, t+2, …,3 is ft+1(i+x-1). Similarly, there is a ½ chance that the inventory at the beginning of period t+1 will be i + x – 2. In this case, the expected cost incurred during periods t+1, t+2, …,3 will be ft+1(i+x-2). In summary, the expected cost during the periods t+1, t+2, …,3 will be (½) ft+1(i+x-1) + (½) ft+1(i+x-2). Stochastic Production Planning Model For t = 1,2 1 1 1 1 f t (i ) min c( x ) i x 1 i x 2 f t 1 i x 1 f t 1 (i x 2 ) x 2 2 2 2 where x must be an integer and x must satisfy (2-i) ≤ x ≤ (4-i). Period 1 1 Period 4 Period 3 Period 2 0 0 0 1 1 1 2 3 0 Period 1 1 $28.00 Period 4 Period 3 Period 2 0 $25.00 0 $17.00 0 $9.00 1 $20.00 1 $12.00 1 $4.00 2 $15.00 3 $11.00 Inventory Optimal Production Quantity 0 2 1 1 2 0 3 0 0 Period 1 1 $28.00 Period 4 Period 3 Period 2 0 $25.00 0 $17.00 0 $9.00 1 $20.00 1 $12.00 1 $4.00 Inventory Optimal Production Quantity 0 2 1 1 2 0 3 0 0 Next Class Stochastic DP Formulation II Approximate lognormal distribution by binomial tree State depends on values of multiple parameters Assignment 2 is due before class: Submit in class a hard copy of your report Submit online your report and the Excel files
© Copyright 2025 Paperzz