Statistical Consulting - Cox Associates Consulting

Math 6330: Statistical Consulting
Class 9
Tony Cox
[email protected]
University of Colorado at Denver
Course web site: http://cox-associates.com/6330/
Course schedule
• April 14: Draft of project/term paper due
• April 18, 25, May 2, (May 9): In-class
presentations
• May 4: Final project/paper due by 8:00 PM
2
Agenda
• Prescriptive (decision) analytics
(Cont.)
– Decision trees (wrap-up)
– Simulation-optimization
– Newsvendor problem and
applications
– Decision rules, optimal statistical
decisions
– Quality control, SPRT
• Decision psychology
– Heuristics and biases
3
Advanced decision tree analysis
• Game trees
– Different decision-makers
• Monte Carlo tree search
(MCTS) in games with risk
and uncertainty
–
–
https://jeffbradberry.com/posts/2015/09/intro-to-monte-carlotree-search/,
http://www.cameronius.com/research/mcts/about/index.html
http://www.cameronius.com/cv/mcts-survey-master.pdf
– Generating trees
• Apply rules to expand and
evaluate nodes
http://stackoverflow.com/questions/23803186/monte-carlo-tree-searchimplementation-for-tic-tac-toe
• Learning trees from data
• Sequential testing
4
Summary on decision trees
• Decision trees show sequences of choices,
chance nodes, observations, and final
consequences.
– Mix observations, acts, optimization, causality
• Good for very small problems; less good for
medium-sized problems; unwieldy for large
problems  use IDs instead
• Can view decision trees and other decision
models as simple c(a, s) models
– But need good optimization solvers!
Example: Influence diagrams
http://ch.lumina.com/technology/influence-diagrams/
6
Influence diagram algorithms
• Goal: Calculate/infer the settings of decision
variables that will maximize expected utility
• Extend Bayesian Network inference algorithms
– Distributed message-passing/updating on graphs
– CPTs at chance nodes, EU-maximizing decisions at
choice nodes, EU at value node
– One approach: “Marginalize out” chance nodes,
“optimize out” decision nodes until none left
• Ross Schachter, Evaluating Influence Diagrams, 1986
• ID algorithms are very mature
7
Simulation-optimization
Example: Optimal R&D effort
• Each new employee a company hires has a 10%
probability (independently of anyone else) of solving
a certain R&D problem in the next year
• If solution is obtained in the next year, it is worth
$1M (else $0).
• Each new employee costs $0.05M.
• To maximize EMV, how many new employees should
the company hire to work on this R&D problem?
– Approach: Simulation-optimization in R
8
Simulation-Optimization (SO)
1. Randomly or adaptively* sample act a from choice set A.
2. Sample state s from Pr(s | a) and sample consequence c from
Pr(c | a, s)
–
Assess Pr(s | a) and Pr(c | a, s) by simulation, analysis, or statistics
3. Evaluate u(c) or EMV(c)
4. Repeat steps 2-3 many times, average results to estimate EU(a)
and uncertainty intervals
5. Repeat steps 1-5 many times to find the act a that gives largest
estimated EU(a) with high confidence.
* For the (many) technical details, see the following:
• https://pdfs.semanticscholar.org/e5d8/39642da3565864ee9c043a726ff538477dca.pdf (short overview)
• http://link.springer.com/article/10.1007/s10479-015-2019-x (long review)
• https://people.orie.cornell.edu/shane/pubs/WSC2015TutSlides.pdf (tutorial)
• Commercial solver in Excel: http://www.solver.com/simulation-optimization
9
Solution
• Each employee has probability 10% of solving problem.
• Solution is worth $1M. Each employee costs $0.05M.
• How many employees to hire to maximize EMV?
N = c(1:20); EMV = 1*(1 - 0.9^N) - 0.05*N; plot(N, EMV)
EMV[6]; EMV[7]; EMV[8];
[1] 0.168559
[1] 0.1717031
[1] 0.1695328
Optimal number is 7 employees
Example application of SO:
Newsvendor problem
• How many papers to stock?
–
–
–
–
Each costs $k
Each sells for $
Number sold = min(stock, demand)
demand is uncertain, with CDF of F
11
www.old-picture.com/united-states-history-1900s---1930s/Evening-newsboy-buying-from.htm
Newsvendor problem: Analysis
• How many papers to stock?
–
–
–
–
Each costs $k
Each sells for $
Number sold = min(stock, demand)
demand is uncertain, with CDF of F
• Analytic solution:
– Profit if stock = a and demand = s is *min(a, s) - a*k
– Optimization: Marginal benefit (revenue) = marginal cost
– (1 - F(a)) = k  F(a*) = 1 – k/
• (1 - F(a)) - k = expected additional (marginal) profit from buying one
more paper = 0 at optimum.
• F(a) = probability it remains unsold
12
www.old-picture.com/united-states-history-1900s---1930s/Evening-newsboy-buying-from.htm
Solution by SO
• Template: Choose a to optimize EU(a) or EMV(a), given uncertain
consequences with distribution Pr(c | a)
–
–
–
–
Sample many values of a = order quantity
EMV(a) = cEMV(c)*Pr(c | a) = *smin(a, s)*Pr(s) – ka
Risk profile Pr(c | a) = sPr(c | a, s)*Pr(s | a), s = demand
Pr(s | a) = Pr(s) = F(s) – F(s -1)
• Optimize (choose) inventory order to maximize average profit, given
uncertain demand
– Solution by simulation
– Solution by analysis
• Optimal order quantity is a*, where
• pdist(a*) = ( - k)/ =
–
–
pdist(x*) = CDF(a*) = Pr( demand < a*)
k = unit cost,  = selling price
http://demonstrations.wolfram.com/CapacityPlanningForShortLifeCycleProductsTheNewsvendorModel/
13
DA makes a big difference
• In experiments, decision-makers (MBA
students at Duke) ordered too few high-profit
products and too many low-profit products
• Average profits are less than optimal by 5%61%, depending on experimental conditions
(Schweitzer and Cachon, 2000)
–
http://opim.wharton.upenn.edu/~cachon/pdf/cachon_schweitzer_ms.pdf
14
Examples of newsvendor-like problems
(act = single number)
•
•
•
•
•
•
•
•
Water reservoirs for wind energy backup
Cash reserves
Number of cars or jets in business fleet
Minutes to buy in cell phone plan
How fast to drive
Reservation price for selling house
Pricing an IPO
How early to leave for class?
http://castlelab.princeton.edu/Presentations/ORF411_2013/ORF%20411%2015%20Newsvendor%20problem.pdf
15
Unknown unknowns:
More realistic decision problems
• What to do if probability distribution for
demand is unknown, and must be estimated
from data?
– Bayesian decision theory handles this without
difficulty: Update prior based on data, then make
optimal decision given posterior
• Adaptive learning algorithms
– Warren Powell’s “knowledge gradient” approach
http://castlelab.princeton.edu/Presentations/ORF411_2013/ORF%20411%2015%20Newsvendor%20problem.pdf
16
Unknown demand distribution with
random shocks
•
Solve via machine
learning algorithms
applied to simulated
data
–
–
–
•
Weighted Majority
Newsvendor Shifting
DSE = diversity of
statistical experts
META = hybrid of
approaches from the
literature,
e.g.,minimax
Many sensitivity
analysis results
(O’Neill et al., 2015)
http://onlinelibrary.wiley.com/doi/10.1111/deci.12187/full
17
Example applications of SO: Supply
chain inventory control
Template: Choose a to maximize EU(a) = cu(c)*Pr(c | a)
• Choose base stock s and order-up-to level S to minimize
average holding cost of supply chain inventory
– Optimal decision a has (S, s) form, easily optimized
•
•
https://www0.gsb.columbia.edu/mygsb/faculty/research/pubfiles/4030/federgruen_finding.pdf
http://dido.wss.yale.edu/~hes/pub/ss-policies.pdf (theory)
deterministic
http://www.informit.com/articles/article.aspx?p=2167438&seqNum=8
stochastic, adaptive
http://www.slideshare.net/jetromarquez/inventory-management12248440536560389
18
Important general concepts
• A decision rule or policy maps information (what we
see) to action (what we do)
– (S, s) policy maps observed inventory level to inventory
reorder decisions (when, how much)
– Netica influence diagrams
• Advanced statistical decision theory is largely about
optimizing decision rules
• Numerical optimization makes some insightful
analysis irrelevant
– E.g., how should lead time and demand variability affect
optimal (S, s)?
19
Example applications of SO
• Optimize stock portfolio to maximize average return
with uncertain stock prices
• Optimize selling time of asset to maximize profit,
given uncertain offers and holding costs
• Staff emergency room to minimize average total cost
per day (including costs of waiting times), given
uncertain arrivals
• Optimize location of fire stations or ambulances to
achieve undominated distribution of response times
20
Wrap-up on SO
• Very useful but very technical methods
• Requires some understanding of problem
environment so that probable consequences
of alternative decisions (or policies) can be
simulated
• Can require searching complicated choice sets
efficiently and adaptively
21