Model Uncertainty, Robust Optimization and

c 2006 INFORMS | isbn 0000-0000
°
doi 10.1287/educ.1053.0000
INFORMS—Pittsburgh 2006
Model Uncertainty, Robust Optimization and
Learning
Andrew E. B. Lim, J. George Shanthikumar & Z. J. Max Shen
Department of Industrial Engineering & Operations Research, University of California at Berkeley,
lim, shanthikumar, [email protected]
Abstract
Classical modelling approaches in OR/MS under uncertainty assume a full probabilistic characterization. The learning needed to implement the policies derived from these
models is accomplished either through (i) classical statistical estimation procedures
or (ii) subjective Bayesian priors. When the data available for learning is limited, or
the underlying uncertainty is non-stationary, the error induced by these approaches
can be significant and the effectiveness of the policies derived will be reduced. In this
tutorial we discuss how we may incorporate these errors in the model (that is, model
model uncertainty) and use robust optimization to derive efficient policies. Different
models of model uncertainty will be discussed and different approaches to robust optimization with and without bench-marking will be presented. Two alternative learning
approaches Objective Bayesian Learning and Operational Learning will be discussed.
These approaches could be used to calibrate the models of model uncertainty and to
calibrate the optimal policies. Throughout this tutorial we will consider the classical
inventory control problem, the inventory control problem with censored demand data
and the portfolio selection problem as examples to illustrate these ideas.
Keywords Model uncertainty, Robust optimization, learning, operational statistics
1. Introduction
The majority of the early models in OR/MS are been deterministic. Specifically, models
for production planning, logistics and transportation have been based on the assumption
that all variables of interest are known in advance of the implementation of the solutions.
While some models, such as queueing, insurance and portfolio selections naturally call for
incorporating stochasticity, it is usually assumed that the full probabilistic characterization
of these models are known in advance of the implementation of the solutions. Even when it
is assumed that the parameters of a parametric stochastic model are unknown, it is assumed
that a Bayesian prior for the parameters is known (e.g., see Azoury (1985), Berger (1985),
Ding, Puterman and Bisi (2002), Robert (2001)). Such an approach is often justified by the
axiomatic framework of Savage (e.g., see Savage (1972)) for decision making. Assuming this
one ends up with a model that has been fully characterized. In economics, with the initial
work of Knight (1921) and the Ellsberg paradox (Ellsberg (1961)) questions on this basic
idea of full probabilistic characterization has been raised. The seminal work of Gilboa and
Schmeidler (1989) provides an axiomatic framework justifying the notion of multiple fully
characterized stochastic models for a single decision problem with a max min objective. This
sparked the basis for model uncertainty and robust optimization in the economics and finance
areas (e.g. see Anderson, Hansen and Sargent (1998), (2003), Cagetti, Hansen, Sargent and
Williams (2002), Cao, Wang and Zhang (2005), Dow and Werlang (1992), Epstein (2006),
Epstein and Miao (2003), Epstein and Schneider (2003), (2005a), (2005b), Epstein and Wang
1
2
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
(1994), Garlappi, Uppal and Wang (2005), Hansen and Sargent (2001), (2001), (2003)).
For a recent account of the application of model uncertainty and robust optimization in
economics and finance see the monograph by Hansen and Sargent (2006). Within the OR/MS
community interest in deterministic robust optimization has been strong recently (e.g., see
Atamturk (2003), Atamturk and Zhang (2004), Averbakh (2000), (2001), (2004), Ben-Tal
and Nemirovski (1998), (1999), (2000), (2002), Bertsimas, Pachamanova and Sim (2004),
Bertsimas and Sim (2004a), (2004b), (2006), El Ghaoui and Lebret (1997) and El Ghaoui,
Oustry and Lebret (1998)). See Soyster (1973) for one of the earliest contribution to this
area and the book by Kouvelis and Yu (1997) for a detailed account of the developments
until the mid 90’s. However stochastic models of model uncertainty has not received as
much attention as the others in the OR/MS literature. In this tutorial we will describe
the different ideas in modelling model uncertainty, finding the solution to this model using
robust optimization and its implementation through learning.
Consider a static or a discrete time dynamic optimization problem defined on a sample
space (Ω, F, (Fk )k∈M ). Here M = {0, 1, 2, . . . , m}, where m is the number of decision epochs
(m = 1 for a static optimization problem, m = 2 in a stochastic programming problem
with recourse, and m ≥ 2 for a discrete dynamic optimization problem). Ω is the set of all
possible outcomes of the input variables Y0 and the future values Y = {Yk , k = 1, 2, . . . , m}
of interest for the optimization problem (such as the demand over time for different items in
an inventory control problem, the arc lengths and costs in a network optimization problem,
etc.). F is the sigma algebra of event in Ω and F0 is (the sigma algebra of) all possible
information on the input variables that may be available to the decision maker at time 0
(such as the past demand or sales data for the different items in an inventory control problem
or the arc lengths and costs in network optimization problem). The actual information I0
available to the decision maker is an element of F0 . Though it is not required, Fn is often
the sigma algebra generated by the internal history of the variables {Yk , k ∈ M} (that is,
Fk = σ(Yj , j = 0, 1, 2, . . . , k)). It should be noted that the information that is available to the
decision maker at the beginning of period k + 1 (k ≥ 1) may not be Fk (for example, in an
inventory control problem one may only have information on the sales and not the actual
demand values).
Let π1 be the decision made at the beginning of period 1 (which is adapted to an information
subset I0 in F0 ). This leads to an information set that may depend on π1 . Let I1 (π1 ) be the
sigma algebra generated by this information set (which satisfies I1 (π1 ) ⊂ F1 ). Now let π2 be
the decision made at the beginning of period 2 (which is adapted to I1 (π1 )). In general, the
policy π is adapted to an information filtration ((Ik (π))k∈M ) which in turn is sequentially
generated by the policy π.
Let ψ(π, Y) be the reward obtained with policy π and Γ be the collection of all admissible
policies π. We are then interested in finding a policy π ∗ ∈ Γ that maximizes ψ(π, Y) in some
sense. One may adapt several alternative approaches to do this. All of these approaches in
some way need to define a probability measure (say P ) on (Ω, F, (Fk )k∈M ) given I0 . Classical modelling approaches in OR/MS under uncertainty assume that a full probabilistic
characterization can be done very accurately (that is, we have perfect forecasting capability when a non degenerate measure is used in our model AND that we have the capability
to predict the future perfectly when the assumed measure is degenerate). When we do
this we hope one or both of the following is true:
THE ASSUMPTIONS
• A1: The chosen probability measure P is the true probability measure P0 or very close
(in some sense) to it.
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
3
• A2: The solution (optimal in some sense ) obtained with P leads to a performance that
is either optimal or close to optimal (in some sense) with respect to P0 .
The learning needed to implement the policies derived from these models is accomplished
either through (i) classical statistical estimation procedures or (ii) subjective Bayesian priors.
It is not hard to see that the assumptions in many cases need not be true! When the data
available for learning is limited, or the underlying uncertainty is non-stationary, the error
induced by these approaches can be significant and the effectiveness of the policy derived
will be reduced. In this tutorial we discuss how we may incorporate these errors in the model
(that is, model model uncertainty) and use robust optimization to derive efficient policies.
Different models of model uncertainty will be discussed and different approaches to robust
optimization with and without bench-marking will be presented. Two alternative learning
approaches Objective Bayesian Learning and Operational Learning will be discussed. These
approaches could be used to calibrate the models of model uncertainty and to obtain robust
optimal policies.
Before proceeding further with this discussion we will introduce a very simple canonical
example: The Newsvendor Inventory Problem with Demand Observed. This can be thought
of as a sequence of n static problems. This model is almost always used as a RAT to
experiment with to test different ideas in inventory control. It will allow us to discuss the
importance of model uncertainty and the integration of optimization and estimation. Later,
in Section 7 we will work out three classes of dynamic optimization problems that will serve
as examples to illustrate our ideas on learning with integrated dynamic optimization and
estimation and robust optimization with bench-marking.
THE INVENTORY RAT: Consider perishable item inventory control problem. Items are
purchased at c per unit and sold for s per unit. There is no salvage value and no lost sales
penalty. Suppose Y1 , Y2 , . . . , Ym represent the demand for this item for the next m periods.
We wish to find the optimal order quantities for the next m periods. Suppose we order πk
units in period k. Then the profit is
ψ(π, Y) =
m
X
{s min{Yk , πk } − cπk }.
k=1
This problem allows us to illustrate the effects of separating modelling and optimization
from model calibration without having to bring in the consequences of cost-to-go (that is,
residual) effects of current decisions at each decision epoch on future time periods. In evaluating the different approaches we will assume that Y1 , Y2 , . . . , Ym are i.i.d. with an absolutely
continuous distribution function FY . Further, if needed we will assume that Yk is exponentially distributed with mean θ (that is FY (y) = 1 − exp{− θ1 y}, y ≥ 0). Let {X1 , X2 , . . . , Xn }
be the past demand for the last n periods. This information is contained in Y0 . We will also
assume that {X1 , . . . , Xn } are i.i.d samples from the same distribution as Yk .
In Section 2 we will discuss what is done now: how models are formulated, optimized and
implemented. Following a discussion on the possible errors in the current approaches in
Section 2, alternative approaches to model these errors through flexible modelling will be
discussed in Section 3. Flexible modelling will be accomplished through defining a collection
of models that is very likely to contain the correct model or a close approximation of it.
Hence finding a robust solution to these collection of models depends on defining a robust
optimization approach. Alternative approaches to robust optimization are discussed is Section 4. Section 5 is devoted to the calibration of flexible models using classical statistics.
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
4
Integrated learning in flexible models using (i) min-max, duality and objective Bayesian
learning and (ii) operational learning is introduced in Section 6. Detailed application of the
concepts discussed in this tutorial to dynamic inventory control and portfolio selection are
given in Section 7.
2. Modelling, Optimization and Implementation
Almost always the abstract formulation of the model and optimization is done independent
of I0 and how the model will be calibrated. Here and in the remaining of the paper we will
assume that Y0 contains the past n values {Xk , k = 1, 2, . . . , n} that will be used to calibrate
Y (that is, its probability measure P ).
2.A. Deterministic Modelling, Optimization and Implementation
Though this is obvious, we wish to discuss deterministic modelling here since it forms a basis
for a large body of work currently being done in robust optimization (e.g., see the special
issue of Mathematical Programming, 107, Numbers 1-2 on this topic). Let Pωd0 = I{ω = ω0 },
ω0 ∈ Ω be a collection of degenerate (Dirac) probability measures on (Ω, F, (Fk )k∈M ). In
deterministic modelling one assumes that for some chosen ω0 ∈ Ω, we have P = Pωd0 . Then
φ(π, ω0 ) = E[ψ(π, Y)] = ψ(π, Y(ω0 )).
Given that the feasible region of π is Γ one then has the following optimization problem:
φd (ω0 ) = max{φ(π, ω0 )}
π∈Γ
and choose a π d (ω0 ) ∈ Γ such that
φ(π d (ω0 ), ω0 ) = φd (ω0 ).
To implement this policy, however, one would have to estimate Y(ω0 ). For example one may
assume that {X1 , . . . , Xn , Y1 , . . . , Ym } are i.i.d. and estimate Y(ω0 ) by say,
Ŷk (ω0 ) = X̄, k = 1, 2, . . . , m,
where
n
X̄ =
1X
Xk .
n
k=1
For some problems, the effect of variability on the final solution may be insignificant so
that such an assumption of determinism can be justified. For most real problems, however,
such an assumption may be unacceptable. Often, such an assumption is made so that the
resulting optimization problems are linear programs or integer linear programs so that some
of the well established approaches in OR can be used to solve these optimization problems.
Sometimes, even with this assumption of determinism, the solution may be hard to get. It is
fair to say that the decision to assume determinism is mostly motivated by the desire to get a
solution rather than to capture reality. However, with all the advances that have been made
in convex optimization (e.g., see Bertsekas (2003) and Boyd and Vandenberghe (2004)) and
in Stochastic Programming (e.g., see Birge and Louveaux (1997), Ruszczynski and Shapiro
(2003), and van der Vlerk (2006)), it seems possible to relax this assumption and proceed
to formulate stochastic models. Before we proceed to discuss stochastic modelling, we will
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
5
give the deterministic version of the inventory rat. We will later use this result in robust
optimization with bench-marking.
THE INVENTORY RAT (continued)
φd (ω0 ) = max{
m
X
ψ(πk , Yk (ω0 )) : πk ≥ 0} = (s − c)
m
X
Yk (ω0 )
k=1
k=1
and
πkd (ω0 ) = Yk (ω0 ), k = 1, 2, . . . , m.
Then the expected profit is
φd (θ) = (s − c)mθ.
where θ = E[Yk ].
To implement this policy we need to know the future demand. If we don’t, maybe we can
approximate the future demand by the observed average! Hence the implemented policy
would be
π̂kd = X̄, k = 1, 2, . . . , m
with profit
ψ̂(Y ) =
m
X
{s min{Yk , X̄} − cX̄},
k=1
Pn
where X̄ = n1 k=1 Xk . Depending on when policy change is allowed, re-optimization will
take place at sometime in the future. Here and in the rest of the paper we will assume that
we are allowed to re-optimize at the end of each period. Now depending on the belief we
have on the i.i.d assumption for the demand, we may be willing to estimate the demand for
the next period based only on the last, say, l periods. For ease of exposition we will assume
that l = n. Set Xn+j = Yj , j = 1, 2, . . . , m. Then using an updated estimate of Yk (ω0 ) at the
beginning of period k we get
π̂kd = X̄k , k = 1, 2, . . . , m,
P
n+k−1
where X̄k = n1 j=k Xj is the n-period moving average for k = 1, 2, . . . , m. The associated
profit is
m
X
ψ̂(Y ) =
{s min{Yk , X̄k } − cX̄k }.
k=1
Suppose the demand is exponentially distributed with mean θ. It is easy to verify that
n n
1
) .
ψ̂(Y ) = (s − c)θ − sθ(
m→∞ m
n+1
lim
As n → ∞ one gets an average profit of (s − c)θ − sθ exp{−1}. It can be verified that this
profit can be very inferior to the optimal profit. For example, when sc = 1.2, c = 1 and θ = 1,
the optimal profit is 0.121 while the above policy results in a profit of −0.241.
2.B. Stochastic Modelling and Optimization
For stochastic modelling, we assume a non-degenerate probability measure. That is, we
define , given I0 a non-degenerate probability measure P on (Ω, F, (Fk )k∈M ). Wanting to
specify a probability measure without any statistical assumption is indeed an idealized goal.
Even if we are able to solve the resulting optimization problem, the calibration of P given
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
6
I0 will almost always require us to make some statistical assumptions regarding Y and Y0 .
These assumptions are often such as i.i.d., Markovian, autoregressive of some order etc. If
the state space of Y is finite, then we may try to solve the problem with respect to the
probabilities assigned to the different states (treating them as parameters). Even then it
may be difficult to solve the optimization problem. In such cases and in cases where further
information on the distributional characteristic are known we make additional assumptions
that allow one to fully characterize P up to some finite dimensional parameter.
2.B.1. Parametric modelling, Optimization and implementation
Suppose we have fully characterized P up to some finite dimensional parameter, say, θ.
For example this may be achieved by postulating that Yk has an exponential or normal
distribution or that the transition kernel of the Markov process Y is parameterized by
a finite set or the state space if finite. Let Pθp be the corresponding probability measure
parameterized by θ. Define
φp (π, θ) = E[ψ(π, Y)].
Finding the solution to this formulation depends on one of two approaches one chooses for
implementation: Frequentist or Bayesian Approach.
Frequentist Approach Suppose we assume that the information I0 we have will allow us
to estimate the parameter θ exactly. Then one solves
φp (θ) = max{φ(π, θ)}
π∈Γ
and choose a π p (θ) ∈ Γ such that
φ(π p (θ), θ) = φp (θ).
To implement this policy, however, one would have to estimate θ. Suppose we use some
statistical estimator Θ̂(X) of θ using the data X. Then we would implement the policy
π̂ p = π p (Θ̂(X)).
THE INVENTORY RAT (continued): When the demand is exponentially distributed one
has (e.g., see Liyanage and Shanthikumar (2005), Porteus (2002) and Zipkin (2000)),
π
φp (π, θ) = E[ψ(π, Y)] = sθ(1 − exp{− }) − cπ,
θ
s
π p (θ) = θ ln( ),
c
and
s
φp (θ) = (s − c)θ − cθ ln( ).
c
For an exponential distribution, the sample mean is the uniformly minimum variance unbiased (UMVU) estimator. Hence we will use the sample mean of the observed data to estimate
θ. Then the implemented policy would be
s
π̂kp = X̄ log( ), k = 1, 2, . . . , m.
c
with profit
ψ̂(Y ) =
m
X
s
s
{s min{Yk , X̄ log( )} − cX̄ log( )},
c
c
k=1
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
where X̄ =
get
1
n
Pn
k=1 Xk .
7
If we use the updated estimate of θ at the beginning of period k we
s
π̂kp = X̄k log( ), k = 1, 2, . . . , m.
c
With this implementation,
ψ̂(Y ) =
m
X
s
s
{s min{Yk , X̄k log( )} − cX̄k log( )},
c
c
k=1
and it can be easily verified that (see Liyanage and Shanthikumar 2005)
lim
m→∞
n
s
1
n
ψ̂(Y ) = sθ(1 − (
s ) ) − cθ log( ).
m
n + log( c )
c
Observe that the average profit achieved is smaller than the expected profit (s − c)θ −
cθ ln( sc ). For small values of n this loss can be substantial. For example, when n = 4 and
s
c = 1.2, the percent loss over the optimal value with known θ is 22.86. (see Liyanage and
Shanthikumar 2005, page 343). When the demand is non-stationary, we will be forced to
use a moving average or exponential smoothing to forecast the future demand. In such a
case, we will need to use a small value for n.
Subjective Bayesian Approach Under the subjective Bayesian approach, given I0 one
assumes that the parameter characterizing the measure is random and postulates a distribution for that parameter (Θ). Suppose, we assume that the density function of Θ is
fΘ (θ), θ ∈ Θ and the conditional density of {Θ|X} as fΘ|X (θ|X), θ ∈ Θ. The objective function in this case is
Z
EΘ [φ(π, Θ)|X] =
φ(π, θ)fΘ|X (θ|X)dθ .
θ∈Θ
Let
πfBΘ (X) = arg max{EΘ [φ(π, Θ)|X] : π ∈ Γ}
and
B
φB
fΘ (θ) = EX [φ(πfΘ (X), θ)].
THE INVENTORY RAT (continued): Often the subjective prior is chosen to be the conjugate of the demand distribution (e.g., see Azoury 1985). When the demand is exponentially
distributed we should choose the Gamma prior for the unknown rate, say λ = θ1 of the
exponential distribution (e.g., see Robert (2001), page 121). So let (for α, β > 0)
fΘ (θ) =
( βθ )α+1
β
exp{− }, θ ≥ 0.
βΓ(α)
θ
1
]= α
Note that E[Λ] = E[ Θ
β . We still need to choose the parameters α and β for this prior
distribution. Straightforward algebra will reveal that
s 1
πfBΘ (X) = (β + nX̄)(( ) α+n − 1).
c
Even if the demand distribution is exponential, if the demand mean is non-stationary the
Bayesian estimate will converge to an incorrect parameter value. Hence we need to re-initiate
the prior distribution every now and then. Suppose we do that every n periods. Then
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
8
s 1
B
πk:f
(X) = (β + nX̄k )(( ) α+n − 1), k = 1, 2, . . . , m,
Θ
c
with profit
ψ̂(Y ) =
m
X
s 1
s 1
{s min{Yk , (β + nX̄k )(( ) α+n − 1)} − c(β + nX̄k )(( ) α+n − 1)}.
c
c
k=1
With this implementation, it can be verified that
lim
m→∞
1
β s 1
s 1
θ
)n exp{− (( ) α+n − 1)}) − c(β + nθ)(( ) α+n − 1).
ψ̂(Y ) = sθ(1 − (
1
s α+n
m
θ
c
c
( )
+θ−1
c
For bad choices of α and β the performance can be poor. The success of this policy will
depend on a lucky guess for α and β.
2.B.2 Non-Parametric modelling
Suppose we have characterized P without making any assumptions regarding the parametric
form of Y. Now define
φg (π, P ) = E[ψ(π, Y)]
and solve
φg (P ) = max{φ(π, P )}
π∈Γ
g
and choose a π (P ) ∈ Γ such that
ψ(π g (P ), P ) = φg (P ).
THE INVENTORY RAT (continued): Observe that the optimal order quantity π g (FY ) for
demand distribution FY is given by
c
π g (FY ) = F̄Yinv ( ),
s
where F̄Yinv is the inverse of the survival function (F̄Y = 1 − FY ) of the demand. We may
therefore use the empirical demand distribution (F̄ˆ Y ) to obtain an estimate of the order
quantity. Let X[0] = 0 and X[r] be the r-th order statistic of {X1 , . . . , Xn }, r = 1, 2, . . . , n.
Since the demand is assumed to be continuous, we set
x − X[r−1]
1
F̄ˆ Y (x) = 1 − {r − 1 +
}, X[r−1] < x ≤ X[r] , r = 1, 2, . . . , n.
n
X[r] − X[r−1]
Then the implemented order quantity π̂ g based on the empirical distribution is
inv c
π̂ g = F̄ˆ X ( ) = X[r̂−1] + â(X[r̂] − X[r̂−1] ),
s
where r̂ ∈ {1, 2, . . . , n} satisfies
c
c
n(1 − ) < r̂ ≤ n(1 − ) + 1
s
s
and
c
â = n(1 − ) + 1 − r̂.
s
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
9
It can be shown that (see Liyanage and Shanthikumar (2005), page 345),
r̂−1
X
1
s
n − r̂ + 2
n − r̂ + 1
â
1
lim
ψ̂(Y) = cθ{ (1 − (
)(
)) −
−
}.
m→∞ m
c
n+1
n − r̂ + 1 + â
n − k + 1 n − r̂ + 1
k=1
The loss in expected profit in this case can be substantially bad. For example, when n = 4
and sc = 1.2, the percent loss over the optimal value with known θ is 73.06. (This is much
worse than the 22.86 percent loss with the use of the sample mean for this example).
It is clear that with limited and/or non-stationarity in the underlying stochastic process we
may have significant errors in our models due to errors in the statistical assumptions we
used for the parametric or non parametric models and due to estimation errors. Therefore
we should find solutions that are robust to these errors. We could do this by attending to
two issues: (1) find ways to incorporate these errors in the model itself and (ii) find a way
to obtain a robust solution.
3. Model Uncertainty and Flexible Modelling
From the preceding discussion it is clear that we have to account for the errors we will
have in calibrating the stochastic model. Therefore, we will not know the exact probability
measure for our model. Given this it is reasonable to argue that one should not make a
decision based only on a single model (that is, using a single probability measure). Under
flexible modelling we would consider a collection of models and modify our ASSUMPTION.
MODIFIED ASSUMPTION
• A1: The chosen collection of probability measures P contains the true probability measure
P0 or one that is very close (in some sense) to it.
It is up to us now to define this collection of measures. Following tradition, we will have
three different approaches one could take to develop models of model uncertainty.
3.1. Flexible modelling with a variable uncertainty set
If the goal is to keep the resulting optimization problem within a class that has efficient
solution algorithms or strong approximations, one may consider a collection of degenerate
probability measures. That is, one considers
P = {Pωd , ω ∈ Ω}.
This is essentially to identify the possible values that Y can take. Let Y be this state space.
Then one considers a collection of problems
ψ(π, Y ), Y ∈ Y.
It is easy to see that in almost all real problems, the probability measure P0 will not be in
P. Yet, a vast majority of robust optimization reported in the OR/MS literature follows this
modelling approach (e.g., see see Atamturk (2003), Atamturk and Zhang (2004), Averbakh
(2000), (2001), (2004), Ben-Tal and Nemirovski (1998), (1999), (2000), (2002), Bertsimas,
Pachamanova and Sim (2004), Bertsimas and Sim (2004a), (2004b), (2006), Bertsimas and
Thiele (2003), Kouvelis and Yu (1997), Soyster (1973)).
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
10
3.2. Flexible modelling with a parametric uncertainty set
Suppose our statistical assumptions are valid and the only unknown are the true parameter
values. Then the collection of measures we consider could be
P = {Pθp , θ ∈ Θ},
for some set Θ of parameter values. Then one considers a collection of problems
φp (π, θ), θ ∈ Θ.
This appears to be a very promising way to formulate and solve real problems. Application
of this approach to portfolio optimization is discussed in Lim, Shanthikumar and Watewai
(2005), (2006b).
3.3. Flexible modelling with a non-parametric uncertainty set
For flexible modelling with a non-parametric uncertainty set, we first identify a nominal
model (or probability measure, say P̂ ). Then the collection of models are chosen to be a
closed ball around this nominal model. Let d(P, P̂ ) be some distance measure between P
and P̂ . If the measures are fully characterized by a density (or distribution) function, the
distance will be defined with respect to the density (or distribution) functions. The collection
of models thus considered will be
P = {P : d(P, P̂ ) ≤ α},
where α is the minimum deviation that we believe is needed to assure that the true probability measure P0 is in P. Some of the distance measures commonly used are listed below.
Distance Measures for Density Functions
We will specify the different types of distances for the density functions of continuous random
variables. Analogous distances can be defined for discrete random variables as well.
Kullback-Leibler Divergence (Relative Entropy)
Z
f (x)
dKL (f, fˆ) = f (x) log(
)dx.
ˆ(x)
f
x
It is easy to verify that dKL takes values in [0, ∞] and is convex in f . However it is not a
metric (it is not symmetric in (f, fˆ) and does not satisfy the triangle inequality). One very
useful property of dKL is that it is sum separable for product measures. This comes in very
handy in dynamic optimization with model uncertainty.
Hellinger Distance
r
Z p
q
1
1
[ ( f (x) − fˆ(x))2 dx] 2 .
2 x
Hellinger distance as defined above is a metric that takes a value in [0, 1]. One useful property
of this metric in dynamic optimization is that the Hellinger affinity (1 − d2H ) is product
separable for product measures.
dH (f, fˆ) =
Chi-Squared Distance
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
dCS (f, fˆ) =
Z
x
11
(f (x) − fˆ(x))2
dx.
fˆ(x)
Discrepancy Measure
dD (f, fˆ) = sup{|
Z
b
(f (x) − fˆ(x))dx| : a < b}.
a
Total Variation Distance
Z
1
ˆ
dT V (f, f ) = sup{ h(x)(f (x) − fˆ(x))dx : |h(x)| ≤ 1}.
2
x
Wasserstein (Kantorovich) Metric
Z
ˆ
dW (f, f ) = sup{ h(x)(f (x) − fˆ(x))dx : |h(x) − h(y)| ≤ |x − y|}.
x
Distance Measures for Cumulative Distribution Functions
Kolmogorov (Uniform) Metric
dK (F, F̂ ) = sup{|F (x) − F̂ (x)| : x ∈ R}.
Levy (Prokhorov) Metric
dL (F, F̂ ) = inf{h : F (x − h) − h ≤ F̂ (x) ≤ F (x + h) + h; h > 0; x ∈ R}.
Wasserstein (Kantorovich) Metric
dW (F, F̂ ) =
Z
|F (x) − F̂ (x)|dx.
x
Distance Measures for Measures
Kullback-Leibler Divergence (Relative Entropy)
Z
dP
)dP.
dKL (P, P̂ ) =
log(
dP̂
Ω
Prokhorov Metric
Suppose Ω is a metric space with metric d. Let B be the set of all Borel sets of Ω and for
any h > 0 define B h = {x : inf y∈B d(x, y) ≤ h} for any B ∈ B. Then
dP (P, P̂ ) = inf{h : |P (B) ≤ P (B h ) + h; h > 0; B ∈ B}.
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
12
Discrepancy Measure
Suppose Ω is a metric space with metric d. Let B c be the collection of all closed balls in Ω.
dD (P, P̂ ) = sup{|P (B) − P̂ (B))| : B ∈ B c }.
Total Variation Distance
dT V (P, P̂ ) = sup{|P (A) − P̂ (A)| : A ⊂ Ω}.
Wasserstein (Kantorovich) Metric
Suppose Ω is a metric space with metric d.
Z
dW (P, P̂ ) = sup{ h(ω)(P (dω) − P̂ (dω)) : |h(x) − h(y)| ≤ d(x, y), x, y ∈ Ω}.
Ω
The majority of the flexible modelling in finance is done using uncertainty sets for measures (e.g., see Hansen and Sargent (2006) and the references in there). Application of this
approach to dynamic programming is given in Iyengar (2005) and in revenue management
in Lim and Shanthikumar (2004) and Lim, Shanthikumar and Watewai (2006a).
4 Robust Optimization
Now that we have a collection of models, we need to decide how to find a solution for these
models such that the solution is indeed a very good solution for the true model. For this we
assume that our robust optimization will give such a good solution.
MODIFIED ASSUMPTION
• A2: The robust solution (optimal in some sense ) obtained with the collection of measures
P leads to a performance that is either optimal or close to optimal (in some sense) with
respect to P0 .
4.1. Max-min objective
The most commonly used approach to finding a (so-called) robust solution for the given set
of models is to find the best solution to the worst model among the collection of models.
The optimization problem is
φr = max{min {φ(π, P )}}.
π∈Γ P ∈P
and the solution sought is:
π r = arg max min {φ(π, P )}.
π∈Γ P ∈P
If the true model is the worst one, then this solution will be nice and dandy. However, if the
true model is the best one or something close to it, this solution could be very bad (that
is, the solution need not be robust to model error at all!). As we will soon see, this can be
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
13
the case. However, this form of (so called) robust optimization is still very popular, since
the resulting optimization tends to preserve the algorithmic complexity very close to that of
the original single model case. However, if we really want a robust solution, its performance
needs to be compared to what could have been the best for every model in the collection.
This idea of bench-marking will be discussed later. Let us now look at the inventory example:
THE INVENTORY RAT (continued): We will now apply max-min robust optimization to
the inventory rat with the three different flexible modelling ideas.
Uncertainty Set for Demand: Suppose the demand can take a value in [a, b]. That is a ≤
Yk ≤ b, k = 1, 2, . . . , m. Then we have the robust optimization problem
r
φ = max { min
πk ≥0 a≤Yk ≤b
m
X
{s min{Yk , πk } − cπk }}
k=1
Since the inner minimization is monotone in Yk it is immediate that
φr = max
πk ≥0
m
X
{s min{a, πk } − cπk } = (s − c)ma
k=1
and
πkr = a, k = 1, 2, . . . , m.
Clearly this a very pessimistic solution (for example if a = 0). Specifically, if the true demand
happens to be b, the performance of this solution will be the worst! Furthermore observe
that the solution is independent of s and c.
Uncertainty Set for the Mean of Exponentially Distributed Demand: Suppose the mean
demand can take a value in [a, b]. That is a ≤ E[Yk ] = θ ≤ b, k = 1, 2, . . . , m. Then we have
the robust optimization problem
φr = max { min
πk ≥0 a≤θ≤b
m
X
{sθ(1 − exp{−
k=1
πk
}) − cπk }}
θ
As before the inner minimization is monotone in θ and it is immediate that
φr = max
πk ≥0
m
X
{sa(1 − exp{−
k=1
πk
s
}) − cπk }} = ((s − c)a − ca log( ))m
a
c
and
s
πkr = a log( ), k = 1, 2, . . . , m.
c
Clearly this too is a very pessimistic solution (for example if a = 0). If the true mean demand
happens to be b, the performance of this solution will be the worst!
Uncertainty Set for Density Function of Demand: Suppose we choose the Kullback-Leibler
Divergence (Relative Entropy) to define the collection of possible demand density functions.
Suppose the nominal model chosen is an exponential distribution with mean θ̂. That is
1
1
fˆ(x) = exp{− x}, x ≥ 0.
θ̂
θ̂
Then the collection of density functions for the demand is
Z ∞
Z ∞
f (x)
f (x)dx = 1; f ≥ 0}.
)dx ≤ α;
f (x) log(
P = {f :
fˆ(x)
x=0
x=0
14
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
The min max robust optimization is then
Z π Z
max min{s
{
∞
π≥0 f ∈P
x=0
f (z)dz}dx − cπ}
z=x
Defining κ(x) = ffˆ(x)
and considering the Lagrangian relaxation of the above problem one
(x)
obtains (with β ≥ 0),
Z π Z ∞
Z ∞
Z ∞
max min{s
{
κ(x)fˆ(z)dz}dx−cπ +β
κ(x) log(κ(x))fˆ(x)dx :
κ(x)fˆ(x)dx = 1}.
π≥0 κ≥0
x=0
z=x
x=0
x=0
It can be verified that the solution to the above relaxation is
κ(x) =
(s − c)θ̂ + β
exp{−sx}, 0 ≤ x ≤ π r ,
β
κ(x) =
and
(s − c)θ̂ + β
exp{−sy}, π r ≤ x,
β
s
(s − c)θ̂ + β
β
π r = θ̂{log( ) + log(
}.
)}{
c
β
β + sθ̂
Furthermore it can be shown that the solution to the original problem is obtained by choosing
β such that
Z ∞
κ(x) log(κ(x))fˆ(x)dx = α.
x=0
It can be shown that β monotonically decreases as a function of α with β → 0 as α → ∞
and β → ∞ as α → 0. Notice that the robust order quantity goes to zero as β → 0 (that is
when α → ∞) and the order quantity becomes the nominal order quantity θ̂ log( sc ) when
β → ∞ (that is when α → 0). Clearly, in the former case we allow a demand that is zero with
probability one and in the latter case we restrict the collection of models to the nominal
one.
All the above three formulations suffer from the fact that the inner minimization is monotone
and the worst model is chosen to optimize. In what follows we will see that the idea of using
benchmarks will overcome this shortcoming.
4.2. Min-max regret objectives, utility and alternative coupling with benchmark
Recall that φg (P ) is the optimal objective function value we can achieve if we knew the
probability measure P . Hence we may wish to find a solution that gives an objective function
value that comes close to this for all measures in P. Hence we consider the optimization
problem:
φr = min{max{φg (P ) − φ(π, P )}}.
π∈Γ P ∈P
and the solution sought is:
π r = arg min max{φg (P ) − φ(π, P )}.
π∈Γ P ∈P
One may also wish to see how the robust policy works with respect to the optimal policy
with the actual profit and not its expectation. Given that one has a utility function U r for
this deviation, the coupled objective function is
φr = min{max{EP [U r (ψ(π g (P ), Y) − ψ(π, Y))]}}.
π∈Γ P ∈P
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
15
and the solution sought is:
π r = arg min max{EP [U r (ψ(π g (P ), Y) − ψ(π, Y))]}.
π∈Γ P ∈P
THE INVENTORY RAT (continued): Observe that clairvoyant ordering will result in a
profit of (s − c)Y . Hence if we order π units, the regret is (s − c)Y − {s min{π, Y } − cπ}
= s max{Y − π, 0} − c(Y − π). Hence we wish to solve
min max {s max{Y − π, 0} − c(Y − π)}
a≤Y ≤b
The optimal solution is
s−c
).
s
Unlike in the min-max robust optimization, here the order quantity depends on s and c.
π r = a + (b − 1)(
4.3. Max-min competitive ratio objective with alternative coupling with benchmark
Suppose φg (P ) ≥ 0 for all P ∈ P. Then instead of looking at the difference in the objective
function values, we may wish to look at the ratios (and find a solution that achieves a ratio
close to one for all P ). Hence we consider the optimization problem:
φr = min{max{
π∈Γ P ∈P
φ(π, P )
}}.
φg (P )
and the solution sought is:
φ(π, P )
}.
φg (P )
One may also wish to see how the robust policy works with respect to the optimal policy
with the actual profit and not its expectation. Suppose ψ(π g (P ), Y) ≥ 0. Given that one
has a utility function U r for this deviation, the coupled objective function is
π r = arg min max{
π∈Γ P ∈P
φr = min{max{EP [U r (
π∈Γ P ∈P
ψ(π, Y)
)]}}.
ψ(π g (P ), Y))
and the solution sought is:
π r = arg min max{EP [U r (
π∈Γ P ∈P
ψ(π, Y)
]}.
ψ(π g (P ), Y)))
5. Classical Statistics and Flexible Modelling
We will now discuss how classical statistics can be used to characterize model uncertainty
of different types. To do this, first we have to postulate a statical model for X, Y. Suppose
the extended measure for this is P e (note that, then P = {P e |I0 }).
5.1 Predictive Regions and Variable Uncertainty Set
Let SY be the state space of Y. Now choose a predictive region Y(X) ⊂ SY for Y such that
P e {Y ∈ Y(X)} = 1 − α,
16
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
for some appropriately chosen value of α (0 < α < 1). Then we could choose
Y = {Y(X)|I0 }.
THE INVENTORY RAT (continued): Suppose {X1 , X2 , . . . , Xn , Y } are i.i.d. exponential
random variables with mean θ. Let χ2k be a Chi-squared random variable with k degrees of
freedom and Fr,s be an F -random variable with (r, s) degrees of freedom. Then,
2n
X̄ =d χ22n
θ
and
Therefore
2
Y =d χ22
θ
Y d
= F2,2n
X̄
and
P {f2,2n,1− α2 X̄ ≤ Y ≤ f2,2n, α2 X̄} = 1 − α,
where
P {f2,2n,β ≤ F2,2n } = β, β ≥ 0.
A (1 − α)100% predictive interval for Y is (f2,2n,1− α2 X̄, f2,2n, α2 X̄). Hence with a min-max
objective, the robust solution is (see Section 4.1)
π r = f2,2n,1− α2 X̄.
Observe that this implementation is independent of s and c. Alternatively, one may use a
one sided predictive interval (f2,2n,1−α X̄, ∞). Then
π r = f2,2n,1−α X̄.
This too is independent of s and c. Therefore there is no guarantee that this solution will
be robust to model uncertainty! Suppose we choose an α such that
s 1
1 − α = P {(( ) 1+n − 1)n ≤ F2,2n }.
c
Then
s 1
π r = (( ) 1+n − 1)nX̄.
c
Later in operational learning we will find that this is indeed the optimal order quantity
when θ is unknown. It is thus conceivable that a good policy could be obtained using a
deterministic robust optimization provided we have stable demand and sufficient data to
test various α. If that is the case, then retrospective optimization using the past data would
have yielded a very good solution anyway! The issue in this method of using min-max robust
optimization is that the solution can be sensitive to the choice α and that a good value for
it cannot be chosen a priori. Hence we need a robust optimization technique that is robust
with respect to the choice of α.
5.2 Confidence Regions and Parameter Uncertainty Set
Let t(X) be an estimator of θ. Now choose a region T (θ) such that
P e {t(X) ∈ T (θ)} = 1 − α,
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
17
for some appropriately chosen value of α (0 < α < 1). Now define
Θ(X) = {θ : t(X) ∈ T (θ)}.
Then we could choose
Θ = {Θ(X)|I0 }.
THE INVENTORY RAT (continued): Suppose {X1 , X2 , . . . , Xn , Y } are i.i.d. exponential
random variables with mean θ. Observing that
2n
X̄ =d χ22n ,
θ
it is immediate that
P{
2nX̄
2nX̄
≤θ≤ 2
} = 1 − α,
2
χ2n, α
χ2n,1− α
2
where
2
P {χ22n,β ≤ χ22n } = β, β ≥ 0.
A (1 − α) 100 % confidence interval for θ is
2nX̄
χ22n, α
, χ22nX̄ α ). Hence with a min-max objective,
2n,1−
2
the robust solution is (see Section 4.1)
πr =
2
2nX̄
.
χ22n, α
2
Observe that this implementation is independent of s and c. Alternatively, one may use a
X̄
one sided predictive interval ( χ2n
, ∞). Then
2
2n,α
πr =
2nX̄
.
χ22n,α
This too is independent of s and c.
6 Learning
Outside of Bayesian learning, the two popular techniques used for learning in decision making
are: (i) Reinforcement Learning (e.g., see Sutton and Barto (1998)) and (ii) Statistical
Learning (e.g., see Vapnik (2000)). Applying either one of these approaches to the inventory
rat problem results in a solution, that is the same as in the non-parametric model discussed
in Section 2.B.2 (see Jain, Lim and Shanthikumar (2006)) which we already know can result
in poor results. We will not discuss these two approaches here.
6.1 Max-min, Duality and Objective Bayesian Learning
In this section we will pursue the max-min bench-marking approach discussed earlier as a
learning tool. Specifically, we will consider the dual problem, which can then be seen as a
form of the objective Bayesian approach (see Berger (1985), Robert (2001)).
In a dynamic optimization scenario, it is the recognition that the implemented policy π̂k at
time k is a function of the past data X that motivates the need to incorporate learning in
the optimization itself. Hence in integrated learning and optimization, the focus is
max Eθe [φ(π(X), θ)],
π
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
18
where the expectation over X is taken with respect to the probability measure Pθe .
This is indeed the focus of Decision Theory (Wald (1950)), where minimization of a loss
function is the objective. Naturally one could define −φ as the risk function and apply
the existing decision theory approaches to solve the above problem. It has already been
recognized in decision theory that without further characterization of π one may not be able
to solve the above problem (e.g., see Berger (1985), Robert (1994)). Otherwise one could
conclude that π p (θ) is the optimal solution. Hence one abides by the notion of an efficient
policy π defined below:
Definition: A policy π0 is efficient if there does not exist a policy π such that
Eθe [φ(π(X), θ)] ≥ Eθe [φ(π0 (X), θ)], ∀ θ,
with strict inequality holding for some values of θ.
Observe that π0 = π p (θ0 ) for almost any θ0 will be an efficient solution. Indeed it is well
known that any Bayesian solution π B (fΘ ), if unique, is an efficient solution. Thus one may
have an unlimited number of efficient policies and the idea of an efficient solution does not
provide an approach to identifying a suitable policy. While it is necessary for a solution to
be efficient, it is not sufficient (unless it is optimal).
Definition: A policy π0 is optimal, if
Eθe [φ(π0 (X), θ)] ≥ Eθe [φ(π(X), θ)], ∀ θ,
for all π.
It is very unlikely that such a solution can be obtained without further restriction on π for
real stochastic optimization problems. Consequently, in decision theory, one follows one of
the two approaches. One that is commonly used in the OR/MS literature is to assume a prior
distribution for the unknown parameter(s) (see Section 2.B.1). This eliminates any model
uncertainty. However this leaves one to have to find this prior distribution during implementation. This task may not be well defined in practice (e.g., see Kass and Wasserman (1996)).
To overcome this there has been considerable work done on developing non-informative priors (e.g., see Kass and Wasserman (1996)). The relationship of this approach to what we
will do in the next two sections will be discussed later. The second approach in decision
theory is min-maxity. In our setting, it is
max min{Eθe [φ(π(X), θ)]}.
π
θ
e
Unfortunately though, in almost all applications in OR/MS, EX
[φ(π(X), θ)] will be monotone in θ. For example, in the inventory problem, the minimum will be attained at θ = 0. In
general, suppose the minimum occurs at θ = θ0 . In such a case, the optimal solution for the
above formulation is π p (θ0 ). Hence it is unlikely that a direct application of the min-max
approach of decision theory to the objective function of interest in OR/MS will be appropriate. Therefore we will apply this approach using objectives with benchmark (see Sections
4.2 and 4.3 and also Lim, Shanthikumar and Shen (2006b)). In this section, we will consider
the relative performance
φ(π(X), θ)
.
η(π, θ) =
φp (θ)
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
19
The optimization problem now is
η r = max min{Eθe [η(π(X), θ)]}.
π
θ
The dual of this problem (modulo some technical conditions; see Lim, Shanthikumar and
Shen (2006b)) is
e
min max{EΘ
[η(π(X), Θ)]},
fΘ
π
where fΘ is a prior on the random parameter Θ of X. For each given prior distribution fΘ ,
the policy π that maximizes the objective η is the Bayesian solution. Let πfBΘ be the solution
and η B (fΘ ) be the objective function value. Two useful results that relate the primal and
the dual problems are (e.g., see Berger (1985)):
Lemma: If
η B (fΘ ) = min
θ
Eθe [φ(πfBΘ (X), θ)]
,
φp (θ)
then πfBΘ is the max-min solution to the primal and dual problems.
(l)
Lemma: If fΘ , l = 1, 2, . . . , is a sequence of priors and πfBΘ is such that
(l)
lim η B (fΘ ) = min
l→∞
θ
Eθe [φ(πfBΘ (X), θ)]
,
φp (θ)
then πfBΘ is the max-min solution to the primal problem.
Now we add a bound that apart from characterizing the goodness of a chosen prior fΘ or
the corresponding policy πfBΘ , will aid an algorithm in finding the max-min solution.
Lemma: For any prior fΘ ,
Eθe [φ(πfBΘ (X), θ)]
min
≤ ηr ≤
θ
φp (θ)
R
θ
Eθe [φ(πfBΘ (X), θ)]fΘ (θ)dθ
R
.
φp (θ)fΘ (θ)dθ
θ
6.2 Operational Learning
This section is devoted to describing how learning could be achieved through operational
statistics. Operational statistics is introduced in Liyanage and Shanthikumar (2005) and
further explored in Chu, Shanthikumar and Shen (2005, 2006a). The formal definition of
operational statistics is given in Chu, Shanthikumar and Shen (2006b).
In operational learning, we seek to improve the current practice in the implementation of
the policies derived assuming the knowledge of the parameters. In this regard, let π p (θ)
be the policy derived assuming that the parameter(s) are known. To implement, in the
traditional approach, we estimate θ by, say Θ̂(X) and implement the policy π̂ p = π p (Θ̂(X)).
The corresponding expected profit is
φ̂p (θ) = Eθe [φ(π p (Θ̂(X)), θ)],
where the expectation over X is taken with respect to Pθe . In operational learning, first we
identify a class of functions Y and a corresponding class of functions H such that
Θ̂ ∈ Y
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
20
and
π p ◦ Θ̂ ∈ H.
The second step is to choose a representative parameter value, say θ0 and solve
max Eθe0 [φ(π(X), θ0 )]
π∈H
subject to
Eθe [φ(π(X), θ)] ≥ φ̂p (θ), ∀θ
First note that since π p ◦ Θ̂ ∈ H, we are guaranteed that a solution exists for the above
optimization problem. Second note that the selection of θ0 is not critical. For it may happen
that the selection of H is such that the solution obtained is independent of θ0 (as we will see
in the inventory examples). Alternatively, we may indeed use a prior fΘ on θ and reformulate
the problem as,
Z
max Eθe [φ(π(X), θ)]fΘ (θ)dθ
π∈H
θ
subject to
Eθe [φ(π(X), θ)] ≥ φ̂p (θ), ∀θ
It is also conceivable that alternative forms of robust optimization may be defined.
THE INVENTORY RAT (continued): Recall that π p (θ) = θ log( sc ) and Θ̂(X) = X̄. So we
could choose H to be the class of order one homogenous functions. Note that
H1 = {π : Rn+ → R+ ; π(αx) = απ(x); α ≥ 0; x ∈ Rn+ }.
is the class of non-negative order-one-homogeneous functions. Furthermore, observe that ψ is
a homogeneous-order-one function (that is, ψ(αx, αY ) = αψ(x, Y )). Let Z be an exponential
r.v. with mean 1. Then Y =d θZ, and one finds that φ too is a homogeneous order one
function (that is, φ(αx, αθ) = αφ(x, θ)).
Now suppose we restrict the class of operational statistics π to homogeneous-order-one
functions. That is, for some chosen θ0 , we consider the optimization problem:
max {Eθe0 [φ(π(X), θ0 )]}.
π∈H1
subject to
Eθe [φ(π(X), θ)] ≥ φ̂p (θ), ∀θ.
Let Z1 , Z2 , . . . , Zn be i.i.d exponential r.v.s with mean 1 and Z = (Z1 , Z2 , . . . , Zm ). Then
X =d θZ.
Utilizing the property that φ, π and φ̂p are all homogeneous order-one functions, we get
Eθe [φ(π(X), θ)] = θEZe [φ(π(Z), 1)]
and φ̂p (θ) = θφ̂p (1). Hence we can drop the constraints and consider
max {EZe [φ(π(Z), 1)]}.
π∈H1
Let V (with |V| =
Pm
k=1 Vk
= 1) and the dependent random variable R be defined such that
fR|V (r|v) =
1
1
1
exp{− }, r ≥ 0,
rn+1 (n − 1)!
r
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
21
and
fv (v) = (n − 1)!, |v| = 1; v ∈ Rn+ .
Then
Z =d
1
V.
R
Therefore
V
), 1)|V]].
R
Since we assumed π to be a homogeneous-order-one function, we get
EZ [φ(π(Z), 1)] = EV [ER [φ(π(
EV [ER [φ(π(
1
Z
), 1)|V]] = EV [ER [ φ(π(V), R)]|V]].
R
R
Hence all we need to find the optimal operational statistics is to find
1
φ(π, R)|V = v] : π ≥ 0}, v ∈ Rn+ ; |v| = 1.
R
Pn
Then the optimal homogenous order one operational statistic is (with |x| = k=1 xk ),
π os (v) = arg max{ER [
π os (x) = |x|π os (
x
), x ∈ Rn+ .
|x|
After some algebra one finds that (see Liyanage and Shanthikumar (2005) and Chu, Shanthikumar and Shen (2005)):
n
X
s 1
π os (x) = (( ) 1+n − 1)
xk
c
k=1
and
s 1
s
φ̂os (θ) = Eθ [φ(π os (X), θ)] = θ[c{ − 1 − (n + 1)(( ) 1+n − 1)}].
c
c
This policy compared to the classical approach improves the expected profit by 4.96% for
n = 4 and sc = 1.2 (see page 344 of Liyanage and Shanthikumar (2005)).
7. Examples
7.1 Inventory Control with Observable Demand
Consider an inventory control problem with instantaneous replenishment, backlogging and
finite planning horizon. Define the following input variables:
•
•
•
•
•
•
m - number of periods in the planning horizon
c - purchase price per unit
s - selling price per unit
{Y1 , Y2 , . . . , Ym } - demand for the next m periods
b - backlogging cost per unit per period
h - inventory carrying cost per unit per period
At the end of period m all remaining inventory (if any) is salvaged (at a salvage value of
c per unit). If at the end of period m orders are backlogged then all orders are met at the
beginning of period m + 1. Let πk (πk ≥ 0) be the order quantity at the beginning of period
k (k = 1, 2, . . . , m). Then the total profit for the m periods is
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
22
ψ(π, Y) =
m
X
{−cπk + s{max{−Wk−1 , 0} + Yk − max{−Wk , 0}}}
k=1
+ c max{Wm , 0} + (s − c) max{−Wm , 0} −
m
X
{h max{Wk , 0} + b max{−Wk , 0}},
k=1
(0)
where W0 = 0 and
Wk = Wk−1 + πk − Yk , k = 1, 2, . . . , m.
Simple algebra reveals that,
ψ(π, Y) =
m
X
ψk (πk , Yk ),
k=1
where
ψk (πk , Yk ) = (s − c − b)Yk + (b + h) min{Wk−1 + πk , Yk } − h(Wk−1 + πk ), k = 1, 2, . . . , m.
Given Ik = Fk , we wish to find the optimal order quantity πk∗ for period k (k = 1, . . . , m).
First let us see what we can do if we are clairvoyant. Here we will assume that all the future
demand is known. It is not hard to see that
πkd (ω0 ) = Yk (ω0 ), k = 1, 2, . . . , m,
and
φd (ω0 ) = (s − c)
m
X
Yk (ω0 ).
k=1
If we can implement this, then the profit experienced is ψ̂(Y) = (s − c)
expected profit is E[ψ̂(Y)] = (s − c)mθ.
Pm
k=1 Yk
and the
Suppose we assume that the future demand {Y1 , Y2 , . . . , Ym } for the next m periods given I0
are i.i.d. with exponential density function with mean θ (that is fY (y) = θ1 exp{− θ1 y}, y ≥ 0).
Let
q
φk (q, θ) = E[(b + h) min{q, Yk } − hq] = (b + h)θ(1 − exp{− }) − hq, k = 1, 2, . . . , m.
θ
Then
q ∗ (θ) = arg max{φk (q, θ)} = θ log(
b+h
).
h
It is then clear that
πk (θ) = q ∗ (θ) − Wk−1 , k = 1, 2, . . . , m,
and
b+h
).
h
If we use X̄ as an estimate for the θ for implementing this policy, we get
φ(θ) = (s − c)mθ − hmθ log(
ψ̂(Y) = (s − c − b)
m
X
k=1
Yk + (b + h)
m
X
k=1
m
min{X̄ log(
X
b+h
b+h
), Yk } − h
X̄ log(
),
h
h
k=1
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
23
and an a priori expected profit of
Ee[
n
1
n
b+h
ψ̂(Y)] = (s − c)θ − bθ(
) − 1).
)n − hθ((
)n + log(
b+h
m
h
n + log( b+h
n
+
log(
)
)
h
h
However, if we continue to update the estimate we have
π̂k = max{X̄k log(
b+h
) − Wk−1 , 0}, k = 1, 2, . . . , m,
h
and
lim ψ̂(Y) = E e [
m→∞
1
ψ̂(Y)]
m
We will now apply operational learning to this problem (for details of this analysis see
Lim, Shanthikumar and Shen (2006a)). Specifically let H1 be the collection of order-onehomogeneous functions. Then, in operational learning we are interested in
max1
πk ∈H
m
X
Eθe [φk (πk , θ)],
k=1
where
φk (πk , θ) = (b + h)E[min{Wk−1 + πk , Yk }] − hE[(Wk−1 + πk )],
W0 = 0 and
Wk = Wk−1 + πk − Yk , k = 1, 2, . . . , m.
First we will consider the last period. Let Y1 be an empty vector and
Yk = (Y1 , . . . , Yk−1 ), k = 2, . . . , m.
Define the random vector Vm (|Vm | = 1) and the dependent random variable Rm such that
(see Section 6.2)
Vm d
= (X, Ym ).
Rm
Now let
π̃m (z) = arg max{ERm [
φm (q, Rm )
n+m−1
|Vm = z] : q ≥ 0}, z ∈ R+
, |z| = 1,
Rm
and
π̃m (x) = |x|ỹm (
x
n+m−1
), x ∈ R+
.
|x|
Define
πm (X, Ym , w) = max{ỹm (X, Ym ), w − Ym−1 }.
and
n+m−2
φ∗m−1 (x, q, θ) = φm−1 (q, θ) + EYm−1 [φm (πm (x, Ym−1 , q), θ)], x ∈ R+
.
Having defined this for the last period, we can now set up the recursion for any period as
follows: Define the random vector Vk (|Vk | = 1) and the dependent random variable Rk
such that
Vk d
= (X, Yk ), k = 1, 2, . . . , m − 1.
Rk
Now let
π̃k (z) = arg max{ERk [
φ∗k (z, q, Rk )
n+k−1
, |z| = 1,
|Vk = z] : q ≥ 0}, z ∈ R+
Rk
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
24
and
π̃k (x) = |x|ỹm (
x
n+k−1
.
), x ∈ R+
|x|
Define
πk (X, Yk , w) = max{π̃k (X, Yk ), w − Yk−1 }.
and
n+k−2
φ∗k−1 (x, q, θ) = φk−1 (q, θ) + EYk−1 [φ∗k (yk (x, Yk−1 , q), 1)], x ∈ R+
.
Now the target inventory levels π̃k and the cost to go functions φ∗k−1 can be recursively
computed starting with k = m. Computation of this operational statistics using numerical
algorithms and/or simulation is discussed in Lim, Shanthikumar and Shen (2006a).
7.2 Inventory Control with Sales Data
Let m, c, s, and {Y1 , Y2 , . . . , Ym } be as defined earlier. At the end of each period all remaining
inventory (if any) is discarded (and there is no salvage value). Furthermore any excess
demand is lost and lost demand cannot be observed. Let πk (πk ≥ 0) be the order quantity
at the beginning of period k (k = 1, 2, . . . , m). Then the total profit for the m periods is
ψ(π, Y) =
m
X
ψk (πk , Yk ),
k=1
where
ψk (πk , Yk ) = sSk − cπk ,
where Sk =Smin{πk , Yk } is the sales in period k, k = 1, 2, . . . , m. Here Ik (π) = σ({(Sj , πj ), j =
1, 2, . . . , k} I0 ). We wish to find the optimal order quantity πk∗ for period k (k = 1, . . . , m).
Suppose we assume that the future demand {Y1 , Y2 , . . . , Ym } for the next m periods given I0
are i.i.d. with an exponential density function with mean θ (that is fY (y) = θ1 exp{− θ1 y}, y ≥
0). If we know θ this would then be exactly the same as the inventory rat problem. However,
if θ is unknown (which will be the case in practise), we need to estimate it using possibly censored data. Suppose we have past demands, say {X1 , . . . , Xm } and past sales {R1 , . . . , Rm }.
Let Ik = I{Xk = Rk } be the indicator that the sales is the same as the demand in period k
(which will be the case if we had more on hand inventory than thePdemand). Given (R, I)
n
the maximum likelihood estimator ΘM LE of θ is (assuming that k=1 Ik ≥ 1, that is, at
least once we got to observe the true demand)
ΘM LE = Pn
1
n
X
k=1 Ik k=1
Rk .
The implemented order quantities are then (assuming no further updates of the estimator)
s
π̂k = ΘM LE log( ), k = 1, 2, . . . , m,
c
and the profit is
ψ̂(Y) =
m
X
s
s
{s min{ΘM LE log( ), Yk } − cΘM LE log( )}.
c
c
k=1
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
25
We will now show how operational learning can be implemented for a one period problem
(m = 1). Integrated learning for the multi period case can be done similar to the first example
(see Lim, Shanthikumar and Shen (2006a)). Suppose we are interested in
e
max EX
{sEYe 1 [min{π, Y1 }] − sπ},
π∈Ht
for some suitably chosen class Ht of operational functions that includes the MLE estimator.
This function also should allow us to find the solution without the knowledge of θ (what
to do in operational learning if this is not possible is discussed in Chu, Shanthikumar and
Shen (2006b)). Since Rk ≤ Xk and Rk = Xk when Ik = 1, and choosing a value of Xk > Rk
for Ik = 0, we could rewrite the MLE estimator as
n
X
1
min{Xk , Rk }.
k=1 I{Xk ≤ Rk }
ΘM LE = Pn
k=1
Suppose Ht satisfies the following
Ht = {η : Rn+ × Rn+ ⇒ R+ ;
η(αx, αr) = αη(x, r), α ≥ 0;
η(y, r) = η(x, r), y = x + (α1 I{x1 ≥ r1 }, . . . , αn I{xn ≥ rn }), α ≥ 0}.
(0)
It is now easy to see that the function
n
X
1
min{xk , rk }
k=1 I{xk ≤ rk }
h(x, r) = Pn
k=1
is an element of Ht . Within this class of functions, the optimal operational statistics is
n
X
s ( Pn 1
)
π(x, r) = (( ) 1+ k=1 I{xk ≤rk } − 1)
min{xk , rk }.
c
k=1
Hence the operational order quantity is
n
X
s ( P n1
)
Rk .
π̂ = (( ) 1+ k=1 Ik − 1)
c
k=1
Observe that if Ik = 1, k = 1, 2, . . . , n (that is, if there is no censoring), the above policy is
identical to the policy for the newsvendor problem (see Section 6.2).
7.3. Portfolio Selection with Discrete Decision Epochs
We wish to invest in one or more of l stocks with random returns and a bank account with
a known interest rate. Suppose at the beginning of period k we have a total wealth of Vk−1 .
If we invest πk (i)Vk−1 in stock i (i = 1, 2, . . . , l) and leave (1 − πk′ e)Vk−1 in the bank during
period k, we will have a total wealth of
Vk (πk ) = Yk (πk )Vk−1
at the end of period k, k = 1, 2, . . . , m. Here πk = (πk (1), πk (2), . . . , πk (l))′ and e =
(1, 1, . . . , 1)′ is an l-vector of ones and Yk (πk ) − 1 is the rate of return for period k with a
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
26
portfolio allocation πk . The utility of the final wealth Wm for a portfolio selection π and
utility function U is then
ψ(π, Y) = U (v0 Πm
k=1 Yk (πk )).
where v0 initial wealth at time 0.
We will now discuss how we traditionally complete these models, find the optimal policies
and implement them. Naturally, to complete the modelling, we need to define a probability measure P on (Ω, F, (Fk )k∈M ) given I0 and decide the sense (usually in the sense of
expectation under P ) in which the reward function is maximized. In these examples, almost
always we simplify our analysis further by assuming a parametric family for FY .
We will first describe the classical continuous time model, which we will use to create our
discrete time parametric model Yk (πk ), k = 1, 2, . . . , m. Suppose the price process of stock i
is {St (i), 0 ≤ t ≤ m} given by
dSt (i) = (µt (i) + σt′ (i)dWt )St (i), 0 ≤ t ≤ m; i = 1, 2, . . . , l,
where {Wt , 0 ≤ t ≤ m} is a vector valued diffusion process, µt (i) is the drift and σt (i) are
the volatility parameters of stock i,i = 1, 2, . . . , l. Let rt , 0 ≤ t ≤ m be the known interest
rate. Suppose the value of the portfolio is Vt (π) at time t under a portfolio allocation policy
π. Under π the value of investments in stock i at time t is πt (i)Vt (π). The money in the
bank at time t is (1 − πt e)Vt (π). Then the wealth process Vt (π) evolves according to:
dVt (π) = Vt (π){(rt + πt′ bt )dt + πt′ σt′ dWt }, 0 ≤ t ≤ m,
where bt (i) = µt (i) − rt , i = 1, 2, . . . , l and V0 (π) = v0 .
Now suppose we are only allowed to decide on the ratio of portfolio allocation at time k − 1
and the same ratio of allocation will be maintained during [k − 1, k), k = 1, 2, . . . , m. In the
classical continuous time model now assume that µt = µk ; σt = σk and πt = πk , k − 1 ≤ t <
k,k = 1, 2, . . . , m. Then the utility at T = m is
1
′
′
ψ(π, Z) = U (v0 Πm
k=1 exp{rk + πk bk − πk Qk πk + πk σk Zk }),
2
where Qk = σk σk′ and {Zk , k = 1, 2, . . . , m} are i.i.d. unit normal random vectors. Observe
that the probability measure for this model is completely characterized by the parameters
(bk , σk ), k = 1, 2, . . . , m. We will assume that these parameters are independent of {Zk , k =
1, 2, . . . , m} (though this assumption is not needed, we use them to simplify our illustration).
Suppose the values of parameters (bk , σk ), k = 1, 2, . . . , m are unknown but we know a
parameter uncertainty set for them. That is (bk , σk ) ∈ Hk , k = 1, 2, . . . , m. We wish to find
a robust portfolio. We will use the robust optimization approach with competitive ratio
objective with bench-marking. Specifically we will now carry out the bench marking with a
log utility function. In this case, the bench mark portfolio is the solution of
m
X
1
1
′
′
max E log(v0 Πm
{rk + πk′ bk − πk′ Qk πk }.
k=1 exp{rk + πk bk − πk Qk πk + πk σk Zk }) ≡ max
π
π
2
2
k=1
It is not hard to see that
and
πkp = Q−1
k bk , k = 1, 2, . . . , m
1 ′ −1
−1
′
Vmp = v0 Πm
k=1 exp{rk + bk Qk bk + bk Qk σk Zk }.
2
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
27
Taking the ratio of Vm under a policy π and the bench mark value Vmp , we find that the
bench marked objective is
max min E[U (Πm
k=1
π
exp{rk + πk′ b′k − 21 πk′ Qk πk + πk′ σk Zk }
(b,σ)∈H
−1
′
exp{rk + 12 b′k Q−1
k bk + bk Qk σk Zk }
)].
This simplifies as
1 ′
−1
−1
−1
′
′
′
max min E[U (Πm
k=1 exp{− (πk − bk Qk )Qk (πk − Qk bk ) + (πk − bk Qk )σk Zk })].
π (b,σ)∈H
2
Observe that
1 ′
−1
−1
−1
′
′
′
E[Πm
k=1 exp{− (πk − bk Qk )Qk (πk − Qk bk ) + (πk − bk Qk )σk Zk }] = 1.
2
−1
−1
−1
1
′
′
′
′
Furthermore Πm
k=1 exp{− 2 (πk − bk Qk )Qk (πk − Qk bk ) + (πk − bk Qk )σk Zk } is a log concave stochastic function. Hence for any concave utility function U the above objective can
be rewritten as:
m
X
−1
(πk′ − b′k Q−1
min max
k )Qk (πk − Qk bk ).
π
(b,σ)∈H
k=1
It now breaks into a sequence of single period problems:
m
X
{min
k=1
max
πk (bk ,σk )∈Hk
−1
(πk′ − b′k Q−1
k )Qk (πk − Qk bk )}.
Given the uncertainty set Hk , k = 1, 2, . . . , m the above robust optimization problem can be
solved using duality (see Lim, Shanthikumar and Watewai (2006a)).
8. Summary and Conclusion
The interest in model uncertainty, robust optimization and learning in the OR/MS areas
is growing rapidly. The type of model uncertainties considered in the literature can be
broadly categorized into three classes: models with uncertainty sets for (1) variables, (2)
parameters and (3) measures. The robust optimization approaches used to find (robust or
lack thereof) solutions falls into (a) min-max and (b) min-max with bench-marking. Two
common ways to bench-marking are through (1) regret and (2) competitive ratio. The main
focus in OR/MS has been in the development of models with uncertainty sets for variables
(deterministic models of model uncertainty) and deterministic min-max and min-max-regret
robust optimization. Within this framework, the focus has been on developing efficient
solution procedures for robust optimization. Only a very limited amount of work has been
done on looking at stochastic models of model uncertainty and robust optimization with
bench-marking. Very little is done in learning. We believe that a substantial amount of work
needs to be done in the latter three topics.
Acknowledgement
This work was supported in part by the NSF grant DMI-0500503 (for Lim and Shanthikumar) and the NSF CAREER awards DMI-0348209 (for Shen) and DMI-0348746 (for Lim).
28
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
References
Agrawal, V. and S. Seshadri (2000) Impact of Uncertainty and Risk Aversion on Price
and Order Quantity in the Newsvendor Problem, Manufacturing and Service Operations
Management, 2, 410-423.
Ahmed, S., U. Cakmak and A. Shapiro (2005) Coherent Risk Measures in Inventory Problems, Technical Report, School of Industrial and Systems Engineering, Georgia Institute of
Technology, Atlanta, GA.
Anderson, E. W., L. P. Hansen, and T. J. Sargent (1998) Risk and Robustness in Equilibrium, Technical Report, University of Chicago.
Anderson, L. W., P. Hansen, and T. J. Sargent (2003) A Quartet of Semigroups for Model
Specification, Robustness, Price of Risk, and Model Detection, Journal of the European
Economic Association, 1, 68-123.
Atamturk, A. (2003) Strong Formulations of Robust Mixed 0-1 Programming, to appear in
Mathematical Programming.
Atamturk, A. and M. Zhang (2004) Two-Stage Robust Network Flow and Design under
Demand Uncertainty, to appear in Operations Research.
Averbakh, I. (2000) Minmax regret solutions for minmax optimization problems with uncertainty, Operations Research Letters, 27, 57-65.
Averbakh, I. (2001) On the complexity of a class of combinatorial optimization problems
with uncertainty, Mathematical Programming, 90, 263-272.
Averbakh, I. (2004) Minmax regret linear resource allocation problems, Operations Research
Letters, 32, 174-180.
Azoury, K. S. (1985) Bayes Solution to Dynamic Inventory Models under Unknown Demand
Distribution, Management Science, 31, 1150-1160.
Ben-Tal, A. and A. Nemirovski (1998) Robust Convex Optimization, Mathematics of Operations Research, 23, 769-805.
Ben-Tal, A. and A. Nemirovski (1999) Robust solutions of uncertain linear programs, Operations Research Letters, 25, 1-13.
Ben-Tal, A. and A. Nemirovski (2000) Robust Solutions of Linear Programming Problems
Contaminated with Uncertain Data, Mathematical Programming, A88, 411-424.
Ben-Tal, A. and A. Nemirovski (2002) Robust optimization - methodolgy and applications,
Mathematical Programming, B92, 453-480.
Berger, J. O. (1985) Statistical Decision Theory and Bayesian Analysis, Second Edition,
Springer, New York, NY.
Bernhard, P. (2003) A robust control approach to option pricing, Applications of Robust
Decision Theory and Ambiguity in Finance, (M. Salmon, ed.), City University Press, London.
Bernhard, P. (2003) Robust control approach to option pricing, including transaction costs,
Advances in Dynamic Games, Annals of the International Society of Dynamic Games, 7,
(A.S. Nowak, K. Szajowski, eds.), Birkhauser.
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
29
Bertsekas, D. (2003) Convex Analysis and Optimization, Athena Scientific.
Bertsimas, D., D. Pachamanova and M. Sim (2004) Robust Linear Optimization under
General Norms, Operations Research Letters, 32, 510-516 .
Bertsimas, D. and M. Sim (2003) Robust Discrete optimization and Network Flows, Mathematical Programming Series B, 98, 49-71.
Bertsimas, D. and M. Sim (2004) The price of Robustness, Operations Research, 52, 35–53.
Bertsimas, D. and M. Sim (2004) Robust Discrete Optimization under Ellipsoidal Uncertainty Sets, working paper, MIT.
Bertsimas, D. and M. Sim (2006) Tractable Approximation to Robust Conic Optimization
Problems, Mathematical Programming, 107, 5-36.
Bertsimas, D. and A. Thiele (2003) A Robust Optimization Approach to Inventory Theory,
Operations Research, 54, 150-168.
Bienstock, D. and N. Ozbay (2005) Computing Robust Basestock Levels, CORC Report
TR-2005-09, Columbia University, NY.
Birge, J. R. and F. Louveaux (1997) Introduction to Stochastic Programming, Springer, New
York.
Boyd, S. and L. Vandenberghe (2004) Convex Optimization, Cambridge University Press,
Cambridge, UK.
Cagetti, M., L. P. Hansen, T. Sargent and N. Williams (2002) Robust Pricing with Uncertain
Growth,Review of Financial Studies, 15(2), 363-404.
Cao, H. H., T. Wang and H. H. Zhang (2005) Model Uncertainty, Limited Market Participation, and Asset Prices, Review of Financial Studies, 18, 1219-1251
Chen, X., M. Sim, D. Simchi-Levi and P. Sun (2004) Risk Aversion in Inventory Management. Working paper, MIT, Cambridge, MA.
Chen, X., M. Sim and P. Sun (2004) A Robust Optimization Perspective of Stochastic
Programming, Technical Report, National University of Singapore, Singapore.
Chen, X., M. Sim, P. Sun and J. Zhang (2006) A Tractable Approximation of Stochastic
Programming via Robust Optimization, Technical Report, National University of Singapore,
Singapore.
Chen, Z. and L. G. Epstein (2002) Ambiguity, Risk and Asset Returns in Continuous Time,
Econometrica, 70, 1403-1443.
Chou, M., M. Sim and K. So (2006) A Robust Framework for Analyzing Distribution Systems
with Transshipment, Technical Report, National University of Singapore, Singapore.
Chu, L. Y., J. G. Shanthikumar and Z. J. M. Shen (2005) Solving Operational Statistics via
a Bayesian Analysis. Working paper, University of California at Berkeley.
Chu, L. Y., J. G. Shanthikumar and Z-J. M. Shen, (2006a) Pricing and Revenue Management
with Operational Statistics. Working paper, University of California at Berkeley.
30
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
Chu, L. Y., J. G. Shanthikumar and Z-J. M. Shen, (2006b) Stochastic Optimization with
Operational Statistics: A General Framework, Working paper, University of California at
Berkeley.
D’Amico, S. (2005) Density Selection and Combination under Model Ambiguity: An Application to Stock Returns, Technical Report 2005-09, Division of Research and Statistics and
Monetary Affairs, Federal Reserve Board, Washington, D. C.
Ding, X., M. L. Puterman and A. Bisi (2002) The Censored Newsvendor and the Optimal
Acquisition of Information, Operations Research, 50, 517-527.
Dow, J., and S. Werlang (1992) Ambiguity Aversion, Risk Aversion, and the Optimal Choice
of Portfolio, Econometrica, 60, 197-204.
El Ghaoui, L. and H. Lebret (1997) Robust Solutions to Least Square Problems to Uncertain
Data Matrices, SIAM Journal on Matrix Analysis and Applications, 18, 1035-1064.
El Ghaoui, L., F. Oustry and H. Lebret (1998) Robust Solutions to Uncertain Semidefinite
Programs, SIAM Journal on Optimization, 9, 33-52.
Ellsberg, D. (1961) Risk, Ambiguity and the Savage Axioms, Quarterly Journal of Economics, 75, 643-669.
Epstein, L. G. (2006) An axiomatic model of non-Bayesian updating, Review of Economic
Studies, forthcoming.
Epstein, L. G. and J. Miao (2003) A Two-Person Dynamic Equilibrium under Ambiguity,
Journal of Economic Dynamics and Control, 27, 1253-1288.
Epstein, L. G. and M. Schneider (2003) Recursive Multiple Priors, Journal of Economic
Theory, 113, 1-31.
Epstein, L. G. and M. Schneider (2003) IID: independently and indistinguishably distributed, Journal of Economic Theory, 113, 32-50.
Epstein, L. G. and M. Schneider (2005) Learning under ambiguity, University of Rochester.
Epstein, L. G. and M. Schneider (2005) Ambiguity, information quality and asset pricing,
University of Rochester.
Epstein, L. G., J. Noor, and A. Sandroni (2005) Non-Bayesian updating: a theoretical framework, University of Rochester.
Epstein, L. G. and T. Wang (1994) Intertemporal Asset Pricing Under Knightian Uncertainty, Econometrica, 62, 283-322.
Erdogan, E. and G. Iyengar (2006) Ambiguous Chance Constrained Problems and Robust
Optimization, Mathematical Programming, 107, 37-61.
Follmer, H. and A. Schied (2002) Robust representation of convex measures of risk. Advances
in Finance and Stochastics, Essays in Honour of Dieter Sondermann, Springer-Verlag, 3956.
Follmer, H. and A. Schied (2002) Stochastic Finance: An Introduction in Discrete Time, de
Gruyter Studies in Mathematics 27, Second edition (2004), Berlin.
Garlappi, L., R. Uppal, and T. Wang (2005) Portfolio Selection with Parameter and Model
Uncertainty: A Multi-Prior Approach, C.E.P.R. Discussion Papers 5041.
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
31
Gallego, G., J. Ryan and D. Simchi-Levi (2001) Minimax Analysis for Finite Horizon Inventory Models, IIE Transactions, 33, 861-874.
Gilboa, I. and D. Schmeidler (1989) Maxmin Expected Utility with Non-unique Prior, Journal of Mathematical Economics, 18, 141-153.
Goldfarb, D. and G. Iyengar (2003) Robust Portfolio Selection Problem, Mathematics of
Operations Research, 28, 1-28.
Hansen, L. P. and T. J. Sargent (2001) Acknowledging Misspecification in Macroeconomic
Theory, Review of Economic Dynamics, 4, 519535.
Hansen, L. P. and T. J. Sargent (2001) Robust Control and Model Uncertainty, American
Economic Review, 91, 60-66.
Hansen, L. P. and T. J. Sargent (2003) Robust Control of Forward Looking Models, Journal
of Monetary Economics,.
Hansen, L. P. and T. J. Sargent (2006) Robust Control and Economic Model Uncertainty,
Princeton University Press, Princeton, NJ. (forthcoming)
Hansen, L. P., T. J. Sargent, and T. D. Tallarini, Jr. (1999) Robust Permanent Income and
Pricing, Review of Economic Studies, 66, 873-907.
Hansen, L. P., T. J. Sargent, G. A. Turmuhambetova, and N. Williams (2002) Robustness
and Uncertainty Aversion, University of Chicago.
Hansen, L. P., T. J. Sarget and N. E. Wang (2002) Robust Permanent Income and Pricing
with Filtering, Macroeconomic Dynamics, 6, 4084.
Iyengar, G. (2005) Robust Dynamic Programming, Mathematics of Operations Research,
30, 257-280.
Jain, A., A. E. B. Lim and J. G. Shanthikumar, Incorporating Model Uncertainty and
Learning in Operations Management. Working paper, University of California at Berkeley.
Karlin, S. (1960) Dynamic Inventory Policy with Varying Stochastic Demands, Management
Science, 6, 231-258.
Kass, E. and L. Wasserman (1996) The Selection of Prior Distributions by Formal Rules,
Journal of the American Statistical Association, 91, 1343-1370.
Knight, F. H. (1921) Risk, Uncertainty and Profit, Houghton Mifflin, Boston, MA.
Kouvelis, P. and Yu, G. (1997) Robust Discrete Optimization and Its Applications, Kluwer
Academic Publishers, Boston, MA.
Lariviere M. A. and E. L. Porteus (1999) Stalking Information: Bayesian Inventory Management with Unobserved Lost Sales, Management Science, 45, 346-363.
Lim, A. E. B. and J. G. Shanthikumar (2004) Relative Entropy, Exponential Utility and
Robust Dynamic Pricing. Working paper, University of California at Berkeley (to appear in
Operations Research).
Lim, A. E. B., J. G. Shanthikumar and Z-J. M. Shen (2006a), Dynamic Learning and Optimization with Operational Statistics. Working paper, University of California at Berkeley.
32
Model Uncertainty, Robust Optimization and Learning
c 2006 INFORMS
INFORMS—Pittsburgh 2006, °
Lim, A. E. B., J. G. Shanthikumar and Z-J. M. Shen (2006b), Duality for relative performance objectives. Working paper, University of California at Berkeley.
Lim, A. E. B., J. G. Shanthikumar and T. Watewai (2005) Relative Performance Measures
of Portfolio Robustness. Working paper, University of California at Berkeley.
Lim, A. E. B., J. G. Shanthikumar and T. Watewai (2006a) Robust Multi-Product Dynamic
Pricing. Working paper, University of California at Berkeley.
Lim, A. E. B., J. G. Shanthikumar and T. Watewai (2006b) A Balance Between Optimism
and Pessimism in Robust Portfolio Choice Problems through Certainty Equivalent Ratio.
Working paper, University of California at Berkeley.
Liu, J., Pan, J, and T. Wang (2006), An Equilibrium Model of Rare-Event Premia,Review
of Financial Studies, to appear.
Liyanage, L. and J. G. Shanthikumar (2005) A Practical Inventory Policy Using Operational
Statistics, OR Letters, 33, 341-348.
Porteus, E. L. (2002) Foundations of Stochastic Inventory Theory, Stanford University Press,
Stanford, CA.
Robert, C. P. (2001) The Bayesian Choice, Second Edition, Springer, New York, NY.
Ruszczynski, A. and A. Shapiro (Editors) (2003) Stochastic Programming, Hanbooks in
Operations Research and Management Series, Volume 10, Elsevier, New York.
Savage, L. J. (1972) The Foundations of Statistics, Second Edition, Dover, New York.
Scarf, H. (1959) Bayes Solutions of Statistical Inventory Problem, Annals of Mathematical
Statistics, 30, 490-508.
Soyster, A. L. (1973) Convex Programming with Set-Inclusive Constraints and Applications
to Inexact Linear Programming, Operations Research, 21, 1154-1157.
Sutton, R. S. and A. G. Barto (1998) Reinforcement Learning: An Introduction, The MIT
Press, Cambridge, MA.
Uppal, R. and T. Wang (2003) Model Misspecification and Under Diversification, Journal
of Finance, 58, 2465-2486.
van der Vlerk, M. H. (2006) Stochastic Programming Bibliography, World Wide Web,
http://mally.eco.rug.nl/spbib.html, 1996-2003
Vapnik, V. N. (2000) The Nature of Statistical Learning Theory, Second Edition, Springer,
New York, NY.
Wald, A. (1950) Statistical Decision Functions, John Wiley and Sons, New York.
Zipkin, P. H. (2000) Foundations of Inventory Management, McGraw Hill, New York.