Formulating and Modelling Robust Decision

Formulating and Modelling Robust
Decision-Making Problems Under Severe
Uncertainty
Daphne Do
Conducted under the supervision of Moshe Sniedovich and Peter Taylor.
November 2008
Department of Mathematics and Statistics
The University of Melbourne
Abstract
Real world application problems can be subject to data uncertainty. In light of this, it is important to ensure that mathematical formulations are capable of encapsulating such uncertainties,
and that the translation between formulation to mathematical model involves only necessary
assumptions in order to reflect reality as accurately as possible. This thesis focusses on the formulation and modelling aspect of robust decision-making under severe uncertainty, and the
underlying assumptions and concepts required in order to be able to recommend one decision
over another in circumstances which may be highly unique and not reflective of past occurrences. Definitions of different levels of robustness are presented, as well as methods of conceptualising uncertainty. An overview of current methodologies available for robust decisionmaking is presented, and some examples of their implementation are also included. We present
proofs that Starr’s domain criterion, Ben-Tal & Nemirovski’s robust counterpart approach, and
Soyster’s inexact linear programming approach are all instances of Wald’s Maximin criterion,
demonstrating a fundamental concept of robust decision-making. We examine the application of theory to two case studies; the first, an allocation problem regarding carbon offsetting
schemes, and second, a container inspection problem relating to port security.
i
Acknowledgements
Firstly, to Moshe Sniedovich, thank you so much for your support and direction, and your
amazing sense of humour. This wouldn’t have been nearly as much fun without you!
To Peter Taylor, my secondary supervisor, thank you for your perspective and encouragement.
I would also like to acknowledge the generous financial support of both the Applied Environmental Decision Analysis (AEDA), the Commenwealth Environment Research Facility (CERF),
and the Australian Society of Operations Research (ASOR).
Thanks also to my brother for all the last minute help, my dad for the support, and my mum
for cooking almost every meal I ate this year, even though I moved out long ago.
Also, thank you, Nikki, my mathematical partner in crime, for all the citrus, and to my fellow
Honours classmates - it has been a pleasure getting to know you.
And last, but certainly not least, thank you to Gus Goswell, for maintaining the sanity.
iii
Contents
1
Introduction
1
2
Severe Uncertainty
7
2.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.2
Certainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.3
Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.4
Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.5
Dealing with Severe Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.6
Formulations Involving Severe Uncertainty . . . . . . . . . . . . . . . . . . . . . .
16
2.7
Variability and Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
3
4
Satisficing and Optimising
19
3.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3.2
Satisficing Versus Optimising: A Matter of Choice . . . . . . . . . . . . . . . . . .
20
Robustness
25
4.1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
4.2
Complete Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
4.3
Partial Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
4.4
Local Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
4.5
Partial Versus Local Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
v
5
6
4.6
Robust Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
4.7
Maximin Models in Disguise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
Model Choice: What is Attractive?
79
5.1
Problem Formulation and Modelling . . . . . . . . . . . . . . . . . . . . . . . . . .
79
5.2
Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
5.3
Satisficing Versus Optimising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
5.4
Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
5.5
Tractability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
Illustrative Examples
89
6.1
Budget Allocation to Carbon Offsets Schemes . . . . . . . . . . . . . . . . . . . . .
89
6.1.1
Modelling for Robust Optimisation . . . . . . . . . . . . . . . . . . . . . .
91
6.1.2
Modelling for Robust Satisficing . . . . . . . . . . . . . . . . . . . . . . . .
97
6.2
7
Container Inspection Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Conclusion
117
List of Figures
2.1
Two infinite uncertainty regions. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.2
Comparison between the quality of two estimates. . . . . . . . . . . . . . . . . . .
14
2.3
A visualisation of one example of uncertainty dispersion over a state space. . . .
15
2.4
A visualisation of another example of uncertainty dispersion over a state space. .
15
4.1
An example of a problem solvable under complete robustness. . . . . . . . . . . .
28
4.2
An example of a problem with no feasible solution under complete robustness. .
30
4.3
An example of a discretised state space. . . . . . . . . . . . . . . . . . . . . . . . .
34
4.4
The chosen subset of states for the decision represented in Figure 4.3. . . . . . . .
35
4.5
A visual comparison of two decisions. . . . . . . . . . . . . . . . . . . . . . . . . .
36
4.6
Two decisions that are not comparable. . . . . . . . . . . . . . . . . . . . . . . . .
36
4.7
An example of a local analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
4.8
An example of a local analysis under mild uncertainty. . . . . . . . . . . . . . . .
42
4.9
An example of a local analysis under severe uncertainty. . . . . . . . . . . . . . .
42
4.10 Example of identical local analysis results with different partial results. . . . . . .
45
4.11 Example of the dependence on the estimate location. . . . . . . . . . . . . . . . .
46
4.12 Monotonic decreasing function, h(d, s). . . . . . . . . . . . . . . . . . . . . . . . .
47
4.13 Partial robustness region for h(d, s) ≥ 0. . . . . . . . . . . . . . . . . . . . . . . . .
47
4.14 Division of uncertainty region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
4.15 Division of uncertainty region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
4.16 Division of uncertainty region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
vii
4.17 Example where info-gap chooses the incorrect decision. . . . . . . . . . . . . . . .
58
4.18 Example where info-gap chooses the correct decision. . . . . . . . . . . . . . . . .
59
4.19 Visualisations of robustness for strategies A through G. . . . . . . . . . . . . . . .
62
4.20 An example of worst case states in Info-Gap. . . . . . . . . . . . . . . . . . . . . .
76
5.1
Lifecycle of a problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
6.1
Dominance of schemes in the carbon offsets problem. . . . . . . . . . . . . . . . .
94
6.2
Performance grids for decisions (y1 , y2 ) = (0.4, 0.6) and ( x1 , x2 ) = (0.6, 0.4). . . .
97
6.3
Graph of the region 40b1 + 60b2 ≥ 14. . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4
Graph of the region 60b1 + 40b2 ≥ 14. . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.5
Changes to rc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.6
A monotonic increasing function in p with threshold πc . . . . . . . . . . . . . . . 113
6.7
A function in p that is not monotonically increasing, with threshold πc . . . . . . . 114
List of Tables
2.1
Decision Matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.2
Decision-Making Under Risk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.3
Decision-Making Using Maximin. . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.4
Decision Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
2.5
Decision-Making Using Laplace’s Principle of Insufficient Reason . . . . . . . . .
17
4.1
Decision Matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
4.2
Starr’s domain criterion reformulation . . . . . . . . . . . . . . . . . . . . . . . . .
72
6.1
Carbon offsets schemes data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
6.2
Upper and lower bounds on return for the carbon offsets problem. . . . . . . . .
91
6.3
Solutions for each decision given the return parameter value pairing (b1 , b2 ). . .
96
ix
Chapter 1
Introduction
Uncertainty in decision-making is faced constantly by decision makers who lack the information and ability to accurately predict future events without ascription of probabilities or likewise. The field of robust optimisation has been developed in an attempt to assist decision makers under these conditions of severe uncertainty.
A plethora of relevant applications, alongside advances in computing ability, have stimulated
a resurgence of research into the field of robust decision-making. Ideas and methodologies that
were toyed with before these advances, including methods which restricted problems to two
dimensions for the sake of tractability, are now being implemented in business, policy planning,
environmental management, and other fields.
This thesis will examine the state of the art of robust decision-making techniques, with emphasis
on the formulation and modelling aspect of approaching robust decision-making under conditions of uncertainty, rather than the implementation of solution methods. While the definition
of robustness differs from source to source within the decision-making literature, the essence of a
robust decision is that it should perform well across a large number of unknown, yet plausible,
future states. Hence, severe uncertainty as to which state will eventuate should have minimal
consequence to the performance of a decision if it is robust.
Severe uncertainty for decision-making, also known as plain ‘uncertainty’ within the field of
classical decision theory [French, 1986], ‘Knightian uncertainty’ [Ben-Haim, 2001, 2006] within
the Info-Gap literature, or ‘deep uncertainty‘ within RAND literature [Lempert et al., 2003],
refers to uncertainty where probability distribution functions cannot be designated to states of
nature. Under severe uncertainty, uncertainty is immeasurable and unquantifiable, and hence
differs from decision-making under risk, where uncertainty can be quantified using probability
distributions or other related statistics. This leads to grave difficulties with modelling uncer1
2
Chapter 1. Introduction
tainty, which in turn results in difficulties obtaining decisions which produce acceptable outcomes.
Many methodologies have been proposed to tackle decision-making under severe uncertainty.
Classical decision theory, which encompasses Wald’s Maximin principle and its derivatives,
came to light in the mid-1950s, while Laplace’s Principle of Insufficient Reason has been dated
back as early as the 18th century.
Following this period, interest spawned in the field of decision-making under uncertainty, with
a large proportion of research focusing on probabilistic approaches to decision-making under
uncertainty. This included the development, in the 1950s, of stochastic programming, [Dantzig,
1955], Markowitz’s Mean-Variance model [Markowitz, 1952], and optimisation under probabilistic constraints [Charnes and Cooper, 1959]. These methods are implemented in such a way
that, mathematically, it is assumed that we know probability distributions of random variables.
One drawback of these probabilistic methods is that some applications of decision-making under severe uncertainty involve a lack of information or an inability to explicitly ascribe probabilities to a set of possible futures. Such applications may occur due to instabilities within an
environment in which a decision is to take place, where prediction of future outcomes cannot
be based on prior experience and performance. To deal with uncertainty in decision-making
in such situations, what are referred to in the literature as ‘non-probabilistic’ techniques are
required. These are the foci of this thesis.
Probabilists may argue that to require non-probabilistic techniques for conditions of uncertainty
is an oxymoron, as probabilities can always be ascribed to states of nature. However, under severe uncertainty, we may be forced to ascribe probabilities to states of nature which are not
representative of reality in order to apply mathematical analysis. Where non-probabilistic techniques are referred to in this thesis, it shall refer to techniques which are used in cases whereby
the problem is received without probability distributions being attached to uncertain parameters. This does not mean that the technique itself does not make any assumptions regarding
probability distributions or otherwise during implementation. This thesis focusses on nonprobabilistic techniques, with some exceptions. Hence, while some of the concepts that we discuss can be examined and described from a probabilistic perspective, we omit that discussion
from this thesis to avoid confusion.
Research into, and application of, robust decision-making with non-probabilistic approaches is
currently thriving. Recent methodologies that have gained prominence in much of the literature include methods which examine uncertainty sets, developed by Ben-Tal and Nemirovski
[1998, 2002], and Bertsimas and Sim [2004, 2006, 2007]. These methods have been derived from
Soyster’s seminal paper on inexact linear programming [Soyster, 1973]. There is also a great
number of papers which examine the concepts behind these approaches, or implement them
3
[Sniedovich, 2007, 2008c, Lempert et al., 2003, 2006, Beyer and Sendhoff, 2007]. Such is the
growth of the field that the journal, ‘Mathematical Programming’, even released a special issue
on robust optimisation [Ben-Tal et al., 2006].
Applications range over many fields including, but not limited to, terrorism [Moffitt et al., 2005,
Carr et al., 2006], environmental conservation [Regan et al., 2005, Moilanen et al., 2006], financial
applications [Beresford-Smith and Thompson, 2007, Kachani and Langella, 2005], and abrupt
climate & environmental change issues [Eiselt et al., 1998, Lempert and Collins, 2007].
In this thesis we focus on applications of the allocation of funds to carbon offsetting schemes,
and the determination of the number of containers to inspect at a port given the unknown
probability that one container is concealing a weapon.
The purpose of this thesis is to acknowledge the research performed within the field of robust
decision-making under severe uncertainty, as well as to highlight deficits within the literature.
The thesis will discuss the use and misuse of terminology, and will try to present clear definitions and interpretations of buzzwords within the literature. The differences between optimising and satisficing are discussed, as are the distinguishing characteristics between intensity of
uncertainty levels. Classifications are also defined for different types of robustness, followed by
an analysis of several decision-making methodologies with respect to how these concepts influence and affect their implementation and success with regards to recommendation of robust
decisions.
The class of decision problems under consideration is comprised of models where we must
choose a decision such that the outcome performs sufficiently well over all states possible for
that decision. Uncertainty is present, and relates to which state will eventuate. In terms of
an uncertain parameter, this uncertainty relates to the unknown, true value of the parameter.
The following notation has been borrowed from the field of classical decision theory, and is
presented here for ease of reading:
A decision space, D: a set containing all of the decisions available to the decision maker.
State spaces, S(d) ⊆ S, d ∈ D: S(d) denotes the set of possible future states that may
eventuate with the choice of decision d, and S denotes the entire state space.
A real-valued function f on D × S: f is the objective function, where f (d, s) represents the
value of the outcome generated if decision d is made, and state s is realised.
We may have a constraint h(d, s) ∈ C, where h(d, s) is a function over D × S and C 0 is a
set, with C ⊆ C 0 .
4
Chapter 1. Introduction
In this thesis, it shall also be assumed that we aim to maximise our utility. This corresponds with
an adaptation of Wald’s initial Minimax formulation [Wald, 1939], which we shall refer to as
Wald’s Maximin formulation, whereby the decision maker wants to maximise return, while the
antagonistic Mother Nature wishes to minimise it. Generality is not lost with this assumption,
as any minimisation or satisficing problem can be formulated as an equivalent maximisation
problem. With this notation, we introduce the standard robust decision-making problem under
conditions of severe uncertainty:
Choose a decision, d ∈ D, such that the decision performs sufficiently well over all s ∈ S(d).
We break this down into three different versions, which we will delve into further in Chapter 4:
Robust Satisficing: The standard robust satisficing decision-making problem under conditions
of severe uncertainty is as follows:
Choose a decision, d ∈ D, such that h(d, s) ∈ C holds for all s ∈ S(d).
Robust Optimising: The standard robust optimising decision-making problem under conditions of severe uncertainty is as follows:
Choose a decision, d ∈ D, such that f (d, s) is optimal, or within some threshold of optimality,
for all s ∈ S(d).
Robust Satisficing and Optimising: The standard robust satisficing and optimising decisionmaking problem under conditions of severe uncertainty is as follows:
Choose a decision, d ∈ D, such that h(d, s) ∈ C, and f (d, s) is optimal, or within some
threshold of optimality, for all s ∈ S(d).
In Chapter 4, we shall present relaxations of these problems whereby we do not require the
robustness over all states in the state space.
We also discuss the usefulness of Wald’s Maximin criterion as a modelling tool. We present a
classical formulation of Wald’s Maximin criterion, which will be referred to throughout the rest
of the thesis:
z∗ = max min f (d, s).
d ∈D s ∈ S ( d )
5
An equivalent mathematical programming formulation of Wald’s Maximin is given below:
z∗
s.t.
= max v
d∈ D
v ∈R
v ≤ f (d, s),
∀ s ∈ S ( d ).
We shall present further Maximin formulations under different conditions of robustness in
Chapter 4.
In Chapter 4.7, we shall present proofs that Soyster’s inexact linear programming [1973], BenTal and Nemirovski’s robust counterpart approach [1999], and Starr’s domain criterion [1966]
are all instances of Wald’s Maximin criterion.
The examples demonstrated in this paper are outlined as follows:
Carbon Offsets Allocation Problem: Allocate a budget of funds to different carbon offsetting
schemes, with an objective to achieve robustness of amount of carbon offset.
Port Security Allocation Problem: Determine the number of containers to inspect at a port
in order to ensure robustness against failing to meet a threshold value regarding utility which
is reliant on two things: attempting to ensure that no terrorist attack takes place whilst keeping
within budget.
Chapter 2
Severe Uncertainty
2.1
Overview
French [1986] differentiates between the classifications of decisions made under certainty, decisions made with risk, and decisions made under strict, or what we shall refer to as severe,
uncertainty.
Decisions made under certainty assume that the state of nature which will eventuate after making a decision is known with certainty prior to decision-making, hence the consequences of each
decision can be anticipated. Luce and Raiffa [1957] state that decision-making under certainty
typically involves choosing a decision which maximises (or minimises) some given index, for
example profit (or cost), out of a set of possible decisions.
Decision-making with risk assumes that probability distributions can be assigned to each state
of nature, and hence outcome, with an optimal decision reflecting the maximum expected utility. Decision-making under risk leads to choosing a decision with a set of possible outcomes,
each with its own probability distribution and return [Luce and Raiffa, 1957].
Conditions of severe uncertainty refer to situations where it is impossible for the decision maker
to quantify uncertainty of the resultant state prior to the problem formulation stage, through
means of probability distributions or otherwise. French [1986, pg.36] states that under severe
uncertainty, the decision maker ‘cannot quantify his uncertainty in any way’. This is not to be
confused with the probabilities assigned to states of nature when modelling for mathematical
analysis. Here, the problem is received without probabilistic structures applied to the uncertainty.
These definitions are similar to those previously defined by Luce and Raiffa [1957]. Luce and
7
8
Chapter 2. Severe Uncertainty
Raiffa also divide the class of uncertainty into two classifications: uncertainty, and a combination of uncertainty and ‘risk in the light of experimental evidence’, referring to this latter
combination as statistical inference.
Knight differentiated between risk and uncertainty by stating that in risk, ‘the distribution of the
outcome in a group of instances is known, ..., while in the case of uncertainty this is not true, the
reason being in general that it is impossible to form a group of instances, because the situation
dealt with is in a high degree unique’ [Knight, 1921, pg.233]. Along similar lines is Keynes’
definition of uncertainty, ‘About these matters there is no scientific basis on which to form any
calculable probability whatever. We simply do not know’ [Keynes, 1937, pg.213–215]. Lastly,
we introduce Shackle’s definition of uncertainty, which he considers to be ‘the irreducible core
of ignorance concerning the outcome of a virtually isolated act’ [Shackle, 1952, pg.118].
The uncertainty that we discuss in this thesis relates to the uncertainty as to which state will
eventuate. While we acknowledge other types of uncertainty, such as the uncertainty as to how
well the system will perform for a given state, the non-probabilistic methodologies dealt with
in this thesis tackle only the former kind, hence we have restricted our focus to this.
To demonstrate the above classifications, we present the following example:
Example 2.1. Consider the following decision-making problem instance with two decisions, d1 ∈ D
and d2 ∈ D, and two possible states of nature, s1 ∈ S and s2 ∈ S. Suppose that d1 has returns of 3 and
7 in states s1 and s2 respectively, while d2 has returns of 8 and 2 in states s1 and s2 respectively. The
corresponding decision matrix is shown below in Table 2.1:
d1
d2
s1
s2
3
8
7
2
Table 2.1: Decision Matrix.
2.2. Certainty
2.2
9
Certainty
The idea behind decision-making under certainty is that the eventuating state of nature, and
hence a decision’s resulting return, is known prior to making a decision. Thus it is possible to
choose a decision which provides optimal return. Programming problems with known parameters, for example linear, dynamic, and integer programming problems, are classic examples of
decision-making under certainty.
For decision-making under certainty, using the decision table given in Table 2.1, the state of
nature is known prior to making the decision, hence if the state of nature is s1 , we should
choose decision 2 to maximise our return. If the state of nature is s2 , we will choose decision 1.
2.3
Risk
If it is possible to quantify uncertainty in parameters using probability distributions or other related statistics, then we have decision-making under risk. Under conditions of risk, a decisionmaker attempts to maximise their expected utility, provided there are no additional constraints
which prevent this from happening. Although not discussed in this paper, the stochastic optimisation literature is very relevant here [Kall and Mayer, 2005], as is Bayesian analysis and
fuzzy sets [Bandemer, 1992].
Example 2.2. For decision-making under risk, using the decision table given in Table 2.1, we assume
that we know, or are able to postulate, the probabilities of each state. If there is a 43% chance that state
one will occur and a 57% chance that state two will eventuate, we should choose decision 1 to maximise
our expected return, with calculations as shown in Table 2.2:
d1
d2
s1
s2
Expected Return
3
8
7
2
0.43 × 3 + 0.57 × 7 = 5.28
0.43 × 8 + 0.57 × 2 = 4.58
Table 2.2: Decision-Making Under Risk.
It can be seen that if one state has a 100% chance of being the end-state, this gives us decisionmaking under certainty, hence decision-making under certainty is a degenerate case of decisionmaking under risk.
10
2.4
Chapter 2. Severe Uncertainty
Uncertainty
The definition of conditions of uncertainty within the decision-making literature has the underlying foundation that probabilities of future states are unknown or meaningless [Luce and
Raiffa, 1957], and that uncertainty is unquantifiable.
Example 2.3. For classical decision-making under severe uncertainty using a two-player zero-sum game
analysis for the decision table given in Table 2.1, we can apply Wald’s Maximin principle. This principle
is one from the classical decision theory era, and is further discussed in Chapter 4. Maximin chooses the
best return out of the worst-case returns. This principle recommends that we choose decision 1, which
guarantees a return of at least 3. This is demonstrated below in Table 2.3:
d1
d2
s1
s2
Worst-Case
3
8
7
2
3
2
Table 2.3: Decision-Making Using Maximin.
Severe Uncertainty
The concept of severe uncertainty, also known as strict [French, 1986], Knightian [Ben-Haim,
2001, 2006], and deep [Lempert et al., 2003], uncertainty appears to dominate decision-making
literature in both theory and application. It must be emphasised that under severe uncertainty,
within the field of decision-making, no probabilities can be explicitly ascribed to parameter
estimates or states of nature. Severity of uncertainty is immeasurable quantitatively, which has
resulted in misguided or ambiguous definitions of severe uncertainty within the literature.
This being said, there is a need to force quantification of uncertainty in order to be able to analyse the problem mathematically, and we will examine and address some topics regarding this
in Chapter 2.5. A quote from Keynes [1937, pg.213–215] follows suitably: ‘Nevertheless, the
necessity for action and for decision compels us as practical men to do our best to overlook
this awkward fact and to behave exactly as we should if we had behind us a good Benthamite
calculation of a series of prospective advantages and disadvantages, each multiplied by its appropriate probability, waiting to be summed’.
Arrow and Hurwicz [1937] distinguish between complete and partial ignorance under conditions of severe uncertainty. The distinguishing characteristic is that with partial ignorance the
2.5. Dealing with Severe Uncertainty
11
decision maker is able to ‘assign superior positions to some states of the world with respect to
the remaining states’ [Ballestero, 2002, pg.87], whereas with complete ignorance this is impossible. This weighting of states gives a non-uniform distribution whereby it is known what states
are more probable than others.
Kmietowicz [1981] discusses another approach which he calls decision-making under conditions of
incomplete knowledge. Under conditions of incomplete knowledge, a decision maker may have
information about probabilities of states of nature, but this information is insufficient with regard to granting a precise specification of the probabilities.
While we don’t use the terms complete and partial ignorance, the concepts behind them are important, and hence warrant discussion. Partial ignorance and incomplete knowledge imply that
some weighting can be given to certain states of the world with respect to other possible states.
This may result from knowing relative uncertainties between certain states, and can be incorporated easily into a mathematical model. These weightings differ to that of dispersion, discussed
in Chapter 2.5, whereby the distribution over the uncertainty region might be estimated, but
the placement of peaks and valleys within the region are subject to severe uncertainty.
2.5
Dealing with Severe Uncertainty
Whilst there exists an abundance of literature focusing on decision-making under severe uncertainty, very few authors have offered a concise definition of severe uncertainty due to the
inability to quantify the construct. Rather, severe uncertainty and its counterparts seem to have
become buzzwords within the literature, defined in the traditional Knightian or Keynesian way,
but with little discussion, and perhaps much confusion, as to how to treat it successfully.
As severe uncertainty cannot be effectively quantified, treatment of it must be handled delicately. One benefit of the conservatism behind Wald’s Maximin criterion is that under conditions of severe uncertainty, such caution might be logical. At the same time, extreme pessimism
may not be justified in some cases. It is common for people to take risks and to be optimistic
about future outcomes, but in situations where a decision will have important consequences,
under uncertainty it is very rare that one would be extremely optimistic. If stakes are very high,
and uncertainty about the future is severe, caution usually outweighs risk-taking.
Size of Uncertainty Region
There are two main, but subtle, issues in dealing with uncertainty which have not been prevalent in decision-making literature, and which we shall attempt to address in this thesis. The
12
Chapter 2. Severe Uncertainty
first is the size of the total region of uncertainty. As the size of the region of uncertainty changes,
one would expect the robustness of a decision to change accordingly. We cannot generalise that
an increase in the size of the uncertainty region decreases the robustness of a decision, or vice
versa, due to varying definitions and assumptions for different problems.
If we are presented with a problem which has a large uncertainty region and we have no probability distributions or means of quantifying our uncertainty, then it could be assumed that the
uncertainty is severe. However, if our uncertain state space is relatively ‘small’ and the uncertainty cannot be quantified, then we may have mild uncertainty.
There may be a problem in comparing the size of uncertainty regions if they are infinite. If this
is the case, we can divide the size of one region by the other to give a finite ratio of the size of
the uncertainty regions1 . We can then compare relative sizes rather than absolute sizes. This is
discussed further in Kouvelis and Yu [1997]. An example of this is given in Figure 2.1.
x2
x1
Figure 2.1: Two infinite uncertainty regions.
In Figure 2.1, there is a depiction of an infinite state space, in grey, ‘contained’ within another
infinite state space, in white. While intuitively we know that the outer uncertainty region is
larger than that of the internal region (as the internal region is ‘contained’ within the outer
region), it is not a matter of counting the number of states in order to determine the size. Here,
both uncertainty regions are infinite. However, we can take a ratio of their size to determine
which is larger relatively.
1 In
other words, we can take the limit of the ration as the size approaches infinity.
2.5. Dealing with Severe Uncertainty
13
To demonstrate the importance of the size of the uncertainty region on the robustness of a
decision, we take the following example:
Suppose that we have been told that there is a dead pixel on a computer screen which we cannot
visualise, and we need to try to determine its location. Without having any idea where it is, the
region of uncertainty is larger for a 17” screen than for a 10” screen. With a uniform distribution
over all states, is could be assumed that starting from a random point somewhere on the screen,
we would find the dead pixel more quickly when examining the 10” monitor compared to the
17” monitor.
When using a robust decision-making method which examines uncertainty sets, the size of the
set indicates the extent of the uncertainty. The larger the set size, the greater the level of uncertainty. This means that if the set consists of a single point, we have conditions of certainty.
Examples of approaches which tackle severe uncertainty in this manner include Ben-Tal and
Nemirovski’s robust counterpart approach [1998, 1999, 2000], and Bertsimas and Sim’s modification of this approach [2004].
If, rather than the uncertainty set of a problem being representative of the uncertainty, we are
dealing with an estimate, then the equivalent ‘measure’ of the level of uncertainty is given by
the quality of the estimate. Conceptually, one could think of this as the ‘distance’ of the estimate
from the unknown true value.
This concept is demonstrated in Figure 2.2, where the circle is our unknown true value, and the
square is our estimate. Assuming a uniform distribution over the state space, we can think of
the intensity of uncertainty with regard to an estimate in terms of our relative interpretation of
how far away our estimate is from the true value.
If the estimate is believed to be a good estimate, for example, near the true value of the parameter, then we have conditions of mild uncertainty. This could be treated using a sensitivity or
parametric analysis, or other methods of local analysis.
However, if the estimate is a poor guess and likely to be substantially wrong, then this determines conditions of severe uncertainty.
An example of a methodology which uses a local analysis under severe uncertainty is that
of Ben-Haim’s information-gap decision theory (which we shall refer as Info-Gap) [2001, 2006].
However, as will be further discussed in Chapter 4, the use of a local analysis around an estimate
under severe uncertainty is not a suitable treatment of the uncertainty, since if it is severe then
we should intrinsically assume that the estimate is a wild guess which is substantially wrong.
If the uncertainty is mild, then it can be assumed that the estimate is close to the true value of
the parameter, in which case a local analysis is acceptable around this estimate.
14
Chapter 2. Severe Uncertainty
distance
(a) Severe uncertainty.
distance
(b) Mild uncertainty.
Figure 2.2: Comparison between the quality of two estimates.
Dispersion of Uncertainty
The second issue is what we shall term the dispersion of uncertainty over this region. Here,
we differentiate between levels of uncertainty qualitatively by presenting how each might be
conceptualised in terms of uncertainty dispersion. We do not want to use the term ‘distribution’
here as we are not applying an uncertainty distribution to our uncertainty region, but rather we
are trying to describe qualitatively how probability might be dispersed over the region.
Figures 2.3 and 2.4 are visual examples of possible uncertainty dispersions over the state space.
It is important to note that these ideas are purely for conceptual and visual purposes. We do
2.5. Dealing with Severe Uncertainty
15
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
20
0
0
18
16
2
4
14
6
12
8
10
10
8
12
6
14
4
16
2
18
20
0
Figure 2.3: A visualisation of one example of uncertainty dispersion over a state space.
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
20
18
16
14
0
20
12
18
10
16
14
8
12
6
10
8
4
6
4
2
2
0
0
Figure 2.4: A visualisation of another example of uncertainty dispersion over a state space.
not mean that we are prescribing such a distribution to the formulation of the model, hence
transforming the problem to one of risk. Rather, we use the idea of dispersion to allow us to
describe what we think we know qualitatively as we have no means of doing in quantitatively.
If we imagine a normal distribution over our state space, it has a given shape. If the shape
of our normal distribution is wide, then we know that the variance is large. If our normal
16
Chapter 2. Severe Uncertainty
distribution is thin, then the variance for this distribution is small. This is a quantitative way of
describing the uncertainty over our state space. However, under severe uncertainty we cannot
apply the same quantitative constructs. Hence, if we have additional information, such as an
unspecified or unquantified likelihood that our true state may be in one region of the state space
over another, as this may assist in solving the problem, then rather than neglecting it, we should
represent it qualitatively using dispersion.
Using the example above, we are still trying to find the dead pixel on our computer screen. If
someone describes to us that it is ‘more likely’ to be around the borders of the screen, and ‘less
likely’ to be around the centre of the screen, then we should initially base our search around the
borders of the screen. This description of the uncertainty dispersion, although not quantified,
may assist us in solving the problem.
The fact that some scenario might be ‘more likely’ than another under severe uncertainty does
not provide us with any quantification of this uncertainty, but instead might guide us towards
the search of specific areas of the uncertainty space over others.
2.6
Formulations Involving Severe Uncertainty
It is important to reiterate that under severe uncertainty, the uncertainty cannot be quantified in
any way in the problem formulation. The concept of having severe uncertainty with regards to
the unknown eventuating state is a characteristic of the problem, and hence must be acknowledged in the formulation. The ascription of explicit probabilities is not a part of the formulation
under such uncertainty, but it can be part of the analysis and solution.
Following the formulation of a problem, in order to solve it we might make assumptions about
the uncertainty and how to quantify it, and this may entail attribution of probabilities to uncertain parameters. It is in the solving of the problem that risk and uncertainty become parallel
[Bustamante-Cedeno and Arora, 2008]. Solutions obtained following this process are directly
related to whatever probabilistic structure is imposed on the problem in order to solve it.
Presented below is a decision table, Table 2.4, identical to those shown in Sections 2.2, 2.3,
and 2.4. Suppose that we simply do not know which state will eventuate. This is a problem
d1
d2
s1
s2
3
8
7
2
Table 2.4: Decision Table.
2.7. Variability and Uncertainty
17
under severe uncertainty. We could solve it using the classical Maximin criterion, as demonstrated in Example 2.3, but perhaps we are not such pessimistic players. In order to solve the
problem, we need to ascribe some probability distribution over the state space. If we wish to
be conservative, we may apply Laplace’s principle of insufficient reason, as shown below in
Table 2.5. Laplace’s principle of insufficient reason, discussed further in Chapter 4.6, places a
uniform distribution over the entire state space (it is noted that we cannot have a proper uniform
distribution over some state spaces, for example, over R). We describe this as ‘conservative’, as
we are not putting any more proverbial eggs in one basket compared to another. Even though
we have applied a probability distribution over the state space, this was done post problem
formulation and hence began as a problem under uncertainty, yet has now been transformed
into a problem under risk in order to be solved.
d1
d2
s1
s2
Expected Value
3
8
7
2
5
5
Table 2.5: Decision-Making Using Laplace’s Principle of Insufficient Reason
For this example, under Laplace’s principle of insufficient reason, either decision can be chosen
as they both have identical expected values.
2.7
Variability and Uncertainty
It is important to distinguish between variability and uncertainty. Hayes et al. [2007] describe
the main point of difference between these types of uncertainty in terms of the effects of additional data collection; variability cannot be directly reduced with extra accumulation of data,
whereas uncertainty can be reduced. Along similar lines, Anderson [1999] describes variability
as an objective property of a population, whilst describing uncertainty as an ignorance, or lack
of complete knowledge. Anderson also differentiates between the two concepts analogously to
the description given by Hayes et al. [2007] with regard to the effects of obtaining additional
data. Cullen and Frey distinguish between them by noting that variability is an attribute of the
problem, whilst uncertainty is a property of the decision maker [Cullen and Frey, 1999]. Aside
from these descriptions of the main difference between variability and uncertainty, Webster and
Mackay [2003] define uncertainty as a measure of knowledge of the magnitude of a parameter,
whilst variability is a measure of the diversity in content of a parameter, which can be quantified
with a distribution.
There exist examples whereby no distinction can be drawn between variability and severe un-
18
Chapter 2. Severe Uncertainty
certainty within a mathematical model. However, caution must be exercised when interpreting
results of the model, as the interpretation may be very different depending on whether the
problem exhibits one or the other.
The following is an example of a situation in which we cannot distinguish between variability
and uncertainty within a model:
Suppose that we have a piece of mechanical equipment which we are installing into a laboratory.
We know that it only works in certain temperature conditions, and we need to ensure that it will
work in our laboratory. We create two possible scenarios:
Under variability, the temperature changes over the time period, ranging from a minimum of
five degrees Celsius to a maximum of thirty-five degrees Celsius. This change can be measured
and recorded, but it varies. Under uncertainty, the temperature is constant but unknown, and
the possibilities range, again, from five degrees Celsius to thirty-five degrees Celsius.
Chapter 3
Satisficing and Optimising
3.1
Overview
The debate between benefits of satisficing versus optimising has been prominent within the
realms of decision-making analysis since the mid-twentieth century.
Traditionally, optimality conditions were built upon the notion of Homo economicus, or the economic man; in other words, one seeks to maximise profits, or minimise costs. Optimisation is
usually associated with choosing the best alternative, that which gives the greatest return when
maximising, out of a selection of alternatives. An optimal decision is determined by its impact
on the objective function.
Example 3.1. Suppose we have the following linear programming problem:
z∗ = max
x ≥0
s.t.
z = 4x1 + x2
x1 + x2 ≤ 2
3x1 + 2x2 ≤ 5.
In order to optimise, we want to choose ( x1 , x2 ) such that we maximise the objective function, z. Optimising would give the solution ( x1∗ , x2∗ ) = (1, 1), with z∗ = 5.
The concept of satisficing, introduced by Simon in the late 1950s, challenged the idea of optimising. Simon considered the aim of decision-making to be the discovery of ’courses of action that
satisfy a whole set of constraints’ [Simon, 1964, pg.277], and introduced the concept of ‘the administrative man’, coined for those who satisfice ‘as they have not the wits to maximise’ [1964,
19
20
Chapter 3. Satisficing and Optimising
pg. xxviii]. The method of satisficing selects an alternative which may or may not give the best
return, but satisfies all constraints. This is also known as ‘bounded rationality’. Emphasis here
is on the fulfillment of constraints – to quote Simon [1964, pg.6], ‘If you allow me to determine
the constraints, I don’t care who selects the optimization criterion’.
Example 3.2. Suppose we have the identical linear programming problem as seen in Example 3.1:
z∗ = max
x ≥0
s.t.
z = 4x1 + x2
x1 + x2 ≤ 2
3x1 + 2x2 ≤ 5.
An example of a solution which satisfices is taking ( x1 , x2 ) = (0, 1), with z = 1, as all the constraints
are obeyed.
3.2
Satisficing Versus Optimising: A Matter of Choice
The issue of whether to optimise or satisfice has been debated since the late 1950s, when Simon
argued that humans satisfice rather than optimise, as they have not the means to do otherwise.
The debates that followed the publication of Simon’s ‘Administrative Behaviour’ [1964] had
people arguing over whether satisficing was ‘better’ than optimising.
Proponents of satisficing argue that humans’ abilities (or lack thereof) to process all the information required to optimise, combined with the costs of optimisation, are reasons for us to
satisfice.
Rubinstein [1998] responded to Simon’s ideas of satisficing by asserting that ‘it is difficult to
pinpoint any economic work not based on fully rational microeconomic behaviour that yields
results as rich, deep, and interesting as those achieved by standard models assuming full rationality’, thus justifying the traditional restriction of the concept of optimality to maximisation
(or minimisation) problems within the operations research literature.
Such arguments supporting one technique over the other are missing the point. Rather than
optimising being ‘better’ than satisficing, or vice versa, we must emphasise the importance of
choice between these models in terms of their suitability for a given problem. To demonstrate,
we present the following theorem [Sniedovich, 2008b]:
Theorem 3.3. Any satisficing problem can be transformed into an optimisation problem such that any
feasible solution to the satisficing problem is optimal with respect to the optimisation problem, and vice
versa.
3.2. Satisficing Versus Optimising: A Matter of Choice
21
Proof. Consider the following generic satisficing problem: Find d such that d satisfies all given
constraints for a problem. We can reformulate this as the following optimisation problem:
z∗
where
= max f (d)
d∈ D
(
f (d) =
1,
if all constraints hold for d
0,
otherwise.
If f (d∗ ) is maximal, then all constraints hold for d∗ , hence d∗ satisfices. If all constraints hold
for d∗ , then f (d∗ ) = 1, therefore f (d∗ ) is maximal, hence d∗ optimises 1 .
From Theorem 3.3, it can be seen that the argument that satisficing is better than optimising, or
vice versa, becomes redundant, hence general statements similar to ‘good is preferable to best’
[Ben-Haim, 2007b] are simplistic.
If one does not have the means to optimise with respect to utility, then attempts to do as best as
one can within the constraints leads to satisficing. Bounded rationality should not be used as
justification of why one should satisfice rather than optimise.
“This paper has little to say to those who just want ‘good’ solutions in a context
where optimal is worse than ‘good’; after all, the starting point is an optimisation
model with an objective function chosen by the user. In other words, we assume
that even though ‘good’ may indeed be good enough, optimal would be better.”
[Wallace, 2000, pg.22]
If we can optimise with ease, then surely we should, but if the process is time-consuming, or
optimal results are not robust to changes in parameter values, then we should do as best we
can. It needs to be clarified that the importance lies in what it is we wish to satisfice or optimise
for a given problem.
“Unfortunately, the difference between ‘optimising’ and ‘satisficing’ is often referred
to as a difference in the quality of a certain choice . . . The best thing would therefore
be to avoid a general use of these two words.”
[Odhnoff, 1965, pg.39]
1 It should be noted that finding a decision that holds for all constraints is not the same as maximising some general
objective function f (d). In our current case, f (d) is maximal when all conditions held are satisfied due to our explicit
definition of f (d). We call our function f (d), if it is structured as above, an indicator function, as it indicates whether or
not our decision obeys all constraints.
22
Chapter 3. Satisficing and Optimising
Using the same example as above, we have the following linear programming problem:
z∗ = max
x ≥0
s.t.
z = 4x1 + x2
x1 + x2 ≤ 2
3x1 + 2x2 ≤ 5.
Due to the small scale and simple nature of this problem, one would suggest optimising over
satisficing, as the optimal solution, z∗ = 5, is relatively larger than the lowest possible satisficing solution value, z = 0, and an optimal solution can be implemented with ease and speed.
Here, we aim to optimise the value which our performance measure, in this case, the objective
function, achieves.
However, if the problem had numerous constraints and variables, iteration through all solution
possibilities to find the optimal could be tedious and time-consuming. In such a case, we may
wish to choose a point which just fulfills all constraints.
To demonstrate with an example, we present the following scenario:
Suppose that we are at a supermarket, looking to buy a single carton of milk out of 200 cartons. We want to find the carton with the latest use-by date, and at worst, we want a carton
with at least three days until use-by. Now, suppose that the cartons are accurately lined up in
date order, from earliest use-by date to latest use-by date. The problem is then very simple. We
choose the carton at the very back of the line. This process is one of optimising with respect to
the objective function. If the carton at the back does not have at least three days until passing,
then the problem has no feasible solution.
Now, suppose the cartons are not ordered, but rather, placed in a random mess. Trying to find
the carton which will last the longest may be a difficult feat, as not only do we need to find it,
we need to make sure it is the carton that will keep the longest. As it may take time in sorting
through all 200 cartons, we might decide that instead of finding the carton with the furthest
use-by date, we will settle for one which has at least three days remaining. Hence, we can pick
the first carton which complies with this standard that we see. This is an example of satisficing.
For the remainder of this these, we will use the terms ‘optimising’ and ‘satisficing’ to distinguish
between:
robust satisficing, where robustness is sought due to uncertainty within constraints of either
a satisficing or optimisation problem;
3.2. Satisficing Versus Optimising: A Matter of Choice
23
robust optimisation, where robustness is sought with regards to uncertainty in the objective
function of an optimisation problem; and
robust optimising and satisficing, where robustness is sought with respect to both the objective function and the constraints of an optimisation problem. This is identical to the robust
optimisation of a constrained problem, however, for clarity, we shall continue referring to
it as robust optimising and satisficing for clarity.
We will elaborate on these terms in Chapter 4.
Robust satisficing is parallel to Mulvey et al’s concept of model robust, while robust optimising
is analogous to solution robust [Mulvey et al., 1995]. Despite the debate surrounding the use
of the terms ‘satisficing’ and ‘optimising’, we continue to use these terms for the remainder of
the paper over those coined by Mulvey, as they are fairly unintuitive. A model of a problem
includes both the objective function and constraints, however the concept of model robustness
is robustness only involving constraints. Similarly, the solution to a problem is given with
respect to the decision variables of the problem, rather than the objective function itself, and
hence is not suitable in describing robustness with respect to the objective function only.
Chapter 4
Robustness
4.1
Overview
Robustness is defined in numerous ways within decision-making literature. Vlajic et al. [2008]
attempt to group multiple definitions of robustness into epitomising umbrella categories. In
their paper, they split the general concept of robustness into three branches: robustness as a
measure, robustness as a strategy, and robustness as a characteristic of the system.
Robustness as a measure refers to problems in which the robustness is defined using proportions comparing the number of well-performing states a decision gives over the total number
of possible states. For a decision di ∈ D, its robustness over a discrete state space is given by:
ri
=
n(Ŝ(di ))
,
n(Ŝ)
where n(Sˆ) is the number of states in Ŝ ⊆ S that perform well under decision di , and n(Ŝ(di ))
is the number of well-performing states possible after decision di is made. This is well-defined
if the sets are finite, however if there are infinitely many states, we can approximate the relative
ratios to determine the robustness.
Using this definition, robustness measures the flexibility of a decision, and correlates with the
concept of robustness as introduced by Gupta and Rosenhead [1972], which when maximised,
chooses the initial decision which will perform well not only in the final stage of a sequential
decision-making process, but also at each stage of progression.
We may have a lower bound on the acceptable level of robustness, ri , which determines whether
or not a decision’s performance is acceptable. Regardless, the decision chosen is usually that
25
26
Chapter 4. Robustness
with the highest robustness, given all other measures of performance are satisfied.
Robustness as a strategy was introduced in the 1980s, and termed Robust Design. Used heavily
in the field of engineering, it was introduced as a means to assist flexible design in engineering, based on the notion of minimising variance of control factor values and settings from ideal
values. In robust design, this robustness ensures that hard constraints, also known as ‘specifications’ in the literature, will not be violated.
As a strategy, robustness can also refer to the making of decisions or forming of policies which
perform well across a large number of possible, unknown future states. This is the branch of
robustness on which this paper will focus, and is along the same lines of robustness as defined
by Lempert et al.’s Robust Decision Making (RDM) [2003], Ben-Tal and Nemirovski’s robust
counterpart approach [1998, 1999, 2000] and Bertsimas and Sim’s modification of this [2004],
alongside others. Yaman et al. [2007] note that while there are different definitions of robustness,
robust optimisation approaches generally reduce to a Maximin concept.
The last category that Vlajic et al. [2008] describe is that of robustness as a characteristic. Here
it is used to describe systems which are relatively unaffected by small perturbations. A system
which is adaptable to changes within its environment is robust, and highly desirable.
Generally speaking, robustness is a desirable property of a system when the system remains
stable, or at a satisfactory level of performance, despite perturbations or changes in its environment.
A decision can be classified as robust if its outcomes are relatively unwavering with regard
to feasibility or optimality in spite of variations in data or parameters due to uncertainty. A
robust decision under conditions of uncertainty will perform relatively well across a wide range
of plausible circumstances in comparison to other decisions. However, a tradeoff often exists
between robustness and other measures of performance, which forces one to think about how
much one is willing to give up in order to ensure a certain level of robustness [Greenberg and
Morrison, 2007].
Robust decision-making attempts to generate a decision that will avoid unsatisfactory outcomes, regardless of the state of nature that eventuates. Below are classifications regarding
degrees of robustness. The classical and mathematical programming Maximin models have
been adapted from Sniedovich [2008a].
The mathematical programming model is important in conveying that while some formulations
may not explicitly have both ‘max’ and ‘min’ present within the objective function, their form
takes an inherent Maximin approach.
4.2. Complete Robustness
27
We present these classifications using Maximin formulations to emphasise an intrinsic and fundamental property of robust decision-making – the guarantee that a decision performs well
over an entire subset of states hinges upon the decision performing well in the worst performing state within that subset. In the classical Maximin formulation, this is given by the ‘min’
associated with the ‘choosing’ of the state, as follows:
z∗ = max min f (d, s).
d ∈D s ∈ S ( d )
In the equivalent mathematical programming formulation, this worst-case analysis is represented by the ‘∀’ condition present in the constraints.
z∗
= max v
s.t.
d∈ D
v ∈R
v ≤ f (d, s),
∀ s ∈ S ( d ).
It will be shown that, despite traditionally being seen as an overly conservative criterion, Maximin is a useful modelling tool which is capable of reducing its conservatism in order to allow
for a more realistic analysis.
When discussing the Maximin criterion, it is important to note the difference between the worstcase scenario for satisficing, and the worst-case scenario for optimising. The worst case when
we are attempting to satisfice deals with a state, which when paired with our decision, does
not comply with our constraints. The worst case when we are looking to robust optimise is a
state where the value of our objective function is not within the given tolerance of our optimal
objective value for a locally or partially robust decision, and for a completely robust decision,
the worst case is that which gives us the highest guaranteed return. This is manifested in the
objective function, or indicator function, of the Maximin models. In Chapter 4.7, the concept of
‘worst case’ will be further analysed.
4.2
Complete Robustness
A decision is completely robust if it performs well across all possible states of nature, in accordance with the stated performance indicators stipulated by the problem. Methodologies guaranteeing complete robustness of solutions involve examination of worst-case scenarios in order
to eliminate the effects of uncertainty over the entire state space [Bertsimas and Sim, 2004].
An example where the occurrence of complete robustness is possible, and then one where it is
not possible, is presented below. Suppose we have the following uncertain linear programming
28
Chapter 4. Robustness
problem:
z∗
s.t.
=
max x1 + 2x2
x ∈R+
αx1 + x2 ≤ 4
x1 + x2 ≤ 3
x1 ≥ 1,
where α ∈ [1, 2]. Using a worst-case approach, Mother Nature, our adversary, will choose α to
be as large as possible to try to violate the constraint (this approach, known as Soyster’s inexact
linear programming approach [1973], will be discussed further in Sections 4.6, 4.7, 5.5). The
graph of the feasible region with α = 2 is shown in Figure 4.1:
Figure 4.1: An example of a problem solvable under complete robustness.
It can be seen from Figure 4.1 that, when α = 2, the uncertainty region (striped) is bounded
from below by the constraint αx1 + x2 ≤ 4, and bounded above by the constraint x1 + x2 ≤ 3.
4.2. Complete Robustness
29
The feasible region for this problem is given in grey. By decreasing the value of α from two
to one, the feasible region grows ‘into’ the uncertainty region, until the entirety of the feasible
region is bounded above by the constraint x1 + x2 ≤ 3.
As the uncertainty region intersects the feasible region for all values of α, it is possible for our
decision to be completely robust. The robust decision is to take ( x1∗ , x2∗ ) = (1, 2) with z∗ = 5.
This is the optimal decision for our worst-case scenario with α = 2. This means that our return
is guaranteed to be at least five. Coincidentally, ( x1∗ , x2∗ ) = (1, 2) with z = 5∗ is also the optimal
decision for all values in the range of α ∈ [1, 2], however this will not always be the case.
If we change the range of α so that we have α ∈ [2, 16
3 ],
z∗
s.t.
=
max x1 + 2x2
x ∈R+
αx1 + x2 ≤ 4
x1 + x2 ≤ 3
x1 ≥ 1,
then there is no feasible solution in the worst-case scenario, α =
16
3 .
This situation is depicted in Figure 4.2, where it can be easily seen that the feasible region is
empty. The area bound by the uncertain constraint at its worst case is shown in grey. As there is
no feasible solution in the worst case, we cannot choose a decision that guarantees a minimum
specific return over the uncertainty region α ∈ [2, 16
3 ].
Models of decision-making which aim to ensure complete robustness of a decision when possible, including Wald’s Maximin model [French, 1986], and Soyster’s set containment model for
inexact linear programming, are often thought to be too conservative [Soyster, 1973].
Wald’s Maximin, and Savage’s Minimax Regret, criteria, are examples from the field of classical
decision theory which can be implemented when attempting to find a completely robust solution. By choosing the largest return (or smallest regret in Savage’s case) from the worst-case in
each state, it is guaranteed that the true return will be at least as great as the chosen return (the
true regret will be at most as much as the chosen regret). A formulation of Savage’s Minimax
Regret will be presented in Chapter 4.6. Another methodology whose outcome is completely
robust is that of Soyster’s inexact linear programming, as it ascribes the worst value possible to
each unknown parameter [Kachani and Langella, 2005]. Soyster’s approach was implemented
on the above examples, where we set α = 2, and following that, α =
16
3 ,
case values in the range given for our uncertain constraint coefficient.
as these were the worst
30
Chapter 4. Robustness
Figure 4.2: An example of a problem with no feasible solution under complete robustness.
In classical decision theory, severe uncertainty is wholly removed. Wald’s Maximin and Savage’s Minimax Regret assume worst case scenarios. Laplace’s Principle of Insufficient Reason
assumes equal probabilities of occurrences of states. The weighted average between worstcase and best-case outcomes with respect to an optimism-pessimism index can also be taken,
as developed by Hurwicz. As the uncertainty is removed, it should be noted that the level of
uncertainty does not have any effect on how one should approach complete robustness. This
transformation from a problem under uncertainty to one under certainty may force a strict conservatism which might only be necessary if the uncertainty, and the stakes, are high.
Using the notation presented in Chapter 1, a completely robust decision d should perform well
over S(d) ⊆ S, or in other words over all the states that might eventuate with decision d. For
complete robustness, the ’max’ represents the aim to maximise the value of the objective, z∗ ,
across all states. This may be with regards to the objective value of the given objective function, or we may be trying to maximise a {0, 1}-function which takes the value of one when the
constraints are satisfied, and zero when they are violated. The maximisation component under
4.2. Complete Robustness
31
complete robustness ensures that either all constraints are satisfied (complete robust satisficing),
the objective value is highest possible guaranteed return (complete robust optimising), or both,
under all states.
Complete Robust Satisficing
Complete robust satisficing requires a solution to perform well over all possible future states.
Using the notation presented in Chapter 1, the formal problem statement under complete robustness is:
Choose a decision, d ∈ D, h(d, s) ∈ C, ∀s ∈ S(d) ⊆ S.
A generic formulation of a complete robust satisficing problem is:
z∗
= max g(d)
d∈ D
(
g(d) =
where
1,
h(d, s) ∈ C,
0,
∀s ∈ S(d)
otherwise.
If, for any state, s ∈ S(d), h(d, s) ∈
/ C, then we have g(d) = 0. As we are satisficing, we want all
constraints to hold, whereas Mother Nature would like to violate the constraint h(d, s) ∈ C for
some s ∈ S(d) in order to minimise our objective function value. If our constraint holds over
the entire state space, then our objective function will take maximum value, g(d) = 1.
The classical Maximin formulation for complete robust satisficing can be written as shown below:
z∗
= max min k(d, s)
d∈ D s∈S(d)
(
where
k(d, s) =
1,
h(d, s) ∈ C
0,
otherwise.
Functions with similar {0, 1}-structure to k (d, s) will be referred to as indicator functions.
This can be reformulated as a Maximin mathematical programming formulation as such:
z∗
s.t.
= max v
d∈ D
v ∈R
v ≤ k(d, s),
∀ s ∈ S ( d ).
Here, we wish to maximise v ∈ R. This is bounded above by the worst case value for k(d, s).
32
Chapter 4. Robustness
Complete Robust Optimising
Complete robust optimising requires the solution to be the maximal guaranteed element across
all possible states. The problem statement under complete robustness is as follows:
Choose a decision, d ∈ D, such that the value of our objective function, f (d, s), gives the
largest guaranteed return over all states.
and the classical Maximin formulation for complete robustness with objective function f is:
z∗
= max min f (d, s).
d∈ D s∈S(d)
However, it may be more convenient to reformulate this as a conventional optimisation model
by removing the internal optimisation to give the mathematical programming formulation:
z∗ = max
v
d∈ D
v ∈R
v ≤ f (d, s),
s.t.
∀ s ∈ S ( d ).
Removal of the inner optimisation is valid as it is contained within the constraint. As the constraint holds for all states in the state space, the objective is bounded by the smallest return
possible for any given state.
Complete Robust Optimising and Satisficing
Complete robust optimising and satisficing requires the solution to be the maximal element
across all states, whilst obeying all constraints in all possible states. This combines the cases
above, and can be thought of as a robust optimisation problem with constraints. A mathematical
programming formulation combines the above elements to give:
z∗ = max
d∈ D
v ∈R
v
(
s.t.
v ≤ f (d, s),
h(d, s) ∈ C
∀ s ∈ S ( d ).
The value of z∗ in this formulation will be f (d, s) if the constraint h(d, s) ∈ C holds over all
s ∈ S(d), and will be −∞ if the constraint is violated for at least one of the possible states in the
state space, as maximising over an empty set gives −∞.
4.3. Partial Robustness
33
A classical Maximin formulation is as follows:
z∗ = max min
d∈ D s∈S
v(d, s)
where
(
v(d, s) =
f (d, s),
−∞,
h(d, s) ∈ C
otherwise
d ∈ D, s ∈ S(d).
In the classical Maximin formulation, the value of z∗ again will be f (d, s) if the constraint
h(d, s) ∈ C holds over all s ∈ S(d), and will be −∞ if the constraint is violated for at least
one of the possible states in the state space, as given by v(d, s).
4.3
Partial Robustness
A relaxation of complete robustness, partial robustness requires the outcome of the decision to
perform well over a subset of the state space, Ŝ(d) ⊆ S(d) ⊆ S. For simplicity, we assume
that the state spaces for each decision are identical for all possible decisions, and hence the
size of the state space is not dependent on our decision. Under this assumption, the robustness
criterion which we deal with is absolute robustness [Kouvelis and Yu, 1997], where the worst case
performance for a decision, optimising or satisficing, indicates the robustness of that decision.
If this assumption does not hold, we can record the percentage deviation from the optimal
performance for a state, and take the worst percentage deviation to be the worst case for a
decision. We refer the reader to Kouvelis and Yu [1997] for further discussion on absolute and
relative robustness.
With this assumption in place, the larger this subset is, the greater the robustness, hence robustness may be defined in terms of the relative size of the largest subset over which a decision
performs well. The size of this subset may be determined by ‘counting’ the number of elements
within the set Ŝ(d), or by using some other function which weights elements of Ŝ(d) according
to their level of plausibility or some other factor.
If the space is continuous and not discrete, the volume of the domain may be calculated in the
Euclidean sense. However, for large scale problems, this may not be tractable and discretisation
of the state space may be necessary.
The formal problem statements for the corresponding robustness approaches are identical to
that for complete robustness. The only exception is that instead of requiring decision d to perform well over the entire state space, we now require d to perform well only over a subset
34
Chapter 4. Robustness
Ŝ(d) ⊂ S(d) ⊆ S such that Ŝ(d) is largest. If we attempt to find partial robustness for a decision
over a subset of the state space, Ŝ(d), and the decision holds for all s ∈ S(d), then the decision
is completely robust.
Taken from Sniedovich [2008a], consider the following example of a robust satisficing problem:
Find a decision, d ∈ D such that h(d, s) ∈ C, ∀s ∈ Ŝ, Ŝ ⊆ S(d).
As we do not require the constraints to hold over the entire state space, but rather a subset of
the state space, we have partial robustness. Under conditions of severe uncertainty, we may
want to maximise the size of this subset to ensure that our decision is robust over the largest set
of possible outcomes.
Let V (Ŝ(d)) denote the size of subset Ŝ(d) ⊂ S(d) ⊆ S. V is a measure on the state space such
that
∀Ŝ(d) ⊂ S(d) ⊆ S
V (Ŝ(d)) ≥ 0,
and
Ŝ(d) ⊂ Ŝ0 (d)
→
V (Ŝ(d)) < V (Ŝ0 (d)).
There may be constraints on which elements we can and cannot add to the set Ŝ(d). One way
to measure the robustness of a decision could be to discretise the state space. If the decision
performs sufficiently well in a state, we can ‘mark’ it. Additionally, we may want to constrain
elements which we include within the subset; for example, we may only wish to allow immediate neighbours of a point in the state space to be included in Ŝ(d). A visual example is presented
below in Figure 4.3:
Figure 4.3: An example of a discretised state space.
4.3. Partial Robustness
35
The black circles on the vertices of the grid represent the states in which the current decision
performs well, whereas the ‘empty’ vertices represent the states in which the current decision is
unsatisfactory. Note that the grid represents the state space and its outcomes with respect to one
decision. We may wish to only ‘count’ the states (circles) that are immediate neighbours for the
one decision, and choose the largest weighted subset of these to be the subset corresponding
to that decision. Hence, if we applied a uniform distribution over the discretised state space
shown in Figure 4.3, then the cluster of states in the lower left corner would be those which
determine the domain of that decision. The set of states chosen to represent our decision is
shown in Figure 4.4.
Figure 4.4: The chosen subset of states for the decision represented in Figure 4.3.
It is important to note that due to the condition Ŝ(d) ⊂ Ŝ0 (d) → V (Ŝ(d)) < V (Ŝ0 (d)), we can
compare decisions directly if one set of states is a subset of another, as seen in Figure 4.5 below.
Obviously, weighting would not affect this, as any weighting applied on the state space would
affect both the smaller subset Ŝ(d) and the larger subset Ŝ0 (d).
In Figure 4.5, we present two possible performance grids for two decisions. Figure 4.5(b) depicts
a subset of Figure 4.5(a), hence we can say that the domain V (Ŝ(d)) < V (Ŝ0 (d)). Thus we can
state that decision d0 is more robust.
However, if a subset of the state space for one decision is not contained within the state space of
another decision, then these are not comparable, as weightings or other factors may affect their
standings. An example of a case where performance grids for two different decisions are not
comparable is shown in Figure 4.6.
In Figure 4.6, if we have no preference as to which states eventuate, then we will choose decision d0 purely on the issue of size. If the states in which the decision performs well under in
Figure 4.6(b) are weighted more heavily than those shown in Figure 4.6(a), then decision d will
be more robust. Thus we cannot compare these decisions solely on the number of states which
each set contains.
36
Chapter 4. Robustness
(a) Ŝ0 (d0 ) for decision d0
(b) Ŝ(d) for decision d
Figure 4.5: A visual comparison of two decisions.
(a) Ŝ0 (d0 ) for decision d0
(b) Ŝ(d) for decision d
Figure 4.6: Two decisions that are not comparable.
Partial robustness also differs from complete robustness with regard to what is represented by
the ’max’ within the Maximin formulations. Unlike complete robust optimising, where this
maximisation relates directly to the value of the objective function, for partial robustness we are
trying to maximise the size of the subset of states over which either all constraints are satisfied
(partial robust satisficing), the objective value is maximal or close to maximal (partial robust
optimising), or both.
Methods which use partial robustness include Starr’s domain criterion [1966]. This method
generates ‘domains’ (polytopes) for each decision, with each domain traditionally containing
points for which that decision is optimal. The size of each domain is proportional to its corre-
4.3. Partial Robustness
37
sponding decision’s likelihood, hence by maximising the size of the domain, it is ensured that
the chosen decision is optimal for the greatest number of (or most highly weighted) plausible
future states. While Starr’s initial formulation of the domain criterion focussed on optimisation problems, we could also use Starr’s domain criterion in a satisficing context, where each
domain contains points for which that decision is feasible.
It should be noted that if the largest subset of the state space chosen using partial robustness
techniques represents only a small fraction of the state space, then we cannot say that the decision under consideration is robust under severe uncertainty unless this fraction is heavily
weighted. The size of the subset, or its weight, is proportional to the intensity of the uncertainty
for which robustness will hold.
Below are examples of possible models which demonstrate the concepts of partial robust satisficing, partial robust optimising, and partial robust optimising & satisficing.
Partial Robust Satisficing
A partially robust satisficing decision will choose an element which satisfies all constraints over
the subset, Ŝ, of largest size. A classical Maximin formulation of this is given by:
max min V (Ŝ) · (h(d, s) C )
d∈ D s∈Ŝ
Ŝ⊆S(d)
where is a binary operator defined as:
(
(a C) =
1,
a ∈ C,
0,
otherwise,
where a ∈ C 0 and C ⊆ C 0 .
The ‘min’ here attempts to choose a state such that the constraint h(d, s) ∈ C is violated. If the
constraint is violated, then that state will not be included in the subset Ŝ.
We can reformulate this as a mathematical programming problem by getting rid of the inner
optimisation:
max
d∈ D
Ŝ⊆S(d)
s.t.
V (Ŝ)
h(d, s) ∈ C, ∀s ∈ Ŝ.
38
Chapter 4. Robustness
Partial Robust Optimising
Defining the ‘size’ of the set as for partial robust satisficing, attempts to robust optimise over a
subset of the state space will recommend a decision whose return is the maximal element in all
states s ∈ Ŝ, where Ŝ is chosen to be of largest feasible size.
A generic partial robust optimising model can be presented as follows:
max
d∈ D
Ŝ⊆S(d)
s.t.
V (Ŝ)
f (d, s) ≥ f ∗ (s) − e, ∀s ∈ Ŝ,
where e ≥ 0 and
f ∗ (s) = max f (d, s),
d∈ D
s ∈ S ( d ).
Note here that, as mentioned earlier, we aim to maximise the size of the subset of the state
space in which our decision d ∈ D gives maximal return. Here, our objective function is in
the constraints, and we consider a decision to produce an optimal return if it is within some
tolerance, e, of the true maximal return, f ∗ (s), where f ∗ (s) is determined under certainty for a
given state.
Under partial robust optimising, the idea of robustness with respect to the objective function
differs from that of complete robustness. Under complete robustness, our objective function
value is the highest guaranteed return across all states. When we attempt partial robust optimisation, we hope our decision will achieve a value close to (within e of) the maximal return for
each and every state, s ∈ Ŝ 1 .
This generic model is a mathematical programming Maximin formulation, as Mother Nature
will attempt to make f ∗ (s) as large as possible in order to reduce the size of V (Ŝ). The e represents a relaxation whereby the value of the objective function no longer must be maximal, but
can be within some e ≥ 0 threshold of the maximal value. Below, we present a small numerical
example demonstrating this.
Example 4.1. Suppose we have the following decision table, Table 4.1, and we want to find a decision that
is partially robust with e = 0. Here, we have f (s1 ) = 10, f (s2 ) = 3, and f (s3 ) = 10. Using a complete
robust optimisation approach, we would choose decision two, as it provides the greatest guaranteed return
of two. However, we may miss out on a large return, as decision two is quite conservative. Using a partial
robustness approach, we aim to maximise the number of states over which our decision is maximal. This
would choose decision one, as it is maximal over two of the three states, namely s1 and s3 . However, if we
1 This
discussion is applicable to local robustness, described in Section 4.4
4.3. Partial Robustness
39
d1
d2
s1
s2
s3
10
2
1
3
10
2
Table 4.1: Decision Matrix.
changed e = 2, then while decision one would still be optimal, it would hold over all three of the states.
It can be seen from Example 4.1 that a decision found using complete robust optimisation may
differ to a decision that is completely robust (robust over the entire state space) which was
found using a partial robust optimisation approach.
A classical Maximin formulation is:
max min
d∈ D s∈Ŝ
Ŝ⊆S(d)
v(d, Ŝ, s)
where
(
v(d, Ŝ, s) =
V (Ŝ),
f ( x, s) ≥ f ∗ (s) − e
−∞,
d ∈ D, Ŝ ⊂ S, s ∈ Ŝ.
The mathematical programming problem, which is rid of the inner optimisation, is identical to
the generic partial robust optimising model presented above.
Partial Robust Optimising and Satisficing
Combining the above cases for robust optimising and robust satisficing, we can present a generic
robust optimising and satisficing model:
max
d∈ D
Ŝ⊆S(d)
V (Ŝ)
(
s.t.
f (d, s) ≥ f ∗ (s) − e,
h(d, s) ∈ C
∀s ∈ Ŝ
Again, this generic optimising and satisficing model is a mathematical programming Maximin
formulation, as Mother Nature will attempt to make f ∗ (s) as large as possible in order to reduce
40
Chapter 4. Robustness
the size of V (Ŝ). Mother Nature will also try to choose s such that the satisficing constraints are
violated. A classical Maximin formulation is as follows:
max min
d∈ D s∈Ŝ
Ŝ⊆S(d)
v(d, Ŝ, s)
where
(
v(d, Ŝ, s) =
4.4
V (Ŝ),
f (d, s) ≥ f ∗ (s) − e and h(d, s) ∈ C
−∞,
otherwise
d ∈ D, Ŝ ⊂ S, s ∈ Ŝ
Local Robustness
Local robustness analysis is conducted in the immediate neighbourhood of a point estimate of
the parameter of interest. The idea of local robustness is similar to that of sensitivity or parametric analysis, which is conducted to see how much a solution can change before it becomes
suboptimal or infeasible. Local robustness can also be seen as a localised partial robustness
analysis around an estimate.
The formal problem statements for the corresponding robustness approaches are similar to that
for complete and partial robustness. However, under local robustness, we require decision d to
perform well over a subset S(d, α) ⊂ S(d) ⊆ S such that S(d, α) is largest, where S(d, α) is a
neighbourhood of ‘size’ α of the given estimate, s̃, as seen below in Figure 4.7.
The robustness of a decision d is determined by finding the largest value of α such that d performs well over S(d, α). The question now becomes: how large can we make α such that the
output of d still performs satisfactorily? The larger this value, the greater the robustness of d.
For simplicity and visual clarity, we assume that α represents the radius of a circle centred at
our estimate. This does not have to be the case, however; α can represent some measure of the
subset of the state space which is any shape we want it to be. It is assumed that this subset,
S(d, α), is non-decreasing with α; in other words, for each d ∈ D and α ≥ 0,
S(d, α) ⊆ S(d, α + e), ∀e ≥ 0
This nesting property characterises local robustness. Robustness analysis is conducted around
the singleton S(d, 0) = {s̃}, where s̃ represents the parameter estimate.
However, under severe uncertainty it is unlikely that the estimate is accurate, hence a local robustness analysis around this estimate is inappropriate. Wallace [2000] critiques the concept
of sensitivity or parametric analysis, which is closely related to local robustness analysis, by
4.4. Local Robustness
41
Figure 4.7: An example of a local analysis.
demonstrating that under uncertainty the assumption ‘that the optimal solution inherits properties from candidate solutions produced by parametric linear programming’, or geometrically,
‘that the optimal solution is contained in the space spanned by the deterministic solutions’, is
false. Keeping this in mind, there is no reason that such an analysis should be performed on a
poor estimate of the true value.
If our estimate, s̃, is believed to be a good estimate, then, as discussed in Chapter 2.5, we have
conditions of mild uncertainty. If the uncertainty is mild, then a local analysis can be performed
in order to gauge how far away we can move from our estimate before our decision becomes
suboptimal or infeasible. An example is shown below in Figure 4.8, where our estimate, s̃, is
given by the black circle, and our unknown true value is given by the black square.
Under severe uncertainty, it can be assumed that the quality of the estimate is poor, hence the
true value of the parameter may be quite far from that of the estimate, as seen in Figure 4.9,
where, again, our estimate is given by the black circle, and our unknown true value is given by
the black square.
It can be seen from Figure 4.9 that the true unknown value of our parameter may be far away
from our estimate, hence local analysis conducted solely around an estimate does not convey
the robustness of a decision under such conditions. In Chapter 2.5 we discussed how we may
characterise uncertainty using the quality of the estimate. If it is of poor quality, or a wild guess,
then uncertainty is severe.
42
Chapter 4. Robustness
Figure 4.8: An example of a local analysis under mild uncertainty.
Figure 4.9: An example of a local analysis under severe uncertainty.
If we have reason to believe that conducting sensitivity or parametric analyses around an optimal deterministic solution under uncertainty is of little use [Wallace, 2000], then conducting a
sensitivity or parametric analysis around a solution using poor estimates of parameters seems
equally inappropriate. However, we should note that within much of the robust optimisation
literature, estimates are not used under the conditions of severe uncertainty, due to their poor
quality.
4.4. Local Robustness
43
Local robustness is almost identical to partial robustness with regard to what the ’max’ within
the Maximin formulations represents. We attempt to maximise the size of the subset of states
over which either all constraints are satisfied (local robust satisficing), the objective value is
maximal, or close to maximal, (local robust optimising), or both. The difference is that we
maximise the size of this subset around an estimate.
Local Robust Satisficing
Local robust satisficing will choose the decision which gives the largest value of α, or in other
words, chooses the decision which gives the largest sized subset of the state space, such that
all the constraints are obeyed on this subset. The mathematical programming formulation for a
local robust satisficing problem is as follows:
max
d∈ D
α ≥0
s.t.
α
h(d, s) ∈ C, ∀s ∈ Ŝ(d, α),
where Ŝ(d, α) is a subset of ‘size’ α of the state space associated with decision d.
A classical Maximin formulation is:
max min
α≥0 s∈Ŝ(d,α)
d ∈D
g(d, s)
where
(
g(d, s) =
α,
−∞,
h(d, s) ∈ C
otherwise
d ∈ D, s ∈ Ŝ(d, α)
In Ben-Haim’s Info-Gap [2006] approach, discussed further in Chapter 4.6, worst-case analysis
is performed around an estimate of the uncertain parameters, hence the analysis is locally robust. Info-Gap attempts to maximise robustness with respect to a function meeting its critical
threshold value. This is an example of local robust satisficing. However, as discussed above,
local robustness is unsuitable under conditions of severe uncertainty, as the severity of the uncertainty is defined by the quality of the estimate of the uncertain parameters. This means that
if uncertainty is mild, it can be assumed that the estimate is of high accuracy, or close to the true
value of the parameter. Under such conditions, a local worst-case analysis may be of use.
44
Chapter 4. Robustness
Local Robust Optimising
Local robust optimising will choose the decision which gives the largest value of α, or, in other
words, will choose the decision which gives the largest sized subset of the state space in the
neighbourhood of the estimate, such that the outcome of the decision is maximal across all
s ∈ Ŝ(d, α). The mathematical programming formulation for a local robust maximising problem
is as follows:
max
α
d∈ D
α ≥0
f (d, s) ≥ f ∗ (s) − e, ∀s ∈ Ŝ(d, α)
s.t.
where Ŝ(d, α) is a subset of ‘size’ α of the state space associated with decision d.
A classical Maximin formulation is as follows:
max min
α≥0 s∈Ŝ(α)
d ∈D
g(d, s)
where
(
g(d, s) =
f (d, s) ≥ f ∗ (s) − e
α,
−∞,
otherwise
d ∈ D, s ∈ Ŝ(d, α)
Local Robust Optimising and Satisficing
Combining the above, we get the following mathematical programming formulation for local
robust optimising and satisficing:
max
d∈ D
α ≥0
α
(
s.t.
f (d, s) ≥ f ∗ (s) − e,
h(d, s) ∈ C
A classical Maximin formulation is as follows:
max min
α≥0 s∈Ŝ(α)
g(d, s)
∀s ∈ Ŝ(d, α)
4.5. Partial Versus Local Robustness
45
where
(
g(d, s) =
4.5
α,
f (d, s) ≥ f ∗ (s) − e and h(d, s) ∈ C
−∞,
otherwise
d ∈ D, s ∈ Ŝ(d, α)
Partial Versus Local Robustness
We have previously mentioned that local robustness is similar to partial robustness, but centred
around an estimate. In Figure 4.10, we depict the image of two performance grids, each with the
same estimate (the black circle), and its horizon of uncertainty (represented by black rings). The
uncertainty regions are identical. The estimates and horizons of uncertainty for each decision
are identical, hence local analysis would not recommend one decision over another solely based
on robustness.
However, it can be seen that under partial robustness, shown by the grey areas, the decision
shown in Figure 4.10(b) performs well over a much larger area than the decision shown in
Figure 4.10(a).
(a) Poorer partial performance with identical local uncertainty regions.
(b) Greater partial performance with identical local
uncertainty regions.
Figure 4.10: Example of identical local analysis results with different partial results.
Due to the fact that local robustness analysis is based around an estimate, this dependence
may lead to mixed results when the estimate is relocated. Given an estimate of good quality,
we may only be interested in the immediate neighbourhood around that estimate. However,
under severe uncertainty, we should assume that the estimate is poor, and hence local analysis
is unreliable.
If our estimates are relocated towards the centre of the uncertainty space, it can be seen that
there is no acceptable solution for the decision given in Figure 4.11(a), however the region of
the uncertainty space over which the decision shown in Figure 4.11(b) increases dramatically.
46
Chapter 4. Robustness
(a) Poorer partial and local performance.
(b) Greater partial and local performance.
Figure 4.11: Example of the dependence on the estimate location.
This demonstrates the high level of dependence that local analysis places on the value of the
estimate. However, as the estimate is poor under conditions of severe uncertainty, a solution
contingent on the location of this estimate may not reflect a decision that is truly robust.
There do exist some examples whereby local analysis and partial analysis produce different
sized uncertainty sets, but the critical values are identical.
Here, we show that local analysis produces the correct solution in cases where we have a monotonic function, either increasing or decreasing, and we need to maximise or minimise this respectively, subject to some critical value, where the uncertainty is one-dimensional. In this case,
our local analysis becomes a partial analysis, as there is no solution dependence on the estimate.
Suppose we have a monotonic decreasing2 function, h(d, s), which gives a scalar for each combination (d, s). Given the satisficing problem where we have a constraint h(d, s) ≥ C, with
h(d, s), C ∈ R, we are trying to maximise the region of s for which our decision will hold. This
is given in the following Figure 4.12
We shall begin by examining a partial robustness approach. A partial robust satisficing approach chooses the largest subset of the state space in which our constraint, h(d, s) ≥ C, is held.
Figure 4.13 demonstrates the values of s in which our decision is robust, shown in the diagram
using arrows. As the function is monotonically decreasing, there will be no values s ∈ (−∞, sc ]
where our constraint is violated, and our maximum value for s occurs when our constraint inequality becomes an equality. It can be seen that our critical s value, sc , is constant with given
h(d, s) and C.
This divides our uncertainty region, which is one dimensional in s, namely s ∈ R, into two
regions – one ‘good’, and one ‘bad’, as shown in Figure 4.14, hence, using a partial robustness
approach, the uncertainty region in which our decision is robust is s ∈ (−∞, sc ], where sc is
2 An
example with a monotonic increasing function is given in Chapter 6.2
4.5. Partial Versus Local Robustness
47
h(d,s)
C
sc
s
Figure 4.12: Monotonic decreasing function, h(d, s).
h(d,s)
C
sc
s
Figure 4.13: Partial robustness region for h(d, s) ≥ 0.
fixed. Hence, to give robustness over a maximal number of states, our critical value is sc , where
h(d, sc ) = C.
Now we examine what happens when applying a local robustness approach. Using a local
robustness approach, we take an estimate, s̃, and gauge how far we can move away from it
before we have h(d, s) < C. This distance between s̃ and the first state we reach such that
48
Chapter 4. Robustness
bad
good
s
s
c
Figure 4.14: Division of uncertainty region.
h(d, s) < C is called the horizon of uncertainty, α, and is given by the following:
max
α
α ≥0
min h(d, s) ≥ C.
s.t.
s∈Ŝ(α,s̃)
As we have the monotonic decreasing graph shown in Figure 4.12, we know that our fixed
value sc occurs when h(d, sc ) = C. This again splits our uncertainty region, s ∈ R, into two.
Figure 4.15 shows our divided uncertainty region, complete with an estimate, s̃.
bad
good
s~
s
s
c
Figure 4.15: Division of uncertainty region.
It can be seen in Figure 4.15 that as our estimate s̃ changes, our horizon of uncertainty α changes
accordingly to reflect the distance between s̃ and sc , however, the critical value of sc does not
change. Consequently, although our horizon of uncertainty changes, our local analysis is independent of our estimate sc . As the function h(d, s) decreases monotonically, any estimate s̃ < sc
will approach sc from the left, where h(d, sc ) = C.
Hence, it can be seen that both partial robust and local robust satisficing approaches, under
these conditions, will always give identical critical values, independent of our estimate.
We will investigate a case study [Moffitt et al., 2005] whereby this property holds for a monotonic increasing function in Chapter 6.2.
4.6. Robust Optimisation
49
We briefly introduce a counterexample where the uncertainty lies in two dimensions. Suppose
we have a satisficing problem where we have the constraint h(d, s) ≥ C, with h(d, s), C ∈ R,
and we are trying to maximise the region of s for which our decision will hold. Our constraint
splits the uncertainty region into two; a ‘good’ region and a ‘bad’ region.
In this case, a local robustness approach is dependent on the estimate. Figure 4.16 demonstrates
this dependence. Figure 4.16 depicts the uncertainty region, with two different estimates and
their horizons of uncertainty. These horizons of uncertainty are identical for both the grey estimate and the black estimate.
s2
good
sc
1
bad
sc
2
s1
Figure 4.16: Division of uncertainty region.
It can be seen that, although the horizons of robustness are the same for each estimate, due to
the different locations of the estimate, we have achieved different critical values, namely the
grey estimate finds critical value sc1 , while the black estimate finds the critical value sc2 .
4.6
Robust Optimisation
Overview
The area of research concerning robust decision-making under severe uncertainty is a burgeoning field, with many methodologies being put into practice in order to attempt to deal with
problems involving incomplete information.
Whilst modern-day technology allows for more progressive, computation-heavy methodologies, decision-making under severe uncertainty has its roots in simple, yet important, concepts
50
Chapter 4. Robustness
drawn from the field of classical decision theory.
Classical decision theory views decision-making under severe uncertainty as a two-person zerosum game, with the decision maker playing against the adversary, Nature [French, 1986]. It
must be noted that, unlike in game theory, classical decision theory forces the decision maker
to play first. Lempert et al. [2003] term classical decision criteria as ‘predict-then-act’ paradigms
due to the nature in which they predict a worst-case state, then act accordingly to optimise
return within that state. In classical decision theory, uncertainty lies in predicting which state
will occur.
Discussion of decision-making under severe uncertainty was motivated by Abraham Wald in
the 1940s, with his development of the Maximin principle, as introduced in Chapter 1. Examining the worst-case scenario for each decision [Chiulli, 1999], Maximin chooses the decision
which gives the best possible return should the worst case scenario eventuate, hence the ‘true’
return is guaranteed to be at least as good as the return given by Maximin.
This concept of robustness was borrowed from the field of game theory, disguised by the
nomenclature ‘security level’. The security level for each ‘move’ is the least desired outcome,
hence any other outcome is at least as good as the security level [Zagare, 1984]. Thus, when
a player chooses the move with the maximum security level, this is equivalent to utilising the
Maximin criterion. The mathematical definition of security level is given by:
z∗ (d) = min f (d, s).
s∈S(d)
By assuming occurrence of the worst case, Maximin effectively eliminates all uncertainty, transforming the initial problem of decision-making under severe uncertainty to one of decisionmaking under certainty [Sniedovich, 2007]. This is also the case for the other classical decision
criteria discussed, whereby uncertainty is wholly removed due to assumptions made on the
conditions of the resulting state. Although the Maximin principle has been deemed too conservative and pessimistic by many [French, 1986, Savage, 1951] it has proven to be a useful
modelling tool which demonstrates a key concept underlying robust decision-making – to ensure that a decision is robust over a number of states, the decision must perform sufficiently
well should the worst-case state eventuate.
Savage states that Maximin’s heavily pessimistic view, and consequential overly-conservative
results, are a product of the examination of negative return rather than loss [Savage, 1951].
Savage also notes that several statements in Wald’s book [1950] cited in [Savage, 1951], are
inconsistent with the use of negative return, and perhaps Wald had originally meant to work
in terms of loss. In an attempt to remedy Maximin’s conservatism, Savage adapts elements of
4.6. Robust Optimisation
51
Maximin to develop what is now known as Savage’s Minimax Regret criterion. Using Minimax
Regret, one attempts to minimise the maximum regret, with regret defined as the difference
between the most desirable return in each state and the return achieved with each action given
that state [French, 1986], giving the following formulation:
∗
z = max min
d ∈D s ∈ S ( d )
∗
max f (d , s) − f (d, s) .
d ∗ ∈D
However, Chernoff [1954] lists several criticisms of this criterion, including similar pessimistic
characteristics shared with Wald’s Maximin, and a critique of its ill-defined ‘measurement’ of
regret, suggesting the regret in moving between two states of utility may not be equivalent to
the regret experienced with the same differences in return at higher or lower values of return.
Hurwicz [French, 1986] argues that a decision maker could decide how optimistic or pessimistic
they are using his optimism-pessimism index, an individual-specific constant, which, by convention, gives ultimate pessimism when taken to be zero, or in other words gives the instance of
Wald’s Maximin. This index balances Maximin and Maximax criteria accordingly.
Another classical decision theory criterion is Laplace’s Principle of Insufficient Reason, which suggests that if nothing is known about the true state of nature, then one may conjecture that all
states have equal probability [French, 1986]. The advantage of this is that it discards the possibility of attitudinal bias, and transforms the problem from one of severe uncertainty to one
of risk [Schneller and Sphicas, 1983]. It may be argued that if nothing is known about the true
state of nature, then to conjecture that all states have equal probability isn’t too far-fetched as it
gives the most conservative guess as to which state will eventuate.
Starr [1966] manipulates the Principle of Insufficient Reason such that probabilities of states of
nature are no longer equal, but rather they can be any set of probabilities which sum to unity. It
is then assumed that all combinations of these probabilities are equally likely, or in other words,
we have a uniform mixture of all distributions. The idea behind Starr’s Domain Criterion, which
was further developed by Schneller and Sphicas [1983], is to generate all possible combinations
of discretised probabilities, of the random variables, and use these to ‘calculate a set of expected
values at least one member of which is largest’ [Starr, 1966]. Each decision has a corresponding
polytope, its domain, which is comprised of points for which that decision performs best, or has
best expected value.
The volume of a domain, D (k ), k = 1, . . . , m, is defined as follows [Schneller and Sphicas, 1983]:
(
D (k)
=
p∈P:
n
n
j =1
j =1
∑ ckj p j ≥ ∑ cij p j , ∀i 6= k
)
52
Chapter 4. Robustness
where P is the set of all possible probability distributions relative to the assumed discrete state
variables, and cij represents the return received when decision di is taken and state s j eventuates.
Letting Ek,p represent the expected value of utility generated by decision k assuming p is the
underlying distribution, we can give an alternate definition of the volume of the domain:
D (k)
=
n
p ∈ P : Ek,p ≥ Ei,p , ∀i 6= k
o
The domain with largest ‘volume’ is analogous to that which has the greatest probability of
being optimal [Eiselt et al., 1998]. This volume can also be determined by some measure which
gives each probability distribution a weight corresponding to its likelihood.
In this thesis, we demonstrate that Starr’s domain criterion is an instance of Wald’s Maximin in
Theorem 4.3.
One drawback during development of this criterion was that for any problem with more than
three states of nature, calculations were extremely taxing (Starr’s domain criterion is subject to
the ‘curse of dimensionality’). In most early papers which make use of Starr’s domain criterion,
problems are restricted to three states of nature or less, and are solved graphically. However,
with modern-day ability to address computational complexity, there has been a resurgence in
popularity of modified versions of this criterion [Eiselt et al., 1998, Lempert and Collins, 2007].
As an aside, Starr [1966] discusses the criticisms and concerns voiced over possible scenarios
whereby different criteria choose different decisions for the same problem. French [1986] discusses eight axioms which are reasonable properties of decision-making criteria. It has been
proven that no single decision criterion can obey all eight of these axioms, as satisfying combinations of some axioms imply the negation of others [French, 1986]. Starr acknowledges that
the domain criterion violates several of the axioms presented for classical decision theory. However, he argues for a consensus on a single decision-making criterion rather than on a choice of
criteria which may produce conflicting results.
It is our opinion that this argument may be troublesome. Different decision problems are based
on different assumptions, and it is these assumptions which may validate or invalidate one or
more of the decision-making axioms. To assert the necessity of a unique criterion only because
the current choice of criteria may give differing results ignores the differing situations between
problems which make one criterion more suitable than another for a given scenario.
Due to this, it seems perfectly reasonable, if not more sensible, to have a choice of criteria which
may or may not produce conflicting results, rather than one sole criterion which may be inappropriate for a given problem. It is this choice between conflicting models which forces us to
think about what assumptions, and hence what models, are appropriate for any problem.
4.6. Robust Optimisation
53
Approaching from a different Operations Research angle, following the development of the
Simplex method by George Dantzig in 1947, linear programming was a flourishing field, until a
hitting stagnant period in the 1960s after realisation that the results of linear programming did
not meet expectations of users [Fiedler et al., 2006]. The fact that results were not in line with
expectations of linear programming modelers was mainly due to uncertainty within parameter
values, thus rendering solutions useless. Another reason for disillusionment was the inability
to solve large programming problems suffering from computational complexity.
Until the 1970s, uncertainty within linear programming problems had always been dealt with
by using post-optimisation techniques such as sensitivity or parametric analysis, or probabilistic
approaches such as stochastic programming. Sensitivity analysis is performed by first solving
the program deterministically, with some nominal parameter values, and then tweaking these
values in order to see how much they can change before the solution becomes unstable. This
differs from robust optimisation techniques as here, we attempt to compute solutions such that
they will perform sufficiently well, given that the eventuating state may vary after the decision
has been made. Fabozzi et al. [2007] criticise sensitivity analysis as they state that that it may
only allow deviation from one parameter at a time.
The first approach that dealt with uncertain coefficients in linear programming problems was
stochastic linear programming [Fiedler et al., 2006]. Stochastic programming models uncertainty over time using scenario generation, and allows recourse decisions which respond to
new information to be made as information becomes available, period by period. A disadvantage of stochastic linear programming is its complexity, as it transforms the programming
problem from a linear to a nonlinear one. Another issue is that it assumes that the probability
distributions of uncertain parameters are known.
Soyster [1973] was the first to examine non-probabilistic approaches to parameter uncertainty in
linear programming problems. Soyster developed a method of inexact linear programming, which
can be used when the values within the constraint coefficient matrix vector are not known with
certainty. Here, we can deal solely with the constraint matrix as uncertainty in the coefficients
of an objective function can always be translated such that the objective function is transformed
into a constraint. A generic, maximising linear programming problem is as follows:
max
cT x
s.t
Ax ≤ b.
x ≥0
If we have uncertainty only in the coefficients of the objective function, we can reformulate this
54
Chapter 4. Robustness
in the following manner:
max
x ≥0
v ∈R
s.t
v
−cT x ≤ −v
Ax ≤ b.
Inexact linear programming defines the feasible region using set containment - ‘the sum of a
finite number of convex sets is contained in another convex set’ - rather than using convex
inequalities. This ‘summing’ of sets involves summing all elements in each set with all elements
in every other set. Defining {K j }, j = 1, . . . , n as n nonempty convex sets, with b ∈ Rm and
K (b) = {y ∈ Rm |y ≤ b}, we can reformulate the above model into the following convex
programming problem:
max
cT x
s.t
x 1 K1 + x 2 K2 + · · · + x n K n ⊆ K ( b ) .
x ≥0
While this approach generally produces highly robust solutions, the tradeoff with optimality
may be high. This tradeoff has been coined ‘the price of robustness’ by Bertsimas and Sim [2004].
Examples of where a decision can and cannot be completely robust have been given in Figures 4.1 and 4.2 in Chapter 4.2. Example 4.2 demonstrates the price of robustness in a simple
linear programming problem.
Example 4.2. Suppose we have the following linear programming problem:
z∗ = max
x ≥0
s.t.
100x1 + x2
αx1 + βx2 ≤ 10,
with α ∈ [1, 11] and β ∈ [1, 10]. Soyster’s approach sets α = 11 and β = 10, with the worst case
solution of ( x1 , x2 ) = (0, 1) and the optimal worst case value of z∗ = 1. However, if α is decreased by
one, or α = 10, then the solution becomes ( x1 , x2 ) = (1, 0) with a value of z = 100.
Soyster’s [1973] method is extremely conservative, not only due to its robustness over the entire state space, but also attributable to the type of uncertainty discussed in his paper, namely
‘column-wise’ uncertainty [Ben-Tal and Nemirovski, 1999]. This type of uncertainty results in
greater conservatism than ‘row-wise’ uncertainty, as it sets every uncertain coefficient in the
constraint matrix to its worst possible value, as has been mentioned earlier in Chapter 4. We
4.6. Robust Optimisation
55
present an application of Soyster’s approach in our carbon offsets portfolio problem in Chapter 6.1.
Ben-Tal and Nemirovski [1998] developed the robust counterpart approach to uncertainty, which
improved on the conservatism encountered in Soyster’s inexact linear programming, and examined ellipsoidal uncertainty sets to ensure tractability of the robust counterpart of a linear
program.
Ben-Tal and Nemirovski’s [1998, 1999, 2000] definition of feasibility and their ideas of ‘hard
constraints’ (those which must be satisfied under all realisations) are identical to those in Robust
Control Theory, which will be touched upon below. The set of states under which these hard
constraints must be obeyed under is known as the ‘uncertainty set’, Ŝ ⊆ S, and, given the
following general programming problem,
max
f (d, s)
s.t
h(d, s) ∈ C,
d∈ D
we get the following robust satisficing and optimising counterpart [Ben-Tal and Nemirovski,
1998]:
max
d∈ D
s.t
inf f (d, s)
s∈Ŝ
h(d, s) ∈ C,
∀s ∈ Ŝ.
If we take an example of a generic linear programming problem,
max
cT d
s.t
Ad ≤ b,
d∈ D
its robust satisficing counterpart is [Ben-Tal and Nemirovski, 1999]:
max
d∈ D
s.t
cT d
Ad ≤ b,
∀( A, b) ∈ Ŝ,
where c ∈ Rm is the return vector, A is an n × m coefficient matrix and b ∈ Rn .
Ben-Tal and Nemirovski’s [1998, 1999, 2000] approach discusses linear and non-linear programs
where uncertainty is row-wise. While the approach produces tractable programming problems,
the complexity of the nominal problem is increased. Ben-Tal and Nemirovski’s [1998] work was
completed independently of and analogously to that of El-Ghaoui and Lebret [1997].
56
Chapter 4. Robustness
In Chapter 4.7, we will present proofs that Soyster’s inexact linear programming model [1973],
and Ben-Tal and Nemirovski’s models [1998], [2002], are Maximin models.
A similar approach has been proposed by Bertsimas and Sim [2004], which is based upon polyhedral uncertainty sets. An advantage over Ben-Tal and Nemirovski’s approach is that complexity is maintained as the robust counterparts in this method are linear programs. Bertsimas
and Sim’s [2004] method also allows for the decision maker to have complete control over the
degree of conservatism by enabling the specification of the number of the coefficients which are
allowed to change per constraint.
Bertsimas and Sim also extend their approach to discrete optimisation problems [Bertsimas
and Sim, 2003], with their approach more computationally tractable than the robust discrete
optimisation approach of Kouvelis and Yu [1997].
Mulvey et al. [1995] coined the term robust optimisation for their decision-making approach,
which involves a simple linear programming structure involving structural components which
are free from uncertainty, and control components which are subject to uncertainty. Robustness
under conditions of uncertainty is dealt with using a feasibility penalty function. Mulvey et al.
differentiate between model robustness and solution robustness, which has been presented in
this paper as a distinction between satisficing and optimising respectively.
Scenario generation is one of the basic foundations required for Mulvey’s robust optimisation
[1998], as well as, but not limited to, Kouvelis and Yu’s robust discrete optimisation, as discussed next. Use of scenario generation allows for a more thorough exploration of possible
future states, and enables the ability to ensure that the robust decision chosen will perform over
a large diversity of these states. Given the possibility of assessing how a decision will perform
in a number of states, scenario generation proves more beneficial than traditional methods, such
as using point estimates, when it comes to decision-making under severe uncertainty. Lempert
et al. [2003] cite the importance of scenario generation when trying to make robust decisions,
listing the consideration of a number of scenarios to be one of their four key elements to successful long-term policy analysis. Lempert et al.’s robust decision-making method [2006], discussed
below, also uses scenario generation.
Influenced by Mulvey et al.’s seminal paper, Kouvelis and Yu [1997] developed robust discrete
optimisation, a mathematical programming framework. It applies either Wald’s Maximin, giving
absolute robustness, or Savage’s Minimax regret criteria, giving a robust deviation decision using
the traditional definition of regret (or relative robustness if defining regret as a ratio of benefits)
over a set of generated plausible scenarios to examine a decision’s robustness. As with classical
decision theory, application of Wald’s Maximin will generate a more conservative decision than
decisions generated using the concept of regret. A robust strategy here is defined as a strategy
with small regret over a large number of possible future states [Greenberg and Morrison, 2007].
4.6. Robust Optimisation
57
Another approach we examine is Information-gap (Info-gap) decision theory, which has been
proposed by Ben-Haim [2006] as a new, non-probabilistic methodology for decision-making
under severe uncertainty. Info-gap takes a nominal estimate, and attempts to maximise robustness to uncertainty in parameter values around this estimate such that the return is always at
least as great as some predetermined threshold value. An info-gap model of uncertainty is
usually an unbounded family of nested sets centred around the point estimate. The generic
Info-Gap robustness model is as follows:
V̂ (d) = max
Ŝ⊆S(d)
{V (Ŝ) : f (d, s) ≥ f ∗ , ∀s ∈ Ŝ}, d ∈ D,
where f ∗ is a given constant, perhaps a critical threshold value, Ŝ is a subset of the state space,
and V (Ŝ) denotes the size of set Ŝ ⊆ S(d) such that:
V (Ŝ) ≥ 0,
Ŝ ⊂ Ŝ0
∀Ŝ ⊆ S(d)
→ V (Ŝ) < V (Ŝ0 ).
Letting α represent a parameter that determines size V (Ŝ) and S(α) = Ŝ, we can rewrite this in
a form recognisable in the info-gap literature:
α̂(d) = max
α ≥0
{α : f (d, s) ≥ f ∗ , ∀s ∈ S(α)}, d ∈ D,
where S(α) is the region of uncertainty of size α ≥ 0 centred around state s̃. It is assumed that
the regions of uncertainty are nested and non-decreasing with α, namely
S (0)
S ( α ) ⊆ S ( α + e ),
= {s̃},
∀α, e ≥ 0.
Criticisms of Info-Gap include its local treatment of robustness, and the fact that the solution
produced by Info-Gap is invariant to changes in the size of the uncertainty space. The latter criticism has been demonstrated by Sniedovich’s Invariance theorem [Sniedovich, 2007], in which
Sniedovich also proves, contrary to the assertions of Info-Gap proponents that it is a new, ‘radically different’ [Ben-Haim, 2001] theory for decision-making under uncertainty, that Info-Gap
is a simple instance of Wald’s Maximin criterion [Sniedovich, 2007].
As the estimate’s quality reflects the level of uncertainty of the problem, we should assume
58
Chapter 4. Robustness
that the quality of the estimate is poor. The limitations of using a local analysis around a poor
estimate have been discussed in Chapter 4.4.
Info-gap uses the size of the subset of uncertainty to determine the robustness of a decision.
The decision chosen is the one of greatest size α. However, α is measured such that it includes
states in subset S(α) if the decision satisfies the threshold constraint. This means that as we
move ‘away’ from the estimate, once we ‘hit’ a state for which the decision does not satisfy the
constraint, then we cannot continue moving in that direction.
Figure 4.17 represents corresponding state spaces for two decisions, d and d0 , where the black
dots reflect a state in which the decision performs sufficiently. With decision d, a large proportion of the state performs well, as seen in the lower left corner of Figure 4.17(a), except for the
immediate neighbourhood surrounding our estimate, represented as a larger, grey circle.
Under decision d0 , shown in Figure 4.17(b), the majority of the state space performs poorly,
except for the immediate neighbourhood around our estimate. It can be seen that the uncertainty region, given as grey rings around our estimates, for decision d0 is greater than that of
d with a uniform distribution over the state space, hence Info-Gap would choose decision d0 .
However, decision d clearly performs well over a greater number of states than decision d0 and
hence under, for example, a uniform distribution, is more robust – just not with respect to our
estimate.
(a) The performance grid of decision d.
(b) The performance grid of decision d0 .
Figure 4.17: Example where info-gap chooses the incorrect decision.
This demonstrates a dependence on our estimate. In one case, our decision may perform poorly
only in the immediate neighbourhood of our estimate, but may perform well over a large subset
of the state space elsewhere. Conversely, our decision may perform well in the immediate
neighbourhood, but not satisfy conditions over the remainder of the state space.
4.6. Robust Optimisation
(a) The performance grid of decision d.
59
(b) The performance grid of decision d0 .
Figure 4.18: Example where info-gap chooses the correct decision.
It can be seen, in Figure 4.18, that for the same decisions d and d0 , if we change the location of
our estimate, the decision which performs best may change. Here, even though the state space
performance grid is identical to those above in Figure 4.17, we choose decision d as the estimate
used falls in the subset of the state space which is larger.
This thesis will mainly focus on the aforementioned approaches, excluding that of Kouvelis
and Yu’s [1997] robust discrete optimisation, and Mulvey et al’s robust optimisation [1995],
which were introduced due to important concepts present in their works. However, there are
many other robust decision-making techniques for other areas of operations research, including
sequential decision-making [Rosenhead et al., 1972] for an initial decision, robust discrete-time
dynamic programming [Nilim and El Ghaoui, 2005, Iyengar, 2005] and discrete optimisation
[Kouvelis and Yu, 1997], to name a few. Robust control is also a major area of robust decisionmaking [Zhao and Glover, 1996], where the uncertainty is bounded such that the performance
of a system is guaranteed to fall within specifications.
Review of Applications
The RAND Frederick S. Pardee Center for Longer Range Global Policy and the Future Human Condition
(RAND) publish many papers and books relating to the field of robust decision-making under
severe uncertainty, which relates to their work in long-term policy analysis (LTPA). Lempert
et al. [2003] discuss robust decision-making methods in terms of LTPA, and propose four key
60
Chapter 4. Robustness
elements that they describe as crucial to successful LTPA. These key elements are to consider
large ensembles of scenarios; seek robust, rather than optimal, solutions; achieve robustness with
adaptivity; and to design analysis for interactive exploration of the plethora of possible futures.
Lempert et al. use scenario landscapes as a means for visualisation of robustness of different
strategies.
Lempert et al. [2006] developed robust decision-making (RDM), for decision-making under severe uncertainty, which is heavily based on scenario generation. By the repeated implementation of models with unknown parameters using numerous plausible probability distributions,
RDM generates a large number of possible solutions, and recommends the decision which is
relatively insensitive to the changes in parameter values. RDM then attempts to identify weaknesses in the robust strategies and suggests positive modifications to these strategies, while
providing descriptions of the tradeoffs involved regarding these modified strategies. Due to its
reliance on scenario generation, RDM is computationally expensive.
Lempert and Collins [2007] compare three techniques for robust decision-making under an application of sudden climate change, whereby a small lakeside town must examine the tradeoff
between the economic benefits of increasing development, and the environmental pollution
which may result from this. These robust decision-making techniques are also compared with
traditional methods of optimising expected utility, and worst-case analysis.
The first technique examined gives up some optimality for more robustness to uncertainty in
assumptions, and is similar to Ben-Haim’s Info-Gap approach. It also involves concepts from
classical decision theory, implementing Savage’s Minimax regret criterion and a version of Hurwicz’s optimism-pessism criterion.
Using the notation where we have strategies s ∈ S̄, states x ∈ F̄, and probability distribution
functions ρi ( x ) ∈ D̄, they begin by computing the expected regret of strategy s with distribution
i in the following manner:
R̂s,i =
Z
x
Rs ( x )ρi ( x )dx,
where Rs ( x ) = max[ pvUs0 ( x )] − pvUs ( x ) is the regret of decision s in state x, and pvUs ( x )
s0
represents the utility of decision s under state x.
By computing the values for R̂s,best and R̂s,worst , Lempert and Collins then apply Hurwicz’s
optimism-pessimism criterion to take a weighted average of the best and worst case expected
regrets (similar to balancing Maximin with Maximax), where the decision maker-specific constant refers to the level of uncertainty regarding ρbest ( x ), rather than indicating how optimistic
or pessimistic one is. By defining z as such, Lempert and Collins are able to plot the expected
4.6. Robust Optimisation
regret of each strategy as a function of the chance
61
z
1− z
that the distribution ρbest ( X crit ) is correct.
Lempert and Collins identify the worst case performance for their Strategy A, which is optimal
for a nominal probability distribution, and then generate alternatives to A by evaluating the
optimal strategies over a set of multiple probability distributions evaluated at the critical state,
X crit . The plot of this will be presented below.
The second approach chooses a strategy which satisfices over a large number of possible futures,
like Starr’s domain criterion. Starr’s domain criterion, as mentioned earlier, aims to maximise
the volume over which the decision performs satisfactorily. Regret is mapped on a ‘scenario
landscape’ to provide a visual basis for analysis.
The third approach that Lempert and Collins discuss defines robustness as leaving options
open, as Rosenhead does for sequential decision planning [Rosenhead and Gupta, 1968]. Lempert and Collins describe this as being similar to the second approach, and use the same scenario landscape to determine which decisions hold for the most number of states without the
possibility of not meeting a threshold utility value.
Lempert and Collins emphasise the importance of choice of approach to cater for problems, as
each approach expresses uncertainty and robustness in a different way to the other. Catering for
audiences is also important when choosing an approach, as each approach has different visual
representations to display robustness.
Figure 4.19 below shows how both approach one, and approaches two & three, can be visualised. These have been adapted from Lempert and Collins, Figures 7 and 8 [2005].
To determine which strategy is more robust, we examine the strategy lines for cases A to G. Robustness depends on the odds of our initial distribution being correct. For example, strategies
G and F have low expected regret, but only provided that we believe our initial probability distribution has a high chance of being wrong. On the other hand, strategy A has lowest expected
regret when the odds that our initial probability distribution is correct are high.
The benefit of using the scenario landscapes (Figure 4.19(b)) is that it is easier for the viewer to
understand where the results come from, namely that a robust strategy has low regret across as
many values of X crit as possible, and as few instances as possible of high regret. We can also
easily add weightings to a scenario landscape if we believe that there is greater likelihood that
the state space is in one area than another. For example, if we believe that the critical threshold
is likely to be on the lower end of the scale, we shall choose strategies F or G. However, if we
thought the threshold might be at the upper end of the scale, we might choose strategy A.
Lempert and Collins assert that the first approach demonstrated above is more appropriate
in situations where the decision maker has some probability distribution over the state space,
whereas the other two methods are better suited to those with uniform probability distributions.
62
Chapter 4. Robustness
(a) Expected regret vs. odds
z
1− z
that ρbest ( X crit ) is correct.
(b) Scenario landscape for approaches two and three.
Figure 4.19: Visualisations of robustness for strategies A through G.
However, non-uniform probability distributions can be fixed to the state space by applying
weights to different possibilities, both prior to and post-analysis. The use of scenario landscapes
4.6. Robust Optimisation
63
enables a visual representation of the state space which allows easier application of probability
distributions post-analysis, but this does not mean that the method is incapable of handling
non-uniform probability distributions before the solution process has begun.
Another topical application which has great use for robust decision theory is terrorism. Carr et
al. [2006] look at a water contamination problem under severe uncertainty following a terrorist attack. Wanting to minimise the expected number of people exposed simultaneously with
the expected number of pipe junctions that become contaminated, Carr et al. formulate three
different absolute robustness models with different forms of uncertainty within the objective
function. Absolute robustness refers to that described in Kouvelis and Yu [1997], where the performance in the worst state for a decision indicates the robustness of the decision. It can be seen
clearly in their representation of all three robustness models that these are Maximin (technically
Minimax) models in their classical form.
They assume that the uncertain coefficients, α and δ, are bound by intervals which sum to
constants, namely they incorporate constraints specifying that the upper and lower bounds of
a coefficient sum to a constant. Carr et al. argue that by using such constraints within their
model, their model is more realistic than a simple worst-case scenario model. They also claim
their robustness is more intrinsic than that of Bertsimas and Sim [2004], as they do not require
user input designating how many constraint coefficients are allowed to change.
Notation involves αi , the probability of an attack at pipe junction i; δj , the number of people who
contact the water at junction j; and xij , a binary variable where xij = 1 if water contaminated at
junction i will not be discovered at junction j. R is the set of junction pairs (i, j) such that j is
downstream from i, and X is a set of feasible 0-1 x values.
The first case they analyse is that of minimising the expected population exposed to the contaminated water, with linearly weighted uncertainty where the uncertainty lies in the coefficients αi
in their objective function as follows:
min
max
∑
x ∈ X α∈B(α̂,α,ᾱ)
(i,j)∈R
αi δj xij .
δ is known, and the values of α are restricted by a constant-sum constraint. Carr et al. present
two approaches for solving this problem. The first involves formulating the dual of the primal
problem, which results in replacing the ‘max’ with ‘min’ to give a mixed integer linear program.
A method which may be more computationally efficient is to decompose the problem, solving
the inner maximisation problem for each x ∈ X in the outer minimisation problem, and then
choosing the smallest of these get our robust cost.
64
Chapter 4. Robustness
The second case involves unweighted uncertainty in α, where the upper and lower bounds of
αi are proportional to the nominal values. Here, the objective function, which aims to minimise
the number of contaminated pipe junctions, is given as follows:
min
max
∑
x ∈ X α∈B(α̂,α,ᾱ)
(i,j)∈R
αi xij .
In this case, the constant-sum constraints give a fixed solution which is equivalent to that which
would result from a maximum expected utility analysis, namely, we take the central value from
the interval bounds and use it as if we were undertaking a deterministic analysis. While Carr
et al. [2006] discuss the counter-intuition of having a permanent solution, they ‘have proven
that robust solutions can be obtained from just the central value (which may be the most likely
realisation)’. If the central values were fairly accurate estimates of the true parameter values,
then their approach could be used to find a robust solution without an increase in computational
complexity. However, under severe uncertainty, which one would assume would be the case
in a situation involving terrorist attacks, this method is unsuitable as it is unlikely that the
central values used are even close to the true values of the parameters, nor would they represent
accurate estimates of the mean.
Carr et al. formulate a final model with uncertainties in both α and δ as follows:
min
max
∑
x ∈ X α∈B(α̂,α,ᾱ)
(i,j)∈R
δ∈B(δ̂,δ,δ̄)
αi δj xij .
They describe this uncertainty as a bilinearly weighted uncertainty due to the bilinear inner
maximisation. While this problem is NP-hard, Carr et al. demonstrate a heuristic which solves
for a value of α0 ∈ B(α̂, α, ᾱ), then uses this value as a deterministic one to solve for a value of
δk , where k gives the number of iterations. The algorithm then alternates between solving for
values of δk and αk , and does not terminate until a maximum is found. One downfall to this
algorithm is that there is no guarantee that the maximum value found is the global. A linear
programming relaxation to the problem is also presented.
The three cases increase in computational complexity in order of presentation, however, the
paper focusses more strongly on the modeling side than the solution process. It recommends
further analysis of robust decision-making techniques in order to determine suitable computational approaches.
Eiselt et al. [1998] discuss an adaptation of Starr’s domain criterion regarding how to choose the
location of a sewage treatment plant from sixteen potential sites.
4.6. Robust Optimisation
65
As mentioned previously, Starr’s domain criterion traditionally chooses the decision with greatest expected value (or in Eiselt et al’s example, minimum expected cost) over the largestnumber
of states. The domain is then a polytope, and we choose that which has largest volume. This
involves the curse of dimensionality, as we need to compute the volumes for each polyhedra in
(n − 1)-dimensional space.
Eiselt et al. suggest an approach whereby instead of examining which decision has the lowest
expected cost, we look at finding the decison with the lowest cost given an interest rate, i,
and inflation rate, f , with each (i, f ) having an associated cost, Ci,k f . Rather than the domain
consisting of the set of all probability distributions for which decision k has minimum expected
cost, the domain of decision k becomes the set of all combinations (i, f ) for which k costs less
than all other decisions, as shown below:
n
o
D (k) = i, f : Ci,k f ≤ Ci,l f , ∀l 6= k .
This approach does not suffer the curse of dimensionality, as we do not move above the twodimensional space.
The state space is then discretised, and Eiselt et al. only examine values of i and f which they
believe are appropriate for the problem. They mention that this is done in order to overcome
the limitations of applying a uniform distribution over the state space, criticising Starr by stating that the assumption that all probability distributions are equally likely is unrealistic. If we
have reason to believe that some elements in the state space are irrelevant, then by all means,
we should discard them. However, we can do this before the solution process is applied, specifically, we can ignore them in the formulation and modelling stages. Following this, we can then
apply either a uniform or non-uniform distribution over the space. Solely discretising the state
space and disregarding irrelevant states does not imply that the remaining states are not equally
likely, and hence does not ‘resolve this issue’.
We also note that their criticism of the application of Laplace’s principle of insufficient reason
may be unjustified, as the function which determines the domain’s volume can be modified
such that if some probability distributions are more likely than others, this can be accounted for.
In their paper, Lempert et al. [2007] discuss scenario landscapes to give a visual representation
and comparison of how well different strategies perform under plausible states. Such a scenario
landscape, or an alternate visual approach, could be combined with probabilistic mapping in
order to see which strategies perform better in states that are more likely to eventuate. In fact,
Schneller and Sphicas [1983] discuss the benefit of using Starr’s domain criterion due to the
control we can exert over which probability distribution to use:
66
Chapter 4. Robustness
“One of the strong points of the Domain criterion, in our opinion, is the ease with
which meta probability distributions other than the uniform distribution can be
placed on the f.p.s. [fundamental probability simplex]. Such alternate distributions
might reflect different interpretations of absence of information, or partial information of certain types, or simply the decision maker’s subjective assessment of the
probabilities.”
[Schneller and Sphicas, 1983, pg.334]
Eiselt et al. note additional reasons as to why Starr’s domain criterion has been neglected in
practice. One reason given is that Starr’s domain criterion focuses solely on the best strategy,
with no thought given to other near-optimal strategies. We believe that this reason is trivial
though, as this can be overcome very easily with a slight modification of the definition of the
domain criterion. By relaxing the optimality constraint, as demonstrated in Chapter 4.3 under partial robustness, we can easily formulate the model such that Starr’s domain criterion
attempts to determine the decision whose domain is largest for all outcomes within some tolerance level of return below the maximum, rather than the maximum itself.
Despite the criticisms noted here, Eiselt et al. discuss several issues of great importance with
regard to problem formulation and modelling. When excluding values of i and f from their
analysis due to their irrelevance, they mention that the state space should not be reduced by too
great a proportion. Cost is simple to compute and including a wider range of states will allow
the decision maker to examine characteristics of solutions over normal and extreme situations,
and hence offer perspective that can be lost if the state space is too far reduced. This line of reasoning is similar to that presented in Chapter 4.3, where we discuss the benefits of maximising
the ‘size’ of a subset of the state space in order to ensure robustness.
Eiselt et al. also mention the use of dominance in order to eliminate schemes prior to analysis.
Dominance is an extremely useful tool in decision-making which can reduce the size of a problem by removing decision possibilities that will not contribute to the optimal solution. In Eiselt
et al.’s example, they discuss plotting a diagram of cost versus probabilities that a decision costs
no more than some threshold amount. Each decision is represented by a monotonically nonincreasing curve, and those decisions whose curves are never higher than others are dominated
and can hence be removed from analysis.
The problem of predicting the occurrence of a terrorist attack, and its aftermath, is subject to
severe uncertainty. Stranlund and Field [2006] apply Ben-Haim’s Info-Gap decision theory to a
port security problem, whereby the decision maker seeks the number of containers that should
be inspected given a certain probability of there being a threat in a large number of containers.
4.6. Robust Optimisation
67
This problem demonstrates a trade-off between acceptable loss and immunity, due to resource
restrictions which constrain the number of containers that can be inspected. This paper will be
examined in further detail in Chapter 6.
Other applications for robust decision-making approaches include conservation biology and
applied ecology [Regan et al., 2005, Moilanen et al., 2006], engineering [Zhao and Glover, 1996],
and finance [Kachani and Langella, 2005, Gregory et al., 2008].
The literature revolving around robust decision theory is littered with examples of applications
of decision-making theory within a wide range of fields beyond that of mathematics. However,
there is here a tendency towards focussing on solution methods and ignoring the importance of
good problem formulation and modelling.
While tractability of a solution process is vital in the applicative world, it must be noted that:
“...there is little value to either a poor solution to a correctly formulated problem or
a good solution to one that has been incorrectly formulated.”
[Jensen, 2004]
In Chapter 5, we reference some of the concepts described in this thesis in the context of building
a model.
Why ‘Optimisation’, Not Satisficing
Although we discuss in Chapter 3 the possible pitfalls of using the terms ‘optimising’ and ‘satisficing’, the title of this chapter is Robust Optimisation. The optimisation that we deal with when
we discuss the field of robust optimisation refers to that of optimisation of robustness, rather
than optimisation with respect to an objective function.
The field of robust optimisation is currently thriving, and many approaches are being developed
in order to maximise robustness whilst lowering the conservatism that complete robustness approaches potentially have. As seen in Chapter 4.2, we aim to maximise the value of the objective
function to the problem at hand. As we maintain robustness over the entire uncertainty region,
the ‘level’ of robustness is fixed. All decisions that are feasible over the state space display
identical robustness.
Modern-day techniques attempt to lower the tradeoff between performance and robustness,
and are generally constructed to maximise robustness. It can be seen in Chapters 4.3 and 4.4
that, unlike complete robustness, partial and local robustness attempt to maximise robustness
68
Chapter 4. Robustness
for a decision, rather than maximising the objective value. In these approaches, the level of
robustness may vary for differing feasible solutions, and hence, under severe uncertainty, our
aim is to choose the decision which performs well over the most number of, or most heavily
weighted, states.
In general, the decision maker can decide whether they desire this robustness to be with respect
to an objective function, constraints, or both. Under severe uncertainty, the aim of maximising
robustness remains the same.
Hence, when discussing the field of ‘Robust Optimisation’, this covers our concepts of robust
optimising, robust satisficing, and robust optimising & satisficing, as presented in Chapter 3.
State of the Art
Many real-world problems are subject to severe data uncertainty. Robust optimisation is a field
of operations research which attempts to recommend decisions which will perform well according to some standard, regardless of the unpredictability of the true state of nature. Unlike
probabilistic methodologies, such as stochastic programming, much of robust optimisation revolves around the use of sets in order to provide a description of the uncertainty at hand.
The robust counterpart of a problem examines all, or as many as possible, instances within the
uncertainty set as if they were occurring with certainty, and aims to maximise the number of
states in which a decision will perform well.
While complexity of a robust counterpart is not maintained, approximations can be formulated,
and due to the preservation of convexity, methods of convex analysis can be applied in order to
assist in the solution process of a problem.
There exist many approaches to robust decision-making, however one major gap in the field is
that of solution techniques. In many cases of robust decision-making, real-world, large-scale
problems can be very difficult to solve. Unlike many other areas of operations research, there
are no set algorithms or software packages in which we can ‘input’ our problem and receive an
output solution. Due to this, we require a proper understanding of the problem at hand and
good modelling skills in order to capture the essence of the problem.
Vital to our understanding and ability to model problems is the idea that, as the literature concerned with robust decision-making is large, unification of the field is necessary. There exist definitions and concepts of basic elements involved in decision-making under severe uncertainty
which are blurred in the literature between subfields of robust decision-making approaches.
One aim of this thesis is to promote a unified perspective on some of the fundamental concepts
of robust decision-making under severe uncertainty.
4.7. Maximin Models in Disguise
69
Following this, we can attempt to deal with issues of tractability and computational implementation of our models in hope of more precision and accuracy regarding representation of our
problem. As discussed by Beyer and Sendhoff [2007], much of the literature regarding robust
optimisation techniques using mathematical programming methods over uncertainty sets has
a strong theoretical background. However, a great weakness in this field lies in the applicative
literature due to intractability spawning from problem size.
4.7
Maximin Models in Disguise
“. . . the worst scenario . . . can be considered a building brick of various approaches to
uncertain data, or they themselves can render useful results in the case of minimum
information on input data. In actual fact, the worst scenario approach is unavoidable
at least on some level of modelling”.
[Hlavacek et al., 2004, pg.49]
“. . . all and any of the above approaches [parametric linear programming, sensitivity
analysis, and sampling], except worst-case analysis, represent a cumbersome way of
consciously looking for the wrong solution”.
[Wallace, 2000, pg.22]
Borrowing the title ‘Maximin Models in Disguise’ from Sniedovich [2008a], we present proofs
that Starr’s domain criterion [1966], Soyster’s inexact linear programming [1973], and Ben-Tal
and Nemirovski’s robust counterpart approach Ben-Tal and Nemirovski [1999] are all instances
of Wald’s Maximin criterion.
Sniedovich [2008a] demonstrates that Info-Gap’s robustness model is an instance of Wald’s
Maximin criterion, and uses the phrase ’Maximin models in disguise’ to refer to models which
don’t ‘look’ like Maximin models, but which in fact are.
This disguise comes in the form of a mathematical programming formulation of Wald’s Maximin criterion, where the worst-case analysis is dealt with using the ‘∀ states’ condition as seen
in the constraints, rather than having an explicit ‘min’ as with its classical formulation. This
‘∀’ bounds the problem in the sense that the worst possible state for a decision must satisfy
all constraints, otherwise we cannot achieve the maximum return, as described in Chapter 4.
The greatest level of return that one decision can guarantee is the lowest return that one can
physically achieve with that decision.
70
Chapter 4. Robustness
The significance of Sniedovich’s [2008a, 2007] proof, and the following proofs, Proof 4.3, Proof 4.4,
and Proof 4.5, is that they demonstrate that successful robust decision-making tools share a
rudimentary concept, that is, a decision can only perform well over an entire subset of the state
space if it performs well in the worst case. The decision is hence bound by the least desirable
state in that subset.
As for recognising that the below models are, in fact, Maximin models – this is important as the
Maximin criterion carries with it fundamental concepts and a history which allows us insight
into how to approach robust decision-making. As shown throughout Chapter 4, the Maximin
criterion provides a core template for robust decision-making models under severe uncertainty.
If we are unable to identify our model as a Maximin in disguise, it suggests a poor understanding of the model that we have, the problem which it represents, and the way we should
approach solving the problem. Such an inability is tantamount to that of not recognising that a
given model is, for example, a linear programming model, an integer programming model, or a
Markovian decision model, and this may result in reinvention of the wheel, or worse, reinvention of the ‘square wheel’.
We begin by presenting a proof that Starr’s domain criterion is an instance of Wald’s Maximin.
This proof was alluded to in Sniedovich [2008a].
Theorem 4.3. Starr’s domain criterion is an instance of Wald’s Maximin criterion.
Proof. Starr’s domain criterion aims to maximise the volume of a domain πk , k = 1, . . . , m,
which is defined by the following inequalities [Schneller and Sphicas, 1983]:
(
πk =
p∈P:
n
n
j =1
j =1
∑ ckj p j ≥ ∑ cij p j ,
)
∀i 6 = k
where P is the set of all possible probability distributions relative to the assumed discrete state
variables. As described in Chapter 4.6, the aim of Starr’s domain criterion is to maximise the
volume of the domain of a decision k, where the domain is made up of points corresponding to
all possible probability distributions where decision k is best. Defining V (πk ) as the volume, or
size, of the domain of decision k, we get [Schneller and Sphicas, 1983]:
4.7. Maximin Models in Disguise
v∗ (k ) = max
71
V (π )
π ⊆P
(4.1)
n
n
j =1
j =1
∑ ckj p j ≥ ∑ cij p j , ∀i 6= k, ∀ p ∈ π
s.t.
(4.2)
n
∑ pj = 1
(4.3)
0 ≤ p j ≤ 1, j = 1, . . . , n
(4.4)
j =1
Here, the ‘for all’ constraint given tells us that the value of the left hand side of!the inequality is
n
bounded by the value of the maximum decision dk , giving us
∑ cij p j , ∀i 6= k
j =1
n
≡ max ∑ cij p j .
i
j =1
n
Let r ( p) = max ∑ cij p j . As P defines the set of all possible probability distributions, the coni
j =1
straints (4.3) and (4.4) are redundant, and we can rewrite above as follows:
v∗ (k) = max
V (π )
π ⊆P
n
∑ ckj p j − r( p) ≥ 0, ∀ p ∈ π
s.t.
j =1
Letting h(k, p) = ∑nj=1 ckj p j − r ( p) gives:
v∗ (k ) = max
π ⊆P
s.t.
V (π )
h(k, p) ≥ 0, ∀ p ∈ π
Thus, an optimal decision is a decision that is optimal with respect to the following problem,
restructured for visual clarity in Table 4.2 below:
v∗ = max {V (π ) : h(k, p) ≥ 0, ∀ p ∈ π } .
k
π ⊆P
This is analogous to the mathematical programming Maximin formulation for partial robust
satisficing as given above. Corresponding notation is summarised as follows:
Hence it has been shown that Starr’s domain criterion is an instance of Wald’s Maximin criterion.
72
Chapter 4. Robustness
Wald’s Maximin MP Formulation
max V (Ŝ) : h(d, s) ∈ C, ∀s ∈ Ŝ
max {V (π ) : h(k, p) ≥ 0, ∀ p ∈ π }
V (Ŝ)
d
D
s
Ŝ
S
h(d, s)
C
V (π )
k
D
p
π
P
h(k, p)
R+
d∈ D
Ŝ⊆S(d)
Starr’s Domain Criterion
k
π ⊆P
Table 4.2: Starr’s domain criterion reformulation
Next, it will be shown below that Soyster’s inexact linear programming model is also an instance of Wald’s Maximin criterion.
Theorem 4.4. Soyster’s inexact linear programming model is an instance of Wald’s Maximin criterion.
Proof. Soyster [1973] formulates the following convex mathematical programming problem for
inexact linear programming:
max
x ∈Rn
x ≥0
s.t.
cT x
x 1 K1 + x 2 K2 + · · · + x n K n ⊆ K ( b ) : = { y ∈ Rn : y ≤ b }
x j ≥ 0,
where {K j } are non-empty convex sets, and the operation x j K j involves the multiplication of
the scalar x j with all elements within the set {K j }. This can be rewritten as
n
maxn c T x : x1 a1 + x2 a2 + · · · + xn an ≤ b,
x ∈R
x ≥0
∀ a j ∈ K j , j = 1, . . . , n
o
as we can choose any vector a j ∈ K j for each j to maximise c T x. Each a j , j = 1, . . . , n is a vector,
as is b. Note that the ‘∀’ with respect to the constraints gives away the Maximin nature of the
problem. A Maximin formulation of this model, which is robust over the entire state space, is
then as follows:
max min
x ∈Rn a1 ,...,an
x ≥0 a j ∈ K j
( c T x ) · ( x1 a1 + x2 a2 + · · · + x n a n b )
4.7. Maximin Models in Disguise
73
where
(
(r b ) =
1,
r ≤ b component-wise
0,
otherwise.
This formulation is equivalent to the complete robust satisficing Maximin formulation given in
Chapter 4.2.
A continuation of this proof which demonstrates Soyster’s [1973] result that this formulation
can be reformulated into a simple linear programming problem is given below in Chapter 5.5.
We next present a proof that Ben-Tal and Nemirovski’s robust counterpart of an uncertain linear
programming problem is also a Maximin formulation [Ben-Tal and Nemirovski, 1999].
Theorem 4.5. Ben-Tal and Nemirovski’s robust counterpart of an uncertain linear programming problem is an instance of Wald’s Maximin criterion.
Proof. Ben-Tal and Nemirovski [1999] formulate the following robust counterpart to a generic
linear programming problem:
min
x ∈ GU
c T x,
∀ A ∈ U ; f T x = 1}. We will rewrite this as a maximisation problem
in order to fit in with prior notation. This leaves us with the following:
where GU = { x : Ax ≥ 0,
max
x ∈ GU
where
−c T x,
GU = { x : Ax ≥ 0,
∀ A ∈ U ; f T x = 1}.
We can reformulate this as follows:
max min
x ≥0 A∈U
s.t.
−c T x · ( Ax 0)
f T x = 1,
74
Chapter 4. Robustness
where
(
(r b ) =
1,
r ≥ b component-wise
0,
otherwise.
This formulation is equivalent to the partial robust satisficing Maximin formulation given in
Section 4.3.
Hence we have provided proofs that Soyster’s inexact linear programming approach, Ben-Tal &
Nemirovski’s robust counterpart approach, and Starr’s domain criterion are instances of Wald’s
Maximin criterion.
The importance of these proofs, and Sniedovich’s proof [2008a, 2007], lie in their illustration
of the foundational and inherent concept that should underly all robust decision-making techniques, namely, a decision is bounded by the worst performing state within a subset of the state
space.
We now present an argument against claims in the info-gap literature that for an info-gap robustness, there exists no worst case, and hence info-gap’s robustness approach is not a Maximin
analysis.
“Since the horizon of uncertainty is unbounded, there is no worst case and the infogap analysis cannot and does not purport to ameliorate a worst case.”
[Ben-Haim, 2005, pg.4]
As discussed above in Chapter 4.4, Info-Gap is an example of local robust satisficing. The results of Sniedovich’s [2007] proof that Info-Gap is an instance of Wald’s Maximin criterion are
presented below.
Info-Gap’s robustness framework is as follows:
(
max max
q ∈Q
α
)
α ≥ 0 : rc ≤
min
u∈U (α,Ũ )
R(q, u)
.
Sniedovich [2008a] presents a proof that:
(
max max
q ∈Q
α
)
α ≥ 0 : rc ≤
min
u∈U (α,Ũ )
R(q, u)
= max min α · (rc R(q, u)) ,
q∈Q u∈U (α,ũ)
α ≥0
4.7. Maximin Models in Disguise
75
where
(
ab=
1,
a ≤ b,
0,
, a, b ∈ R
otherwise.
q ∈ Q, u ∈ U (α, ũ), α ≥ 0.
This ‘’ relation is identical to that demonstrated in Chapter 4.3. It can be seen that the model
shown on the right hand side of the equality is a local robust satisficing model, as seen in Chapter 4.4. In trying to minimise our return, the antagonist will attempt to choose a state u such
that the inequality rc ≤ R(q, u) does not hold. This worst case gives us a return of zero.
Any state u which results in the objective function equalling zero is technically a worst case
state. However, we are also interested in maximising our horizon of uncertainty, say, a radius,α,
around our estimate. Hence, our adversary will not only choose a state in which our contraint
rc ≤ R(q, u) is not satisfied, but she will choose the state which reduces the value of α to its
minimum. Then, as discussed in Chapter 4, the worst-case scenario is that which violates the
constraints ‘closest’ to the value of our estimate, as α cannot ‘grow’ past this point.
This conflicts with remarks made in the Info-Gap theory regarding there existing no worst case
scenario. When there is uncertainty regarding which true state of nature will eventuate, worstcase analysis finds a solution which will be robust over some subset of the state space. Generally
speaking, the larger this state space is, the more robust the decision is.
Worst-Case Analysis
For modelling purposes, we can design a decision-making model such that the worst-case is
the state which reduces the return of an indicator function. This is shown in Chapters 4.2, 4.3,
and 4.4.
Recall in Chapters 4.2, 4.3, and 4.4, that the maximisation action that a decision-maker takes
differs between complete robustness, and partial & local robustness. In complete robust optimisation, the maximisation relates to the value of the objective function, while in partial & local
robust optimising or satisficing, we aim to maximise the size of the subset of states for which
our decision performs well over.
Similarly, the idea of what the worst case is differs between complete robustness, and partial &
local robustness. We restrict the discussion here to robust satisficing, as Info-Gap is a local robust
satisficing methodology.
When we have a decision which is completely robust, under satisficing we can conclude that
our decision satisfies all constraints in all states of the state space. If our decision is partially
76
Chapter 4. Robustness
or locally robust, we can deduce that there exists at least one state in the state space where our
decision violates the constraints 3 .
Following this, if our decision is completely robust for our satisficing problem, then there is
no worst case within the state space, as there is no state for which our decision does not hold.
Under complete robust optimisation, the worst case provides the lowest guaranteed return.
This is found by pre-empting the worst state for each decision, and then choosing the decision
with the best worst-case return. In this case, our worst case state is within the set of state spaces
in which our decision is robust over.
If our decision is partially or locally robust, then there exists at least one state in the state space
for which the decision does not satisfy the constraints. In this case, these ‘bad’ states are not in
the subset of states in which our decision is robust over. Including such a state in our subset,
Ŝ(d) ⊆ S(d), will set our indicator function to zero.
However, we have another ‘measure’ which we wish to maximise, namely the size of our subset of the state space, V (Ŝ), for partial robustness, or our horizon of uncertainty, α, for local
robustness.
Under Info-Gap, α is the ‘distance’, most commonly the radius of a circle, which we can move
from our estimate such that our decision will remain robust. Our adversary wishes to minimise
the value of α, as this affects our return. In this case, Mother Nature shall choose the ‘bad’ state
closest to our estimate as our ‘worst’ state. This will minimise both the value of the indicator
function, and decrease the value of α.
Figure 4.20: An example of worst case states in Info-Gap.
Figure 4.20 demonstrates a performance grid for some decision. As in previous figures, the
black circles represent ‘good’ states – in this case, states for which our decision satisfies all
constraints. Grid vertices that are not represented with a circle are ‘bad’ states. The pairing of
3 If we attempt partial or local robustness, and the decision found holds over the entire state space, then we can call
this decision completely robust, even though the initial approach was a partial or local approach. However, it must be
emphasised that a decision found using a partial or local robust approach which holds over the entire state space may
not be identical to that found using complete robust approach.
4.7. Maximin Models in Disguise
77
these states with our decision give solutions for which at least one of the constraints is violated.
The three larger, red circles represent our worst case states. Not only does our decision violate
the constraints in these states, but they are also closest to our estimate, represented as the centre
grey dot. This means that our horizon of uncertainty, α, cannot grow any further.
This difference between the worst case for complete robustness, and that for partial & local
robustness appears to be an issue which seems to confuse Ben-Haim and Davidovitch [2008]
when they argue that Info-Gap is not a Maximin (Minimax) model:
“When minimaxing, we assume that we know the horizon of uncertainty and that
its value is αm . Then the minimax decision maker looks for the decision q? that
guarantees the minimal loss for any state of the world at uncertainty αm . . . On the
other hand, when robust-satisficing . . . we have no idea what is the worst case, since
we do not know the true horizon of uncertainty.”
[Ben-Haim and Davidovitch, 2008, pg.11–12]
“Since the horizon of uncertainty is unknown, there is no worst case”.
[Ben-Haim, 2007a, pg.3]
Firstly, they describe a Minimax for a complete robust optimising decision, rather than a local
satisficing decision. This assumes that the worst case state exists within the entire set of states,
over which our decision is robust. If we do not know what states are included in this subset,
then we cannot find such a worst case state. Hence, under complete robust optimising, the
boundaries of the state space should be fixed 4 .
For partial & local robust satisficing, and optimising, for that matter, we do not need to know
what the horizon of uncertainty is prior to determining the worst case. In fact, this is the opposite of what we require. The horizon is bounded by the state closest to the estimate which does
not satisfy all the constraints, hence the horizon of uncertainty can only be determined once we
have found the worst case state. In other words, the horizon of uncertainty is defined by the
worst case.
This demonstrates that Info-Gap does, indeed, have a worst case, and as shown by Sniedovich
[2008a], is an instance of Wald’s Maximin criterion.
“The worst scenario method represents a substantial part of the information-gap
theory.”
[Hlavacek et al., 2004, pg.xix]
4 This
may be considered trivial as the boundaries are those which border the entire state space.
Chapter 5
Model Choice: What is Attractive?
Here we discuss what main factors should be taken into account when modelling problems that
are subject to severe uncertainty.
5.1
Problem Formulation and Modelling
It is important to note the difference between a problem formulation and its model. A model
is a mathematical representation of the problem at hand. The problem formulation can be considered as the precursor (a combination of the forethought and assumptions required) to the
construction of a model, as seen in Figure 5.1.
Situation &
Data
Formulate Problem
Implementation &
Modelling
Solution
Problem
Formulation
Construct
a Model
Implement Solving Technique
Model
Figure 5.1: Lifecycle of a problem.
Problem formulation is the first analytical step we take in solving the problem. It involves
79
80
Chapter 5. Model Choice: What is Attractive?
articulating the given problem more precisely using objectives, constraints, assumptions, identification of possible actions, and data requirements. This step is extremely important to the
decision-making process, and much of our previous discussion in this thesis can be directly applied to this stage. Note that at this stage of the cycle, we have yet to quantify our uncertainty.
The following step combines these elements into a mathematical model which can (hopefully)
be used as input into some problem-solving method, whether it be a large computer-based
model or a simple toy model. A model may involve some simplification of its problem formulation via assumptions, in order to allow mathematical treatment, hence its reflection of reality
may not be as accurate as the representation given in the formulation state. Such assumptions
may not be genuine representations of what occurs in actuality, but may be necessary in order
for further problem development. It is at this stage that we must quantify our uncertainty in
order for it to be amenable to mathematical analysis.
The importance of these discussed stages must be emphasised. A solution to a poorly formulated model should be taken as a poor solution. The attitude that solving a problem is the be-all
and end-all factor to decision-making under severe uncertainty neglects the importance of good
problem formulation, and underestimates the effect of any assumptions made (in order to solve
a problem) on a solution.
Suppose we are approached with a decision-making problem which we need to formulate and
solve. Here we give two examples of the difficulties in moving from a problem formulation to
its representative model:
The person who approaches us with the problem gives us what they call a poor estimate of an
unknown parameter, a wild guess to the true value. In our formulation, we have no probabilities or mathematical structures ascribed to this estimate. However, in the model, we must ask,
how will we translate this poor estimate mathematically?
The person who approaches us also tells us that they want this solution to be robust. In our
formulation, we need to determine in what sense they want robustness; namely robust optimising, robust satisficing, or both. When we incorporate this into the model, we need to consider
what kind of robustness the model can cover, and where we want the robustness to lie; in the
constraints, the objective function, or both.
5.2. Uncertainty
5.2
81
Uncertainty
“In specifying a subjective probability distribution that represents the present state
of knowledge for an uncertain model input, care should be taken first to produce a
clearly written rationale that summarizes all of the evidence before committing to
an actual distribution. The shape of the distribution need not be any known mathematical form; it should be whatever shape properly reflects the state of knowledge
dictated by the totality of evidence.”
[Hoffman and Kaplan, 1999, pg.1]
“We can hardly strive for an improvement in the reliability of mathematical modelling and for better insight into modelled phenomena unless we direct our attention
to the very foundations: the input data. Since inputs are inseparably burdened with
uncertainty, we have to learn how uncertainty can be rigorously included into mathematical models.”
[Hlavacek et al., 2004, pg.49]
A prominent difficulty faced when applying mathematical analysis to problems involving severe uncertainty is attributable largely to the fact that the uncertainty present is not quantified.
It is this lack of key information that distinguishes uncertainty from risk, and leads to much
confusion in the application of mathematical tools in order to solve the problem.
If a problem occurs under conditions of severe uncertainty, it is after problem formulation has
occurred that we should think about ascription of probabilities or other mathematical structures
that will allow us to solve it, as discussed above in Chapter 5.1. Once we have solved the
problem, we should then think about how these probabilities have impacted our solutions.
As mentioned in Chapter 2.5, the level of uncertainty in a problem is given by the size of the uncertainty region, or, if using an estimate, the quality of this estimate. When deciding on a model
to represent our problem, we must do our best to ensure that the severity of the uncertainty is
reflected.
If we are given an estimate which is good, then we can perhaps represent this by weighting
values around our estimate more highly than those that are further away, or perhaps we can
ignore part of our uncertainty region far from the estimate. We might use the estimate deterministically, and then perform a sensitivity, parametric or local analysis around it.
Bustamante-Cedeno and Arora [2008] state that ‘robust risk analysis is erroneously confused to
be tied up with equally likely scenarios’. This confusion results from the blurring of lines be-
82
Chapter 5. Model Choice: What is Attractive?
tween problem formulation and implementation of problem solving techniques, and generally
occurs in the attribution of probabilities which enable mathematical analysis.
It may seem that many non-probabilistic techniques for robust risk analysis incorporate certain
distributions, however in some cases this is demonstrative and should not be taken as gospel.
In other words, assigning an unknown parameter a certain distribution purely to suit the model
which we wish to use to solve a problem is a poor way of formulating it.
5.3
Satisficing Versus Optimising
As discussed in Chapter 3, it is important to make clear what it is we wish to satisfice, and what
it is we wish to optimise.
If we have conditions of severe uncertainty and we want to guard against undesirable outcomes, then optimisation of robustness is critical to ensure that as many bases as possible are
covered. In terms of performance, we must then decide whether it is suitable to optimise some
performance measure, whether it is enough to choose an element that ensures that all constraints will be satisficed over all possible outcomes, or whether we wish to do both.
One benefit that optimisation has over satisficing is that traditional optimisation software packages may not be equipped to solve pure satisficing problems. However, on smaller scale problems it may be very simple to determine a satisficing element.
From complexity theory, we have the result that for most cases, an optimisation problem is only
as difficult as its satisficing counterpart [Schrijver, 2003]. Given a real-valued function f over X,
where X represents all elements consistent with the given constraints, if we need to maximise
f ( x ) for x ∈ X, we can transform this to the following recognition problem:
Given a rational constant, c, can we find an x ∈ X such that f ( x ) ≥ c?
Provided that we have a lower bound on our maximum value, by using a binary search, we can
pose this recognition problem for different values of c to find the maximal element. Hence if we
have an efficient algorithm which solves the recognition problem (the satisficing problem), we
can use this to derive an efficient algorithm for its equivalent optimisation problem.
5.4
Robustness
In this section, we deal with how one should choose between complete or partial robustness.
We discussed in Chapter 4 how a local analysis does not capture robustness over the uncertainty
5.5. Tractability
83
region successfully, hence we shall omit its discussion here.
When deciding whether complete or partial robustness is required for a problem, it is important
to consider the intensity of the uncertainty, and the possible risks involved.
If the uncertainty is severe and the potential losses are great, then any resulting conservatism
which comes from insisting on complete robustness over the state space may be warranted.
Complete robustness is also beneficial in the sense that it gives a lower bound on the guaranteed
return, or for satisficing ensures that all constraints will be met no matter what state eventuates,
and can automatically gauge whether a feasible solution exists or not by testing the worst case.
If uncertainty is severe yet we are willing to risk a loss, or if the uncertainty is severe and risk
is high but complete robustness cannot produce a plausible solution and a decision needs to be
made, then we should examine partial robustness over the state space.
Partial robustness aims to maximise the size of a subset of the state space for which our decision
performs well.
5.5
Tractability
Whilst this thesis has focussed heavily on the formulation aspect of robust decision-making
problems, we must discuss the importance of a tractable solution to a model.The robust counterpart of a tractable problem may not be tractable [Bertsimas et al., 2007]. Here we discuss the
benefits that some approaches have over others due to issues involving tractability, complexity
and implementation.
Kouvelis and Yu’s robust discrete optimisation approach
Kouvelis and Yu [1997] show that for many combinatorial problems1 which are solvable in
polynomial time2 , their robust counterparts are NP-hard. Mentioned below, Bertsimas and Sim
remedy this for problems solvable in polynomial time using their robust counterpart approach,
where bounds are placed on the uncertain coefficients and we have a parameter which controls
the number of coefficients which are allowed to vary [Bertsimas et al., 2007].
The difficulty in finding tractable solutions spawns mainly from the Maximin nature of the
problem, as well as the large dependence on scenario generation. They also mention the simplicities in having continuous variables and scenario sets rather than discrete variables and
scenario sets. Scenario generation is discussed further on in this chapter.
1 These
2 The
include the robust shortest path problem, the minimum spanning tree problem, and the knapsack problem.
time taken to implement an algorithm takes no longer than some polynomial function of the problem size.
84
Chapter 5. Model Choice: What is Attractive?
Approximation techniques have been proposed for robust discrete optimisation, which are
based on piecewise linearisation
Kouvelis and Yu [1997] also present models for tractable discrete optimisation problems, including that of the robust economic order quantity (EOQ) problem. We refer the reader to Chapter
4 in their book [Kouvelis and Yu, 1997] for more details.
Starr’s domain criterion
One limitation of Starr’s domain criterion is that for a large state space, the computation of
volumes of polyhedra is highly intractable [Schneller and Sphicas, 1983, Eiselt et al., 1998]. For a
large scale problem, the only way to avoid this is to narrow down the size of the state space. This
may be done by discretising a continuous state space, and if possible, discarding low probability
states.
While this can be a fairly simple process to execute, a lot of thought and reason may need to go
into deciding which states are discarded, and in what way to discretise the space (i.e., what size
intervals are required, should they be equal) in order to best represent the total state space.
Also, following acquisition of results, an analysis of how the structure of the end state space affects the solution should be conducted, in order to decide whether the decision is implementable
without dire consequences.
Soyster’s inexact linear programming approach
We first examine Soyster’s approach [1973]. Following the proof of Theorem 4.4, we were left
with the following model:
max min
x ∈Rn a1 ,...,an
x ≥0 a j ∈ K j
( c T x ) · ( x1 a1 + x2 a2 + · · · + x n a n b )
where
(
(r b ) =
1,
0,
r ≤ b,
otherwise.
We can rewrite the above constraint in the following manner:
5.5. Tractability
85

a11


a12


a1n


 a21



x1  a31

 .
 .
 .





 a22







 + x2  a32


 .

 .

 .







 a2n







 + · · · + xn  a3n


 .

 .

 .



am1
am2
amn


b1


 

 
  b2 

 

 
  b 
 ≤  3 .

 
  . 
  . 
  . 

 
bm
It can be seen that aij ‘s are independent of each other, of the x j ’s, and of the bi ’s. As these aij ’s
are independent of the x j ’s, we can reformulate the Maximin model as such:
max
x ∈R
x ≥0
s.t.
(c T x ) · min ( x1 a1 + x2 a2 + · · · + xn an b)
a j ∈K j
x j ≥ 0.
We can also remove the ‘inner’ optimisation, by letting a˜ij = max( ai ) j , i = 1, . . . , m for each
a i ∈ Ki
j = 1, . . . , n. This can be done as Mother Nature, being the antagonist, wants to minimise c T x
with her choice of a j for each j = 1, . . . , n. This means that she will choose the largest value (as
x j ≥ 0) within the vector a j , let us say aij , for each j = 1, . . . , n, in the hope that the constraint
cannot be obeyed, hence the return will be set to zero. Hence, regardless of i, for each j = 1, . . . , n
we want to choose the largest element aij . This gives:
max
x ∈R
x ≥0
s.t.
(c T x ) · ( x1 a˜1 + x2 a˜2 + · · · + xn a˜n b)
x j ≥ 0.
This formulation whereby the ‘inner’ optimisation is removed is possible due to what Ben-Tal
and Nemirovski term column-wise uncertainty [Ben-Tal and Nemirovski, 1999].
This Maximin model can be transformed into an equivalent standard linear programming model,
which was demonstrated by Soyster[1973]:
max
x ∈R
x ≥0
s.t.
(c T x )
Ãx ≤ b
where à is a matrix consisting of entries a˜ij as defined above. This reason that this model displays such strong conservatism is due to the fact that it sets all unknown parameters to their
worst values in order to be robust over all possible scenarios.
86
Chapter 5. Model Choice: What is Attractive?
An example demonstrating the forming of the new matrix à is as follows:
Suppose we have the following constraints with column-wise uncertainty:

7


2


1


6


35


 








 







x1  3  + x2  6  + x3  1  + x4  8  ≤  64 .

 







28
3
0
4
10
As Soyster’s method chooses the largest value in each column in attempt to violate the less than
or equal to constraint, we will get the following linear programming problem:


10
6
1


 10

6
1
10
6
1
x1




35


  x2  
 

 ≤  64
8 
 x  
 3 

28
8 
x4
8






Despite the heavy conservatism, Soyster’s approach is quick to formulate and easy to solve as
it reduces down to a standard linear programming model which can be easily solved with an
optimisation software package.
Due to its simple implementation, Soyster’s model could be recommended for a problem under severe uncertainty where risks and possible losses are extremely high, so conservatism is
warranted.
Ben-Tal and Nemirovski’s robust counterpart approach
Ben-Tal and Nemirovski’s [1998, 1999, 2000, 2002] robust counterpart approach was developed
to counter the conservatism inherent in Soyster’s approach. The robust counterpart of a mathematical programming problem is generally a semi-infinite optimisation problem [Ben-Tal and
Nemirovski, 1998, 2002]. The difficulty with this approach is that semi-infinite optimisation
problems are often computationally intractable, especially when the problems involve large
numbers of variables, which is often the case in real-world problems.
Recent research identifies that in order to be solved efficiently, the robust counterpart of a problem should not go beyond a second-order cone problem [Beyer and Sendhoff, 2007]. Ben-Tal
and Nemirovski suggest that uncertainty sets should be ellipsoidal, as this results in a conic
quadratic program as the robust counterpart [Ben-Tal and Nemirovski, 1999]. Interior point
methods can be used to solve large conic quadratic problems. Results are also presented for
5.5. Tractability
87
quadratically constrained quadratic convex programming problems, conic quadratic programming problems, and uncertain semidefinite programs, to name a few [Ben-Tal and Nemirovski,
1998]. If we cannot model such uncertainty sets, approximations of robust counterparts may
need to be implemented instead.
Ben-Tal and Nemirovski [2002] offer suggestions of approximate robust counterparts, where the
uncertainty set U is replaced by an approximation, Uρ , of itself:
Uρ = ζ n + ρν,
where ζ n is the nominal data, ν is a convex, compact perturbation set containing the origin,
and ρ ≥ 0 is the level of uncertainty, where ρ = 1 gives the original uncertainty set U . This
‘+0 translates our reference point from the origin to that of our nominal data. This transforms
our original mathematical programming problem to a parametric family of uncertain problems.
The approximate robust counterpart of the original problem can be found in [Ben-Tal and Nemirovski, 2002].
Bertsimas and Sim robust approach
Following the work of Ben-Tal and Nemirovski [1998, 1999, 2000, 2002], Bertsimas and Sim
[Bertsimas and Sim, 2004] developed a new robust approach. In this approach, they define a
parameter, Γi , which restricts the number of uncertain coefficients that are allowed to change
from some nominal problem constraint i.
There are several advantages that Bertsimas and Sim’s method has over previous programming
approaches. The first is that it allows the user to control the level of conservatism through
statement of the value of this parameter, Γi . The second is that it is possible to reformulate
their nonlinear robust model as an equivalent linear model, and hence this model is tractable
[Kachani and Langella, 2005].
Bertsimas and Sim also extend this approach to discrete optimisation problems [Bertsimas and
Sim, 2003], which cannot be done using Ben-Tal and Nemirovski’s approach due to the structure
of the robust counterparts. We have previously mentioned, in Chapter 4.6, a robust discrete
optimisation method by Kouvelis and Yu [1997]. One benefit that Bertsimas and Sim’s extension
has over Kouvelis and Yu’s is that of tractability [Kachani and Langella, 2005, Bertsimas and
Sim, 2003]. Under Kouvelis and Yu’s approach, the robust counterpart can become NP-hard,
whereas Bertsimas and Sim’s robust counterpart is polynomially solvable if the original {0, 1}discrete optimisation problem is polynomially solvable.
88
Chapter 5. Model Choice: What is Attractive?
Scenario generation
Examination of large scenario ensembles is considered by Lempert et al. [2003] to be one of the
four key elements for a robust decision approach. The basis behind scenario ensembles is that
we can approximate the future state by looking at a sufficiently diverse set of scenarios, and
then perform further analysis on decisions which perform well across a wide range of possible
scenarios. A robust decision will perform well over a diverse range of plausible futures.
In order to acquire large scenario ensembles, we need to generate plausible scenarios. This
can be computationally heavy if we wish to ensure diversity and an ensemble of sufficient
size. Enumerating all possible scenarios is beyond us at present, but with recent developments
in technology, by combining scenario generation with exploratory-modelling software, we are
able to produce an approximation representation of the ensemble [Lempert et al., 2003].
Methods which utilise scenario generation include Mulvey’s robust optimisation approach [1998],
Kouvelis and Yu’s robust discrete optimisation approach [1997], and Lempert et al.’s robust
decision-making [2006].
In terms of software packages which deal with scenario generation, RAND [Lempert et al., 2003]
R
recommends the use of the Evolving Logic Computer Assisted Reasoningsystem.
Chapter 6
Illustrative Examples
6.1
Budget Allocation to Carbon Offsets Schemes
Climate change is a hot topic of our times, with many concerned as to how our daily behaviour
may affect our environment. Travelling, commuting, electricity use, and many other factors increase a person’s individual greenhouse gas emissions. Individuals are now able to voluntarily
purchase carbon offsets in attempt to become carbon neutral. Offsets may be generated from
renewable energy projects and energy efficiency projects, among others.
Our problem is one of allocating funds to different carbon offsetting schemes given a certain
budget. Due to the lack of high quality estimates regarding emissions reduction parameters for
different carbon offsetting schemes, this problem is one under severe uncertainty. We assume
that the uncertainty with regards to returns are uncorrelated.
Problem Formulation
A single individual wishes to allocate funds between 7 different carbon offsetting schemes given
a pre-specified budget.
Data borrowed from Laurikka and Springer [2003], collected from a public database, has common units of tonnes per dollar (tCO2 /$) across all schemes, which allows us to calculate returns
in terms of emissions reduction per dollar invested. The different offsetting schemes and their
corresponding mean returns and standard deviations as given in Laurikka and Springer [2003]
are shown below in Table 6.1.
It is these means and standard deviations which are subject to severe uncertainty in our prob89
90
Chapter 6. Illustrative Examples
Scheme Number
Scheme
Mean Return (tCO2/$)
Standard Deviation
1
2
3
4
5
6
7
Sequestration
Hydropower
Energy efficiency scheme 1
Energy efficiency scheme 2
Energy efficiency scheme 3
Wind power option 1
Wind power option 2
0.0676
0.2117
0.00818
0.0056
0.0312
0.0290
0.0080
0.025
0.030
0.004
0.004
0.006
0.008
0.003
Table 6.1: Carbon offsets schemes data.
lem. We note that Laurikka and Springer [2003] give the coefficient of variation for each of
the schemes. However, we omit these as under conditions of severe uncertainty, we assume
that quantifications of uncertainty are useless in the problem formulation stages. The mean and
standard deviations are used only to provide bounds for our uncertainty. If we wished to embed
this data into our formulation, this would move us into the realm of probabilistic formulation,
which is not a focus for this thesis.
That said, there is plenty of scope to incorporate these probabilistic structures into the traditionally non-probabilistic methodologies discussed in this thesis. Ben-Tal and Nemirovski [1999]
examine a portfolio problem assuming that the uncertain return coefficients are symmetrically
distributed with respect to the nominal values, in our case, the mean returns. Under these
assumptions, they model a robust counterpart which closely resembles Markowitz’s MeanVariance approach. An adaptation of this model, under the same assumptions, is presented
by Bertsimas and Sim [2004] using their robust counterpart approach, which allows the user to
determine the maximum number of coefficients in each constraint that are allowed to change.
We also refer the reader to the literature in the field of stochastic processes [Kall and Mayer,
2005, Pachamanova, 2006].
Due to the small scale nature of this problem, namely an individual deciding how to allocate
funds to offsetting schemes rather than, say, a governmental body deciding how much to spend
on research funding, we can assume linearity in the problem. It is our goal to find a robust
solution to this portfolio problem, first exploring a robust optimising approach, followed by a
robust satisficing approach.
We set our budget to be $100.
We shall take 3 standard deviations either side of the mean (such that we do not have negative
return) to give us upper and lower bounds for each return, giving us the following Table 6.2:
6.1. Budget Allocation to Carbon Offsets Schemes
Scheme Number i
1
2
3
4
5
6
7
91
li
ui
0
0.1217
0
0
0.0132
0.005
0
0.1426
0.3017
0.0201
0.0176
0.0492
0.053
0.017
Table 6.2: Upper and lower bounds on return for the carbon offsets problem.
6.1.1
Modelling for Robust Optimisation
The individual wishes to invest in such a way that emission reduction is maximised. The basic
model consists of the following constructs:
Decision Variables
Let xi be the total number of dollars allocated to scheme i, i = 1, . . . , n. In this case study, we
have n = 7.
Objective Function
The objective function will be an additive function, constructed by adding the emission reduction function for each scheme together. This can be done as we can make the assumption that
the schemes are independent of each other. We want to maximise our return subject to the
uncertainty within the coefficients of the objective function, as follows:
z∗ = max
0≤ x ≤1
n
∑ bi x i
i =1
92
Chapter 6. Illustrative Examples
Constraints
We cannot exceed our given budget, B. This constraint involves a simple summation of the
values of all decision variables being of equal1 to our budget, as follows:
n
∑ xi = B.
i =1
Model
This gives us a standard portfolio problem with the following formulation:
n
∑ bi x i
z∗ = max
0≤ x ≤1
i =1
n
∑ xi = B,
s.t.
i =1
where bi is the unknown rate of return achieved per dollar invested in scheme i.
It can be noted that we have uncertainty in the objective function only. We shall reformulate
this such that the uncertainty lies only in the constraints in order to correspond with some of
the literature. This gives the following model2 :
z∗ = max
0≤ x ≤1
z
n
s.t.
z − ∑ bi x i ≤ 0
i =1
n
∑ xi = B.
i =1
If there is no uncertainty, then the problem is trivial. The fundamental theorem of linear programming states that if there is an optimal feasible solution, then there is an optimal basic
feasible solution. Hence, in this situation, one optimal feasible solution would be to invest the
total budget on the best performing offset scheme, namely the scheme with the largest return,
bi . Using our estimates in a deterministic sense, we would allocate all our money to scheme
two.
1 As we are optimising and we have positive, linear return, we can assume that the more money, we spend, the
greater the return, hence we shall seek to use up all of our budget.
2 It should be noted that this model can be extended to cases for n > 2.
6.1. Budget Allocation to Carbon Offsets Schemes
93
Complete Robustness
Soyster’s Inexact Linear Programming
We now implement a worst-case analysis over the entire state space using Soyster’s inexact linear
programming approach. Let b̃i be the estimates of the true rate of return (the mean rate) for scheme
i as given in Table 6.1. Assume that we know with certainty that the true value bi lies in a
hypersphere with centre at at b̃i , and a radius σi , where σi is three standard deviations from the
mean.
Define Ki = {b ∈ R : ||b − b̃i || ≤ σi }. This results in Ki = [li , ui ] as shown in Table 6.2.
Soyster’s formulation for complete robustness is given by the following:
z
sup
0≤ x ≤1
n
z − ∑ x i Ki ⊆ K (0)
s.t.
i =1
n
∑ xi = B,
i =1
where K (0) = {y ∈ R|y ≤ 0}.
If we let B̂ = (b̂1 , b̂2 , . . . , b̂7 ) where b̂ = sup bij , we can rewrite this in the following manner:
bi ∈ K i
z∗ = max
0≤ x ≤1
s.t.
z
z − B̂ · x ≤ 0
n
∑ xi = B.
i =1
When the sets {Ki } are hyperspheres, as in our assumption, the values of b̂i are given by the
following:
b̂i
= b̃i − σi ,
where e is the vector of all ones. This effectively sets all coefficients to their smallest possible
value. From Soyster [1973], we can rewrite this as a linear programming problem, namely:
94
Chapter 6. Illustrative Examples
z
max
0≤ x ≤1
z − x1 (b˜1 − σ1 ) + x2 (b˜2 − σ2 ) +
s.t.
...
+ x7 (b˜7 − σ7 ) ≤ 0
n
∑ xi = B.
i =1
As all coefficients reflect their lowest possible value, we have b̃i − σi = li . Substituting the
values from Table 6.2, we get:
z
max
0≤ x ≤1
z − [0.1217x2 + 0.0132x5 + 0.005x6 ] ≤ 0
s.t.
n
∑ xi = B.
i =1
This is a simple portfolio model where the solution recommends that we will allocate all funds
to the second scheme, hydropower.
However, examination of Table 6.2 will show that via the use of dominance prior to our formulation, we can exclude all schemes barring scheme one and scheme two. Below, we have a
dominance graph, Figure 6.1, which gives the lower and upper bounds of each unknown return
value, bi ∈ [li , ui ].
7
6
5
4
3
2
1
0
0.1
0.2
0.3
Figure 6.1: Dominance of schemes in the carbon offsets problem.
6.1. Budget Allocation to Carbon Offsets Schemes
95
It can be seen from Figure 6.1 that the lower bound of scheme two is greater than the upper
bound for all schemes from three up to seven. This means that scheme two dominates all schemes
from three up to seven, as it is preferable to put money into a scheme which, at worst, is better
then other schemes at their best.
Following this, we could have concluded pre-formulation that Soyster’s inexact linear programming model would allocate all funds to scheme two, as the lower bound of scheme one is zero. It
can be noted that this worst-case analysis gives an identical solution to that of the lower bound
using interval analysis, hence the true value of the return is guaranteed to be at least as great
as the objective value provided by the method. Soyster’s method is therefore one of complete
robustness, as discussed in Chapter 4.
As Soyster’s method takes a linear programming problem suffering from uncertainty, and transforms it to a linear programming problem under certainty, solution methods are simple. The
linear programming problem under certainty, if large, can be solved by numerous optimisation
software packages due to its simple structure.
Partial Robustness
Starr’s Domain Criterion
We now present a small numerical example of how Starr’s domain criterion could be implemented. For illustrative purposes, we have narrowed the number of schemes down to two,
namely scheme one and scheme two which were not removed via dominance.
To create more interesting results, we place an extra constraint on the problem, but first, for
simplicity, we discretise the state and decision space. Suppose we have a ‘feeling’ that the true
value of b1 is closer to its upper bound. Note that these are still conditions of uncertainty, as we
cannot quantify this instinct. Supposed we also have a feeling that the true value of b2 is closer
to its lower bound. Let the possible values of b1 be contained in the set {0.13, 0.137, 0.1425}, and
let the possible values of b2 be contained in the set {0.12, 0.135, 0.15}.
Here, we add another constraint. Suppose that we can allocate funds in only three ways
whereby the funds are allocated fairly evenly. We are allowed to give each scheme equal
amounts, or we can give scheme one 40% and scheme two 60%, or we can give scheme one
60% and scheme two 40%. The following Table 6.3 gives the value of the return.
The stars, ?, given in Table 6.3, identify which decision performs best for a specified parameter
value pairing, (b1 , b2 ). Suppose we have a uniform distribution over the state space, or in other
words, we apply Laplace’s principle of insufficient reason to each outcome (b1 , b2 ). Starr’s
96
Chapter 6. Illustrative Examples
Pair (b1 , b2 )
x1 = 40, x2 = 60
x1 = 50, x2 = 50
x1 = 60, x2 = 40
(0.13,0.12)
(0.13,0.135)
(0.13,0.15)
(0.137,0.12)
(0.137,0.135)
(0.137,0.15)
(0.1425,0.12)
(0.1425,0.135)
(0.1425,0.15)
12.4
13.3 ?
14.2 ?
12.68
13.58
14.48 ?
12.9
13.8
14.7 ?
12.5
13.25
14
12.85
13.6
14.35
13.1
13.9
14.6
12.6 ?
13.2
13.8
13.02 ?
13.62 ?
14.22
13.35 ?
13.95 ?
14.55
Table 6.3: Solutions for each decision given the return parameter value pairing (b1 , b2 ).
domain criterion can choose the decision which gives the optimal result in the most number of
states. In our example, Starr’s domain criterion would choose to allocate $60 to scheme one,
and $40 of the funds to scheme two, given the assumptions made on b1 and b2 . Note that
instead of applying Laplace’s principle of insufficient reason, we could have applied any other
probability distribution which we felt was suitable, but this was chosen for ease of interpretation
and calculation.
Figure 6.2 provides a visual representation of the state spaces for decisions ( x1 , x2 ) = (40, 60)
and ( x1 , x2 ) = (60, 40). It can be seen from Table 6.3 that our decision to split our funds equally
between the two schemes does not yield beneficial returns, hence we ignore this decision below. Using a visual representation, it may be easier to identify which decision performs best
should we decide that application of a uniform distribution to the state space is not suited to
the problem.
Examining Figures 6.2(a) and 6.2(b), it can be seen that if, for example, a uniform distribution
was not applied over the state space, but rather, we had some intuition or gut feeling that our
value for b2 was likely to be higher rather than lower, then we might recommend that funds
be split such that $40 is given to scheme one, and $60 is given to scheme two. If we have the
reverse, where b2 is likely be lower rather than higher, then we might split the funds such that
$60 is given to scheme one, and $40 is given to scheme two.
Figures 6.2(a) and 6.2(b) may also suggest that the value of b1 does not play too large a role in
determining the robustness of the decision when restricted to the range [li , ui ] = [0.13, 0.1425].
It can be seen that with an increase in the possible set of values of b1 or b2 , the problem size
would also increase dramatically. This may imply that for such a small scale problem, for instance, a decision made on behalf of one individual, Starr’s domain criterion might take too
long to implement without first reducing the problem via assumptions. However, for a large
scale problem, tractability may become an issue.
6.1. Budget Allocation to Carbon Offsets Schemes
97
b
b
2
0.12
0.135
2
0.12
0.15
0.13
b
1
0.137
0.1425
(a) Decision (y1 , y2 ) = (0.4, 0.6)
0.135
0.15
0.13
b
1
0.137
0.1425
(b) Decision (y1 , y2 ) = (0.6, 0.4)
Figure 6.2: Performance grids for decisions (y1 , y2 ) = (0.4, 0.6) and ( x1 , x2 ) = (0.6, 0.4).
As can be seen in Table 6.3, our recommended decision is not guaranteed over the entire state
space, demonstrated by the fact that the rightmost column has entries void of stars. In fact,
entries which lack a star are actually the worst entries for the corresponding states. However,
what we have done here was to maximise the number of states in which our decision performs
best. Hence, Starr’s domain criterion falls under partial robustness.
If we imagine Table 6.3 to be a decision table, with rows and columns swapped by convention,
then applying a traditional Maximin analysis would also result in a recommendation of decision
three, which allocates $60 to scheme one, and $40 to scheme two, the decision three has the best
worst-case return.
In a general case, Starr’s domain criterion simplifies when the state and decision space are discretised for large scale problems, however, the combinations of states required to build our
uncertainty region may grow very rapidly. This state space can be built using scenario generation.
6.1.2
Modelling for Robust Satisficing
The individual wishes to invest in such a way that emission reduction meets a minimum threshold value and is robust to uncertainty within the emission rates. The basic model consists of the
following constructs:
98
Chapter 6. Illustrative Examples
Decision Variables
Let xi be the total number of dollars allocated to scheme i, i = 1, . . . , n. In our example, n = 7.
Constraints
We cannot exceed our given budget, yet we must also meet a minimum critical return value, rc .
These constraints are given by:
n
∑ xi ≤ B
i =1
n
∑ bi x i ≥ r c .
i =1
Based on the range of values seen in Table 6.3, we set our rc = 14 for numerical analysis purposes.
Model
Combining the above, we have the following formulation for our generic robust satisficing
problem:
z∗
where
=
max g( x )
 n


∑ bi x i ,
g( x ) =
i =1

 0,
0≤ x ≤1
n
n
∑ xi ≤ B and ∑ bi xi ≥ rc
i =1
i =1
otherwise.
where bi is the unknown rate of return achieved per dollar invested in scheme i. This means
that the output of the model will be the maximum return if and only if both constraints hold.
Again, if there is no uncertainty, then the problem is trivial. If we can ensure that with all money
allocated to it, the scheme with the largest return gives a return greater than our critical return,
then we have found our optimal solution. In our numerical example, if b2 ≥ rc , then the optimal
solution is x2 = 1, with z∗ = b2 . However, if we have b2 < rc , then there is no feasible solution.
Interval Analysis
For the case of Knightian uncertainty, we assume that we know nothing. Using interval analysis,
we can get upper and lower bounds for our problem [Giove et al., 2006]. Interval analysis as-
6.1. Budget Allocation to Carbon Offsets Schemes
99
sumes that parameter values vary within given (known) intervals. As we have not been given
definite parameter intervals, a fixed number of standard deviations away from the mean will
be used to create upper (ui ) and lower (li ) bounds for the parameter values.
This method is similar to that used in the deterministic case. The difference here is that we
solve for our ‘determinstic’ model twice, replacing the mean rates of return with the lowest
bounds for one implementation, and then the highest bounds for the second implementation.
This is an implementation of a worst case analysis, followed by a best case analysis.
Using interval analysis, we remove the uncertainty by solving to give a definite interval in
which the value of the objective function will lie. Again, we shall take 3 standard deviations
either side of the mean (such that we do not have negative return) to give our interval, shown
previously in Table 6.2.
n
∑ li x i
max
x ≥0
i =1
n
∑ xi = B.
i =1
and
n
∑ ui xi
max
x ≥0
i =1
n
∑ xi = B.
i =1
In our example, the lower bound will involve allocating all funds to scheme two, which as the
highest value li out of all schemes. The upper bound will also give all funds to scheme two, as
it has the highest value ui out of all schemes.
This method will not give bounds for the solution, but will determine upper and lower bounds
on objective function value. Interval analysis is a method which provides complete robustness.
The true value of the return will be at least as great as the lower bound provided. It should be
emphasised that the solution could be completely different if we have a mixture of variables at
their upper and lower bounds.
We present interval analysis not as a solution approach, but as a method of quickly determining
the upper and lower bounds on the objective function. This may be useful to determine outright
whether or not a feasible solution exists. If upper bound achieved using interval analysis is
100
Chapter 6. Illustrative Examples
lower than that of our critical return, rc , then we immediately know that there exists no solution
to our problem which is completely robust.
Partial Robustness
Ben-Tal and Nemirovski’s Robust Counterpart Approach
We now examine Ben-Tal and Nemirovski’s [1999] approach for linear programming with inexact coefficients. Uncertainty lies only in the values of bi . We can define our uncertainty set as a
set of vectors b = (b1 , b2 , b3 , b4 , b5 , b6 , b7 ), where bi ∈ [li , ui ]. This gives us the robust counterpart:
z∗
=
max c T x
0≤ x ≤1
T
b x ≥ rc
s.t.
∀b ∈ U
n
∑ xi ≤ B,
i =1
where U is some subset of the entire state space. Here, we have used an arbitrary function, c T x
as the objective function.
If we only wish to satisfice, or in other words, if we do not have a preference as to what our
return is, provided it is greater than the critical return, then we may reformulate this problem
in the following manner:
max min
x ≥0 b∈U
(b T x rc )
n
∑ xi ≤ B,
s.t.
i =1
where
(
(r b ) =
1,
0,
r ≥ b,
otherwise.
However, as we are satisficing, there is a possibility that the critical return can be achieved
without having to spend all of our budget. Hence, we may wish to determine the minimum
amount of money we can spend in order to achieve this threshold return. This is given by the
6.1. Budget Allocation to Carbon Offsets Schemes
101
following formulation:
n
∑ xi
min max
x ≥0 b∈U
!
( b T x ? r c ),
i =1
where
(
(r ? b ) =
1,
r ≥ b,
∞,
otherwise..
If a feasible solution exists, the value of the objective function will represent how much money
in total is spent achieving the critical return. Here, we can remove the budget constraint ∑in=1 xi ≤
B. As we are minimising the budget in the objective function, and the indicator function is set
to one if the threshold constraint is met, then the value of the objective function will be exactly
that of the total amount spent. Hence, if the return is less than the budget, B, we have found our
optimal solution. However, if the return is greater than B, this means that there is no feasible
solution, and we have no way of meeting the critical return given our budget.
This can be simplified to the following mathematical programming formulation:
n
min
x ≥0
s.t.
∑ xi
i =1
T
b x ≥ rc ,
∀b ∈ U .
This problem can be solved using the method of Ben-Tal and Nemirovski, where the robust
counterpart is found using ellipsoidal uncertainty sets [Ben-Tal and Nemirovski, 1999, 2002].
Let us examine several properties of model, beginning with that of the uncertainty set, U . U contains possible realisations of the uncertain parameter values. If the problem was deterministic,
then U would consist of only one vector, namely that of the true returns, and the problem would
be trivial. However, this model has been formulated under conditions of severe uncertainty.
We move back to the discussions presented in Chapters 4.3. For problems under severe uncertainty, we should assume that if our uncertainty is represented by its uncertainty set, then
this set should be large. We have also argued, with the assumption that the state spaces are
equal for all decisions, which is the case in this example, that the larger this set, the greater the
robustness. Hence, if our set U contains only a few elements, then under severe uncertainty,
we should warn that the chosen decision resulting from this analysis may not be very robust.
102
Chapter 6. Illustrative Examples
If, however, our set U is large in comparison with the total region of uncertainty, then we may
assume that our decision is fairly robust under uncertainty. For this to occur, we can transform
our model into a partial robust satisficing model.
Ben-Tal and Nemirovski’s model uses a fixed set, U . We may wish to reformulate this to reflect
the face that the larger our set U , the greater the robustness of a decision. Hence in choosing a
decision that performs well under uncertainty, it may be beneficial to also choose an appropriately large set U . In order to decide U , we instead may wish to denote the size of the set V (U ),
and shift our aim to maximising the size of this set.
Again, as shown in Chapter 4.3, we have the following conditions:
V (Û ) ≥ 0,
Û ⊂ Û 0
∀ Û
→ V (Û ) < V (Û 0 ).
Moving back to our original formulation with no objective function, let U represent the entire
state space, and let Û ⊆ U . Then a partial robust satisficing formulation is as follows:
max min
x ≥0 b∈Û
Û ⊆U
V (Û ) · (b T x rc )
n
∑ xi ≤ B.
s.t.
i =1
As shown in Chapter 4.3, this can be reformulated in the following conventional mathematical
programming manner by removing the inner optimisation:
max
x ≥0
Û ⊆U
s.t.
V (Û )
b T x ≥ rc , ∀b ∈ Û
n
∑ xi ≤ B.
i =1
This problem can become intractable very quickly with a large, and high-dimensional, region
of uncertainty. Hence, in order to solve the problem, we may need to discard some states within
the state space in order to narrow down the possibilities for Û . In our example, U comprises of
all the possible combinations of b = (b1 , b2 , b3 , b4 , b5 , b6 , b7 ), where bi ∈ [li , ui ]. We can discretise
our values for bi , and generate all, or as many as relevant, combinations for b. It can be seen
that as n increases, the uncertainty set becomes very large.
This intractability demonstrates the need for careful modelling. As shown in Figure 6.1, the
6.1. Budget Allocation to Carbon Offsets Schemes
103
problem reduces to one with two schemes due to dominance. This simplifies the solution process greatly.
Another difficultly in this model is the measure of size, V, which is generally not a linear function.
If we do not wish to discretise our uncertainty set, modelling ellipsoidal uncertainty sets results
in conic quadratic programs as the robust counterpart, which can be solved tractably using interior point methods. Robust counterpart should not go beyond second-order cone problems.
We refer the reader to [Ben-Tal and Nemirovski, 1999] for more detail on general solution approaches.
Otherwise, Ben-Tal and Nemirovski [2002] offer formulations for approximations of the robust
counterpart problem.
Proportional Satisficing
While we have not discussed the method of proportional satisficing in this thesis, we introduce
it here. Proportional satisficing is a term coined by Renn-Jones [2007], and can be described,
using concepts presented in this thesis, as the use of Rosenhead et al’s [1972] definition of robustness in order to count the elements in the subset of the state space in which our decision is
partially robust over.
The definition of robustness given by Rosenhead et al. [1972] is presented here. The robustness
of a decision d ∈ D is given by:
r (d)
=
n(U , d)
,
|U |
where n(U , d) is the total number of states in U in which decision d performs well, and |U | is
the total number of states in U .
In our carbon offsets study, we assume that the state space is identical for all decisions, hence the
robustness of decision d can be measured by the number of well-performing states for decision
d. We reformulate the satisficing case using the concept of proportional satisficing, modifying
the methodology of Renn-Jones [2007]. For simplicity, we omit all dominated schemes.
Problem Setup
We let U be the set of discrete combinations (b1 , b2 ), with bi ∈ [li , ui ]. We can interpret this set,
U , to contain the grid points obtained by discretising our uncertainty region.
104
Chapter 6. Illustrative Examples
In order to iterate through all the grid points, we require the definition of a pair of numbers,
β p,q = ( β p,q (1), β p,q (2)), where β p,q (i ) denotes the value of bi at the coordinates ( p, q) of the
grid. For example, if we have β 3,4 = (2.3, 7), then at point (3, 4), we have b1 = 2.3 and b2 = 7.
Assume the grid is of size (k1 , k2 ), and let K := k1 × k2 .
Decision Variables
For our problem, we let xi be the total number of dollars allocated to scheme i, i = 1, . . . , n, n = 2
as in previous formulations (after eliminating dominated options).
We also define an additional decision variable, y p,q , where
(
y p,q :=
1,
if our constraints are not violated with β p,q
0,
otherwise.
Objective Function
Our objective is to identify the most robust decision. This is the decision that satisfies our
performance constraint at the largest number of states. Formally, the objective function is then
as follows:
k1
r (y)
=
k2
∑ ∑ y p,q
p =1 q =1
hence our goal is:
r∗
k1
= max
k2
∑ ∑ y p,q .
x ≥0 m =1 n =1
y
Constraints
If y p,q = 1, then by definition, the constraint b T x ≥ rc is fulfilled. If y p,q = 0, then by definition,
the constraint b T x ≥ rc is not fulfilled. Reformulating this to allow for our iteration across the
uncertainty grid gives the constraint:
β p,q (1) x1 + β p,q (2) x2 ≥ rc y p,q ,
∀ p = 1, . . . , k1 ; q = 1, . . . , k2 .
6.1. Budget Allocation to Carbon Offsets Schemes
105
Here, if y p,q = 1, the corresponding constraint is:
β p,q (1) x1 + β p,q (2) x2 ≥ rc ,
and hence, our original satisficing constraint must hold. If y p,q = 0, then our constraint is
effectively:
β p,q (1) x1 + β p,q (2) x2 ≥ 0,
as we have no negative return values, and we cannot allocate negative dollars to a scheme. This
‘removes’ the constraint from the problem.
We also have our budget constraint:
x1 + x2 ≤ B
Model
Combining these gives the following model:
r∗
s.t.
k1
:=
max
k2
∑ ∑ y p,q
x ≥0 p =1 q =1
y
β p,q (1) x1 + β p,q (2) x2 ≥ rc y p,q ,
∀ p = 1, . . . , k1 ; q = 1, . . . , k2
x1 + x2 ≤ B
y p,q ∈ {0, 1}, p = 1, . . . , k1 ; q = 1, . . . , k2
Using preprocessing, we can eliminate all constraints where y p,q = 0. These are redundant, as
we have all β p,q (1), β p,q (2), x1 , x2 ≥ 0.
Due to the fact that the components of our objective function and constraints are linear, this
model is a mixed integer linear programming problem. In terms of tractability, we almost cannot ask for a simpler model, as this model can be implemented in any linear integer programming software package. The mixed integer programming model has two ‘continuous’ variables,
K binary variables, and K + 1 functional constraints. This model is relatively small, and could
be handled by commercial mixed integer programming software packages such as Xpress-MP.
It is also possible to incorporate weights if we have reason to believe that some states are more
likely than others. The weights can be incorporated as the coefficient of y p,q in the objective
106
Chapter 6. Illustrative Examples
function.
Proportional satisficing is a partial robust satisficing approach, where we aim to choose a decision which maximises the number of points for which our constraints are not violated. We can
visualise this in a similar manner to the discretised state space shown in Starr’s domain application for the partial optimising case above, where the binary variable y p,q = 1 places a marker on
our performance grid. While this method counts all satisfactory states indiscriminately, we can
modify this by plotting the successful points on a performance grid and only choosing those
which, for example, have at least one well-performing nearest neighbour.
As an aside, we briefly mention the graphical method. While this method has major restrictions,
it can be used easily when the uncertainty in our problem is one- or two-dimensional, which is
what we have when we consider non-dominated schemes in our case study.
If we discretise and fix our decision set (as done in our Starr’s domain criterion analysis above
for a partial robust optimising model), then we only have two variables, b1 and b2 in our constraint (as rc is a constant). This means that we can plot the satisficing area of our uncertainty
region for a given decision without the need to discretise the uncertainty region.
Given ( x1 , x2 ) = (40, 60), we give our uncertainty region below in Figure 6.3, with rc = 14.
b2
0.3017
good
0.23
bad
0.1361
0.1217
0
0.1426
b1
Figure 6.3: Graph of the region 40b1 + 60b2 ≥ 14.
The uncertainty region of decision ( x1 , x2 ) = (60, 40) is shown in Figure 6.4.
Here, it is easy to see that the decision ( x1 , x2 ) = (40, 60) has a greater robustness region over
that of decision ( x1 , x2 ) = (60, 40) for a satisficing problem.
We can experiment with different critical return values and decision values in order to gather
more information regarding what states perform well corresponding to specific values in the
6.1. Budget Allocation to Carbon Offsets Schemes
107
b2
0.0322
0.3017
good
0.23
bad
0.1361
0.1217
0.1426
0
b1
Figure 6.4: Graph of the region 60b1 + 40b2 ≥ 14.
decision set.
Intuitively, we would believe that if the critical value, rc , was raised for a fixed uncertainty
region, then robustness of our decision would decrease. Similarly, if the critical value is lowered,
then robustness of our decision should increase. This can be seen in Figure 6.5, hence our
analysis resonates with our intuition.
b2
b2
0.3017
0.3
0.3017
good
0.205
good
bad
0.17
bad
0.1217
0
0.1426
(a) Graph of the region 60b1 + 40b2 ≥ 18.
b1
0.1217
0
0.075
0.1426
b1
(b) Graph of the region 60b1 + 40b2 ≥ 10.
Figure 6.5: Changes to rc .
We have presented three models which can be applied to our carbon offsetting schemes portfolio problem, two of which are robust optimising, and one of which is robust satisficing.
The first robust optimising model presented has been derived using Soyster’s inexact linear
108
Chapter 6. Illustrative Examples
programming approach. This approach sets all uncertain coefficients to their worst values in
order to obtain a conservative lower bound on our solution. As this method produces a completely robust decision with respect to our objective function, the return obtained is the greatest
guaranteed return.
The second approach we examined was Starr’s domain criterion. In order to illustrate several
concepts which were discussed throughout the thesis, we made assumptions on the bounds
of our uncertain parameters, and introduced constraints on our decisions, in order to reduce
the problem to a size which easily demonstrated the methodology. Starr’s domain criterion
produces a partially robust decision where the decision chosen performs best over the most
number of states. Using visual aids, it was easy to conceptualise how our decision would be
affected by additional dispersion over the uncertainty region.
We then presented a model using Ben-Tal and Nemirovski’s robust counterpart approach, and
reformulated this, turning our model into a partial robust satisficing model.
A proportional satisficing approach was also formulated, giving a second partial robust satisficing model for the problem. Proportional satisificing involves discretisation of the state space,
yet by construction, is highly tractable, even for large-scale problems. Before implementing a
robust satisficing approach, it may be useful to perform an interval analysis in order to gauge
whether or not there exists a feasible solution.
We also illustrated how the graphical method could be used for a small problem with fixed
decisions. For one- or two-dimensional problems, it might not be necessary to discretise the
uncertainty region if our decision values and critical values are constant.
6.2
Container Inspection Problem
Focus on terrorism has shifted to the forefront of mainstream politics and media over the last
few years, with many worried about the possibility of potentially devastating attacks. Uncertainty regarding terrorist attacks is high, due to the unique and unpredictable nature of such
events. Efforts to intercept potentially hazardous materials before they reach our mainstream
have been increased, with border and port security called into action with regard to such tasks
as inspection and detection of weapons within shipping containers.
Problem Formulation
The problem at hand is to determine the number of containers to inspect at a port in order
to detect a weapon, given that the probability that a single weapon is present in one of the
6.2. Container Inspection Problem
109
containers is subject to severe uncertainty. Here, we analyse a paper by Moffitt et al. [2005].
We must decide on how many containers to inspect out of a total shipment of containers, given
that we don’t know the probability that a container contains a weapon. We assume that there is
at most one weapon hidden in the containers. If we inspect a container, we know with certainty
whether a weapon is or is not present. If we do not detect a weapon and the container passes
through the port, then we are subject to some loss. Notation from Moffitt et al. [2005] is as
follows:
The total number of containers arriving at the port is N.
The number of items which we will inspect is n ≤ N.
We denote the states of nature, v, which represents our net benefit.
We aim to choose a decision that ensures that we meet some critical threshold expected utility
value. We wish to maximise the robustness of our decision using an ordering relation which
indicates the element that is most robust.
Moffitt et al. [2005] formulate the following problem. Let:
B denote the benefit at the port without a security threat.
p denote the probability that a weapon is present in one of the N containers that will pass
through the port. p is subject to severe uncertainty.
L denote the cost incurred if a weapon passes through the port undetected.
C (n) denote the cost of inspecting n containers, where C (n) is monotonically increasing.
The conditional probability density function, f (v|n, p), for net benefit v given n inspections and
probability p that a weapon is present, is:
(
f (v|n, p)
=
p( N −n)
,
N
p( N −n)
,
N
1−
if v = B − C (n)
if v = B − C (n) − L.
If v = B − C (n), then f (v|n, p) is the probability that a weapon is found in one of the inspected
containers. Otherwise, f (v|n, p) is the probability that a weapon has passed through the port
undetected. In this case,
p( N −n)
N
is the failure probability, and we let π =
p( N −n)
.
N
Given n and p,
the expected utility is
p( N − n)
U ( B − C (n)) 1 −
N
Ū f
=
+ U ( B − L − C (n))
p( N − n)
N
.
110
Chapter 6. Illustrative Examples
This utility must be greater than some critical threshold expected utility, Ūg .
Under our assumption that there is at most one weapon present in the N containers, as the port
inspector, we would obviously stop inspecting if a weapon was found. However, this formulation of the problem provides a general policy on how a port inspector might wish to behave
on a regular basis, and hence determines the utility based on the inspection of the proposed
number of containers which will meet the critical threshold. If the weapon is found before all nc
containers are inspected, then this will add to our utility, rather than subtract from it, and will
hence leave us with a utility greater than that required to obey our constraint.
Modelling
We reformulate the problem step-by-step using the constructs defined throughout this thesis.
To Optimise or Satisfice?
As noted above, the aim is to determine the number of containers to inspect to ensure that some
critical threshold expected utility is met. This problem is a satisficing problem, as we can choose
any number of containers which meets this criterion.
We could easily formulate this problem to be an optimising problem by introducing a constraint
which only allows solutions within a pre-specified tolerance of some optimal expected utility
value.
If we wished to optimise and satisfice, we could combine the constraints above to find an optimal solution provided that the maximal expected objective value is greater than the critical
threshold expected utility.
Robustness
Terrorist attacks rely on the element of surprise and uniqueness, and hence are subject to severe uncertainty. As discussed in Chapter 4.4, local robustness analysis may not be suited to
conditions of such uncertainty. It suffices to say that, using arguments from Chapter 4, local robustness analysis is similar, but can be inferior, to, partial robustness, as it bases analysis around
the immediate neighbourhood of a poor estimate. It will not be discussed in this section.
The benefit of complete robustness is that it provides a minimum guaranteed expected utility.
However, the tradeoff is that this expected utility value may be very conservative. If we wish
to have complete robustness over the state space, then our worst case expected utility must be
6.2. Container Inspection Problem
111
greater than that of our critical expected utility value.
To determine our worst case, we must assume that the probability of there being a weapon in
one of the containers is one. If this is the case, and Mother Nature plays the adversary, then we
must examine all N containers, unless our utility is greater when loss is incurred with an attack
over the utility achieved when having to pay the cost of inspecting all N containers.
The former case gives us a utility of U ( B − C ( N )), as we know with certainty when we examine
a container whether it contains a weapon or not.
The latter case gives us a utility of U ( B − L − C (0)). As Mother Nature is playing against us, if
we don’t inspect all containers, worst case analysis assumes the weapon will be present in one
of the N − n containers that we do not check, with n < N. Also, C (n) is a monotonic increasing
function, hence our cost will be greater, and hence our utility lower, the more containers we
inspect. Under worst case analysis with n < N, the weapon will not be found, hence loss L will
be incurred regardless. Hence, to maximise our utility in the worst case, we should choose that
with lowest cost, namely n = 0.
If we have Ūg < U ( B − C ( N )) and Ūg < U ( B − L − C (0)), there is no feasible solution for
complete robustness.
Unless the losses are enormous, we may wish to relax the conservatism. We either check all
N containers or no containers in order to achieve complete robustness, if possible. However,
our assumptions may be too conservative due to the assumptions based on the worst case.
Rarely will there be the event that a terrorist weapon is on a shipment with certainty, hence the
probability of a weapon being concealed in one of the containers can be decreased from one.
This moves us into the realm of partial robustness. Partial robustness aims to maximise the
subset of the state space in which the decision performs well. In our case, as we are satisficing,
partial robustness looks for the decision with the largest subset of the state space where our
critical expected utility value is met.
Our uncertainty lies in the attack probability, p. Moffitt et al. [2005] denote our expectation of
the probability of a weapon being concealed in a container as pc , given that a robust decision
has probability of no more than πc of failing to detect a weapon. The size of pc represents the
size of our subset of the state space. For partial robustness, it is this value, pc , that we wish
to maximise in order to ensure that we choose the decision with the greatest robustness. The
derivation of this value will be discussed below.
Decision Variables
Let n be the number of containers we choose to inspect, n ∈ {0, 1, 2, . . . , N }.
112
Chapter 6. Illustrative Examples
Objective Function
As this problem has been presented as a satisficing problem, there is no performance objective
function.
Constraints
Our expected utility value, Ū f , must at least meet our critical expected utility value, Ūg . This is
given by:
Ū f ≥ Ūg .
There are other constraints given by Moffitt et al. [2005], but these are constraints related to
characteristics of probability density functions, and can be neglected with the assumption that
these properties influence our decision in such a manner that they are obeyed. In other words,
with a decision x ∈ X, we can assume that conditions on probability density functions are
represented in X.
Derivation of Constraints
Here, we discuss the derivation of the expected utilities in the constraint Ū f ≥ Ūg . Ū f is defined
in Chapter 6.2. We are now required to specify the critical expected utility level, Ūg .
Moffitt et al. [2005] tell us that Ūg can be any value of the decision maker’s choosing. Moffitt
et al. choose to derive this value using a robustness function that utilises the critical failure
probability value and our expected weapon probability value as mentioned earlier.
The probability of a weapon passing through the port without being detected is
p( N −n)
,
N
where
p is subject to severe uncertainty. All we know about p is that p ∈ [0, pc ]. As mentioned above,
to ensure robustness, we want to maximise the value of pc .
The following Info-Gap robustness model is as follows:
(
p(n, πc , pc )
= max
p( N − n)
α| max
≤ πc ,
N
p∈[0,α]
)
, n ∈ {0, 1, 2, . . . , N },
where α is the horizon of uncertainty. Info-Gap finds a maximum horizon of uncertainty for
each decision. The decisions can then be compared in terms of this horizon of uncertainty.
As discussed earlier, we wish to maximise the value of pc as this gives the greatest value of
p that ensures against πc . However, n, N and πc are fixed. This gives us a function which
6.2. Container Inspection Problem
113
increases monotonically in p, which gives us the scenario shown in Figure 6.6.
Figure 6.6: A monotonic increasing function in p with threshold πc .
If we have a monotonic increasing function, and we want to maximise the value of a parameter
such that it is less than or equal to the value of a constant, in this case πc , then this maximum
value occurs when the inequality transforms to an equality. In other words, if we want to
maximise p(n, πc , pc ), then we set:
p( N − n)
N
p(n, πc , pc )
= πc
=
Nπc
, p ∈ [0, pc ], n ∈ {0, 1, 2, . . . , N },
N−n
This value remains constant no matter where our estimate lies, hence a local Info-Gap analysis works in this example. We demonstrate this in Figure ??, where our uncertainty regions,
U (α, p̃), are intervals of [0, 1] (as we are dealing with probabilities).
Our uncertainty region is split into two halves, one ‘good’, and one ‘bad’. The bad region is the
region such that any p > pc may result in a failure probability of higher than πc . Any estimate
p̃ < pc will move toward the critical, constant value of pc . There is no region between our
estimate and the critical value for which our satisficing constraint does not hold. In order to
maximise this value of p(n, πc , pc ), we choose p(n, πc , pc ) = pc and get nc =
N ( pc −πc )
.
pc
It can be seen that no matter what the value of the estimate, p̃ < pc , our info-gap analysis will
114
Chapter 6. Illustrative Examples
always give us the correct solution. However, this is due to the trivial nature of the problem,
namely, the fact that our function is monotonic increasing. Observe the following Figure 6.7,
where this is not the case.
Figure 6.7: A function in p that is not monotonically increasing, with threshold πc .
It can be seen, here, that our uncertainty region has been broken up into several ‘good’ and ‘bad’
regions. This means that there is a dependence on the location of our estimate as to whether or
not we find the best value for pc .
pc5 gives the greatest value of pc such that the failure probability, πc , is not exceeded. However,
Info-Gap will only choose this value of pc if the estimate falls within the range [ pc4 , pc5 ].
It should be noted here that in Moffitt et al. [2005], there is no explicit value for an estimate
given. However, given the structure of an Info-Gap robustness model, namely | p − p̃| ≤ α, this
is identical to having an estimate at p̃ = 0. In our example of a non-monotonic function, it can
be seen in Figure 6.7 that the estimate used in Moffitt et al. would result in a poor value, namely
it would choose pc = pc1 .
Returning to our case where we have a monotonic increasing function, shown in Figure 6.6, it
is clear that our solution is independent of the estimate, hence why Moffitt et al. can neglect it
in this case study. As the solution does not depend on the estimate, we have partial robustness
rather than local robustness, as discussed in Chapter 4.5.
Returning to derivation of Ūg , the above values for nc and pc give us the following probability
6.2. Container Inspection Problem
115
density functions:
(
g(v|nc , pc )
=
1 − πc ,
if v = B − C (n)
if v = B − C (n) − L.
πc ,
The expected utility of g(v|nc , pc ) is then given by:
Ūg = U ( B − C (nc ))(1 − πc ) + U ( B − L − C (nc ))πc .
Moffitt et al. use second-degree stochastic dominance conditions to simplify the constraint Ū f ≥
Ūg to:
p( N − n)
N
( L + C (n) − C (nc )) − πc L ≤ 0.
This gives us the following info-gap robustness function for the decision maker:
(
p(n, L, C, πc , pc ) =
max
α : max
p∈[0,α]
p( N − n)
N
n∈
0, 1, . . . ,
)
( L + C (n) − C (nc )) − πc L ≤ 0 ,
N ( pc − πc )
pc
.
Similarly to above, with n, N, L, and πc constant, to maximise pc to give the greatest assurance against πc , we transform the inequality into an equality as we have monotonicity in our
increasing function. This gives:
p(n, L, C, πc , pc )
=
=
Nπc L
N ( pc − πc )
∈ [0, pc ], n ∈ 0, 1, . . . ,
( N − n) ( L + C (n) − C (nc ))
pc
N ( pc − πc )
Nπc L
∈ [0, pc ], n ∈ 0, 1, . . . ,
.
pc
( N − n) L + C (n) − C N ( pcp−c πc )
This strategy does not take into account the cost of choosing nc containers to examine, and
Moffitt et al. present a numerical analysis of the problem where n is not fixed. By allowing n
to become a variable, the problem is no longer trivial, as we lose the condition of monotonicity
on our robustness function. This is due to the economic effects on utility, due to the trade off
between ensuring that an attack does not take place, and the effect that this has on cost.
By examining properties of the robustness function, Moffitt et al. find that robustness decreases
when the loss incurred from an attack increases, when pc is reduced, and when the minimum
threshold expected utility is increased. We refer the reader to Moffitt et al. [2005] for more detail
on the numerical analysis.
116
Chapter 6. Illustrative Examples
We have presented a step-by-step derivation of the model presented in Moffitt et al.’s paper on
port security [2005], which demonstrates a triviality in the problem, namely, the critical probability, pc , is fixed. Consequently, Info-Gap analysis finds the correct solution, pc , regardless of
the location of the estimate, p̃. In fact, despite being one of the main points of difference between Info-Gap and other robust decision-making approaches, Moffitt et al. neglect to mention
or introduce such an estimate in this paper. As a technicality, due to the bounds of our uncertain
parameter, we can take p̃ = 0, however, the aforementioned triviality in the problem results in
the solution being independent of any estimate.
We note that at the beginning of the paper, Moffitt et al. give the following model for robustness:
max
x∈X
s.t.
α( x )
Ū f ≥ Ūg
Z
f (v| x )dv = 1
f (v| x ) ≥ 0,
where X reflects the constraint set on x other than those given in the formulation, Ū f is the
expected utility with conditional probability density function, f (v| x ), Ūg is a threshold utility
value, and α( x ) describes the largest subset of our uncertainty set for which our performance
requirement is satisfied, given we choose decision x. This is a partial robust satisficing model.
Here, we maximise the ‘size’ of a subset of the state space over which x has meets a critical
threshold utility value.
However, during the derivation of the constraint, this formulation is ‘reduced’ to an Info-Gap
model. The Info-Gap model holds when finding p(n, πc , pc ) solely due to the monotonic increasing character of the uncertain parameter plotted against robustness, allowing the local
robustness analysis to mimic that of a partial analysis. We have shown that while Moffitt et
al.’s analysis finds the correct value of pc , this is due to the trivial nature of the problem, as
demonstrated in Chapter 4.5.
Chapter 7
Conclusion
We have presented a review of the current non-probabilistic robust decision-making methodologies available, with a focus on the modelling aspect rather than the solution method. Under
severe uncertainty, a major difficulty in the formulation of robust decision-making problems
arises from the inability to quantify the uncertainty present. Chapter 2 discusses how one might
interpret uncertain data qualitatively, using concepts of dispersion over and size of the uncertainty region.
Recognising where the uncertainty lies in a problem and what we wish to make robust is also
important in the formulation stage. Chapter 3 deals with misconceptions in the literature regarding claims that satisficing ‘is better’ than optimising. We define a robust satisficing approach as one which seeks a solution such that constraints are obeyed, and a robust optimising
approach to seek robustness in the value of the objective function. From a modelling perspective, rather than one approach being superior to the other on a general basis, we must decide
which approach is suitable for our problem.
The main motivation behind this thesis comes from the unreliable and uncertain nature of data
in decision-making problems. Under uncertainty, we aim to promote robustness of a decision.
In Chapter 4, we begin by presenting different definitions of robustness found in the literature.
Specific levels of robustness are then introduced and defined using Maximin formulations, designed to highlight the fundamental property that robustness can only be guaranteed over a
subset of the uncertainty region provided that it performs well in the worst case of that subset.
To emphasise the relevance of this inherent characteristic, we prove that Soyster’s inexact linear
programming approach, Ben-Tal and Nemirovski’s robust counterpart approach, and Starr’s
domain criterion are all instances of the Maximin criterion. We use our definitions of different
levels of robustness in order to demonstrate why Info-Gap’s robustness model is an instance of
117
118
Chapter 7. Conclusion
the Maximin criterion, contrary to the claims of Info-Gap’s proponents. This involves identification of how the definition of the ‘worst case’ differs with robustness classifications.
Chapter 5 brings together all the concepts discussed within the thesis to encourage the use of
these ideas when attempting to model a robust decision-making problem under severe uncertainty. We also briefly introduce the concept of tractability, which is important when attempting
to implement decision-making models.
Using the ideas discussed throughout the thesis, we formulate two case studies. The first is a
portfolio problem, whereby we wish to allocate funds to different carbon offsetting schemes.
This is first presented as a robust optimising problem, and then as a robust satisficing problem.
We then examine a paper which uses an Info-Gap model to determine the number of containers
one should inspect in order to prevent a terrorist attack at a port. Here, we demonstrate that
under certain conditions, a local analysis can give rise to a solution which is identical to that
found using a partial robustness analysis.
While this thesis examined concepts which are rooted in the problem formulation stage of robust decision-making, an important step in the future development of robust optimisation requires research of solution methods such that the approaches discussed in this thesis can be
applied to real world problems.
Bibliography
E.L. Anderson and D. Hattis. Foundations: A. uncertainty and variability. Risk Analysis, 19(1):
47–68, 1999.
E. Ballestero. Strict uncertainty: A criterion for moderately pessimistic decision makers. Decision
Sciences, 33(1):87–107, 2002.
H. Bandemer. Modelling uncertain data. Akademie Verlag, Germany, 1992.
Y. Ben-Haim. Value-at-risk with info-gap uncertainty. The Journal of Risk Finance, 6(5):388–403,
2005.
Y. Ben-Haim. Info-gap robust-satisficing and the probability of survival. DNB Working Paper,
(138), 2007a.
Y. Ben-Haim. Information-gap decision theory: Decisions under severe uncertainty. Academic Press,
London, 2001.
Y. Ben-Haim. Info-gap decision theory: Decisions under severe uncertainty. Elsevier, Oxford, U.K.,
2006.
Y. Ben-Haim. Info-gap decision theory for engineering design: Or why ‘good’ is preferable to
‘best’. In Engineering design reliability handbook. CRC Press, 2007b.
Y. Ben-Haim and L. Davidovitch. Profiling for crime reduction under severely uncertain elasticities. Working Paper, 2008.
A. Ben-Tal and A. Nemirovski. Robust convex optimization. Mathematics of Operations Research,
23(4):769 – 805, 1998.
A. Ben-Tal and A. Nemirovski. Robust solutions of uncertain linear programs. Operations Research Letters, 25(1):1 – 13, 1999.
A. Ben-Tal and A. Nemirovski. Robust solutions of linear programming problems contaminated
with uncertain data. Mathematical Programming, 88(3):411 – 424, 2000.
119
120
Bibliography
A. Ben-Tal and A. Nemirovski. Robust optimization - methodology and applications. Mathematical Programming, 92(3):p453 – 480, 2002.
A. Ben-Tal, L. El Ghaoui, and A. Nemirovski (editors). A special issue on robust optimization.
Mathematical Programming, 107(1–2), 2006.
B. Beresford-Smith and C.J. Thompson. Managing credit risk with info-gap uncertainty. The
Journal of Risk Finance Incorporating Balance Sheet, 8(1):24–34, 2007.
D Bertsimas and M. Sim. Robust discrete optimization and network flows. Mathematical Programming, 98:49 – 71, 2003.
D Bertsimas and M. Sim. The price of robustness. Operations Research, 52(1):35 – 53, 2004.
D. Bertsimas and A. Thiele. Robust and data-driven optimization: Modern decision-making
under uncertainty. In Tutorials on Operations Research, pages 195–122. INFORMS, 2006.
D. Bertsimas, D.B. Brown, and C. Caramanis. Theory and applications of robust optimization.
2007.
H-G. Beyer and B. Sendhoff. Robust optimization - a comprehensive survey. Computer Methods
in Applied Mechanics and Engineering, 196:3190–3218, 2007.
E. Bustamante-Cedeno and S. Arora. Stochastic and minimum regret formulations for transmission network expansion planning under uncertainties. Journal of the Operational Research
Society, 59:1547–1556, 2008.
R.D. Carr, H.J. Greenberg, W.E. Hart, G. Konjevod, E. Lauer, H. Lin, T. Morrison, and C.A.
Phillips. Robust optimization of contaminant sensor placement for community water systems. Mathematical Programming, 107(1):337–356, 2006.
A. Charnes and W.W. Cooper. Chance-constrained programming. Management Science, 6(1):73
– 79, 1959.
H. Chernoff. Rational selection of decision functions. Econometrica, 22:423–443, 1954.
R. Chiulli. Quantitative analysis: An introduction. CRC Press, 1999.
A.C. Cullen and H.C. Frey. Probabilistic techniques in exposure assessment. Plenum Press, New
York and London, 1999.
George B. Dantzig. Linear programming under uncertainty. Management Science, 1(3/4):197 –
206, 1955.
Bibliography
121
H. A. Eiselt, C.-L. Sandblom, and N. Jain. A spatial criterion as decision aid for capital projects:
Locating a sewage treatment plant in halifax, nova scotia. The Journal of the Operational Research Society, 49(1):23–27, 1998.
L. El Ghaoui and H. Lebret. Robust solutions to least-squares problems with uncertain data.
SIAM Journal on Matrix Analysis and Applications, 18(4):1035 – 1064, 1997.
F.J. Fabozzi, P.N. Kolm, D.A. Pachamanova, and S.M. Focardi. Robust portfolio optimization and
management. Wiley Finance, 2007.
M. Fiedler, J. Rohn, J. Nedoma, J. Ramik, and K. Zimmermann. Linear optimization problems with
inexact data. Springer US, 2006.
S. French. Decision theory: an introduction to the mathematics of rationality. Halsted Press, New
York, NY, USA, 1986.
S. Giove, S. Funari, and C. Nardelli. An interval portfolio selection problem based on regret
function. European Journal of Operational Research, 170:p253 – 264, 2006.
H.J. Greenberg and T. Morrison. Robust optimisation. In Operations research and management
science handbook. CRC Press, 2007.
C. Gregory, K. Darby-Dowman, and G. Mitra. Robust optimization and portfolio selection: The
cost of robustness, 2008. http://ssrn.com/abstract=1225678.
K. Hayes, H. Regan, and M. Burgman. Environmental risk assessments of genetically modified organisms vol. 3: Transgenic fish in developing countries. CABI Publishing, 2007.
I. Hlavacek, J. Chleboun, and I. Babuska. Uncertain input data problems and the worst scenario
method. Elsevier, Amsterdam, The Netherlands, 2004.
F.O. Hoffman and S. Kaplan. Beyond the domain of direct observation: How to specify the
probability distribution that represents the ”state of knowledge’ about uncertain inputs. Risk
Analysis, 19(1):131–134, 1999.
Garud N. Iyengar. Robust dynamic programming. Mathematics of Operations Research, 30(2):
p257 – 280, 2005.
P.A. Jensen. Operations research models and methods, 2004. http://www.me.utexas.edu/
~jensen/ORMM/models/unit/or_method/process.html.
S. Kachani and J. Langella. A robust optimization approach to capital rationing and capital
budgeting. Engineering Economist, 50(3):195–229, 2005.
122
Bibliography
P. Kall and J. Mayer. Stochastic linear programming: Models, theory, and computation. Kluwer, New
York, NY, USA, 2005.
J. M. Keynes. The general theory of employment. The Quarterly Journal of Economics, 51(2):
209–223, 1937.
Z.W. Kmietowicz and A.D. Pearman. Decision theory and incomplete knowledge. Gower Publishing, 1981.
F.H. Knight. Risk, uncertainty and profit. Hart, Schaffner & Marx, Boston, MA, 1921.
P. Kouvelis and G. Yu. Robust discrete optimization and its applications. Kluwer, Dordrecht, 1997.
H. Laurikka and U. Springer. Risk and return of project-based climate change mitigation: A
portfolio approach. Global Environmental Change, 13:207 – 217, 2003.
R. Lempert and M. Collins. Managing the risk of uncertain threshold responses: Comparison
of robust, optimum, and precautionary approaches. Risk Analysis, 27(4):1009–1026, 2007.
R. Lempert, S. Popper, and S. Bankes. Shaping the next one hundred years: New methods for quantitative, long-term policy analysis. The Rand Pardee Center, Santa Monica, CA, USA, 2003.
R. Lempert, D. Groves, S. Popper, and S. Bankes. A general, analytic method for generating
robust strategies and narrative scenarios. Management Science, 52(4):514–528, 2006.
R.D. Luce and H. Raiffa. Games and decisions: Introduction and critical survey. John Wiley & Sons,
New York, 1957.
H. Markowitz. Portfolio selection. Journal of Finance, 7(1):77 – 91, 1952.
L.J. Moffitt, J.K. Strandlund, and B.C. Field. Inspections to avert terrorism: Robustness under
severe uncertainty. Journal of Homeland Security and Emergency Management, 2(3):1–17, 2005.
A. Moilanen, M.C. Runge, J. Elith, A. Tyre, Y. Carmel, E. Fegraus, B.A. Wintle, M. Burgman,
and Y. Ben-Haim. Planning for robust reserve networks using uncertainty analysis. Ecological
Modelling, 199(1):115–124, 2006.
J.M. Mulvey, M. Stavros, and Robert J. Vanderbei. Robust optimization of large-scale systems.
Operations Research, 43(2):264 – 281, 1995.
J.M. Mulvey, R. Rush, and J. Sweeney. Generating scenarios for global financial planning systems. International Journal of Forecasting, 14(2):291–298, 1998.
A. Nilim and L. El Ghaoui. Robust control of Markov decision processes with uncertain transition matrices. Operations Research, 53(5):780–798, 2005.
Bibliography
123
J. Odhnoff. On the techniques of optimizing and satisficing. The Swedish Journal of Economics, 67
(1):24–39, March 1965.
D. Pachamanova. Handling parameter uncertainty in portfolio risk minimization. Journal of
Portfolio Management, 32(4):p70 – 78, 2006. ISSN 00954918.
H.M. Regan, Y. Ben-Haim, B. Langford, W.G. Wilson, P. Lundberg, S.J. Andelman, and M.A.
Burgman. Robust decision-making under severe uncertainty for conservation management.
Ecological Applications, 15(4):1471–1477, 2005.
J. Renn-Jones. Robust decision-making under uncertainty, December 2007. Honours thesis,
Department of Mathematics and Statistics, The University of Melbourne.
J. Rosenhead and Shiv K. Gupta. Robustness in sequential investment decisions. Management
Science, 15(2):B–18 – B–29, 1968.
J. Rosenhead, M. Elton, and Shiv K. Gupta. Robustness and optimality as criteria for strategic
decisions. Operational Research Quarterly (1970-1977), 23(4):413–431, 1972.
A. Rubinstein. Modeling bounded rationality. The MIT Press, Massachusetts, USA, 1998.
L. J. Savage. The theory of statistical decision. Journal of the American Statistical Association, 46
(253):55–67, 1951.
G.O. Schneller and G.P. Sphicas. Decision making under uncertainty: Starr’s domain criterion.
Theory and Decision, 15(4):321–336, December 1983.
A. Schrijver. Combinatorial optimization: Polyhedra and efficiency. Springer, Berlin, Germany, 2003.
G.L.S. Shackle. Expectations in economics. Cambridge University Press, 1952.
H.A. Simon. On the concept of organizational goal. Administrative Science Quarterly, 9(1):1 – 22,
1964.
M. Sniedovich. The mighty maximin. Working Paper No.MS-02-08, 2008a.
M. Sniedovich. The art and science of modeling decision-making under severe uncertainty.
Decision Making in Manufacturing and Services, 1(1-2):109–134, 2007.
M. Sniedovich. Satisficing vs. optimizing, 2008b. http://www.moshe-online.com/satisficing/.
M. Sniedovich. Wald’s maximin model: A treasure in disguise! The Journal of Risk Finance, 9(3):
287–291, 2008c.
A. L. Soyster. Convex programming with set-inclusive constraints and applications to inexact
linear programming. Operations Research, 21(5):1154–1157, 1973.
124
Bibliography
M.K. Starr. A discussion of some normative criteria for decision-making under uncertainty.
Industrial Management Review, 8(1):71–78, 1966.
J Stranlund and B.C. Field. On the production of homeland security under true uncertainty.
Working Papers 2006-5, University of Massachusetts Amherst, Department of Resource Economics, September 2006.
J.V. Vlajic, J. van der Vorst, and E. Hendrix. Food supply chain network robustness: A literature
review and research agenda. Working Paper: Mansholt Graduate School of Social Sciences, 0(42):
0, 2008.
A. Wald. Contributions to the theory of statistical estimation and testing hypotheses. Annals of
Mathematical Statistics, 10(4):299 – 326, 1939.
A. Wald. Statistical decision functions. Wiley, New York, NY, USA, 1950.
Stein W. Wallace. Decision making under uncertainty: Is sensitivity analysis of any use? Operations Research, 48(1):20 – 25, 2000.
E. Webster and D. Mackay. Defining uncertainty and variability in environmental fate models.
CEMC Report No. 200301, 2003.
H. Yaman, O.E. Karasan, and M.C. Pinar. Restricted robust uniform matroid maximisation
under interval uncertainty. Mathematical Programming, 110(2):431–441, 2007.
F.C. Zagare. Game theory: concepts and applications. Sage Publications, Beverly Hills, 1984.
Doyle J. Zhao, K. and K. Glover. Robust and optimal control. Prentice Hall, New Jersey, 1996.