Expected utility theory and some extensions

Expected utility theory
and some extensions∗
Paul Schweinzer
Birkbeck College, University of London
School of Economics, Mathematics, and Statistics
Malet Street, Bloomsbury, London WC1E 7HX
[email protected]
Abstract
We provide a nontechnical survey of expected utility theory and some of its extensions. The general focus is on the interpretation of the theory rather than on the formal
discussion of the theory’s properties for which we refer to original work.
1
Introduction
Choice under risk1 is the branch of economic theory that models decisions under imperfect
information over the course of future events. Expected utility theory restricts the set of all
possible preferences over risky alternatives to a subset which defines the rationality of the individuals holding such preferences. Modern expected utility theory was created by von Neumann
& Morgenstern as a theory based on objective probabilities.2 Its subjective Bayesian foundations were developed by Savage and others into the main theoretical vehicle used to model
economic situations under uncertainty. It was widely successful in key applications like insurance theory and portfolio selection and has become the workhorse of modern utility theory. It
is a normative, axiomatised theory that is faced with a handful of seemingly paradoxical but
systematic empirical violations—an example is simultaneous gambling and insurance—which
confirm von Neumann & Morgenstern’s view that expected utility theory is not a description of
people’s actual behaviour (cf. [Mor79, 176f]). Strictly speaking, one is not forced to accept any
of these empirical objections as falsifications.3 It is always possible to say (with Savage) that
expected utility theory is not merely a prescriptive theory and, at the end of the day, thorough
deliberation will lead individuals to accept expected utility theory as the only theory of rational
choice. In that view, the principle of rationality is regarded rather as an animating principle
than a testable hypothesis and, consequently, all deviations from expected utility theory are
considered theories of irrational choice. This strategy, though, is not entirely convincing. The
theory’s inability to solve several ‘paradoxes’ satisfactorily has given rise to the development
of alternative and descriptively more successful non-expected utility theories. In general, these
This work is part of my M.Sc. dissertation at the London School of Economics and Political Science,
Department of Philosophy, Logic and Scientific Method. I am grateful for helpful discussions, comments, and
corrections of numerous mistakes to Richard Bradley, Till Grüne, Georg Kirchsteiger, Christoph Schmidt-Petri,
and Luca Zamparini.
∗
1
theories are mathematically less elegant and analytically less powerful than their antecedent but
are more general in the sense that they give up certain aspects of the theory in order to bring
about more consistency with empirical results. Both their more cumbersome application to
general economic questions and their (allegedly) inferior normative appeal are the reasons why
there is no established alternative to expected utility theory which is widely used for modeling.
The following section is concerned with presenting von Neumann-Morgenstern expected utility theory. The normative axioms that, if obeyed, lead individuals to behave as if they were
maximising expected utility are discussed. Several subsections are devoted to the description of
the empirical and theoretical problems troubling the theory. These constitute the principle reason for being concerned with generalisations of expected utility theory. Throughout the analysis
we shall focus on the objective probability interpretation of von Neumann & Morgenstern—
Savage’s subjective interpretation will only be touched on when indispensable. In that sense,
we shall not discuss theories of choice under uncertainty. This is a serious restriction because
objective probabilities may well not exist for some real choice situations.4
In the third section, we shall elaborate on a ‘minimalist’ extension of the von NeumannMorgenstern framework. It has interesting interpretations, is easily tractable and should thus be
useful for the development of the arguments that form the ratio for more complicated extensions
discussed in the fourth section.
The fourth section presents some of the most widely discussed theories based on nonexpected utility functions. We are concerned with only one strand of the theoretical literature—
axiomatic non-expected utility theory—and structure this literature according to some plausible deviations from the independence axiom. These are the betweenness property and rankdependent probability representations.5
In section five we shall discuss some interpretations of the different axiomatic frameworks.
Most of the non-technical examples are given in this section.6
2
2.1
von Neumann-Morgenstern expected utility theory
The model
Von Neumann-Morgenstern expected utility theory asks us to express our preferences over
lotteries much in the same way as we usually express our preferences over goods in the general
economic theory of choice under certainty. Lotteries are taken to be representations of risky or
uncertain, mutually exclusive alternatives; they are denoted by {x1 : payoff in case of outcome
1, :, p1 : probability of outcome 1, x2 : payoff in case of outcome 2, :, p2 : probability of outcome
2, . . . }. The outcomes (x1 , . . . , xn ) of simple lotteries are sure, while in compound lotteries we
replace the sure outcomes by (simple) lotteries. We shall assume throughout the discussion
that a compound lottery can be reduced to a probabilistically equivalent simple lottery (A0 —
reduction axiom).7
Von Neumann & Morgenstern axiomatise the theory with the undefined primitive binary
relation ’’ on the (abstract) convex and nonempty set L (cf. [Neu44, 24ff]). We interpret
members of L as probability measures on a Boolean algebra of subsets of a set X of outcomes
(that is, as lotteries),8 ‘u v’ as ‘u is preferred to v’ and ‘u ∼ v’ as ‘u is indifferent to v’ if
neither ‘u v’ nor ‘v u’. Let u, v, w ∈ L and α, β ∈ [0, 1]:
A1 —order: the binary relation is asymmetric and transitive; ∼ is transitive.9
(2.1)
A2 —separation (‘independence’): u v ⇒ αu + (1 − α)w αv + (1 − α)w.10
(2.2)
A3 —continuity: ∃α, β ∈ [0, 1]|u v w ⇒ αu + (1 − α)w v ∧ v βu + (1 − β)w.11 (2.3)
2
Based on these restrictions on , von Neumann-Morgenstern’s expected utility theorem establishes the existence of an ordinally unique and in probabilities linear real function U(·) that
represents preferences over lotteries meeting A0 –A3 such that:
v w ⇔ U(v) > U(w).
(2.4)
That is, individuals whose preferences satisfy the above axioms choose as though maximising expected utility. Von Neumann-Morgenstern expected utility functions have the following
characteristic form for p = {x1 : p1 , . . . , xn : pn } ∈ L:12
X
U(p) =
pi u(xi ).
(2.5)
That is, the utility valuation function u(x) used to rank the sure outcomes (to be interpreted as
mutually exclusive, final wealth levels) xi is the same for each outcome. This implies that each
outcome’s utility is totally independent of the valuation of the other outcomes in the lottery;
therefore, they can be summed up independently, that is, outcomes are additively separable.13
Utilities are combined in a multiplicative manner with the probabilities and are subsequently
added—just as mathematical expectations are formed. But since we form expectations not over
sure outcomes xi but over their utility u(xi ), the result is considerably more general than the
mere mathematical expectation of separable events. We call the expected utility function U(p)
the von Neumann-Morgenstern expected utility function and the function u(x) the Bernoulli
utility function (cf. [Mas96, 184]).14
All of the above axioms can be (and have been) challenged. In the present discussion,
however, we shall accept the order axiom A1 (implying transitivity and completeness) and the
continuity axiom A3 as normatively compelling. This will be justified in section five and is not
universally supported (cf. [Fis88a, 27f]).
2.2
The core element: the independence axiom
In general, we would assume that any choice-act under risk depends on the utility of the outcome
and its probability of occurring. Such a general function could look like:
U(p) = u(p, x).
(2.6)
The independence axiom A2 , however, shows a much deeper insight that gives the theory its
vast empirical content and power in determining irrational behaviour (cf. [Mac82]). Its essential
presumption is that utilities are in a sense the unalterable data of a decision problem while
probabilities represent some relative frequency of occurrence that may vary with the evidence
available. Hence, some separation between the two seems warranted. The structure of our
discussion is given by several different attempts to axiomatise the separation of these factors
in a meaningful way that allows for empirical phenomena such as e.g. Allais’ paradox to be
accommodated.
The independence axiom is a normative argument and can be deduced from von Neumann
& Morgenstern, although they did not explicitly state it in their axiomatisation. A statement
of the independence axiom alternative to A2 is:
If the lottery u is preferred (res. indifferent) to the lottery w, then the mixture αu + (1 −α)w
will be preferred (res. indifferent) to the mixture αv + (1 − α)w for all α in the unit interval
and w.
If we mix two lotteries u and v over which there is an established preference relationship with
a third lottery w with equal probability 1−α, the original and mixed relationships between u and
3
v coincide. This is the key ingredient of the definition of von Neumann-Morgenstern rational
choice behaviour. It can be easily shown that the level set15 of functional representations of
preferences over lotteries u, v and w that satisfies A1 –A3 such as (5) exhibits straight, parallel
lines (indifference curves) in a probability diagram (cf. [Mas96, 175f]).
The level sets of expected utility functions for n outcomes can be represented in an n − 1
dimensional simplex. In its three dimensional case, a simplex ∆ = {p ∈ RN , Σpi = 1} can be
graphically represented by an equilateral triangle with altitude 1. Each perpendicular then can
be interpreted as the probability of the outcome at the opposing vertex. Hence, every point in
the triangle represents a lottery. This is shown in figure 1.a for the three sure outcomes £3 £2 £1 placed at the vertices. Notice that these vertices represent degenerate lotteries (i.e.
pi = 1; pj6=i = 0). The Lottery Q = {1 : p1 , 2 : p2 , 3 : p3 }, with all probabilities equal to one
third, is drawn in figure 1.a.
£3
£3
½
f+
p2
f+
1
p1
Q
½
£1
p3
£1
£2
(a)
£2
(b)
Figure 1: Level set of an expected utility function in a three-dimensional simplex.
Assuming u(·) is twice differentiable and increasing, we define the Arrow-Pratt coefficient
of absolute risk aversion of Bernoulli utility functions defined over outcomes xi as16
r(x) = −
u00 (x)
.
u0(x)
(2.7)
Indifference curves that show a coefficient of absolute risk aversion higher than unity have
a steeper slope than depictions of the mathematical expectation that are shown as dashed lines
in figure 1.b.17 Trivially, the level sets of the mathematical expectation consist of straight,
parallel lines as well. Hence, the level set in figure 1.b represents risk-prone behaviour and the
set in 1.a risk-averse choices. The arrow points in the preferred direction.
Bernoulli functions of risk averse agents in [x, u(x)] space are concave—the more concave,
the more risk-averse—as depicted in figure 2.a., while those of risk prone agents are strictly
convex. Notice that this follows directly from Jensen’s inequality.18
Another technicality we need to touch on, which is concerned with the shape of u(·), is
stochastic dominance. We have already seen that we can move from simple lotteries with a
discrete number of outcomes to any finite number of outcomes by using compound lotteries. By
switching to continuous distribution functions, we can cover an infinite number of outcomes.
The probability distribution of F first-order stochastically dominates that of G iff (for nondecreasing u):19
Z
Z
u(x)dF (x) ≥
4
u(x)dG(x).
(2.8)
u( )
u( )
u( )
u( )
u(3)
u(2)
u(3)
½u(1) + ½u(3)
½u(1) + ½u(3)
u(1)
u(2)
u(1)
1
3
2
£
1
3
2
(a)
£
(b)
Figure 2: Bernoulli utility function of a risk averse (a) and a risk prone (b) decision-maker.
First-order stochastic dominance (fosd ) is a straightforward extension of the relation to
the stochastic case: more is better. describes a weak order over the continuous probability
distributions F and G rather than over the simple measures u and v. So essentially, the discrete
probabilities in a lottery are replaced by a continuous distribution function. Fosd implies that
the mean of F is larger than that of G; it does not imply, however, that each probability value
of F (xi ) is larger than that of G(xi ). Actually, the discrete version of the definition of fosd is
F (xi ) ≤ G(xi ) for every xi . Stochastic dominance of distribution F over G means that F gives
unambiguously higher returns than G. Since the dominating distribution is always preferred,
fosd gives some structure to the preferences of a decision-maker endowed with preferences
meeting this criterion—these are called stochastic dominance preferences. figure 3.a shows fosd
for the continuous case and 3.b is a discrete version for a lottery with three outcomes.
cumulative density
cumulative density
1
1
G(x)
p3F
.5
G(x)
F(x)
F(x)
p2F
0
E[G(x)] E[F(x)]
0
x
(a)
p1F
£1
£2
£3
x
(b)
Figure 3: Two lotteries represented as cumulative distribution functions with supports in the
unit interval. Distribution F first-order stochastically dominates G if distribution F is always
at or below the distribution G.
5
2.3
Some paradoxical results
Its founders took expected utility theory as a normative argument for idealised behaviour
under risk. Its successes, however, made it imperative to test the theory’s predictions as
well. The classical falsifications of the descriptive side of expected utility theory are Allais’
argument against the independence axiom, Ellsberg’s paradox against the existence of unique
subjective probabilities, and, perhaps most fundamentally, Lichtenstein & Slovic’s argument
against transitivity of preferences over lotteries (cf. [All53], [Ell61], [Lic71]).
The fundamental objections to expected utility theory fall into five broad categories: (1)
common consequence, (2) common ratio, (3) preference reversal (or, more generally, all intransitivity) phenomena, (4) non-existence of subjective probabilities,20 and (5) framing effects.
Each of these is taken up in turn.
2.3.1
Common consequence effect
The intuition behind paradoxes based on the common consequence effect is summarised by Bell
as ,,winning the top prize of $10,000 in a lottery may leave one much happier than receiving
$10,000 as the lowest prize in a lottery“ [Bel85]. Essentially, it says that outcomes are not
independent and agents show a higher degree of risk aversion for losses than for gains. Therefore,
the independence axiom does not hold and we would expect the expected utility form to be
violated. The Allais paradox depicted in table 2 is the leading example of this class of anomalies
that shows that this is indeed what happens in experiments (cf. [All53, 89]). It illustrates the
normative tension between the reduction and the independence axioms: if one accepts the prior,
one is to discard the latter. The general form of the common consequence argument is:
b1
b3
αδx + (1 − α)F b2
αδx + (1 − α)G b4
αP + (1 − α)F
αP + (1 − α)G
Table 1: General form of (row-wise) common consequence effects.
The paradoxical result is obtained by setting δx to yield x with certainty while the lottery
P contains both larger and smaller outcomes than x. In addition, the probability distribution
of F fosds that of G. A concrete example is Allais’ paradox.
a1
a3
{£1 Mio: 1, £0 Mio: 0} a2
{£5 Mio: .1, £0 Mio: .9 } a4
{£5 Mio: .1, £1 Mio: .89, £0 Mio: .01 }
{£1 Mio: .11, £0 Mio: .89 }
Table 2: Allais’ paradox; usually a1 a2 and a3 a4 .
In 2.a we show a set of level curves that represents risk-prone behaviour and in 2.b indifference curves of risk-averse attitudes are graphed. Notice that the dotted lines connecting
the choice lotteries in 2 form a parallelogram with the base of the simplex. Therefore, no parallel, straight level set (i.e. indifference curves consistent with the independence axiom) can
represent both a1 a2 and a3 a4 in either of the two representations. Hence, whatever the
attitude towards risk, the choice behaviour described by Allais’ paradox cannot be represented
by expected utility theory. Apparently, for this purpose we need a level set where indifference
curves are not parallel.21
Indifference curves in the level set of 2.c are not parallel but ‘fan out’ from a point of
intersection south-east of the £1 Mio vertex in order to accommodate preferences that are
forbidden by the independence axiom.
6
£5 Mio
f
£5 Mio
f
£5 Mio
f
+
+
+
a3
a3
a2
a4
a1
£1 Mio
£0
(a)
a2
a4
a1
£1 Mio
£0
a3
a2
a4
a1
£1 Mio
£0
(b)
(c)
Figure 4: (a) Allais’ paradox with an expected utility level set that represents risk-prone behaviour, (b) a level set representing risk-aversion. (c) Indifference curves showing the ‘fanning
out’ property.
2.3.2
Common ratio effect
The most well known examples of this second class of paradoxes are Kahneman & Tversky’s
certainty effect and Hagen’s Bergen paradox (cf. [Kah79], [Hag71, 289]). Analytically, the
common ratio effect is very similar to the common consequence effect.
d1
d3
{X : p, 0 : 1 − p} d2
{X : rp, 0 : 1 − rp} d4
{Y : q, 0 : 1 − q}
{Y : rq, 0 : 1 − rq}
Table 3: General form of common ratio effects (the ratio p:q equals rp:rq).
Here, the paradoxical result is obtained by setting p > q, r ∈ (0, 1), and 0 < X < Y . The
special case of the certainty effect is illustrated in 3.
c1
c3
{3.000 : 1, 0 : 0}
c2
{3.000 : .25, 0 : .75} c4
{4.000 : .8, 0 : .2}
{4.000 : .2, 0 : .8}
Table 4: Kahneman & Tversky’s certainty effect as a special form of common ratio effect.
Kahneman & Tversky report that in their experiments 80% of the subjects chose c1 over c2
and 65% chose c4 over c3 , which again implies that level sets will show fanning out of indifference
curves contradicting the independence axiom.
2.3.3
Order problems
In the preference reversal setting of Lichtenstein & Slovic, the agent is asked to choose between
a number of betting situations of the form shown in table 5 (cf. [Lic71]). Agents are first asked
to choose between the two bets and then to give their (true) certainty equivalents22 for the bets
in the form of a selling and a buying price. Clearly, the bet that is chosen should also be the one
given the higher certainty equivalent—this, however, is not in general the case. Lichtenstein &
Slovic report that in one particular setting 127 out of 173 subjects chose e1 over e2 but assigned
a higher value to e2 than to e1 (therefore the naming p-bet and £-bet).
7
£4.000
£4.000
c2
c2
f
f
+
+
c4
c4
c3
c3
c1
£3.000
£0
c1
£3.000
£0
(a)
(b)
Figure 5: (a) Preferences allowing for the certainty effect; (b) Preferences forbidding the effect.
e1 (p-bet):
{X : p, x : 1 − p} e2 (£-bet):
{Y : q, y : 1 − q}
Table 5: Preference reversal when stated as certainty equivalent and payoff.
If X > x, Y > y, Y > X and p > q, a violation of the order axiom (A1 ) is usually observed
in the agent’s choices. This can be used to the effect that a lottery can be bought from the agent
and sold back to her at a premium while leaving her other options unchanged. That is, she can
be used as a money pump. We shall argue below that, upon reflection, such behaviour is not
plausible. This is the basis of our acceptance of the order axiom on normative grounds.23 Some
of the theories discussed in section four, however, can accommodate violations of transitivity.
Other systematic intransitivity has been reported by MacGrimmon and Larsson & May (cf.
[Mac79], [May54]).
2.3.4
Framing effects
Framing effects are disturbing because evidently, the way a decision problem is posed has
some influence on how individuals choose. One of the most persuasive examples reported is
due to Kahneman & Tversky, who ask doctors to choose between two alternative vaccination
programs, each carried out on 600 persons:
“Assume that the exact scientific estimate of the consequences of the programs is as follows:
If program A is adopted, 200 people will be saved. If program B is adopted, there is a 1/3
probability that 600 people will be saved, and a 2/3 probability that no people will be saved.”
“If program C is adopted, 400 people will die. If program D is adopted, there is a 1/3
probability that nobody will die, and a 2/3 probability that 600 people will die.”[Kah88]
Usually, agents choose program A over B but D over C, although probabilistically A equals
C and B equals D. This behaviour is in violation of a consequence of A1 —asymmetry. This class
of experiments targets the so-called reference point from which the agent evaluates variations
in individual wealth. The focus on a reference point as incorporated e.g. in prospect theory
is a modification of von Neumann & Morgenstern’s original concept of regarding total wealth
as the argument of their Bernoulli utility functions. While normatively compelling, this was
dropped by Kahneman & Tversky after they showed that agents choose inconsistently over
lotteries with identical distributions over total wealth (cf. [Kah88]). The particular reference
point taken by an individual to judge the desirability of a gamble, however, can be influenced
by the description of the choice situation—this is not true to the same extent for final wealth
8
levels.
Although the above example only hits asymmetry, there are different frames against virtually
every axiom that has been proposed. Hence, we have to select the axioms we wish to keep on
a normative basis. The arguments to do so will be discussed in section five.
2.4
Static Dutch book arguments
A Dutch book is a bet where no matter how uncertainty or risk is resolved, the decision-maker
will always lose. A decision situation is said to be dynamic, if it involves some decisions to be
made after (some of) the initial uncertainty is resolved, that is, if at least one information set24
is more finely partitioned at some point of the decision making as was the case ex ante. When
all decisions are made on the initial information partitioning, the choice situation is static. Part
of the literature claims that only expected utility theory can ensure that preferences are of a
form that static Dutch books can never be made (cf. [Web87]). This is wrong for appropriately
designed non-expected utility functionals because of an implicit continuity assumption. This
is well documented in the literature and we shall only pick one example to illustrate this (cf.
[Ana93, 79]). Let a risk-averse decision-maker hold preferences over w = {£3 : 2p, 0 : 1 − 2p}
and v = {£3 : p, 0 : 1 − p} such that:
u(w) < u(v) + u(v).
(2.9)
Notice that w 0 = {£3 : 2p + ε, 0 : 1 − 2p − ε} and v 0 = {£3 : p + ε, 0 : 1 − p − ε} fosd w
and v. By continuity, however, we can restate (9) as:
u(w 0) < u(v) + u(v)
(2.10)
which is a direct violation of fosd, although this behaviour seems to be entirely sensible. Hence,
we should be suspicious about proofs involving the continuity assumption in this context—not
all failures of fosd are troubling.
Other breakdowns of fosd, however, allow Dutch books to be made. If we use a functional
form which does not employ the Bernoullian riskless
intensity notion but uses probabilities
P
in some more general form τ (pi ), we may have
τ (pi ) 6= τ (1).25 If the inequality is strictly
greater, we can always extract a positive amount from an individual who holds such preferences
in return for a smaller sure payoff. Kahneman & Tversky’s ’pure’ prospect theory (i.e. the
mathematical form without the ’editing out’ of dominated outcomes) violates fosd in this way
allowing for Dutch books. More sophisticated functionals (exhibiting stochastic dominance),
however, avoid this shortcoming (cf. [Kel93, 17]).26
All forms of intransitive preferences will lead to the possibility of Dutch books as well.27
The same is true if someone’s preferences over lotteries are incoherent (i.e. not satisfying
Kolmogorov’s axioms of probability)—a fact used by Ramsey in his statement of subjective
probability theory.
2.5
Dynamic Dutch book arguments
In the dynamic setting where decisions take the form of contingent plans involving sequential
moves, a distinction can be made between planned and actual choices. If there are no unexpected events, the planned and actual choices will be consistent. As in the preceding subsection,
an important defence of expected utility functions would be to show that only the independence axiom implies dynamically consistent behaviour, and a Dutch book can always be made
against an agent holding non-expected preferences. In general, again, this is not the case—it
9
is true, however, for theories that (implicitly) assume a version of consequentialism (discussed
in section four).28
Our plan is to show that there are reasonable consistency requirements, which imply some
version of utility maximisation. In particular, we want to explore which axioms are compatible
with the additive separability of states implied by the independence axiom, while still entailing
that agents will choose dynamically consistent (if beliefs are updated using Bayes’ rule). There
are several possible ways of establishing this.
Employing the reduction axiom, we can rewrite table 2 in a way that lends itself to the
direct application of the independence axiom.
a1
a3
{£1 Mio: .89, £1 Mio: .11}
{£5 Mio: .1, £0 Mio: .89, £0 Mio: .01 }
a2
a4
{£5 Mio: .1, £1 Mio: .89, £0 Mio: .01}
{ £1 Mio: .11, £ 0 Mio: .89}
Table 6: An equivalent form of Allais’ paradox; again usually a1 a2 and a3 a4 .
This decision situation can be represented in the following extensive form game tree (cf.
[Mac89, 57]). Nature’s move at A (A’) precedes the move by the decision-maker, and therefore
this move is included in whatever is chosen at B (B’).
£5 Mio
£5 Mio
10/11
10/11
C
a2
B
.11
C’
1/11
a3
B’
£0 Mio
.11
a1
A
1/11
£0 Mio
a4
A’
£1 Mio
£1 Mio
.89
.89
£1 Mio
£0 Mio
Figure 6: The two choices a1 : a2 and a3 : a4 of Allais’ paradox in the extensive form.
After Nature’s first move at A, the player’s choice at B precisely corresponds to the first
row of Allais’ paradox in the above table. The choice at B 0 corresponds to the table’s second
row. The highlighted moves are the ones that are usually taken: a1 and a3 . Notice that the
subtrees rooted at B and B 0 are identical—we therefore denote the choice node by B ∗ .
We first record the decision-maker’s choice at B ∗ ignoring the preceding part of the tree
(this is a consequentialist procedure) as the decision-maker’s planned move for the case that she
should be to choose at B ∗ (i.e. if Nature chooses ‘up’ at A or A0 ). If she were now put in the
situation to actually decide after Nature’s move (which she has no influence on whatsoever), it
would be paradoxical if she were to change her mind. This is, however, precisely what subjects
in experiments usually do by preferring a1 to a2 and a3 to a4 . This can certainly not be brought
into line with the kind of (temporal) separability the independence axiom allows. We shall use
this as our intuitive notion of dynamic inconsistency between planned and actual moves.
A different version of dynamic inconsistency again employs the notion of money pumps (cf.
[Gre87, 787]). An outside agent (who knows the preferences of the decision-maker) formulates
10
compound lotteries and offers the decision-maker at intermediate stages a reduction in risk
for a beneficial outcome to the outside agent. Thereby the outside agent leads the decisionmaker from one probability distribution over wealth to another probability distribution with
the difference being absorbed by the agent.29 If this difference is positive, outsiders flourish,
and decision-makers with such preferences will always lose money.
The route we shall follow for a more formal definition of dynamic consistency was charted by
Karni & Schmeidler (cf. [Kar91, 1787ff]). As mentioned above, fosd does not imply separability
across sublotteries. However, this is precisely what we need in order to fulfil our dynamic
consistency requirements. We denote the space of all compound lotteries on X as P (X). Let
z, y be two elements of the set of compound lotteries P (X) and denote the sublottery z of y as
z ∈ y or (z|y). Finally, we define Ψ(X) = { (z|y)| y ∈ P (X), z ∈ y} as the space of lotteries and
sublotteries on which the preference relation % is defined. The preference relation % satisfies
dynamic consistency on Ψ(X) if for all quadruples of compound lotteries (y, y 0, z, z 0 ) we have:
C1 —dynamic consistency : (y|y) % (y 0 |y 0) ⇔ (z|y) % (z 0 |y 0)
(2.11)
where z ∈ y and y’ is obtained from y by replacing z with z’. This means that if the decisionmaker prefers the compound lottery y to y 0 , then, if the sublottery (z|y) is played, she has no
incentive to switch to z 0 from z.
Similarly, we define Hammond’s consequentialism (subsection 4.1.2) where actions are solely
judged by their consequences as follows: % satisfies consequentialism on Ψ(X) if for all quadruples of compound lotteries (y, y 0, z, z 0 ):
C2 —consequentialism : (z|y) % (z 0 |y 0) ⇔ (z|ȳ) % (z 0 |ȳ 0)
(2.12)
where z ∈ y, z ∈ ȳ. y’ is obtained from y by replacing z with z 0 and similarly ȳ 0 is obtained
from ȳ by replacing zwith z 0 .
Compound lotteries satisfy reduction if for all (y, y 0, z, z 0 ):
C0 —reduction : (z|y) % (z 0 |y 0) ⇔ (z̄|ȳ) % (z̄ 0 |ȳ 0)
(2.13)
where z ∈ y, z 0 ∈ y 0 , but z̄ is a simple probability measure.30 z̄ is obtained by replacing the
compound lottery z in y by its reduced form, which is obtained by applying the calculus of
probabilities. z̄ 0 is defined analogously.
A representation theorem then establishes that if the preference relation % on Ψ(X) satisfies
C0 and C2 , then it also satisfies C1 —dynamic consistency—iff it satisfies the independence
axiom A2 (cf. [Kar91, 1789]). This is our desired result.
The consequence of this discussion is that we cannot have it all: we cannot weaken the
independence axiom while sticking to the reduction axiom to accommodate ‘paradoxical’ behaviour and still satisfy dynamic consistency in general. Our formulations of non-expected
utility functions in section four will be measured by their ability to comply with the requirements imposed by static and dynamic consistency and the dominance criteria developed in this
section. Moreover, we can structure the different approaches by checking which of the above
axioms—reduction, consequentialism or dynamic consistency—are given up in order to arrive
at the particular non-expected theory.
3
3.1
A minimalist extension of expected utility
The inductive model
To study the argumentation that leads to functional forms of utility functions which are more
general than the above-described additive expected utility functions, we first develop a min11
imalist extension of expected utility. The idea is to provide the theory with the ability to
discount not only wealth levels (in the form of utility functions) but also probabilities in very
small or very large regions. In this extension, probabilities have the usual linear relationship
throughout the interval [p, p̄], but p > 0 and p̄ < 1 are two thresholds where the function is
discontinuous. More formally:
X
U(x) =
v(pi )u(xi ),
(3.1)

 0 if p < p
v(p) ≡
p if p ≤ p ≤ p̄ .

1 if p > p̄
(3.2)
If individuals perceive probabilities31 as too low to care for or, symmetrically, too high to
doubt the outcome, they consider the respective states as impossible or certain. This seems to
be intuitively sensible as long as different thresholds are allowed for different outcomes. For
instance, the lower tolerance limit for nuclear accidents may well be considerable lower than the
lower limit for what is perceived as crossing a road safely. Two sensible (empirical) assumptions
are:
dp(x)
dp̄(x)
< 0, and
< 0.
(3.3)
dx
dx
This expresses the above idea that both the lower and the upper thresholds decrease with
growing values of the gamble. For very high stakes, one is unwilling to accept the same risk as for
relatively unimportant low stakes. Equivalently, the upper limit is constructed by formulating
the bet symmetrically, i.e. by asking at which probability the nuclear plant is perceived as
‘safe’.
One interpretation of such behaviour is simple induction. A Bayesian can never actually
reach probability 1 if there is any counter-evidence—be it as slight as it may. In our inductive
model, however, we can (discontinuously) jump from a threshold value that is ‘quite sure’ to
absolute certainty.
The lower level p can be interpreted as the attention span an individual requires to devote
any attention at all to a phenomenon (e.g. the positive probability of breaking through the
ceiling of a building is usually neglected). p̄, accordingly, can be seen as the probability above
which an individual is certain that a phenomenon will occur (e.g. the sun rising tomorrow). This
is in contrast to the usual Bayesian approach, which requires the density function between the
two limits to be smooth and (indeed asymptotically) S-shaped. Since we use the interpretation
of induction now, the abscissa is labeled H—the set of hypotheses—in figure 7.
The discontinuous nature of the inductive distribution function means that, in general, we
shall lose many of the established economic tools we can use to show the existence of equilibria.
We can, however, project our inductive model on a lattice and study probability not defined
as in section two as a probability measure on some σ-algebra of attributes but as itself being a
lattice. An argument in favour of such a strategy is that our inductive model will only be able
to cope with Allais’ paradox if the probabilities in question are either very small or very large.
This, however, is not necessarily the case. We can extend our model to what is suggested in
6.b: To give up the linear segment as well and only retain a discrete set of probability values.
In this view, probabilities themselves constitute a set M that can be partially ordered by a
relation . There are no numerical probability measures but only a weak order on the elements
of M. Addition and subtraction are defined in the usual way by their set-theoretic equivalents
of union and intersection. If the probability ‘sum’ or ‘difference’ hits or exceeds the thresholds,
the supremum and infimum (which themselves need not be elements of the lattice) of 1 and 0
12
cumulative density
cumulative density
1
1
upper threshold
linear
segment
elements
of the set of
probabilities
0
H
lower threshold
0
(a)
H
(b)
Figure 7: A smooth and a discontinuous cumulative distribution. The S-shaped function (a)
represents the Bayesian approach, while the linear function (b) illustrates the inductive approach.
result. Notice that the minimum difference between two outcome probabilities which agents are
not indifferent towards (i.e. the distance of the lattice-elements) can be arbitrarily small—the
same is true for the distance of the thresholds from 1 and 0. We shall refer to this as the agent’s
attention span. There exists an isomorphism between each element of X (outcomes) and the
elements of M (probabilities).32
3.2
The inductive model’s predictions
The level set of this inductive model would look like figure 8.a in the probability simplex. The
point is not that the level set fans out (which would signal a systematic one-sided distortion of
how probabilities are mapped onto the lattice). Much rather we want to point out that there
are only slight departures from parallelity. Moreover, indifference is not defined in a dense
fashion but only for the values the probabilities can take (i.e. for the elements of M)—hence,
the dotted indifference ‘curves’.
p
£3
p3
f
p2
+
a3
p1
a2
a4
a1
£2
£1
(p3,u(£1))
(p3,u(£2))
(p3,u(£3))
(p2,u(£1))
(p2,u(£2))
(p2,u(£3))
(p1,u(£1))
(p1,u(£2))
(p1,u(£3))
u(£1)
(a)
u(£2)
u(£3)
u(£ )
(b)
Figure 8: A level set corresponding to a transitive inductive model and Allais’ paradox is shown
in (a). In (b) we give an example of a non-additive union operation yielding a different result
from expected utility theory.
13
Why should this be worth our while? The easiest way to demonstrate the general argument
is again graphically. A drawback is that we have to restrict ourselves to two dimensions.33
It is easy to see from figure 8.b that we can define unions which do not exhibit the additive
separability of expected utility theory. Hence, we can easily find non-expected formulations. In
expected utility theory, multiplying the probability with the outcome forms each lattice element;
then all elements along the 45o diagonal of the lattice are joined (i.e. added). Separability
means that only one element is evaluated at a time—the off-diagonal elements are ignored.
Our union operation is considerably more general and can exclude certain diagonal elements
while including some off-diagonal elements.
An example for a non-standard union operation is the set depicted in figure 8.b. In addition
to the diagonal elements, it includes the element (p1 , u(£1))—for some reason, this element is
important for the decision. In the above example of the nuclear power plant, we could think
of the small probability that the plant might explode as affecting all other states. Hence, state
evaluation is not separable.
3.3
A disclaimer
The most important step in developing a theory as outlined above is to show that some version
of Kolmogorov’s axioms of probability still holds. This, however desirable, is not a simple task
and will not be attempted here. Moreover, for the purposes of this essay this is not required.
We just want to develop an alternative reasoning for the sake of illustrating which arguments
are important—we do not need to show that it actually works. This is done for the more
elaborate theories in the following section.
4
Non-expected utility theory
There exist a variety of non-expected utility theories. All are designed to accommodate empirically established violations of one or more of the axioms A0 -A3 . The focus of the present
discussion lies in alternatives to the additive separability implied by the independence axiom.
The alternative roads that are taken are to discuss early alternatives, the betweenness property,
and some form of rank-dependent probability representation. These four different (empirically
increasingly weak while mathematically progressively more general) fundamental intuitions give
rise to a number of different axiomatisations, which will be discussed in turn. Space limitations,
however, force use to discuss only two examples of each category. For each different kind of
separation, an example is presented in section five, which illustrates the fundamental intuition
behind this class of theories.34
4.1
Variations on the additivity property
If preferences satisfy the independence axiom, outcomes in the respective lottery are independent of the specific context they are placed in. The theories described in this subsection are
extensions or reformulations of expected utility theory that leave von Neumann-Morgenstern’s
basic tenet intact. Their level sets are therefore represented properly by figure 1 and are not
duplicated here.
4.1.1
Bernoulli’s riskless intensity notion
Using the notation of section two, Bernoulli’s notion of expected utility is
14
U(p) =
X
p(xi ) log(xi ).
(4.1)
Quite remarkably and with considerable effects on economics, this is the first statement
of the principle of (logarithmically) diminishing marginal ‘moral worth’ or utility of wealth.
Obviously, the additive separability implied by the independence axiom is obeyed.
Since Bernoulli uses a specific (risk averse) function u(·), his version is less general than von
Neumann-Morgenstern expected utility theory. Bernoulli does not place weights on probabilities—
hence, the name riskless intensity (cf. [Kar91, 1778], [Fis88a, 50]).
4.1.2
Hammond’s consequentialism
The separability of outcomes given by the independence axiom is backed up by Hammond’s
consequentialist analysis. If an agent ignores all information contained in the decision tree
preceding her choice node, she acts in a purely consequentialist manner: only the consequences
of the agent’s action count. Hence, Hammond’s version of the independence axiom is that in any
finite decision tree, everything that influences a decision must be modelled as a consequence. It
is consequences and nothing else that determines behaviour. This axiom—we stated a version
as C2 in (2.12)—justifies studying the normal form (i.e. the decision matrix) of the tree only
and to neglect the extensive form (i.e. the decision tree) of a decision situation because there
is no additional information contained in the tree.
Hammond’s axiom is an in principle testable normative hypothesis and, as shown in subsection 2.5, it implies the independence axiom. As Hammond proves, a consequentialist norm
prescribing consistent behaviour over a decision tree can be defined, which only depends on
consequences of behaviour. He proceeds to show that there exists a (complete and transitive)
revealed preference ordering maximising this norm at any decision node that satisfies both
the independence axiom and a version of the sure-thing principle allowing independent probabilities. Finally, he proves that these conditions (consistency, order, independence axiom /
sure-thing principle) are a complete characterisation of consequentialist behaviour. Together
with an additional continuity axiom, this implies the existence of a von Neumann-Morgenstern
expected utility function. That is, consequentialism implies (in its objective probability version)
expected utility maximisation.
To obtain the subjective version of the theory, additional assumptions have to be made. As
Hammond points out and contrary to the objective case, it is not the case that all probabilities
must be independent—consequently, we arrive at a state dependent utility theory which is based
on multiplicative separability rather than additive separability (cf. [Ham88, 74]). Hence, there
is also a foundation for non-expected utility theory in Hammond. Further support for this
rejection of additive separability comes from Drèze. He shows that the standard hypothesis
of possible consequences being independent of the state is completely unacceptable ‘if states
include calamities such as accidental death or injuries’. Under these circumstances, the lottery
framework of expected utility theory is not useful and is usually replaced by state-dependent
utility formulations (cf. [Drè87], [Mas96, 199ff]).
The same conclusion, unacceptability of the additive separability assumption if utility depends on states, is also implied by another result of the consequentialist approach. One can
show that to avoid inconsistencies of behaviour, there cannot be zero probabilities of consequences at any chance node of a decision tree. This is in general not defendable and, hence, the
assumption of additive separability cannot be maintained in general. We need non-expected
formulations.
15
4.2
Early alternatives
The basic idea of all theories in this category is to assign probability weights to single outcomes in order to accommodate empirically troubling effects, such as Allais’ paradox. All
these attempts are vulnerable to Dutch books—only the more sophisticated RDU-approach
can circumvent this problem by looking at the complete distribution.
4.2.1
Intensity theory
Intensity theory goes back to Allais, who uses the same form as Bernoulli and von NeumannMorgenstern and extends it with a functional θ(p∗ ). Indeed, Allais is led to a Bernoulli utility
function with a very similar shape to Bernoulli’s own logarithmic function (cf. [All53, 34],
[Sug86, 12], [Fis88, 274]).
X
V (p) = θ(p∗ ) +
u(xi )p(xi )
(4.2)
where θ(p∗ ) is defined as the measure induced by p on the difference of utilities from expected
utility. Allais argues that θ depends at least on the second moment (M2 ) and Hagen takes
this to the third moment of p∗ (M3 ).35 Hence, the basic idea of this approach is to enrich
von Neumann-Morgenstern’s conception by factors determined by the shape of the probability
distribution over outcomes. If θ vanishes, that is, if we neglect all distributional aspects other
than the expectation, we are back with expected utility. A particularly simple form of θ would
be:
θ = αM2 + βM3 .
(4.3)
Allais assumes the sign on the second moment’s influence to be negative, while he assumes
β to be positive. This has the reasonable interpretation of people disliking increased risk and
showing higher risk aversion with respect to good outcomes than to bad ones. The simple
above form, however, is vulnerable to Dutch book attacks because it does not ensure that
stochastically dominating lotteries are chosen. Loomes & Sugden’s disappointment theory
incorporates this aspect with a straightforward extension that amounts to a locally linear Sshaped utility function (concave for gains but convex for losses) as depicted in figure 17.b (cf.
[Sug86, 13]). This type of utility function is the basis of Machina’s local linear approximation of
utility functions (cf. [Mac82]). Since this is not an approach that fits well within our axiomatic
framework, we shall not discuss it here.36 The below level set of figure 9.a is, however, applicable
to both theories.
As can be seen from figure 9.b, in Allais’ theory the decumulative probability density is
distorted both horizontally as the effect of the Bernoulli utility function and vertically as the
effect of the functional θ(p∗ ). For a decumulative density function that shows only the effect of
expected utility functions see figure 15.a.
Allais does not assume that decision-makers maximise the expected value of their riskless
utility since he considers only one-off choice situations where ‘it would be wrong to consider that
a strategy of maximising the mathematical expectation would be a good first approximation’
(cf. [All53, 73, 92ff]). In that, he departs from Bernoulli or von Neumann-Morgenstern. Allais’
basic principles are the reduction axiom A0 and a version of the order axiom A1 that he
calls ‘axiom of absolute preference’. This axiom contains the assumption that decision-makers
preferences satisfy fosd because otherwise consistency cannot be ensured.
16
p+
1
£3
f
+
0
£1
£
£2
(a)
(b)
Figure 9: (a) A level set corresponding to Allais’ non-expected utility theory and (b) the
decumulative probability representation of the same theory.
4.2.2
Prospect theory
This very influential descriptive approach is due to Kahneman & Tversky (cf. [Kah79). They
developed their theory as a direct response to their experimental evidence against expected
utility theory. In contrast to the other theories presented here, it is a descriptive theory and
has no normative standing. Indeed, Kahneman & Tversky believe that no theory can exist
that is descriptively accurate and normatively appealing (cf. Fis88a, 26]). The normative
touch Kahneman & Tversky give their theory is the invariance property of preferences to
different frames (cf. [Fis88a, 27]). As mentioned above, lotteries in this theory are not about
final wealth levels but about deviations from a certain reference point.
Prospect theory distinguishes two successive stages: the editing phase and the evaluation
phase. At the editing stage, decision-makers contemplate the choice situation and, if possible,
simplify the problem. This includes the operations of coding, combining, segregating, rounding,
and cancelling that, in essence, amount to the usual manipulation rules we applied above
for simplifying Allais’ paradox from the representation of table 2 to 5. The most important
additional operation included in the editing stage is the detection of dominated prospects that
are ruled out and discarded. Therefore, prospect theory is not solely based on the probability
distribution over the ultimate payoffs but on additional factors, which are seen as important
ingredients of the individual’s actual choice—in this, the theory is similar to regret theory.
The activities of the editing stage are below referred to as methodological twist—without
them, prospect theory would be prone to Dutch book attacks and, being unable to model
intransitivity, descriptively false.
In the subsequent evaluation phase, the prospects are ranked, and the most highly valued
risky outcome is chosen. Prospect theory employs two functionals: a probability weighting
functional τ (p) as shown in figure 10.b and a Bernoulli utility function measuring gains and
losses u(x), which is S-shaped to reflect Kahneman & Tversky’s findings that individuals are
risk-prone towards losses and risk-averse towards gains (cf. figure 17.b). This amounts to:
X
τ (pi )u(xi ).
(4.4)
In general, τ (p)+τ (1−p) 6= 1; hence, τ (·) cannot be a simple measure. Prospect theory reduces
to the expected utility form if τ (·) is the identity mapping. That this is indeed not in general
the case in order to ensure stochastic dominance preferences is due to the methodological twists
17
of the editing stage.
(p )
1
£3
f
.75
+
(p )
.5
.25
£1
0
£2
p
.25
(a)
.5
.75
1
(b)
Figure 10: (a) A level set corresponding to prospect theory and a weighting function τ (·) in
(b).
Interpreted in a purely descriptive fashion, prospect theory allows Dutch books—only if we
allow for methodological twists, the theory is an alternative to expected utility theory.
4.3
The betweenness property
A weakened form of the independence axiom A2 called betweenness axiom says that the preference ranking of a probability mixture of any two lotteries is always intermediate between the
individual lotteries.37 We leave the order axiom A1 unchanged and therefore retain the weak
order and the equivalence order ∼. Again let p, q ∈ P (X), with α ∈ (0, 1):38
B2 —separation (‘betweenness’): p q ⇒ p αp + (1 − α)q q.
(4.5)
This implies for every p, q ∈ P (X) with p ∼ q and α ∈ (0, 1) that
p ∼ αp + (1 − α)q ∼ q,
(4.6)
that is, if the decision-maker is indifferent between two lotteries, she is indifferent to any mixture
of the lotteries as well. Figure 11.b. illustrates this point. Betweenness is clearly implied by
independence; this is easy to see since (4.6) implies both quasi-concavity and quasi-convexity
of on P (X).39 Therefore, level curves are linear, but they are not parallel and they need not
all emanate from a single point (cf. [Che98, 211]). Hence, mixtures between lotteries are not
fully separable. A picture may clarify the difference between independence and betweenness:
For all functionals V (p) that satisfy betweenness, we have for all α ∈ (0, 1):
V (p) > V (q) ⇒ V (p) > V (αp + [1 − α]q) > V (q).
(4.7)
In general, a more appropriate formulation of continuity B3 has to be supplied (cf. [Kar89,
431], [Kar91, 1772]). Since its interpretation is equivalent to A3 we do not duplicate it here.
Axiomatisations in this subsection are based on axioms A0 , A1 , B2 , and B3 .
4.3.1
Weighted (linear) utility theory
Weighted utility theory is based on an axiomatisation of the betweenness property by Chew and
MacGrimmon called the weak substitution axiom (cf. [Che83], [Mac79]). The independence
18
£3
p
£3
p
f
+
i
i
q
q
£1
£2
£1
(a)
£2
(b)
Figure 11: Three lotteries with i = αp + (1 − α)q, α ∈ (0, 1); (a) illustrates (4.6) and (b) shows
that (4.7) forbids non-linear level curves.
axiom requires that mixtures of two distributions with the same expectation with another
distribution in the same proportions share the same mean, regardless of the third distribution
they are mixed with. The weak substitution axiom allows the above mixture proportions, which
give rise to the same mean value, to be different (cf. [Che83, 1068]).
One interpretation of weighted utility theory is in terms of a transformation of the ratio of
transformed probability to probability as a function of the outcome x. If the positive weighting functional τ (·) is low for highly ranked outcomes and high for lowly ranked outcomes, the
resulting distortion of probabilities implies overestimation of the disregarded outcome’s probability and underestimation of the highly regarded outcome’s probability. An appropriate choice
of τ may well represent a pessimistic (res. optimistic) attitude towards risk (cf. [Kar91, 1775]).
We use:
P
pi u(xi )
(4.8)
V (p) = P
pi τ (xi )
to separate pi and pj : The final utility attached to an outcome is given by the ratio of the
two ‘expected utility’ functions u/τ . If τ (·) is identical 1, the above formulation reduces to the
expected utility form. In general, however, the weighted linear utility function is given by the
ratio:
p i > pj ⇒
u(pi )
u(pj )
>
.
τ (pi )
τ (pj )
(4.9)
In the simplex representation, indifference curves for this form are straight, though not
in general parallel, and intersect at a point outside the simplex. Here the intersection of
indifference curves does not imply intransitivity because indifference curves are not defined
outside the simplex: preference structures over outcomes outside the choice set do not matter
(cf. [Sug86, 11]).
4.3.2
Regret and SSB theory
Regret theory as defined by Loomes & Sugden is a normative approach that uses a modified
utility functional r and allows for intransitivity by dropping axiom A1 (cf. [Loo82]). Fishburn’s independent axiomatisation rests on the betweenness and continuity axioms (cf. [Fis82],
[Fis88a, 68f]). Since the feeling of regret is a fundamentally individual capacity, the theory
19
£3
f
+
£2
£1
Figure 12: A level set corresponding to weighted linear utility theory.
is founded on subjective probabilities; in that it differs from all other theories discussed here.
Different from the subsequently discussed rank-dependent theories, regret theory is based on
aspects other then the probability distribution over the ultimate payoffs (viz. on regret). Therefore, important ingredients of individual choice, such as variations in the temporal resolution
of the uncertainty or of the payoffs themselves, can be included (cf. [Gre88, 377]). A simple
formalisation of regret theory defines the modified utility function r as:
r(xi , xj ) = u(xi ) + R[u(xi ) − u(xj )].
(4.10)
The functional r is designed to accommodate a regret factor along with the usual Bernoullian
riskless intensity notion u. The idea is that if action p is chosen from {p,q} and consequence
xi obtains as a result of the choice, one may rejoice if u(xi ) > u(xj ), but experience regret
if u(xj ) > u(xi ) where u(xj ) is the consequence that would have resulted if action q had
been chosen. Depending on the regret / rejoice functional R, regret theory can ensure either
u(xi ) > u(xj ) or u(xj ) > u(xi ) and therefore accommodate different intensities.
Decision-makers then maximise the non-expected utility of choosing p rather than q:
XX
V (p) =
r(xi , xj )p∗i pj .
(4.11)
i
j
The full power of the theory is only reached when a skew-symmetric bilinear (SSB) functional
is used that represents preferences by a SSB functional φ on L × L. φ is skew-symmetric if
φ(xi , xj ) = -φ(xj , xi ) for all xi , xj ∈ L, and it is bilinear if it is linear separately in each argument
(cf. [Fis88, 275]). φ is defined as:
φ(xi , xj ) = r(xi , xj ) − r(xj , xi ).
(4.12)
SSB theory requires in its axiomatised form only the very weak substitution axiom ([Kar91,
1776]). The major achievement of this extended formulation is that it can model statistically
dependent prospects which (4.11) cannot. Decision-makers maximise:
XX
V (p) =
φ(xi , xj )p∗i pj .
(4.13)
i
j
As mentioned above, this freedom in assigning regret or rejoice to the same choice using a
SSB functional allows for the modelling of cyclic (i.e. intransitive) behaviour that is observed
in e.g. voting.40 In the simplex representation, indifference curves can intersect at one interior
20
point. With such a weakened structure we can accommodate all of the paradoxes discussed in
section two since our level set can exhibit both the fanning out and intransitivity properties.
£3
£3
f
+
£2
f
+
£1
£1
£2
(a)
(b)
Figure 13: Level sets corresponding to SSB theory. The intersection of indifference curves can
be both inside the simplex as in (a) or outside as in (b). Since SSB theory generally can allow
for intransitivity, cycles are not excluded.
4.4
Rank-dependent probabilities
The non-expected utility theories based on Quiggin’s rank-dependent probabilities (RDU) extend Allais’ and Hagen’s idea that utility evaluation should be based on more than just the
first moment (i.e. the expectation) of (cardinal) utility (cf. [Qui93, 55]).41 The RDU approach
is characterised by ordering outcomes prior to the application of a utility-representation. The
ranking is used to assign a weight to an outcome depending on the relative rank of this outcome
in the whole probability distribution of outcomes (i.e. lotteries). The weighting functional is
defined on the cumulative probability distribution p+ as τ :[0,1]→[0,1]. If it is the identity function, we are back with expected utility theory, but for convex τ (·) with increasing elasticity
we can accommodate fanning out. This approach escapes the fosd vs. Dutch book problem by
relying on a transformation that considers the whole structure of a risky prospect. Similarly,
sophisticated formulations such as the ordinal independence approach can avoid higher-order
dominance problems as well.42 RDU theories keep the reduction, order, and continuity axioms
(A0 , A1 , A3 ) but reformulate the separation axiom A2 .
RDU-approaches that cannot be discussed here are Quiggin’s anticipated utility theory and
non-Lebesgue-measure approaches as pioneered by Segal.
4.4.1
Dual theory
An approach dual to the von Neumann-Morgenstern’s was axiomatised by Yaari for u(xi ) = xi
(cf. [Yaa87]). Here, the wealth levels are undiscounted, but the attitude towards probabilities
can be specified. Hence, in this case the utility function is linear in wealth rather than in
probability:
X
V (p) =
τ (p+ (xi ))u(xi ).
(4.14)
In Yaari’s model, the decision-maker’s attitude towards risk is not reflected in the curvature
of the Bernoulli utility function but in the way the decumulative probability distribution function is distorted when a lottery is evaluated by the decision-maker. The difference to expected
21
(p )
1
£3
.75
f
+
(p )
.5
.25
£1
0
£2
p
.25
.5
(a)
.75
1
(b)
Figure 14: A level set corresponding to dual theory. Notice that Yaari’s approach is linear in
outcomes rather than in probabilities. The weighting functional τ (p+ ) in (b) compares with
the unrestricted form of 10.b.
utility theory is shown in figure 15: Yaari’s decumulative distribution function is vertically
distorted, while the expected utility function is horizontally distorted (cf. [Mun87, 11, 23]).
decumulative density
decumulative density
1
1
0
£
0
(a)
£
(b)
Figure 15: The horizontal corrections made by expected utility theory with respect to a lottery
with a linear distribution function are shown in (a) while (b) shows the vertical corrections
made by dual theory.
As seen in figure 15, expected utility theory replaces for each decumulative density level
p the outcome x by its valuation u(x)—hence, the horizontal distortion. Dual theory goes
the opposite way: For each outcome level x, the probability density is distorted vertically
by replacing p+ it with τ (p+ ). Therefore, in Yaari’s model it is the curvature of τ (p+ ) that
represents attitudes towards risks. Again, concavity represents risk-aversion (cf. figure 9.b).
An axiomatisation of the dual theory is based on the dual independence axiom (cf. [Qui93,
148]).
+
4.4.2
Ordinal independence
Green & Jullien replace the independence axiom by their ordinal independence axiom to weaken
the implied linearity to quasi-convexity of preferences (cf. [Gre88, 255], [Qui93, 149]). In
22
repeated betting situations ordinal independence amounts to two conditions:
• The decision-maker never agrees to (intermediate) outcomes that are fosd’d by the previous distribution.
• An outsider proposing a bet to the decision-maker should not be able to extract a profit
with a positive mean.
Green & Jullien conclude that for any quasi-convex preference relation and any initial wealth
level, no manipulation exists that leads the agent to a stochastically dominated alternative.
Notice that a quasi-convex but non-linear preference relation can be manipulated from some
initial random wealth but not from any non-random wealth. Whenever we can eliminate
distributions that would indeed lead to the possibility of manipulations (e.g. by looking at the
total ranking of outcomes), we have a strong argument for quasi-convexity and, hence, only
a weak argument for additivity. This result supports similar results by Kreps & Porteus and
Machina (cf. [Kre78], [Mac84], [Gre87, 788]).
In its spirit, the theory is an extension of Quiggin’s anticipated utility theory. The idea
behind ordinal independence is that if two distributions over payoffs share a common tail,
then this common tail can be modified for both distributions without altering the individual’s
preference between these distributions. The shared segments do not affect the ranking of the
distributions—preference is determined only by the interval on which the two distributions
differ. This is considerably weaker than the independence axiom and is illustrated in figure 16.
cumulative density
1
G(x)
.5
F(x)
0
a
b
X
Figure 16: Distributions F and G share the same tails—they only differ on S = [a, b]. (G is
not necessarily a mean-preserving spread of F .)
The interpretation of the ordinal independence axiom contains some resemblance to the
psychological concepts of editing in e.g. prospect theory by editing out the part where the
distributions coincide. While the independence axiom implies additive separability along a
single dimension, the utility functionals implied by the ordinal independence axiom exhibit
additive separability along multiple dimensions (i.e. the Gorman-from, cf. [Mas96, 119f]).
Let the preference relation % be complete, transitive, and continuous. Then the ordinal
independence axiom for the distributions F and G, S ⊂ X, is defined as (cf. [Gre88, 357],
[Qui93, 149]):
D2 —ordinal independence: If F % G and
i ) F̃ (x) = G̃(x) ∧ F (x) = G(x)|x ∈ S, and ii ) F̃ (x) = F (x) ∧ G̃(x) = G(x)|x ∈
/ S,
then F̃ (x) % G̃(x).
23
(4.15)
Together with appropriate forms of the order and continuity axioms and suitable monotonicity (i.e. stochastic dominance) assumptions, this axiom defines a class of non-expected utility
functionals that we can compare in its discrete form to alternative functionals (cf. [Qui93, 57]):
"
!
!#
n
i
i−1
X
X
X
V (p) =
u (xi ) τ
p(xj ) − τ
p(xj )
.
(4.16)
i=1
j=1
j=1
This formulation allows for the above-mentioned S-shaped (concave-convex) utility functions
that are able to accommodate the Friedman-Savage paradox of simultaneous gambling and
insurance (cf. [Che83, 1082], [Gre88, 372].
u(x )
£3
f
+
£1
-£
u(x )
+£
£2
(a)
(b)
Figure 17: (a) A level set corresponding to ordinal independence theory and an S-shaped
Bernoulli utility function over monetary gains and losses in (b).
Figure 17.a shows a level set for both an S-shaped Bernoulli function—as in 17.b—and an
S-shaped weighting functional as in 14.b. This bestows the theory with enormous flexibility
but also incredible weakness in determining irrational behaviour.
5
Implied notions of rationality
The axioms of expected utility theory prescribe a particular form of rationality for homo economicus. It is obvious from the previous discussion that there are a lot of different axioms
that compete for this role. The most prominent are alterations of the independence axiom, but
the other axioms are by no means sacrosanct either. Therefore, the need arises to reflect upon
the axioms intuitive appeal in order to determine which set of axioms we are most inclined to
adopt.
The main reason why expected utility is still the most widely used theory for modelling
decisions under risk and uncertainty is its normative appeal to many researchers (cf. [Har92,
320]). Normative approaches are concerned with consistency and coherence requirements of
rational preferences that are mostly formulated as (in themselves rather) convincing axioms.
They do not necessarily have descriptive accuracy but, upon reflection, the axioms should
convince people that their choices are wrong if the axioms are violated: People ought to behave
as prescribed by the theory, although they sometimes make errors and do not. The normative
interpretations of the two competing approaches discussed here are far less developed. Their
claims to be reasonable foundations of rationality are nevertheless strong.
24
For convenience, we shall first state a somewhat more verbose version of the axioms A0 -A3
in a concrete example. It is, however intuitive, not a full statement and therefore analytically
inferior to the prior statement (cf. [Bin94, 272f]).
Let X be a set of sure outcomes of a decision under risk and let w, b ∈ X be its worst
and best elements. S = {b : p, w : 1 − p} is a simple lottery with only two outcomes. Let
C = {b : p, x : q, w : r} be a 2-stage compound lottery. We assume the existence of stochastically
independent objective probabilities. For every outcome x ∈ X there exists a probability p∗ such
that
E0 —reduction: x ∼ {b : p∗ , w : 1 − p∗ } , b ∼ {b : 1, w : 0} and w ∼ {b : 0, w : 1} .
E1 —order: Higher probabilities p∗ in {b : p∗ , w : 1 − p∗ } are preferred.
∗
(5.1)
(5.2)
∗
Since (5.1) gives indifference of x, b and w with {b : p , w : 1 − p }, {b : 1, w : 0} and {b : 0, w : 1},
we can substitute the latter for the prior in the compound lottery C:
E2 —independence: {{b : 1, w : 0} : p1 , {b : p∗ , w : 1 − p∗ } : p2 , {b : 0, w : 1} : p3 } .
(5.3)
E0 –E2 amount to the agent maximising the expectation of the lottery
{{b : 1, w : 0} : p1 , {b : p∗ , w : 1 − p∗ } : p2 , {b : 0, w : 1} : p3 } ,
which equals p1 b + p2 p∗ b + p3 w. If we define u(b)=1, u(w)=0,
and u(x) = p∗ b, we get the
P
expected utility form U(C) = p1 u(1) + p2 u(p∗) + p3 u(0) =
pu(·).43 Hence, agents behave as
if maximising expected utility.
5.1
A0 : Reduction
A simple justification of the reduction axiom goes like this. If p, q are lotteries, e.g. p={1:.4,
0:.6}, q={2:.5, 0:.5}, then a mixture of the two is not trivial and can be interpreted as choice
from a 2-stage compound lottery—which implies the reduction axiom (cf. [Kar91, 1769]). The
justification for the existence of objective probabilities lies in the fact that the probabilities
come from mixed strategies—strategy profiles that are obtained by the employment of a randomisation device such as a coin toss. Since each mixed strategy is based on its own device,
it is quite reasonable to assume complete stochastic independence of the underlying probabilities.44 This property commands support for the reduction principle if the generated utility
functions are used for game theoretic analysis. If this is not the case, A0 may be indefensible
(cf. [Drè87]).
5.2
A1 : Order
Axiom A1 —order gives completeness, reflexivity, and transitivity of our primitive preference
relation. The completeness property can easily be attacked. It stems from the analytical
requirement of ex ante and constant preferences over outcomes and says that the decision-maker
must be able to compare all pairs of possible risky outcomes, which is obviously unrealistic.
It is, however, much easier to defend the complete preordering requirement in the specialised
expected utility context than for general goods-bundles, since it is quite plausible that people
have a universal (i.e. complete) preference order over final wealth levels. Hence, we rule out
incommensurability of prospects.
Reflexivity requires p ∼ p which is a mere technical requirement that always holds.
Transitivity is more serious. The idea behind transitivity is that people will always want to
correct intransitive behaviour if they discover it in their preordering simply because if they do
25
not, Dutch books can be made against them (cf. [Fis88a, 10]). The problem is more subtle,
however, as the following example may illustrate.
In apparent violation of transitivity, an individual chooses Salmon while Hot Dog is also
available in restaurant A and therefore reveals her preferences as Salmon Hot Dog. In a
different restaurant B, both are again available, but she chooses Hot Dog over Salmon. We
cannot represent these preferences by a rational preference ordering. The story, however, goes
on: The individual knows that it is outright dangerous to eat Salmon in a bad restaurant and
she would rather have a Hot Dog in a place like that where nothing much can go wrong. Her
effective preferences are therefore {Salmon, good restaurant} {Hot Dog, good restaurant}
{Hot Dog, bad restaurant} {Salmon, bad restaurant}. In the second choice situation,
the decision-maker presumed from the general cleanliness of the place, area of the restaurant,
competence of the waiter etc. that restaurant A is ‘good’ and B is ‘bad’. Stated like this, there
is no problem in expressing the individual’s preferences by a rational preordering.
This restatement reinforces the logical transitivity property by putting more pressure on
the completeness assumption—we conjecture that we can resolve all intransitivity with this
strategy (cf. [Ana93, 103]). Hence, we view the transitivity hypothesis as justified as long as
completeness is a reasonable assumption and conclude that intransitivity is a sign of irrational
behaviour.
5.3
5.3.1
A2 : Separation
Independence
Samuelson’s reasoning in favour of the independence axiom can be summarised as follows (cf.
[Sam52, 672]). We flip a coin that has a probability of showing tails of (1-α); in this case we
win the lottery w. If we are independently asked to decide whether we prefer lottery u or v in
the case the coin shows heads, we have duplicated the setting of the independence axiom. If
the coin lands tails, our preference over u and vdoes not matter—we get w in any case, but if
the coin shows heads, we are back to precisely the decision between u and v, independently of
w.
In general, the utility valuation of the outcome in one state depends on the valuation in
different states of nature. Apparently, in the coin example, the outcomes heads and tails
are mutually exclusive and therefore it is immaterial what would have happened in the other
state: the independence axiom is applicable—and quite rightly so.45 If events are exclusive
and lotteries are over total wealth levels, it is indeed rational to act like this. If heads occurs,
tails cannot have occurred—hence, the state realisations are independent. The point of nonexpected utility theory is that apparently there exist choice situations where this independence
cannot be granted.
We conclude that the independence axiom is applicable iff the decision-maker’s preferences
are separable among outcomes in the sense that each utility-ranking of an outcome is identical
regardless in what state it happens. There are choice situations, however, where it is rational
not to treat events as independent. Hence, we cannot retain the independence axiom for a
general theory of choice under risk.
5.3.2
Betweenness
As we have seen in subsection 4.3, betweenness is the property that for every p, q ∈ P (X), and
α ∈ (0, 1):
p ∼ q ⇒ αp + (1 − α)q ∼ p.
(5.4)
26
If a decision-maker is indifferent between two lotteries, she is also indifferent between any
convex combination of the two lotteries. Under the above assumptions, this implies both quasiconcavity and quasi-convexity of preferences:
p ∼ q ⇒ αp + (1 − α)q % p, p ∼ q ⇒ p ∼ p % αp + (1 − α)q
(5.5)
but does not require the other restrictions that the independence axiom places on preferences
which imply additive separability. We use (5.4) to give an intuitive example of what betweenness
means (cf. [Gra98, 15]).
I recently bought a CD-Writer, a so called ‘toaster’. This little machine allows me to ‘burn’
CDs. The technology, however, is not very mature: approximately one in five copies does
not work and the CD can only be thrown away. This is annoying because burning a CD on
my double-speed toaster takes a lot of time—approximately 40 minutes (”) for a 80” CD and
proportionally less time for less data. Moreover, it only shows if the CD is broken at the very
end of the process. There are very cheap 18” CDs, cheap 75” and not so cheap 80” CDs.46
Now suppose I want to take a weekend off and plan to burn a CD to take some data with
me. If the CD were useable on the one hand, I would prefer to have all the documents I have
written recently with me on a 80” CD because I could re-use many bits of these. On the other
hand, if the CD turned out to be broken, I would prefer to have burned only my dissertation
draft on a cheap 18” CD since then both the wasted money and time would be minimised.
Alternatively, I could also burn an audio-CD and just enjoy a nice weekend and listen to some
music on the train. There is an analogous choice situation between my beloved Pet Shop Boys
album on an 80” CD and just a single song from it on an 18” CD. I am quite indifferent between
actually working on the full data set and listening to the complete album, but I would prefer
to have only my dissertation to listening to the same song over and over again.
The example can be formalised as follows: p is the probability that the burned CD works
and (1-p) that it does not; Nature decides which state occurs. Let dis be the Bernoulli-valuation
of my dissertation, and all that of the complete data set, si the utility of the short song, and
lp that of the long album. bd 18 is the outcome of the useless broken 18” data-CD, which I
prefer to the outcome bd 80 of the broken 80” data-CD. ba 18 denotes the broken 18” audio-CD
preferred to ba 80 . According to the above preferences we have:
(lp ∼ all) (dis si), (bx18 bx80 ).
(5.6)
The corresponding lotteries look like this:47
t1
t3
{all : p, bd80 : 1 − p} t2
{lp: p, ba80 : 1 − p} t4
{dis: p, bd18 : 1 − p}
{si : p, ba18 : 1 − p}
Table 7: The toaster example in lotteries.
The extensive form of the choice situation is depicted in figure 5.3.2.
In the setting of the independence axiom, we can calculate independently of each state
the expectations of the lotteries ti and act according to expected utility maximisation. In
particular, we apply the independence axiom to eliminate the equivalent choices all and lp and
we are done.
To illustrate betweenness, however, we argue as follows. Ex ante, we choose under risk and
form plans of how to proceed. There are two equally valued, ‘optimal’ plans (all |CD works)
and (lp|CD works). Condition (5.4) together with (5.6) require that the decision maker is
indifferent between any mixture of the two optimal plans. Hence, any mixture of the optimal
27
all
CD works
p
lp
long
CD works
long
p
short
short
dis
N
si
N
bd80
1-p
ba80
long
1-p
CD broken
long
CD broken
short
short
bd18
ba18
(a)
(b)
Figure 18: The two choices of the toaster example in decision trees.
plans αoall |CD works)+(1-α)(lp|CD works) is another optimal plan. This is clearly not an
independence of the two optimal plans but a kind of ‘shuffling’ condition. It also illustrates
that there is nothing like consequentialism in this choice—at the contrary: the decision is based
on past information.
In the context of the decision trees of figure 5.3.2, betweenness does not imply that the
decision-maker’s choices at terminal nodes are independent of what her choices would be at other
(unrealised) terminal nodes. (The independence axiom would require just that.) Betweenness
does imply, however, that if there are several optimal plans, the decision-maker’s choices do
not depend on which optimal plan she would have followed at prior nodes (cf. [Gra98, 15]).
5.3.3
Regret
As Sugden points out, “you feel disappointed when what is compares unfavourably with what
might have been” [Sug86, 16]. This is the rationalisation behind regret theory, which is an
apparently weaker form of separation than complete independence of events. Even if events are
mutually exclusive, a decision-maker might feel regret that the other outcome had not realised.
Hence, the valuations of exclusive states are not independent. Since the experience of regret
should be intuitively clear, we shall not expand further.
5.3.4
Rank-dependent axioms: ordinal independence
The basic idea of rank-dependent utility theory is to provide a probability weighting while
avoiding dominance problems. Formulations are representable by the mathematical expectations of a utility function (over outcomes) with respect to transformations of the whole
probability distribution of outcomes. Since the prior ranking of each outcome is decisive for
the probability-weight it is assigned, all outcomes are taken into account for calculating the
probability-weight of each outcome. Therefore, it is possible for two outcomes with the same
objective probability to have different weights—which contrasts with the separability dictated
by the independence axiom.48 An example may illustrate this (cf. [Qui93, 63f]):
Imagine a decision-maker whose expected annual income is drawn from the interval £10.000£11.000. There is also a 1/100.000 chance for her to win in a lottery and receive an income of
£1 Mio. All realisation probabilities in [10.000, 11.000] are 1/100.000 as well. It is reasonable,
however, to assume that the extreme outcome will receive a different final weight than the
ones in the wage interval. Furthermore, it is quite realistic that this weight will depend on the
ranking of all possible outcomes—something quite impossible if outcomes were independent.
28
5.4
A3 : Smoothness
The third axiom (continuity) cannot be defended globally but is a consequence of our desire
to obtain real-valued utility functions. As is well known, lexicographical preferences cannot
be represented by a continuous utility functional because the notion of one alternative being
infinitely preferred to the other cannot be modelled (cf. [Mar98]). Thus, a continuity axiom
excludes a whole class of intuitively rather plausible preferences. To Julia the world is infinitely
worse without Romeo—no bundle is desirable for her if it does not include Romeo.
It is worth noting, though, that a rejection of the continuity axiom leaves von Neumann’s &
Morgenstern’s theory itself unscathed—their set theoretical axiomatisation does not require
continuous preferences. This fact is exploited by Hammond in his formulation of a nonArchimedean axiom that still allows the construction of an (extended) expected utility representation (cf. [Ham97, 26]). Hence, while the particular form A3 may not be convincing,
formulations exist that are. We therefore accept smoothness as one criterion of rationality.
6
Conclusion
This essay focuses on the narrow interpretation of rationality in the context of choice situations
where the course of the unfolding future is not known to the decision-maker. A series of alternative formulations that are intended to model such situations have been put forward. For the
critical appraisal of these competing theories, we use reasonable criteria such as static and dynamic consistency of choice, and persuasive rules of behaviour such as lower payoff-expectations
not to be preferred to higher ones (fosd ), or higher risk not to be preferred to lower-risks when
expectations are equal (sosd ). We suggest that we should accept the reduction, order, and continuity axioms (A0 , A1 , A3 ) and focus our criticism on the independence axiom in cases where
outcomes cannot be viewed as mutually exclusive. We identify a small group of alternatives to
independence centred on the ideas of betweenness and rank-dependence and discuss both their
normative and descriptive appeals. These are irresistible iff the strict independence of events
assumed by the independence axiom cannot be ensured.
Very influential people like Raiffa, Savage, or Harsanyi hold that expected utility theory is
the only adequate normative theory for choice under risk. In the face of the systematic empirical
violations that were established in numerous experiments, this view is difficult to uphold.
The basic result of this paper is that several normative non-expected utility theories have
a stronger claim to convincingly model rationality than von Neumann-Morgenstern’s theory
has. Nevertheless, due to their technical complexity and weakly explored implications, none of
the alternative theories we have at our disposal today commands comparable support in the
literature. They are but valuable stepping stones in the direction of a universally accepted
theory of choice—the remaining problems are formidable.
Notes
1
Both the technical terms risk and uncertainty refer to situations where an action leads to
a number of known outcomes among which one is selected randomly. A choice is referred to as
made under risk when the underlying probability distributions are known objectively (imperfect
but complete information). In the case of uncertainty, these probability distributions are not
objectively known—(differing) subjective beliefs have to be formed (incomplete information).
29
2
Von Neumann & Morgenstern interpret probabilities in terms of relative frequency as opposed to Ramsey’s interpretation of subjective degrees of belief (cf. [Neu44]). Hence, von
Neumann-Morgenstern’s theory is about risk and Ramsey’s and Savage’s theories are about
uncertainty. Today both cases are mostly subsumed under the latter. If we were in the present
approach to think of the set of outcomes as a primitive and the lotteries as acts, the theory of
decision-making under risk would be analogous to decision-making under uncertainty.
Von Neumann-Morgenstern’s theory has been developed to represent mixed strategies in the
general solution of zero sum games. Therefore, the probabilities in von Neumann-Morgenstern’s
lotteries are an implication of using mixed strategies. Since all mixed strategies are taken to be
based on independent randomisation devices, the assumption of the existence of independent
objective probabilities is justified in the context of games.
3
Recent contributions that compare expected utility theory’s predictions with those of nonexpected refinements are Hey & Orme and Carbone & Hey. The basic conclusion to be drawn
from this work seems to be that von Neumann-Morgenstern’s expected utility theory fares as
good as its non-expected contenders in about 40-55% of the tested cases. In the remaining
situations, one or another of the non-expected models shows better results (cf. [Hey94, 1321],
[Car95, 131]).
4
In essentially one-off situations (as e.g. horse races), we cannot conduct the necessary
repeated experiments to establish the relative frequency of the event in question. In these
situations, objective probabilities do not exist, and we have to recur to subjective degrees of
belief. To illustrate the difference, Anscombe & Aumann call uncertain lotteries horse lotteries
and risky lotteries roulette lotteries (cf. [Ans63]).
As Karni and Schmeidler point out, however, the theoretical arguments to apply nonexpected utility theories that obey only the betweenness or rank-dependency properties to
uncertainty are still to be found (cf. [Kar91, 1810]). Hence, we are forced to remain in the
framework of analysing risk.
5
Insightful alternative routes such as state-dependent utility theory, that gives up the requirement that all decisions are reducible to lotteries (cf. [Drè87]), are not elucidated. The same
applies to evolutionary (learning) approaches such as Binmore’s (cf. [Bin99]) and the field of
local expected utility analysis where the preference functional (i.e. a real valued function)is modified to accommodate the empirical violations while preserving in a surprisingly robust manner
most of the results of expected utility theory locally. Space limitations make it impossible to
discuss interesting and important empirical work (cf. [Hey94], [Har94]).
6
A word on the overall approach is in order here. Section three consists of original work
while the rest of the paper takes the form of a specialised survey. Though attempting not
to be excessively formal, a precise statement of the theories under investigation requires some
formal apparatus that in some cases may exceed an introductory level. Where necessary and
possible, however, this material is supplemented by examples which should convey the gist of
the argument.
7
Notice that the reduction axiom ensures that the players in the game are not playing
for their enjoyment. As Binmore points out, a gambler is unlikely to consider two lotteries
as equivalent when they generate the same prizes with the same probabilities. “He would
prefer whichever lottery squeezed the most drama and suspense from the randomising process.”
[Bin94, 261]
30
8
This approach has been pioneered by Kolmogorov (cf. [Kol33], [Bir40, 283]). For a definition of algebras see Gallant (cf. [Gal97, 20ff]).
9
Since the binary relation is asymmetric p q ⇒ ¬(q p) and negatively transitive
{p q, q r} ⇒ p r, it is a weak order. If ∼ is not assumed to be transitive, is a
partial order. We could alternatively use the more involved (and subjective) axiomatisation of
Anscombe and Aumann’s that does not require transitivity of ∼. Their version is less vulnerable
to criticism since there exist numerous examples of intransitivity of ∼ (cf. [Ans63], [Kah86],
[Fis88a, 40]).
9
Since the binary relation is asymmetric p q ⇒ ¬(q p) and negatively transitive
{p q, q r} ⇒ p r, it is a weak order. If ∼ is not assumed to be transitive, is a
partial order. We could alternatively use the more involved (and subjective) axiomatisation of
Anscombe and Aumann’s that does not require transitivity of ∼. Their version is less vulnerable
to criticism since there exist numerous examples of intransitivity of ∼ (cf. [Ans63], [Kah86],
[Fis88a, 40]).
9
Since the binary relation is asymmetric p q ⇒ ¬(q p) and negatively transitive
{p q, q r} ⇒ p r, it is a weak order. If ∼ is not assumed to be transitive, is a
partial order. We could alternatively use the more involved (and subjective) axiomatisation of
Anscombe and Aumann’s that does not require transitivity of ∼. Their version is less vulnerable
to criticism since there exist numerous examples of intransitivity of ∼ (cf. [Ans63], [Kah86],
[Fis88a, 40]).
9
Since the binary relation is asymmetric p q ⇒ ¬(q p) and negatively transitive
{p q, q r} ⇒ p r, it is a weak order. If ∼ is not assumed to be transitive, is a
partial order. We could alternatively use the more involved (and subjective) axiomatisation of
Anscombe and Aumann’s that does not require transitivity of ∼. Their version is less vulnerable
to criticism since there exist numerous examples of intransitivity of ∼ (cf. [Ans63], [Kah86],
[Fis88a, 40]).
10
It is worthwhile to reflect a moment on what a statement like αu+(1−α)w means. We draw
a real number α from the unit interval and multiply it with a lottery u on the one hand and
its complement 1 − α with a different lottery w on the other. This convex combinations can be
interpreted as mixture between the lotteries u and w or as a new compound lottery that results
in the outcomes that lottery u is played with probability α or lottery w with probability 1 − α.
Hence, a preordering that fulfils the above condition (2.2) is also called mixture-preserving. It
means that the (probability weighed) valuation of one risky prospect u is totally independent
of the other prospect w.
10
It is worthwhile to reflect a moment on what a statement like αu+(1−α)w means. We draw
a real number α from the unit interval and multiply it with a lottery u on the one hand and
its complement 1 − α with a different lottery w on the other. This convex combinations can be
interpreted as mixture between the lotteries u and w or as a new compound lottery that results
in the outcomes that lottery u is played with probability α or lottery w with probability 1 − α.
Hence, a preordering that fulfils the above condition (2.2) is also called mixture-preserving. It
means that the (probability weighed) valuation of one risky prospect u is totally independent
of the other prospect w.
10
It is worthwhile to reflect a moment on what a statement like αu+(1−α)w means. We draw
a real number α from the unit interval and multiply it with a lottery u on the one hand and
31
its complement 1 − α with a different lottery w on the other. This convex combinations can be
interpreted as mixture between the lotteries u and w or as a new compound lottery that results
in the outcomes that lottery u is played with probability α or lottery w with probability 1 − α.
Hence, a preordering that fulfils the above condition (2.2) is also called mixture-preserving. It
means that the (probability weighed) valuation of one risky prospect u is totally independent
of the other prospect w.
10
It is worthwhile to reflect a moment on what a statement like αu+(1−α)w means. We draw
a real number α from the unit interval and multiply it with a lottery u on the one hand and
its complement 1 − α with a different lottery w on the other. This convex combinations can be
interpreted as mixture between the lotteries u and w or as a new compound lottery that results
in the outcomes that lottery u is played with probability α or lottery w with probability 1 − α.
Hence, a preordering that fulfils the above condition (2.2) is also called mixture-preserving. It
means that the (probability weighed) valuation of one risky prospect u is totally independent
of the other prospect w.
11
This axiom is frequently referred to as Archimedean axiom.
11
This axiom is frequently referred to as Archimedean axiom.
11
This axiom is frequently referred to as Archimedean axiom.
11
This axiom is frequently referred to as Archimedean axiom.
12
Notice that it is the von Neumann-Morgenstern utility function U (p) that is linear while
the shape of the Bernoulli utility function u(x ) is not restricted. Moreover, the mapping U (p) of
lotteries is ordinal while the ranking of sure outcomes u(x ) is cardinal. A scale is cardinal if both
its origin and unit are arbitrary. Only linear (affine) transformations are permissible on such
scales without altering a given ranking. In the theory of choice under certainty utility functions
are ordinal, that is, every monotonic transformation leaving a given ranking unchanged is
tolerable. Ordinal scales arise from a mere ranking of preferences. To conclude from U (A)=10,
U (B )=20 that B is twice as good as A is nonsense with ordinal utilities (because we can only
say that B is preferred to A) but perfectly sensible with cardinal utilities.
13
Separability requires that an ordering on a subspace is invariant with respect to changes
in the values of variables outside the subspace. Additive refers to the aggregation rule for the
separable elements of the subspace. The important result that preferences over risky outcomes
exhibit additive separability iff they satisfy the independence axiom has been proved by Debreu
(cf. [Deb59]).
14
This naming reflects the fact that Daniel Bernoulli was the first to use a (logarithmic)
discounting of sure outcomes to solve his uncle’s St. Petersburg Paradox.
15
A level (or contour) set is a way of depicting an n-dimensional function in an n-1 -dimensional
space. It is used e.g. by cartographers who describe the topographical features of mountains
on a plane map. Level curves connect points of equal height on the map—level curves that are
packed closely together show steep slopes of the mountain. The same principle is used here for
utility: the level (or indifference) curves connect points of equal utility. The arrow points in
the direction where the ‘utility mountain’ increases most steeply from the current position.
16
Notice that while this is a natural way of looking at risk in the expected utility framework,
we shall later on (in section four) gradually get to use concepts like second order stochastic
32
dominance (defined below) to analyse attitudes towards risk.
17
Steeper indifference curves cross the (dashed) iso-expectation lines in direction of increasing
expectation. Hence, the agent has to be made indifferent for accepting higher risk by being
given higher expected payoffs—she is risk averse (figure 1.a). The reverse holds if the agent
accepts higher risks with lower expectations—she is risk prone (figure 1.b).
18
Jensen’s inequality states that for every non-degenerate random variable x and any strictly
concave function f (x) we have: E[f (x)] < f (E[x]). That is, the expectation of the concave transformation of x is smaller than the concave transformation of the expectation of x.
We
R arrive at the Rabove result (8) by replacing the expectation by its integral formulation
f (x)dF (x) < f ( xdF (x).
19
The cumulative distribution functions F and G are drawn from the space P (X) that we
refer to in the previous note. We will not go into detail here and just think of P (X) as the
stochastic extension to the space L defined in our statement of the independence axiom.
20
Ellsberg’s paradox illustrates the failure of Savage’s substitution principle (our A0 ) for preferences under uncertainty due to comparative characteristics of events. It reflects the inability
of agents to form unique subjective probabilities as required by the substitution principle—
therefore this paradox is outside the scope of decision making under risk where the existence
of objective probabilities is more easily defended in a game theoretic context. Individuals are
asked to draw at random from an urn containing 90 balls, 30 of which are red (R) and 60
of which are either black (B) or yellow (Y ) in unknown proportion. Two pairs of acts are
considered:
f1
f3
win £1,000 if R is drawn
win £1,000 if R or Y is drawn
f2
f4
win £1,000 if B is drawn
win £1,000 if B or Y is drawn
Most individuals prefer f1 to f2 but in direct violation of the substitution principle f4 to f3 .
The reason is apparently that the number of balls being either R or Y is unknown (between
30 and 90) while the number of balls that are B or Y is known to be 60. This choice does
not allow additive subjective probabilities to be formed since f1 > f2 gives π(R) > π(B) and
f4 > f3 gives π(B ∪ Y ) > π(R ∪ Y ) ⇒ π(B) > π(R) which is a contradiction.
21
Generally probability proportions are not drawn to scale in order to emphasise the effect.
22
The certainty equivalent of a lottery is the number of pound sterling paid with certainty
for which the agent is prepared to sell her right to participate in a lottery.
23
In this essay, it is our general strategy to reinforce the (logical) transitivity property by
putting more pressure on the assumption of completeness. An example may illustrate this.
Imagine an individual preferring a red car to a yellow car in the morning but a yellow car to
a red car in the evening. A car dealer can use the individual as a money pump by selling
her the (same) red car every morning at a premium over what he paid her for the car last
evening. He can extract the same premium in the evening for reselling her the yellow car. This
may be interpreted as intransitive behaviour or as change of preferences. We follow a different
strategy, though, and view the consumer as endowed with stable preferences over goods-bundles
of the following kind: {yellow car, morning} {red car, morning} and {red car, evening} {yellow car, evening}. Interpreted as this, there is no intransitivity. (A further example is given
subsection 5.2)
33
24
Formally, the elements of an information set are a subset of a player’s decision nodes. The
interpretation is that when the decision-maker reaches a particular node in an information set,
she does not know at which node in the set she is—all nodes in the particular information
set are equally possible (cf. [Mas96, 222]). Perfect information means that each information
set in the decision tree contains only one node (i.e. is a singleton)—in the case of imperfect
information, information sets may contain multiple nodes (e.g. in a case of imperfect recall).
25
Notice that in this case, when probabilities do not sum to one, the expectation of a constant
would be different from that constant.
26
For an intuitive account of how weighting functionals that exhibit fosd compare to those
that do not, compare figure 14.b to 10.b.
27
If an individual’s preferences are intransitive, then lotteries u, v, w can be found for which
u v, v w, and w u. Thus, an individual owning the right to participate in u can be
offered participation in v in exchange for u, which she will prefer. Likewise, she will prefer
w to v. We reduce all payoffs x i (we assume there are n outcomes) in u by some arbitrarily
small amount ε leaving the original preference relation unchanged. Given her preferences, the
individual will prefer u-ε to w, ending in her initial situation but being worse off by the amount
εn.
The reason why it is not necessarily the case that forms other than the expected utility
form will lend themselves to be exploited as a money pump is that most of the functionals
used as non-expected utility functionals will not satisfy V (u) > V (v) > V (w) > V (u). This
point is not very deep: Let V (u) = 5, V (v) = 4, and V (w) = 3. Then it will never be true
that 5 > 4 > 3 > 5—continuous real valued functions cannot form such a cycle (cf. [Mac89]).
Hence, expected utility functionals are not the only ones that forbid intransitivity.
28
Indeed, as Machina points out, it is helpful to think of consequentialism as some form
of dynamic separability. Then it is intuitively clear that every form of intertemporal utility
maximisation assuming consequentialism must entail expected utility maximisation. We are
not forced, though, to assume consequentialism (cf. [Mac89]).
29
The difference is that while the above money pump was driven by fosd, this one is animated
by sosd. The prior is about different expectations, the latter about differing risks.
30
For simple probability measures, for some finite subset E of X we have p(E) = 1. We
denote the set of all simple probability measures on X as ∆(X).
31
The wording suggests the subjective probability interpretation. This is for the clarity of
the example only—we could also use an objective terminology.
32
A different and very interesting departure from Kolmogorov’s definition involving nonstandard analysis is taken by Hammond (cf. [Ham97]).
33
The restriction to the Cartesian plane does not help to show that on each axis we now have
lattices and not measures. A little fantasy is therefore required to imagine that we have really
the two-dimensional structure of figure 8.b on each axis and quadruples instead of tuples in our
basic lattice.
34
Several excellent surveys on this matter exist (cf. [Fis88a], [Mach89], [Kar91], [Qui93],
[Sug86]); their scope, focus, depth and technical level is, however, substantially different from
34
this investigation and there exists no comparable single discussion of the material presented
here.
35
The first moment is given by the probability-weighted mean of the Bernoulli utilities of
the sure outcomes in a lottery. This is identical to the expected utility. The second moment
is given by the variance of these Bernoulli utilities (larger spreads of the outcomes represent
increased risk) and the third moment by third power of the respective values (interpreted as a
measure of the degree of risk aversion).
36
A result that is locally similar to expected utility theory is obtained by using alternative
notions of ‘smoothness’ in the sense of differentiability. The preference functional is taken to
be Fréchet-, or more weakly, Gâteaux-differentiable; if the functional has a local linear approximation then it behaves locally like an expected utility functional—a point made masterfully
by Mark Machina (cf. [Mac82]).
37
Chew & Dekel’s betweenness is equivalent to Fishburn’s mixture-monotonicity, Chew &
Epstein’s notion of implicit linear utility and Gul & Lantto’s dynamic programming solvability
(cf. [Che83, 1067], [Dek86, 304], [Fis88a, 7], [Che89, 208], [Gul90, 170]).
38
Notice that we replace the general convex set L in the original discrete formulation of A2
with P(X) that covers the stochastic case. This space is a general space of probability measures
on the metric set of outcomes X and a σ-algebra on X. These technicalities are required as we
leave the finite realm of lotteries but are not to be given more than a passing remark here. A
rigourous development can be found in Karni and Schmeidler [Kar91].
A function f : A → R defined on the convex set A ⊂ RN is quasi-concave if its upper
contour sets {x ∈ A : f (x) ≥ t} are convex sets for any t ∈ R. Analogously, f (x) is quasiconvex if its lower contour sets are convex. Dekel lays out why these properties are desirable in
a game-theoretic context: quasi-concavity of preferences is necessary for the existence of mixedstrategy Nash equilibria and quasi-convexity is necessary and sufficient for dynamic consistency
of choices under risk and uncertainty (cf. [Dek86, 305]).
39
40
A different and very interesting interpretation of transitivity comes from the voting literature that uses spatial preference representations. In these contexts, single-peaked preferences
represent those preferences that cannot form a cycle (i.e. transitivity) and multi-peaked preferences do ([Mue89, 65]). The basic result from this literature is that only one-dimensional issues
will exhibit transitivity—we should not include several issues in one vote.
41
Chew & Epstein unify the betweenness and RDU-approaches in a more general axiomatisation of their implicit rank-linear utility theory. They agree, however, that there is no intersection
between betweenness and RDU-theory other than expected utility theory (cf. [Che89, 208]).
Hence, our normative separation seems warranted.
42
Second order stochastic dominance (sosd ) is about distributions with the same mean (i.e.
expected value) but different risks such as a mean preserving spread of a given distribution.
Hence, distributions exhibiting sosd have the same expectations, but the dominated distribution
poses a higher risk. An example of sosd is given in figure 16.
Notice that the possible values of p∗ fully determine the shape of the Bernoulli utility
function u(·). Therefore, U(C) orders the agent’s preferences over lotteries and u(p) measures
the agent’s preferences cardinally.
43
35
44
Binmore gives an example of a case of stochastic dependencies between lotteries. “Assume
that you are indifferent between $10 for sure and getting $20 if a fair coin falls heads. Now
suppose that you are to receive $10 if a fair coin falls tails. Are you ready to swap the $10 prize
in this lottery for the opportunity to win $20 if a fair coin falls heads? Think twice, because
nothing has been said to exclude the possibility that it is the same coin toss that is used in
each lottery!” [Bin94, 272]
45
On the same line, Broome asks “How can something that never happens possibly affect the
value of something that does happen?” (cf. [Bro91, 96]).
46
Let us assume that the failure rate of the cheap and not so cheap CDs is identical. An
interpretation of a higher degree of state dependency can be given if we assume that the failure
rate of cheap CDs is higher than that of expensive ones. The decision-maker’s choices then
depend on states.
47
The dashed line signals that the encircled nodes form an information set—the decision
maker does not know at which node in the set she is.
48
A criticism of RDUs that immediately comes to mind is based on the fact that we have
to assume a prior ranking of outcomes to construct our weighting functional. This is peculiar
since we would much rather want to obtain a ranking as a result of the model than as an
assumption. This prior ranking, however, is not necessarily a property of RDUs as shown by
Safra & Segal with their axiomatisation of dual theory called dual betweenness (cf. [Saf98]).
Hence, this criticism is not tenable.
7
References
[All53] Allais, M., The foundations of a positive theory of choice involving risk and a criticism
of the postulates and axioms of the American school, in: Allais, M., Hagen, O. (eds.), Expected
utility hypotheses and the Allais paradox, 27-145, Dordrecht: Reidel, 1979
[Ana93] Anand, P., Foundations of rational choice under risk, Oxford: Clarendon Press, 1993
[Ans63] Anscombe, F., Aumann, R., A definition of subjective probability, Annals of Mathematical Statistics, 34:199-205, 1963
[Bac91] Bacharach, M., Hurley, S., Foundations of decision theory, Oxford: Blackwell, 1994
[Bar98] Barberà, S., Hammond, P., Seidl, C. (eds.), Handbook of utility theory, 1, Dordrecht:
Kluwer Academic, 1998
[Bel85] Bell, D., Disappointment in decision making under uncertainty, Operations Research,
33:1-27, 1985
[Bin94] Binmore, K., Playing fair: game theory and the social contract, Cambridge: MIT Press,
1994
[Bin99] Binmore, K., Why experiment in economics, Economic Journal, 109:F16-24, 1999
[Bir40] Birkhoff, G., Lattice theory, 3rd ed., American mathematical society colloquium publications 25, 1967
[Bro91] Broome, J., Weighing goods, Oxford: Blackwell, 1995
[Car95] Carbone, E., Hey, J., A comparison of the estimates of EU and non-EU preference
functionals using data from pairwise choice and complete ranking experiments, in: Gollier, C.,
Machina, M. (eds.), Non-expected utility and risk management: A special issue of the Geneva
36
Papers on Risk and Insurance Theory, 20(1), 111-33, Dordrecht: Kluwer Academic, 1995
[Che83] Chew, S., A generalisation of the quasilinear mean with application to the measurement
of income inequality and decision theory resolving the Allais paradox, Econometrica, 5x:106592, 1983
[Che89] Chew, S., Epstein, L., A unifying approach to axiomatic non-expected utility theories,
Journal of Economic Theory, 49:207-40, 1989
[Deb59] Debreu, G., Topological methods in cardinal utility theory, in: Arrow, K., Karlin, S.,
Suppes, P. (eds.), Mathematical methods in the social sciences, 16-26, Stanford: SUP, 1960
[Dek86] Dekel, E., An axiomatic characterisation of preferences under uncertainty: weakening
the independence axiom, Journal of Economic Theory, 40:304-18, 1986
[Drè87] Drèze, J., Essays on economic decisions under uncertainty, Cambridge: CUP, 1987
[Fis82] Fishburn, P., Normative measurable utility, Journal of Mathematical Psychology, y6:3167, 1982
[Fis88] Fishburn, P., Expected utility: An anniversary and a new era, Journal of Risk and
Uncertainty, 1:267-83, 1988
[Fis88a] Fishburn, P., Non-linear preference and utility theory, Baltimore: Johns Hopkins University Press, 1988
[Gal97] Gallant, R., An introduction to econometric theory, Princeton: PUP, 1997
[Gär88] Gärdenfors, P., Sahlin, N. (eds.), Decision, probability, and utility, Cambridge: CUP,
1988
[Gra98] Grant, S., Kajii, A., Polak, B., Decomposable choice under uncertainty, Working paper,
Yale University, December 1998
[Gre87] Green, J., ,,Making book against oneself,“ the independence axiom, and non-linear
utility theory, Quarterly Journal of Economics, 102:785-96, 1987
[Gre88] Green, J, Jullien, B., Ordinal independence in non-linear utility theory, Journal of Risk
and Uncertainty, 1:355-87, 1988
[Gul90] Gul, F., Lantto, O., Betweenness satisfying preferences and dynamic choice, Journal of
Economic Theory, 52:162-77, 1990
[Hag71] Hagen, O., Towards a positive theory of preferences under risk, in: Allais, M., Hagen,
O. (eds.), Expected utility hypotheses and the Allais paradox, 271-302, Dordrecht: Reidel, 1979
[Ham88] Hammond, P., Consequentialist foundations for expected utility, Theory and Decision,
25:25-78, 1988
[Ham97] Hammond, P., Non-Archimedean subjective probabilities in decision theory and games,
Working paper, Stanford University, December 1997
[Har92] Harsanyi, J., Normative validity and meaning of von Neumann and Morgenstern utilities, in: Binmore, K., Kirman, A., Tani, P. (eds.), Frontiers of game theory, Cambridge: MIT
Press, 1993
[Har94] Harness, D., Cramerer, C., The predictive utility of generalised expected utility theories,
Econometrica, 62:1251-89
[Hey94] Hey, J., Orme, C., Investigations of expected utility theory using experimental data,
Econometrica, 62:1291-326
[Hey97] Hey, J. (ed.), The economics of uncertainty, 2 vols., International library of critical
writings in economics, Cheltenham: Edward Elgar, 1997
37
[Kah79] Kahneman, D., Tversky, A., Prospect theory: An analysis of decision under risk,
Econometrica, 47:263-91, 1979
[Kah88] Kahneman, D., Tversky, A., Rational choice and the framing of decision, in: Decision
Making, Cambridge: CUP, 1988
[Kar89] Karni, E., Safra, Z., Dynamic consistency, revelations in auctions and the structure of
preferences, Review of Economic Studies, 56:421-34, 1989
[Kar91] Karni, E., Schmeidler, D., Utility theory with uncertainty, in: Hildenbrand W., Sonnenschein, H. (eds.), Handbook of mathematical economics, 4:1763-1831, Amsterdam: Elsevier
Science Publishers, 1991
[Kel93] Kelsey, D., Dutch book arguments and learning in a non-expected utility framework,
Discussion paper, University of Nottingham, 1993
[Kol33] Kolmogoroff, A., Grundbegriffe der Wahrscheinlichkeitsrechnung, Berlin, 1933
[Kre78] Kreps, D., Porteus, E., Temporal resolution of uncertainty and dynamic choice theory,
Econometrica, 46:185-200, 1978
[Lic71] Lichtenstein, S., Slovic, P., Reversals of preferences between bids and choices in gambling
decisions, Journal of Experimental Psychology, 101:16-26, 1971
[Loo82] Loomes, G., Sugden, R., Regret theory: an alternative theory of rational choice under
uncertainty, Economic Journal, 92:805-24, 1982
[Loo83] Loomes, G., Sugden, R., A rationale for preference reversal, American Economic Review, 73:428-32, 1983
[Mac79] MacGrimmon, K., Larsson, S., Utility theory: Axioms versus ‘paradoxes’, in: Allais,
M., Hagen, O. (eds.), Expected utility hypotheses and the Allais paradox, 333-409, Dordrecht:
Reidel, 1979
[Mac82] Machina, M., Expected utility analysis without the independence axiom, Econometrica,
50:277-323, 1982
[Mac84] Machina, M., Temporal risk and the nature of induced preferences, Journal of Economic Theory, 33:199-231, 1984
[Mac87] Machina, M., Choice under uncertainty, Economic Perspectives, 1:121-54, 1987
[Mac89] Machina, M., Dynamic consistency and choice under uncertainty, in: [Bac91, 39-91]
[Mar86] Marschak, T., Independence versus dominance in personal probability axioms, in:
Heller, W., Starr, R., Starrett, D. (eds.), Social choice and public decision making: Essays in
honour of Kenneth J.Arrow, I, Cambridge: CUP, 1986
[Mar98] Martı́nez-Legaz, J., Lexicographic utility and orderings, in: [Bar98, 345-70]
[Mas96] Mas Colell, A., Whinston, Green, J., Microeconomic Theory, Oxford University Press,
1996
[May54] May, K., Intransitivity, utility, and the aggregation of preference patterns, Econometrica, 22:1-13, 1954
[Mor79] Morgenstern, O., Some reflections on utility, in: Allais, M., Hagen, O. (eds.), Expected
utility hypotheses and the Allais paradox, 175-83, Dordrecht: Reidel, 1979
[Mue89] Mueller, Dennis C., Public Choice II, Cambridge: CUP, 1989
[Mun87] Munier, B. (ed.), Risk, decision and rationality: Proceedings of the 3 rd international
conference on the foundations and applications of utility, risk and decision theories, Dordrecht:
Reidel, 1988
38
[Neu28] Neumann, v. J., Zur Theorie der Gesellschaftsspiele, Mathematische Annalen, 100:295320, 1928
[Neu44] Neumann, v. J., Morgenstern, O., The theory of games and economic behaviour,
Princeton: Princeton University Press, 1944
[Qui93] Quiggin, J., Generalised expected utility theory: The rank-dependent model, Dordrecht:
Kluwer, 1993
[Saf98] Safra, Z., Segal, U., Dual betweenness, Working paper, Tel Aviv University, November
1998
[Sam52] Samuelson, P., Probability, utility, and the independence axiom, Econometrica, 20:6708, 1952
[Sav51] Savage, L., The Foundations of Statistics, New York: Wiley, 1951
[Sug86] Sugden, R., New developments in the theory of choice under uncertainty, Bulletin of
Economic Research, 38:1-24, 1986
[Web87] Weber, M., Camerer, C., Recent developments in modelling preference under risk, OR
Spektrum, 9:129-51, 1987
[Yaa87] Yaari, M., The dual theory of choice under risk, Econometrica, 55:95-115, 1987
39