A Conceptual Foundation for the Theory of Risk Aversion

A CONCEPTUAL FOUNDATION FOR THE THEORY OF RISK AVERSION
(WORKING PAPER)
YONATAN AUMANN
Bar Ilan University
Ramat Gan, Israel
Abstract. Classically, risk aversion is equated with concavity of the utility function. In this
paper we explore the conceptual foundations of this definition. In accordance with neo-classical
economics, we seek a scale-free definition of risk aversion, based on the decisions maker’s preference
order alone, independent of numerical values. We explore two such definitions. We then show that
when cast in quantitative form these ordinal definitions coincide with the classical Arrow-Pratt
definition once the latter is defined with respect to the appropriate scale (which, in general is not
money), thus providing a conceptual foundation for the classical definition. The implications of
the theory are discussed, including, in particular, to defining risk aversion for non-monetary goods,
to disentangling risk aversion from diminishing marginal utility, to multi-commodity/multi-period
risk aversion, and to inter-temporal discounting. The entire study is within the expected utility
framework.
Keywords: Risk aversion, Utility theory, Ordinal preferences, Multiple objectives decision making.
1. Introduction
1.1. Risk Aversion - The Classic Approach. The concept of risk aversion is fundamental in
economic theory. Classically, it is defined as an attitude under which the certainty equivalent of
a gamble is less than the gamble’s expected value; e.g., if a decision maker prefers one unit with
certainty over a fair gamble between three units and none, then she is deemed risk averse.
Examining this core definition, two fundamental questions arise, the combination thereof drives
this work.
First, there is the matter of scale. Consider a decision maker having to choose between lotteries
on the temperature-level in her office room. If she prefers 40◦ F with certainty over a fair gamble
between 30◦ and 60◦ – should this be considered risk aversion? The Fahrenheit scale seems rather
arbitrary in this case, but it is not clear what other scale should or can be used. In the seminal
works of Arrow [2] and Pratt [25], risk aversion was defined with respect to money and the market
E-mail address: [email protected].
Date: April 14, 2017.
1
value of the goods. This, however, limits the notion to monetary (or liquid) goods. A core question
is thus if and how risk aversion can be defined for non-monetary goods - temperature, health, love,
pain, and the like.
The second question is a conceptual one. The classic definition of risk aversion seems to be
based on the presumption that the natural certainty equivalent of a gamble is its expectation, and
risk aversion is defined with respect to this natural certainty equivalent. The question, however, is
why this presumption? Clearly, it cannot rest on empirical evidence, as most people are assumed
to be risk averse. The justification must be conceptual. But on a conceptual level, it is not clear
what reasoning dictates that a fair gamble between $100 and $200 “should” be worth $150; From
a decision theoretic perspective, where preference orders and independence curves are the core
elements of interest, what is the significance of the arithmetic mean? Providing a conceptual,
decision theoretic justification for basing the definition of risk aversion on the arithmetic mean is
the second core goal of this paper.
1.2. A Scale-Free Foundation. In order to address the above questions, we start by seeking a
“scale-free” definition of risk aversion, independent of any units, and making no use of arithmetic
notions such as mean or expectation. We consider two such definitions, as outlined shortly. Both
definitions are based solely on the internal structure of the decision maker’s preferences. Having
defined risk aversion in purely ordinal terms, we then derive a quantitative/numeric form of these
definitions. This quantitative form, we show, coincides with the classic Arrow-Pratt definition, once
the latter is defined with respect to an appropriate, natural scale. This scale, which in general is not
money, applies to any goods - monetary or non-monetary. This provides the missing conceptual
justification for the use of the expectation as the baseline for defining risk aversion, and, and the
same time determines the “appropriate” scale.
Lottery Sequences. Consider a lottery L with certainty equivalent c. Arguably, the most extreme
form of risk aversion would be exhibited if, with probability 1, the certainty equivalent is inferior to
the outcome of the lottery. If that is the case then the decision maker is willing to pay a premium,
with certainty, merely to avoid being in an uncertain situation. Such a preference, however, is
ruled out by the von Neumann-Morgenstern (NM) axioms; the utility of a lottery must lie between
the utilities of its possible outcomes. Interestingly, while such a preference is indeed not possible
for any single lottery, it is possible once we consider sequences of lotteries, and risk aversion as a
policy - consistently adhered to over multiple gambles. We show that for some preference orders
(agreeing with the NM axioms), repeatedly choosing the certainty equivalent of a lottery over the
lottery itself can result in an outcome that is inferior to what would have been the outcome of the
lotteries, with probability 1. This is thus our first scale-free definition of risk aversion: a preference
order is deemed risk-averse if adhering to this preference order over repeated lotteries ultimately
results in an inferior outcome, with probability 1. Importantly, here “inferior” is according to the
2
decision maker’s own preference order, over sequences, not any external market-based criterion.
The details of the definition are provided in Section 3.
Finite Sequences. The above definition is set in the context of an infinite sequence of time periods.
The second definition applies to any number (two or more) of time periods (or alternatively, any
other partition of the space into two or more independent factors1). When the number of lotteries
is finite, there can be no (non-trivial) behavior “with probability 1”. So, we cannot require that
the realization of the lottery sequence be preferred to the certainty equivalent with probability 1.
Rather, we define risk aversion as a preference wherein, for fair lottery sequences, the probability
that the realization is (weakly) preferred to the certainty equivalent is greater than the probability
of the reverse (that is, the probability that the certainty equivalent is (weakly) preferred to the
realization). The exact definition is provided in Section 5.
A Quantitative Form. Having established scale-free definitions of risk-aversion, we show that these
notions can also be cast in quantitative form, using an appropriate scale. Such a scale, we show,
is provided by the multi-attribute (additive) value function, pioneered by Debreu [7, 8], and commonly used in the theory of multi-attribute decision theory (see [20]). Debreu proves that (under
appropriate conditions) the preferences on commodity bundles can be represented by the sum of
appropriately defined functions of the individual commodities. Importantly, these Debreu functions
are defined solely on the basis of the internal preferences amongst the commodity bundles. Thus,
unlike monetary value - which is determined by external market forces - the Debreu functions represent the decision maker’s own preferences. Also, the functions are defined using the preferences on
sure outcomes alone, with no reference to gambles. Thus, they provide a natural, intrinsic yardstick
with which risk-aversion can be measured.
We show that the two above mentioned scale-free definitions of risk-aversion coincide with the
Arrow-Pratt numerical definition, once the latter is defined with respect to the Debreu value function. Essentially, we show that the NM utility function is concave with respect to the associated
Debreu function if and only if the given preference order is risk averse, under either of our definitions.
1.3. Implications. The approach offered in this paper has several implications for the understanding of risk aversion, both conceptual and technical, including:
Non-monetary Goods. First, the approach offers a way to define risk aversion for non-monetary
goods and goods with no natural scale, such as temperature, pain, and pleasure. Indeed, in the
definitions of this paper, externally defined scales (such as market value) do not play any role.
Rather, the only scale of interest is the intrinsic Debreu value, which reflects the decision maker’s
own certainty preferences.
1formally defined in Section 2
3
Disentangling Risk Aversion from Diminishing Marginal Utility. On a conceptual level, the approach offered in this paper provides a natural way to disentangle risk aversion from diminishing
marginal utility. In this scheme, the curvature of the NM utility function with respect to money
is decomposed into two components: the curvature of the Debreu value function with respect to
money, and the curvature of the NM utility function with respect to the Debrue value function.
With this decomposition, the former may naturally be associated with diminishing marginal utility,
while the latter - we argue - represents the risk aversion component. This direction is explored in
more detail in Section 9.
Multi-commodity Risk Aversion. Ever since the publications of Arrow [2] and Pratt [25], researchers
have considered how to extend the definition and associated measures to the multi-commodity
setting, and various approaches have been suggested (see [21, 29, 24, 9, 18, 26, 19, 23] for some
references in the expected utility model). Part of the problem is that in the multi-commodity setting
each commodity has its own scale, so it is not clear what scale should be used when measuring the
concavity of the utility function. Our approach here circumvents this problem. In our approach the
“native” scales of the different commodities are immaterial. Rather, risk aversion is defined with
respect to the preference structure, and the associated Debreu value function. This value function
is shared across all commodities, so there is only one relevant scale. In Section 6 we show how this
approach naturally extends the Arrow-Pratt framework to the multi-commodity setting.
1.4. Assumptions.
Independence. Independence is a key notion and assumption throughout this work. Simply put,
a commodity, or set of commodities, is independent if the preference order over bundles of this
set of commodities is independent of the state in other commodities.2 Arguably, independence is
a strong assumption; having eaten Chinese food today may affect one’s gastronomical preferences
tomorrow. Nonetheless, independence is a common assumption in economic literature, and in
particular with respect to time preferences; indeed, the standard exponential discounted-utility
model implies independence of any time period (indeed, any subset of the time periods). We use
the independence assumption not because we believe it is a 100% accurate representation of reality,
but rather because we believe it is a good enough approximation, which allows us to concentrate
on and formalize other key notions.
Expected Utility. This work is presented entirely within the expected-utility (EU) framework. The
key reason is that the classical Arrow-Pratt definitions were provided within this framework, and we
seek to explore the conceptual foundations of these definitions. Additionally, while EU is perhaps
not the only possible model, it nonetheless is a possible model; and one that is frequently used
in real-world economic and financial applications. So, understanding the notion of risk aversion
2A formal definition is provided in the next section.
4
within this framework is of interest. Extending these ideas to non-EU models is an interesting
future research direction.
1.5. Plan of the Paper. The remainder of the paper is structured as follows. Immediately
following, in Section 2, we present the model, terminology and notation used throughout. The
definition based on infinite lottery sequences is presented in Section 3, and its quantitative form in
Section 4. Section 5 presents the second definition, together with its equivalent quantitative form.
Implications to multi-commodity risk aversion are discussed in Section 6, and to inter-temporal
discounting in Section 7. Some related work is discussed in Section 8. We conclude the main body
of the paper with a discussion in Section 9. All proofs are deferred to an appendix.
2. Model, Terminology and Notation
The Spaces. Preferences are defined over consumption spaces of the form S = C1 × · · · × Cm , where
each Ci is a separable and arc-connected topological space, representing the consumption space of
a specific commodity (at a specific time period). We call each Ci a commodity space.
Lotteries. All lotteries considered are finite support lotteries. The set of all such lotteries over a
space S is denote by ∆(S ). The notation hs1 , . . . , sn : p1 , . . . , pn i, denotes the lottery wherein
outcome sk occurs with probability pk .
Preference Orders. For a space S , two preferences orders are considered:
• the certainty preferences: a preference order - on S ,3
∆
• the lottery preferences: a continuous preference order - on ∆(S ), which agrees with - on
the sure outcomes.
As customary, ≺ denotes the strict preference order induced by -, and ∼ the induced indifference
∆
∆
∆
relation; similarly ≺ and ∼ denote the relations induced by -. Continuity of - means that for
∆
∆
∆
∆
any lottery L, the sets {s : s≺L} and {s : sL} are open (in S ). Since - and - agree on S , this
implies that - is also continuous (that is, the sets {s : s≺s0 } and {s : ss0 } are open for all s0 ∈ S )
∆
∆
All commodity spaces Ci are assumed to be strictly essential [16]; that is, for each i and s−i ∈ C−i
(the remaining commodities), there exist si , s0i ∈ Ci with (si , s−i ) 6∼ (s0i , s−i ).
We assume throughout that the von Neumann-Morgenstern (NM) axioms hold for all preference
orders on lotteries.
Factors and Partitions. The term factor refers to a single Ci or a product of several Ci ’s; i.e., a
factor is the product of one or more commodity spaces. Most commonly in this paper, factors
represent time periods; that is, the factor Ti is the product of all commodity spaces associated
3A preference order is a complete, transitive and reflexive binary relation.
5
with consumption at time i. A partition of S is a representation of S as a product of factors
S = T1 × · · · × Tn .4 An element of S (or of any factor) is called a bundle.
Q
Throughout, ai , bi , ci represent elements of Ti . For i, j, we denote S−{i,j} = t6=i,j Tt . For
c ∈ S−{i,j} , by a slight abuse of notation we denote
(1)
(ai , aj , c) = (c1 , . . . , ci−1 , ai , ci+1 , . . . , cj−1 , aj , cj+1 , . . . , cn ).
Bundle Intervals. For s - s, we denote
[s, s] = {s : s - s - s}
That is, [s, s] is the closed interval of bundles between s and s. Hence, we call such an [s, s] a
bundle interval, or simply interval.
Utility Representations. A function f : S → R represents - if for any s, s0 ∈ S ,
s - s0 ⇐⇒ f (s) ≤ f (s0 ).
∆
The function f : S → R is an NM utility of - if for any L1 , L2 ∈ ∆(S ),
∆
L1 -L2 ⇐⇒ EL1 [f (s)] ≤ EL2 [f (s)],
where ELj [f (s)] is the expectation of f (s) when s is distributed according to Lj . In that case we
∆
also say that f represents -.
Independence. Independence is a key notion in our analysis. Simply put, a factor is independent if
the preferences on the factor are well defined; i.e., the preferences within the factor are independent
of the state in other factors. Formally, for a partition S = T1 × · · · × Tn , we say that factor Ti
is independent if there exists a preference order -Ti on Ti such that for any ai , bi ∈ Ti and any
c ∈ S−i (the remaining factors),
ai -Ti bi ⇐⇒ (ai , c) - (bi , c).
It is important to stress that independence only refers to the certainty preferences; it does not state
or imply that the preferences on lotteries in one factor are independent of the state in other factors.
That would be a much stronger assumption, which we do not make.
When no confusion can result, we may write - instead of -T ; thus, when a, a0 ∈ T , we may
write a - a0 instead of a -T a0 . It is worth noting that the product of independent factors need
not be independent.5
A partition S = T1 × · · · × Tn is an independent partition if the product of any subset of factors
is independent. By Gorman [16], for n ≥ 3, it suffices to assume that Ti × Ti+1 is independent for
all i, and the independence of all other products then follows.
4For our purposes, we only care about the commodities of the space, not their order. So, it may be that the order
of the commodities in the partition S = T1 × · · · × Tn does not correspond to the original order.
5A simple example is the preference on X × Y × Z = (R+ )3 represented by the function v(x, y, z) = xy + z. Here,
each commodity space is independent, but Y × Z is not independent.
6
Relative Convexity/Concavity. Let f, g : S → R, for some space S, with g(x) = g(y) ⇒ f (x) = f (y),
for all x, y ∈ S. We say that f is concave with respect to g if there is a concave function h with
f = h ◦ g. Similarly for convexity, strict concavity, and strict convexity.
3. Lottery Sequences
Our definition of risk aversion is set in the context of lottery sequences. Conceptually, this definition says that risk aversion is a preference that when adhered to repeatedly, ultimately leads to an
inferior outcome. More specifically, with a risk averse preference, repeatedly choosing the certainty
equivalent of a lottery over the lottery itself ultimately leads to an inferior outcome, with probability 1. To make this definition concrete, we must first define the associated notions, including:
lottery sequences, certainty equivalent of a lottery sequence, and ultimately inferior outcome.
The Space. We consider an infinite sequence of factors T1 , T2 , . . ., where Ti represents the consumption space at time i.6 We denote Hn = T1 × · · · × Tn - the finite history space up to time n. In the
following, ai , bi , ci , will always be taken to be in Ti , and lottery Li will be over Ti .
Preference Orders. While the number of factors is infinite, we only need to consider the preferences
∆
n
on the finite history spaces Hn . We denote by -n the preference order on Hn , and by - the
preference order on ∆(Hn ). The superscript n is frequently omitted when clear from the context.
Each Ti is assumed to be independent (in the certainty preference orders -n ), but not necessarily
∆
n
utility independent (in preference orders - ).
∆
∆
1
∆
2
We call the sequence of preference orders - = (- , - , . . .) the preference policy.
Lottery Sequences. Let L1 , L2 , . . . , be a sequence of lotteries (with Li over Ti ). We denote by
(L1 , . . . , Ln ) the lottery over Hn obtained by the independent application of each Li on its associated
factor.
Certainty Equivalents. Suppose that at time t = 1 the decision maker is offered the choice between
lottery L1 and its certainty equivalent c1 . Then, consistent with her preference policy, she may
choose c1 , which suppose she indeed does. Now, at time t2 , she is offered the choice between
lottery L2 and its certainty equivalent c2 . Again, consistent with her preference policy, she chooses
c2 . Suppose that she is thus offered, in each time period, the choice between a lottery Li and
its certainty equivalent ci . Then the decision maker can consistently choose ci , ending up with
(c1 , c2 , . . .).
Accordingly, we say that c = (c1 , c2 , . . .) is the repeated certainty equivalent of L = (L1 , L2 , . . .)
if (c1 , . . . , cn−1 , cn )∼n (c1 , . . . , cn−1 , Ln ) for all n.
∆
6We do not assume that T = T , i.e. the state spaces need not be the same at different time periods. In particular,
i
j
we do not assume any form of stationarity (though it is possible). Similarly, discounting may or may not be applied
between consecutive factors. Our discussion here is independent of any such nominal matters.
7
Figure 1. Illustration of [aj , bj ] v [a1 , b1 ] (the factors of S−{1,j} are not depicted).
Ultimate Inferiority. Consider a sequence c = (c1 , c2 , . . .) of sure states, and a sequence L =
(L1 , L2 , . . .) of lotteries. Let `i be the realization of Li . We say that c is ultimately inferior to L if
Pr[(c1 , . . . , cn ) ≺n (`1 , . . . , `n ) from some n on] = 1.7
Notably, here ≺n denotes the preference over the sure states. Thus, if c is ultimately inferior to L,
then consistently choosing the sure state ci over the lottery Li , will, with probability 1, eventually
result in an inferior outcome, and continue being so indefinitely.
Similarly, c is ultimately superior to L if
Pr[(c1 , . . . , cn ) n (`1 , . . . , `n ) from some n on] = 1.
Bounded and Non-Vanishing Lottery Sequences. We now want to define risk aversion as a policy
for which the repeated certainty equivalent of a lottery sequence is always ultimately inferior to the
lottery sequence itself. However, as such, this definition cannot be a good one since in the case that
the “magnitude” of the lotteries rapidly diminishes the overall outcome will be dominated by that
of the first lotteries, and we could never obtain an inferior outcome with probability 1. Similarly, if
the “magnitude” of the lotteries can grow indefinitely, then for almost any preference policy one can
construct a lottery sequence that is ultimately inferior to its repeated certainty equivalent.8 Hence,
we now define the notions of a bounded lottery sequence and a non-vanishing lottery sequence.
For bundle intervals [ai , bi ] and [aj , bj ], we denote [aj , bj ] v [ai , bi ] if (ai , bj , c) - (b1 , aj , c) for all
c ∈ S−{i,j} (see Figure 1).
A sequence of intervals [a1 , b1 ], [a2 , b2 ], . . ., is bounded if [ai , bi ] v [a1 , b1 ], for all i. The sequence
is vanishing if for any [ã1 , b̃1 ], there exists a j0 such that [aj , bj ] v [ã1 , b̃1 ] for all j > j0 . That is,
the intervals in the tail of the sequence become infinitely small.
7differently put: Pr[∃N, ∀n ≥ N : (c , . . . , c ) ≺n (` , . . . , ` )] = 1.
1
n
1
n
8See Appendix B.
8
A lottery sequence L = (L1 , L2 , . . .) is bounded if its support is entirely within some bounded
interval sequence (that is, there exists a bounded sequence of intervals [a1 , b1 ], [a2 , b2 ], . . . , with
Li ∈ ∆([ai , bi ]) for all i). The sequence is non-vanishing if it includes an infinite sub-sequence of
fair lotteries, the support thereof is not entirely within any vanishing interval sequence.
Risk Averse Policies. Equipped with these definitions, we can now define risk aversion:
∆
Definition 1. We say that preference policy - is:
• Risk averse if for any bounded non-vanishing lottery sequence, the repeated certainty equivalent of the sequence is ultimately inferior to the lottery sequence itself.
• Weakly risk averse if the repeated certainty equivalent of any bounded lottery sequence is
not ultimately superior to the lottery sequence itself.
Thus, the bias of the risk averse for certainty can never result in an ultimately superior outcome,
and on non-vanishing lotteries necessarily leads to an inferior outcome.
Note that the above definition is fully ordinal; it makes no reference to any numerical scale, and
indeed, no such scale need exist.
3.1. Risk Loving and Risk Neutrality. For readability, we deferred the definitions of risk loving
and risk neutrality. We now complete the picture by providing these definitions.
∆
Definition 2. We say that preference policy - is:
• Risk loving if for any bounded non-vanishing lottery sequence, the repeated certainty equivalent of the sequence is ultimately superior to the lottery sequence itself.
• Weakly risk loving if the repeated certainty equivalent of any bounded lottery sequence is
not ultimately inferior to the lottery sequence itself.
• Risk neutral if it is both weakly risk loving and weakly risk averse.
Thus, the risk loving require an ultimately superior certainty equivalent to forgo their love of
risk.
4. Lottery Sequences: The Quantitative Perspective
The previous section provided a fully ordinal definition of risk aversion. We now show how this
ordinal definition can be cast in quantitative form. Specifically, we show that (under some assumptions) this ordinal definition of risk-aversion coincides with the Arrow-Pratt cardinal definition,
once the latter is defined with respect to the appropriate scale. This scale, we show, is provided by
the Debreu value function, which we review next.
4.1. Debreu Value Functions. The theory of multi-attribute decision making considers certainty
preferences over a multi-factor space, and establishes that under certain independence assumptions
such preferences can be represented by quantitative functions, as follows. Consider the space
Hn = T1 × · · · × Tn (n ≥ 2), with preference order -n . Debreu [7] proves that, if the partition
9
is independent9 then -n is additively separable;10 that is, there exist functions v Ti : Ti → R, such
that for any (a1 , . . . , an ), (a01 , . . . , a0n )
(a1 , . . . , an ) -n (a01 , . . . , a0n ) ⇐⇒
n
X
v Ti (ai ) ≤
i=1
n
X
v Ti (a0i ).
i=1
It is important to note that the functions are defined solely on the basis of the certainty preferences.
Debreu’s theorem also establishes that the functions are unique up to similar positive affine
transformations (that is, multiplication by identical positive constants and addition of possibly
different constants).
We call the function v Ti a (Debreu) value function for Ti , and the aggregate function vn =
Pn
i=1 v
Ti
a (Debreu) value function for Hn .11 We note that Debreu [7] called these functions utility
functions; but following Keeney and Raiffa [20], we use the term value functions, to distinguish
them from the NM utility function.
4.2. Risk Aversion and Concavity. We now show that our ordinal definition of risk aversion,
Definition 1, corresponds to concavity of the NM utility functions with respect to the associated
Debreu value functions, provided these value functions exist, and that some consistency properties
hold among the preference orders on the Hn ’s. The exact conditions are now specified.
Certainty Preference. For the certainty preferences, assume:
A1 - Independence: Hn = T1 × · · · × Tn is an independent partition12 (with respect to -n ), for
all n.
0
A2 - Consistency: for any n, n0 , n0 > n, the preference order induced by -n on Hn is identical
to -n
Note that this is only assumed with respect to the certainty preferences, not the lottery preferences.
These assumptions yield the existence of value functions, as follows:
Proposition 4.1. Assuming A1-A2, there exist Debreu value functions v Ti : Ti → R, i = 1, 2, . . .,
P
such that for all n, vn = ni=1 v Ti represents -n .
Lottery Preferences. Whereas the factors are assumed independent, the lottery preferences there∆
n+1
upon need not be independent. That is, the preference order on ∆(Hn ) induced by -
may
depend on the state an+1 in Tn+1 . We do assume, however, the following weak consistency.
9see page 6.
10In the case of two factors (n = 2), the following Thomsen condition is also required: for all a , B , c ∈ T , and
1
1 1
1
a2 , b2 , c3 ∈ T2 , if (a1 , b2 ) ∼ (b1 , a2 ) and (b1 , c3 ) ∼ (c1 , b2 ) then (a1 , c2 ) ∼ (c1 , a2 ). For n > 2 the Thomsen condition
is implied by the independence of the pairs.
11This is a slight abuse of notation. More precisely, v is the function on Hn given by v(a , . . . , a ) = Pn v Ti (a ).
1
n
i
i=1
12That is, each consecutive pair of factors T × T
is
independent.
i
i+1
10
A3 - Weak consistency: for any n, there exists a φn+1 ∈ Tn+1 with
n
n+1
L- L0 ⇐⇒ (L, φn+1 )∆
∆
(L0 , φn+1 );
that is, the preferences on ∆(Hn ) are consistent with some possible future. We call the
sequence (φ2 , φ3 , . . .) the presumed future, and assume that it is bounded away from the
boundaries of the Ti ’s; that is, there exists an s > 0 with v Ti (φi ) ± s ∈ v Ti (Ti ) for all i.
4.2.1. Weak Risk Aversion and (Weak) Concavity. For each n, let un be the NM utility function
∆
n
representing - . The next theorem establishes the connection between weak risk aversion and
concavity of the un ’s.
∆
Theorem 1. Assuming A1-A3, - is weakly risk averse if and only if un is concave with respect to
vn , for all n.
Thus, Theorem 1 provides the missing conceptual justification for defining risk aversion by concavity of the utility function. It also establishes the appropriate scale - the Debreu value function.
Interestingly, the theorem provides that all NM utility functions must be concave, not only from
some n on.
4.2.2. (Strict) Risk Aversion and Strict Concavity. We would have now wanted to claim that (strict)
risk aversion corresponds to strict concavity of the NM utility functions (with respect to the value
function). However, strict concavity alone is not enough, as we are considering repeated lotteries,
and we cannot expect ultimate inferiority if the “level of concavity” rapidly diminishes. So, we need
a condition that ensures that the functions are also “uniformly” strictly concave in some sense. As
it turns out, the condition of interest is that the coefficient of absolute risk aversion of the NM
utility functions is bounded away from zero (when measured with respect to the value function).
The exact definitions follow.
For each n, let ûn be the function such that ûn (vn (a1 , . . . , an )) = un (a1 , . . . , an ). This is well
∆
n
defined, as - and -n agree on the certainty preferences. Conceptually, ûn is the function un once
the underlying scale is converted to the value function vn . Denote û = (û1 , û2 , . . .).
For a twice differentiable function f the coefficient of absolute risk aversion of f at x is:
Af (x) = −
f 00 (x)
.
f 0 (x)
Theorem 2. Assuming A1-A3, if Aûn (x) is bounded away from 0, uniformly for all n and x,13
∆
then - is risk averse (assuming ûn is twice differentiable for all n).
13that is, there exists an constant α > 0 such that A (x) ≥ α for all n and x.
ûn
11
Theorem 2 establishes a sufficient condition for risk aversion. We now proceed to establish a
necessary condition, which is “close” to being tight. To do so we need to consider the behavior of
the functions ûi , and the definition of Aûi (·), in a little more detail.
Let risk-premûn (x, ±) be the risk premium according to ûn of the of the lottery hx + , x − i;
that is
risk-premûn (x, ±) = x − (ûn )−1
ûn (x + ) + ûn (x − )
2
.
Now for any (sufficiently small) define
RPû () = inf {risk-premûn (x, ±)}.
n,x
So, RPû (·) is a function. We will be interested in the rate at which RPû () declines as → 0. The
condition of interest, we show, is that RPû () declines no faster than 2 .
Theorem 3. Assuming A1-A3,
∆
(a) If RPû () = Ω(2 ) as → 0 then - is risk averse.14
∆
(b) If RPû () = O(2+β ) as → 0, for some β > 0, then - is not risk averse.
The sufficient condition of (a) and the necessary one of (b) are not identical, but close.
Finally, we establish that the sufficient condition of Theorem 3-(a) and that of Theorem 2 are
the same.
Proposition 4.2. RPû () = Ω(2 ) as → 0, if and only if Aûn (x) is bounded away from 0,
uniformly for all n and x (assuming ûn is twice differentiable for all n).
4.3. Risk Loving and Risk Neutrality. In analogy to Theorems 1 and 3 we have:
Theorem 4. Assuming A1-A3, for vn , un , and û as in Theorems 1 and 3
∆
(a) Weak risk loving: - is weakly risk loving if and only if un is convex with respect to vn for
all n.
(b) Risk loving
∆
• If (−RPû ()) = Ω(2 ) as → 0 then - is risk loving.
∆
• If (−RPû ()) = O(2+β ) as → 0 (for some β > 0) then - is not risk loving.
∆
(c) Risk Neutral: - is risk neutral if and only if un is a linear transformation of vn for all n.
5. Finite Lottery Sequences
The definitions of Section 3 require an infinite sequence of time periods. In this section, we
provide a definition for the setting with a finite number of periods.
14recall that g(y) = Ω(h(y)) as y → 0 if there exists a constant M and y such that g(y) > M · h(y) for all y < y .
0
0
12
The Space. We consider a space S and an independent partition S = T1 × · · · × Tn .15 In this
case, the factors Ti may either be different time periods, as in the previous sections, or any other
partitions (e.g. work and pleasure).
5.1. Risk Aversion. When the number of spaces is finite we cannot possibly expect a behavior
“with probability one”, as in the definitions of Section 3. Rather, for the finite case, risk aversion
is defined as a preference wherein the probability that the realization of a lottery is preferred to its
certainty equivalent is greater than the probability of the opposite, as follows.
Weak Risk Aversion. Denote the certainty equivalent of a lottery L by c(L).16 A lottery sequence
is fair if each of the lotteries therein is a fair lottery (50-50) over its respective time period.
∆
Definition 3. - is weakly-risk-averse if for any fair lottery sequence L,
Pr[c(L) - `] ≥ Pr[c(L) % `].
(2)
Strong Weak Aversion. We would have wanted to define strong weak aversion as a preferences
wherein (2) holds with a strong inequality. However, this cannot be a good definition, as in the
case where all but one of the lotteries in the sequence are degenerate there are only two possible
outcomes to the lottery, and the certainty equivalent would necessarily lay in between these two
possible outcomes. So, the probability of both sides of (2) would equal 1/2. Similarly, if one of the
lotteries in the sequence is “large” and all the other lotteries relatively “small”, one would again
get that the probabilities for both sites of (2) would be 1/2. So, what we need is two lotteries that
are “of the same magnitude”. Hence, the following definition:
Definition 4. Fair lotteries Li = hai , bi : 0.5, 0.5i and Li = hai , bi : 0.5, 0.5i (in ∆(Ti ) and ∆(Tj )
respectively) are of the same magnitude if (ai , bj ) ∼ (bi , aj ),17. Lottery sequence L = (Li , Lj , d) is
a repeated lottery sequence if Li , Lj are of the same magnitude and d ∈ S−{i,j} .
With this definition, strong risk aversion is defined as preferences wherein strict inequality holds
for repeated lottery sequences.
∆
Definition 5. - is (strongly)-risk-averse if for any non-degenerate repeated lottery sequence L
Pr[c(L) - `] > Pr[c(L) % `].
15Here we use the notation S rather than Hn since we will be considering one fixed space S .
16If there are several certainty equivalents, then pick one arbitrarily.
17this is well defined, as the partition is independent, so the preferences on each pair of factors is independent of
the others.
13
5.2. Properties. Definitions 3 and 5 implicitly assume some underlying partition (with respect to
which the lottery sequences are defined). The following proposition establishes that if the definition
holds for some partition then it holds for any partition.
∆
Proposition 5.1. If - is weakly-risk-averse by Definition 3 with respect to some independent
partition S = T1 × · · · × Tn , then it is also weakly-risk-averse with respect to any independent
partition. Similarly for (strong)-risk-aversion of Definition 5.
5.3. The Quantitative Analogue. As in Section 4, we now show that risk-aversion, as in definitions 3 and 5, corresponds to concavity of the NM utility with respect to the Debreu value function
(if it exists).
Let v Ti : Ti → R, be Debreu value functions for -, with respect to the given partition. That is,
v(a1 , . . . , an ) =
n
X
v Ti (ai )
i=1
represents -.
∆
Theorem 5. For -, -, with NM utility u and Debreu value function v, then u is concave with
∆
respect to v if and only if - is weakly-risk-averse by Definition 3, and strictly-concave with respect
∆
to v if and only if - is strongly-risk-averse by Definition 5.
So again, risk aversion as in Definitions 3 and 5 coincides with the Arrow-Pratt notion, once
concavity is defined with respect to the Debreu value function.
5.4. Risk Loving and Risk Neutrality. The definitions and theorems for risk-loving and risk
neutrality are analogous.
∆
Definition 6. - is weakly-risk-loving if for any fair lottery sequence L,
Pr[c(L) - `] ≤ Pr[c(L) % `]
and (strongly)-risk-loving if for any non-degenerate repeated lottery sequence L
Pr[c(L) - `] < Pr[c(L) % `].
∆
- is risk-neutral if for any fair lottery sequence L
Pr[c(L) - `] = Pr[c(L) % `]
∆
Theorem 6. For -, -, with NM utility u and Debreu value function v, then u is convex with
∆
respect to v if and only if - is weakly-risk-loving by Definition 6, and strictly-convex with respect
∆
to v if and only if - is strongly-risk-averse by Definition 6.
∆
u is a linear transformation of v if and only if - is risk-neutral by Definition 6.
14
6. Multi-Commodity Risk Aversion
The seminal works of Arrow [2] and Pratt [25] defined risk aversion with respect to a single
commodity – money. Ever since, researchers have considered how to extend the definition, and
associated measures, to the multi-commodity setting (see [21, 29, 24, 9, 18, 26, 19, 23] for some
references in the expected utility model). It is out of the scope of this paper to review this extensive
body of research, but a key problem in the multi-commodity setting is that each commodity has
its own scale so the question arises as to which scale should or can be used when measuring the
concavity of the (multi-variable) utility function. Indeed, some papers (e.g. [9]) keep the multiple
scales - in which case the measures of risk aversion become vectors (for the risk premium) and
matrices (for the coefficient of ARA).
Our approach here takes a different direction, which, in a way is the reverse. Rather than starting
from the single commodity definition and extending it to multi-commodities, we start from the
multi-commodity setting, and then derive the uni-scale case as a quantitative representation of
the former. So, the “native” scales of the different commodities are immaterial in our approach.
Rather, the only scale of interest is the intrinsically defined Debreu value function, which is shared
across all commodities.
We now show how, with this approach, the Arrow-Pratt framework carries over to the multicommodity setting.
6.1. DARA and CARA Preferences. Arrow and Pratt introduced the coefficient absolute risk
aversion, and the related notions of CARA - constant absolute risk aversion - and DARA - decreasing absolute risk aversion. They show that if (and only if) the utility function exhibits DARA
then for a given lottery L, the associated risk premium decreases as the decision maker’s wealth
level increases. Similarly, the risk premium is independent of the wealth level if and only if the
coefficient absolute risk aversion is constant. In particular, a decision maker accepting a lottery at
some wealth level is guaranteed to accept the same lottery at a higher wealth level if and only if
her utility function exhibits DARA or CARA.
The Arrow-Pratt coefficient of absolute risk aversion, and hence also the associated notions of
CARA and DARA, are defined in a uni-commodity setting. As such, they do not carry over to the
multi-commodity, or multi-period, setting, as exemplified next.
Consider two commodities, say the decision maker’s property and her art collection, with $x
being the value of the property and $y the value of the art work (and suppose that x, y ≥ 2).
Suppose that the decision maker’s utility function is u(x, y) = (log2 x + log2 y)2 . Then, it is easy
to see that for any fixed y, the coefficient of absolute risk aversion of u with respect to x is DARA,
and vice versa. However, with y = 2, the certainty equivalent of a fair lottery between x = 2 and
x = 4 is ≈ 2.93, while with y = 1000 the certainty equivalent of the same lottery is ≈ 2.85, and
with y = 1, 000, 000 it is ≈ 2.83. So, the certainty equivalent decreases, and the risk premium
increases, as the wealth y grows. In particular, a decision maker with x = 2.9 will accept the fair
15
lottery between x = 2 and x = 4 at a wealth level of y = 2, but would reject the same lottery at
the higher wealth level y = 1000.
Thus, the Arrow-Pratt notions of DARA and CARA are limited to single commodities. We now
show that, using our notion of risk aversion, these notions can be extended to the multi-commodity
setting.
Given an NM utility function u and Debreu value function v, we naturally define the coefficient
of absolute risk aversion of u with respect to v to be
d2 u du
/ ,
dv 2 dv
which, setting û = u ◦ v −1 , is equal to
û00
.
û0
Note that this is the exactly the Arrow-Pratt coefficient of absolute risk aversion when using the v
scale. DARA and CARA are naturally defined with respect to this coefficient. With this definition,
once again, the risk premium of a lottery decreases with the wealth-level for DARA preferences.
Specifically,
Proposition 6.1. In the multi-commodity setting (with S = T1 ×· · ·× Tn an independent partition),
the NM utility function u is DARA with respect to v if and only if for any commodity i, lottery Li
over Ti , state ci ∈ Ti , and d−i , d0−i ∈ Ω−{i} , if d−i ≺ d0−i then
(ci , d−i )-(Li , d−i ) ⇒ (ci , d0−i )≺(Li , d0−i ).
∆
∆
Similarly, u is CARA with respect to v if and only if for any commodity i, lottery Li over Ti , state
ci ∈ Ti , and d−i , d0−i ∈ Ω−{i}
(ci , d−i )-(Li , d−i ) ⇐⇒ (ci , d0−i )-(Li , d0−i )
∆
∆
So, once risk aversion is defined with respect to the value function, the DARA and CARA
behavior carry over to the multi-commodity/multi-period setting.
6.2. Correlation Aversion. Richard [26] considered risk aversion in the multi-commodity setting,
and introduced the following notion,
Definition 7. Given an independent partition, S = T1 × · · · × Tn ,18 - is correlation-averse with
∆
respect to factors Ti , Tj , i 6= j, if for any ai ≺ bi , aj ≺ bj , and c ∈ S−{i,j}
(3)
∆
h(ai , aj , c), (bi , bj , c)i ≺ h(ai , bj , c), (bi , aj , c)i .
The preference is weakly correlation-averse if (3) holds with a weak preference.
18Technically, in [26] it is assumed that each T is a real interval. However, transformation into such an interval
i
can always be obtained by means of [6].
16
This notion, which Richard [26] initially termed multi-variate risk aversion,19 was introduced as
“a new type of risk aversion unique to multivariate utility functions” [26]. A later paper states:
“[Richard’s definition] has nothing to do with what is generally known as risk aversion” [28].
We show that the two notions are deeply tied. We first show that a decision maker is (weakly)
correlation averse if and only if she is (weakly) risk averse in the sense of Definitions 3 and 5. This,
in turn, establishes (by Theorem 5) that the decision maker is weakly correlation averse if and
only if her NM utility function is concave with respect to the Debreu value function, and (strictly)
correlation averse if and only if the utility function is strictly concave with respect to the value
function. So, correlation aversion and Arrow-Pratt risk aversion are one and the same when using
the value function scale.
∆
Theorem 7. If preference order - is (weakly) risk averse by Definition 3 then it is (weakly)
∆
correlation averse with respect to all pairs of factors Ti , Tj . Conversely, if - is correlation averse
with respect to some pair of factors Ti , Tj , then it is (weakly) risk averse by Definition 3.
Which, in turn establishes:
∆
Corollary 6.1. Let - be a preference order over an independent partition S = T1 × · · · × Tn , with
∆
NM utility u and Debreu value function v. Then, u is concave with respect to v if and only if is weakly correlation averse with respect to any and all factor pairs Ti , Tj . Similarly, u is strictly
∆
concave with respect to v if and only if - is correlation averse with respect to any and all factor
pairs Ti , Tj .
6.3. Comparative Multi-Commodity Risk Aversion. Kihlstrom and Mirman [21], observed
that in the multi-commodity setting it is natural to limit risk aversion comparisons to decision
makers agreeing on the certainty preferences.20 This also holds in our framework, as our definition
of risk aversion is always with respect to the certainty preferences. For individuals agreeing on the
certainty preferences, using our approach the entire Arrow-Pratt framework carries over as is, once
the underlying scale is converted to the associated (joint) Debreu value function.
In particular, we have the following. Let v be a joint Debreu value function (representing the
joint certainty preferences), and û1 and û2 be the NM utility functions of players 1 and 2, when
measured with respect to v. For a lottery L let cj (L) be the certainty equivalent of L by ûj
(j = 1, 2). Then
c1 (L) - c2 (L)
for all lotteries L if and only if
Aû1 (x) ≥ Aû2 (x)
19The now more familiar term - correlation-aversion, which we adopt here, was coined by Epstein and Tanny [12].
20Indeed, the same is also true for the classical framework, but it is implicitly assumed that all individuals agree
on the certainty preferences: more money is better than less. Attempting to compare individuals who do not share
this preference, e.g. comparing the risk attitude of Imelda Marcos with that of Saint Francis of Assisi, is meaningless
also under the classic Arrow-Pratt framework.
17
for all x (where Aûi (x) is the coefficient of absolute risk aversion of ûi at x). This follows directly
from Arrow-Pratt as their theorems do not specify the scale, and thus also apply when using the
value function scale.
7. Inter-temporal Discounting
Discounting of the future is a ubiquitous assumption in economic literature, and indeed, widely
observed in practice. Scholars have identified several factors that may contribute to such discounting
(see [14] for an excellent review). One of these factors is the uncertainty associated with the future
- the decision maker may not live to enjoy the future. Other factors are more psychological in
nature - e.g. a preference for immediate pleasure. The common discounted utility model bundles
all factors in a single discount rate. We now show that the framework developed in this paper allows
disentangling the former - discounting due to uncertainty associated with the future - from the other
latter factors, establishing a measure for the rate of discounting arising from the uncertainty of the
future. We also examine how risk aversion (or risk loving) influences such discounting.
Consider a sequence of time periods t = 1, . . . , n,21 and an associated consumption space S =
T1 × · · · × Tn . Suppose that while there are n possible time periods, actual consumption may be
for a shorter duration, e.g. due to death. That is, for each i = 1, . . . , n, there is a probability,
si , that consumption is for the i first time periods alone; so, si is the probability of “death” right
after time i (here the term “death” should be interpreted in the wide sense, as any factor that
may halt consumption - e.g. liquidation in case the entity of interest is a company). The vector
s = (s1 , . . . , sn ) is termed the survival vector. Clearly, the survival vector may influence the decision
maker’s preferences over the n-long sequences. For example, if the decision maker is sure to die
after the first period, then only the preferences of the first time period matter; e.g. if 3 is preferred
to 2 in the first period, then (3, 2, . . .) is preferred to say (2, 100, . . .). If, on the other hand, the
decision maker is sure to live for both of the first time periods, then the preferences between these
two time periods are determined by the risk-free inter-temporal rate of substitution between the two
periods. At intermediate situations, when death between the first and the second time periods is
possible but not certain, a combination of these two factors shall determine inter-temporal decision
making, and, in particular, inter-temporal discounting. We now investigate the structure of the
first component; that is, inter-temporal discounting due to possible “death”, and, in particular, the
role of risk aversion in this discounting.
Given a = (a1 , . . . , an ) ∈ S , and survival vector s = (s1 , . . . , sn ), actual consumption is a lottery:
ha1 , . . . , an : s1 , . . . , sn i. So, for two such sequences, a and b, the preference between the two is
actually the preference between the associated lotteries ha1 , . . . , an : s1 , . . . , sn i and hb1 , . . . , bn :
s1 , . . . , sn i. So, any survival vector s induces a preference order -s over S , which reflects the
preferences amongst the associated lotteries. Inter-temporal discounting in -s is determined by
21For brevity we consider the case with a finite number of time periods.
18
both risk-free inter-temporal preferences (over certainties), and by the probabilities in s, and, as
we will see, the decision maker’s risk attitude.
For k = 1, . . . , n, denote Hk = T1 × · · · Tk - the set of finite histories to time k. The set of all
∗
possible outcomes is H∗ = ∪nk=0 Hk . Let -∗ and - be agreeing preferences orders over H∗ and
∆
k
∆(H∗ ), respectively. For each k, let -k and - be the induced certainty and lottery preferences on
∆
∆
∗
∆
k
Hk and ∆(Hk ), respectively. We say that - is risk averse if each - is risk averse (in the sense of
Definitions 3 and 5).
For the certainty preferences (but not necessarily the lottery preferences), suppose:
A1-Independence: Hk = T1 × · · · × Tk is an independent partition, with respect to -k , for all k.
A2-Consistency: for any k, the preference order induced by -k+1 on Hk is identical to -k .
Then, as in Proposition 4.1, we have
Proposition 7.1. Assuming A1-A2, there exist Debreu value functions v Ti : Ti → R, i = 1, . . . , n,
P
such that for all k, vk = ki=1 v Ti represents -k .
∗
Let u∗ be the NM utility representing - , and for each k, let uk be the induced utility function
∆
∆
k
on Hk , representing - . Given survival vector s = (s1 , . . . , sn ), the function us defined as
(4)
us (a1 , . . . , an ) =
n
X
sk · uk (a1 , . . . , ak ).
k=1
represents -s .
Assume that each Ti is some real interval Ii . From (4), the marginal rate of substitution between
the t-th and t + 1 time periods, at point (a1 , . . . , an ) is
Pn
∂us
∂uk
k=1 sk ∂at
∂at
(5)
= Pn
∂us
∂uk
∂at+1
k=1 sk ∂at+1
∆
k
Since -k and - agree, uk is a monotone transformation of vk =
Pk
i=1 v
Ti .
Set ûk = uk ◦ (vk )−1 ;
that is, ûk is the real valued function such that uk (a1 , . . . , ak ) = ûk (vk (a1 , . . . , ak )). So, from (5),
the aforementioned marginal rate of substitution at a given (a1 , . . . , an ), is
Pn
dûk ∂vk ·
k=1 sk · dx x=vk (a1 ,...,ak ) ∂at (a1 ,...,ak )
(6)
Pn
dûk ∂vk · ∂at+1 k=1 sk · dx x=vk (a1 ,...,ak )
By definition, vk (a1 , . . . , ak ) =
Pk
i=1 v
Ti (a
i ).
So for k < t,
(a1 ,...,ak )
∂vk
∂at
= 0, and for k ≥ t,
similarly for t + 1. So, (6) is
k=t sk ·
Pn
k=t+1 sk ·
Pn
dûk dx x=v (a ,...,a )
k 1
k
dûk dx x=v (a ,...,a )
k 1
k
19
·
dv Tt
dat
·
dv Tt+1
dat+1
,
∂vk
∂at
=
dv Tt
dat ,
and
which is,

(7)

dv Tt
dat
dv Tt+1
dat+1

  Pn
dûk s
·
k
k=t
dx x=v (a ,...,a )

k
·
k 1
 Pn

dûk s
·
k
k=t+1
dx
x=vk (a1 ,...,ak )
So, the marginal rate of substitution is the product of two components. The first,
dv Tt dv Tt+1
dat / dat+1 ,
is the standard, risk free marginal rate of substitution between the two time periods. This is the
marginal rate of substitution that would be exhibited if there would be no risk of death. The second
component, which is our subject of interest, is the component of the marginal rate of substitution
associated with the risks of death. We call this second component, the death discount rate, denoted
DDRt,t+1
.
s
Rewriting (7), we have,
DDRt,t+1
(a1 , . . . , an ) = 1 + P
s
n
(8)
st ·
dût dx x=vk (a1 ,...,at )
k=t+1 sk
·
dûk dx x=v (a ,...,a )
k 1
k
.
Each of the functions ûk is monotone increasing, so the second addend is non-negative. So,
≥ 1, and it is indeed a discount rate. The magnitude of the discount is determined
DDRt,t+1
s
by several factors, which we now explore.
First note that DDRt,t+1
approaches 1 as st goes to zero. This stands to reason, as st = 0
s
means that there is no probability of death between time periods t and t + 1, so there is no
death associated discounting between the two. On the other hand, DDRt,t+1
approaches infinity
s
Pn
Pn
as k=t+1 sk approaches zero. This again makes sense, as k=t+1 sk = 0 means that there is no
chance to live beyond the t-th time period, so any consumption is better to have at time t.
For a fixed s, the magnitude of DDRt,t+1
(a1 , . . . , an ) is determined by both the risk attitude,
s
as captured by the curvature of the ûk ’s (that is, the curvature of the uk with respect to vk ), and
by the sequence (a1 , . . . , an ). Suppose that u∗ is risk averse, in the sense defined in this paper;
that is, each ûk is concave. So, for a fixed a1 , . . . , at , the denominator of the second addend in (8)
decreases with x. So, DDRt,t+1
(a1 , . . . , an ) increases as at+1 , . . . , an , become more favorable. The
s
explanation for this is that risk averters dislike big risks. As the “future” - at+1 , . . . , an - becomes
brighter, the risk of dying at time t becomes larger. So, a risk averter is more inclined to move some
of the better consumption to the current time period, to reduce risk. The opposite holds for risk
lovers (again, as defined in this paper). For risk lovers the ûk ’s are convex. So, DDRt,t+1
(a1 , . . . , an )
s
decreases as at+1 , . . . , an becomes more favorable. We thus obtain:
Proposition 7.2. Assuming A1, A2, for any t, s (with st 6= 0 and
Pn
k=t+1 sk
> 0), and sequences
(a1 , . . . , at , bt+1 , . . . , bn ) - (a1 , . . . , at , ct+1 , . . . , cn )
∆
∗
• if - is weakly risk averse then
DDRt,t+1
(a1 , . . . , at , bt+1 , . . . , bn ) ≤ DDRt,t+1
(a1 , . . . , at , ct+1 , . . . , cn ),
s
s
20
∆
∗
and the inequality is strict if - is (strictly) risk averse.
∆
∗
• if - is weakly risk loving then
DDRt,t+1
(a1 , . . . , at , bt+1 , . . . , bn ) ≥ DDRt,t+1
(a1 , . . . , at , ct+1 , . . . , bn ),
s
s
∆
∗
and the inequality is strict if - is (strictly) risk loving.
So, risk averters discount more as the future becomes brighter, whereas risk lovers discount less.
8. Related Work
Strength of Preference and Relative Risk Aversion. Dyer and Sarin [11] and Bell and
Raiffa [5] have suggested measuring risk aversion with respect to the strength of preference function,
rather than money. It is out of the scope of this paper to review the strength-of-preference theory,
but generally speaking this theory assumes that not only do decision makers have a well defined
preference order over sure states and lotteries, but also that they have a preference order over
differences between states; that is, the decision maker can state that she prefers the transition
x1 7−→ x2 over the the transition y1 7−→ y2 (where x1 , x2 , y1 , y2 are states). Assuming such
preferences exist (and some additional technical conditions), the theory establishes that there exists
a function f (termed measurable value function [10]) that represents these preferences, in the sense
that f (x2 ) − f (x1 ) > f (y2 ) − f (y1 ) if and only if the transition x1 7−→ x2 is preferred over the
transition y1 7−→ y2 . Given such a function, Dyer and Sarin [11] define the notion of relative risk
aversion 22 as the concavity of the NM utility function u with respect to the measurable value
function f . Bell and Raiffa [5] similarly define the notion of intrinsic risk aversion.
Bell and Raiffa [5] also show how the strength-of-preference function (assuming it exists) can be
deduced and identified with a multi-attribute (Debreu) value function (see also [11, Theorem 1]).
Thus, Theorem 5 establishes that, technically, risk aversion as of Definitions 3 and 5 coincides with
the Dyer and Sarin notion of relative risk aversion, provided that the Debreu value function exists
and relative risk aversion is computed with respect to this function. Conceptually, however, our
approach is totally different from that of [11] and [5]. First, we do not suppose, technically or conceptually, any form of preferences over differences. Rather, we only use the standard preferences on
bundles and lotteries thereof. Second, conceptually [11] and [5] follow the Arrow-Pratt framework,
taking it as given that the “natural value” of a gamble “should be” its expectation. They differ
from Arrow-Pratt only in using a different scale. Our approach is the opposite. Our starting point,
and all core definitions, are scale-free. The numerical representation is then mathematically derived
from these ordinal notions.
Intertemporal Risk Aversion of [30]. An independent work of Traeger [30] is motivated by
some of the same questions that motivate our work; namely - providing a scale-free, axiomatic
22not to be confused with the Arrow-Pratt coefficient of relative risk aversion
21
framework for the theory of risk aversion.23 Traeger’s work is set in the Kreps and Porteus [22]
temporal lottery framework, wherein the timing of risk resolution is important. The core definition
provided in the paper is that of Intertemporal Risk Aversion, which is a variant of correlation
aversion wherein the non-correlated outcome is risk-free (using the notation of Definition 7, the
definition considers cases where (ai , bj ) ∼ (bi , aj )).24 The paper also extends the definition to more
than two factors. For the classic framework considered in our paper (wherein there is indifference
to timing of risk resolution) the relationship between correlation aversion and our notion of risk
aversion (based on lottery sequences) was discussed in Section 6.2. In this regard, our Corollary
6.1, which relates correlation aversion to the concavity of the NM utility with respect to the Debreu
function, is analogous - within the classic framework - of Traeger’s Theorem 2 - for the temporal
lottery framework.
9. Discussion
We presented two scale-free definitions of risk aversion, based entirely on the internal structure
of preferences of the decision maker; independent of money or any other units. The first definition
equates risk aversion with a policy that, in the long run, necessarily leads to an inferior outcome.
In the second definition, the probability of an inferior outcome is greater than the probability of a
superior one. We show that when cast in numerical terms, both these ordinal definitions coincide
with the Arrow-Pratt definition, once the latter is defined with respect to the Debreu value function
associated with the decision maker’s preferences over the sure outcomes. This, we suggest, provides
the missing conceptual justification for the use of the arithmetic mean as the basis for defining risk
aversion, and, at the same time, establishes the appropriate scale to use.
Inter-Commodity and Intra-Commodity Risk Aversion. We should stress that risk-aversion,
as considered in this paper, does not relate only to lotteries involving multiple commodities or time
periods, but also to lotteries within a single commodity/time. It may be seen that, given the multicommodity certainty preferences, inter-commodity lottery preferences determine intra-commodity
lottery preferences, and vice versa. Thus, inter-commodity and intra-commodity risk attitudes
are one and the same. We use the inter-commodity setting as it provides an Archimedean vantage
point from which the risk-attitude can be disentangled from the risk-free preferences. Once defined,
however, it applies to all manifestations of risk. This is highlighted by the quantitative form using
the Debreu function. The multi-commodity setting merely provides us with the appropriate scale
with which to measure risk aversion, both inter and intra-commodity.
Disentangling Risk Aversion from Diminishing Marginal Utility. It is well known that
under the classical definition, risk aversion and diminishing marginal utility are entangled. On a
conceptual level, however, the two notions are distinct. Indeed, disentangling diminishing marginal
23We are grateful to Jean-Christophe Vergnaud for calling our attention to this important manuscript.
24See Lemma A.10 of the appendix, which relates the two definitions.
22
utility from risk aversion is one of the earliest motivations for the non-expected utility literature,
as Yaari [31] writes: “Two reasons have prompted me to look for an alternative to expected
utility theory. The first reason is methodological: In expected utility theory, the agent’s attitude
towards risk and the agent’s attitude towards wealth are forever bonded together. At the level
of fundamental principles, risk aversion and diminishing marginal utility of wealth, which are
synonyms under expected utility theory, are horses of different colors.” Using the concepts of
this work, it is possible disentangle the two within the expected utility framework. In our scheme,
the curvature of the NM utility function with respect to money is decomposed into two components:
the curvature of the Debreu value function with respect to money, and the curvature of the NM
utility function with respect to the Debrue value function. With this decomposition, the former may
naturally be associated with diminishing marginal utility, while the latter - we argue - represents
the risk aversion component. For example, suppose that for a two day setting, the utility of the
decision maker is u(x, y) = ln(x) + ln(y), where x and y are the total consumption levels of the
decision maker, in kilograms, in day one and two, respectively. Then, the Debreu value in each
period is logarithmic in the number of kilograms consumed, and the NM utility function is linear
in the Debreu value. Under our interpretation, the logarithmic Debreu value functions correspond
to diminishing marginal utility, with respect to kilograms of consumption, while the linearity of the
NM utility with respect to the value function corresponds to risk neutrality.
This disentangling may have implications for the language we use to describe (and hence comprehend) key economic behavior. Consider, for example, an aging, retired individual, comfortably
living off her savings, who is offered a 50-50 gamble between tripling her savings and losing them
all. Common sense has it that rejecting the gamble is a perfectly rational choice for all but the
most risk loving individuals. Classical economic language, however, would have to deem such a
rejection “risk aversion”. The framework of this paper provides us with a more refined language,
that allows us to give a more convincing interpretation of the behavior. When measured in terms
of the Debreu value function, which reflects the relative benefits provided by each of the possible
outcomes, the 50-50 gamble may well be actuarially inferior (in Debreu value units) to the existing
state. So, by our definition, the gamble would be rejected by risk neutral (or even some risk loving)
individuals.
Repeated Games. The theory of (infinitely) repeated games assumes that the utility in the
repeated game is additive, in one way or another, in the utilities of the individual stage games [3,
27, 15]. By our definition, this corresponds to an assumption of risk neutrality. Accordingly, in
a sequel work [4], we consider a theory of repeated games without this additivity assumption.
We show that when players are risk averse - according to our ordinal definitions - new equilibria
emerge, unaccounted for by the classic theory. In particular, in two person matching pennies games,
if one player is risk averse and the other risk loving, then the resulting pure strategy equilibria are
23
necessarily biased in favor of the risk loving player. Such biased equilibria are not possible in the
classic theory.
References
[1] N. Alon and J.H. Spencer. The Probabilistic Method. Wiley Series in Discrete Mathematics and Optimization.
Wiley, 2004.
[2] Kenneth J. Arrow. Essays in the theory of risk-bearing. Markham Publishing Company, 1971.
[3] Robert J. Aumann. Survey of Repeated Games. Essays in Game Theory and Mathematical Economics in Honor
of Oskar Morgenstern, 1981.
[4] Yonatan
Aumann
and
Erel
Segal-Halevi.
Repeated
games
with
risk
averse
players.
see
https://sites.google.com/site/aumannbiu/home/publication-by-topic-1, 2016.
[5] David E Bell and Howard Raiffa. Marginal value and intrinsic risk aversion. In David E. Bell, Howard Raiffa, and
Amos Tversky, editors, Decision Making Descriptive, Normative and Prescriptive Interactions, pages 384–397.
Cambridge University Press, 1988.
[6] G Debreu. Representation of a preference ordering by a numerical function. In R.M. Thrall, R. L. Davis, and
C.H. Coombs, editors, Decision Process, pages 159–165. John Wiley, New York, 1954.
[7] Gerard Debreu. Topological Methods in Cardinal Utility Theory. Technical Report 76, Cowles Foundation for
Research in Economics, Yale University, 1959.
[8] Gerard Debreu. Topological methods in cardinal utility theory. In Mathematical Economics: Twenty Papers of
Gerard Debreu, chapter 9. Cambridge University Press, 1986.
[9] George T. Duncan. A matrix measure of multivariate local risk aversion. Econometrica, 45(4):895–903, 1977.
[10] James S. Dyer and Rakesh K. Sarin. Measurable multiattribute value functions. Operations Research, 27(4):810–
822, 1979.
[11] James S. Dyer and Rakesh K. Sarin. Relative risk aversion. Management Science, 28(8):875–886, 1982.
[12] Larry G. Epstein and Stephen M. Tanny. Increasing Generalized Correlation: A Definition and Some Economic
Consequences. Canadian Journal of Economics, 13(1):16–34, February 1980.
[13] Peter C. Fishburn. Utility Theory for Decision Making. Wiley, New York, 1970.
[14] Shane Frederick, George Loewenstein, and Ted O’Donoghue. Time discounting and time preference: A critical
review. Journal of Economic Literature, 40(2):351–401, 2002.
[15] Drew Fudenberg and Eric Maskin. The folk theorem in repeated games with discounting or with incomplete
information. Econometrica, 54(3):533–554, 1986.
[16] William M. Gorman. The structure of utility functions. The Review of Economic Studies, 35(4):367 – 390,
October 1968.
[17] Charles M. Grinstead and Laurie J. Snell. Introduction to Probability. American Mathematical Society, 2006.
[18] Edi Karni. On multivariate risk aversion. Econometrica, 47(6):1391–1401, 1979.
[19] Edi Karni. On the correspondence between multivariate risk aversion and risk aversion with state-dependent
preferences. Journal of Economic Theory, 30(2):230 – 242, 1983.
[20] R.L. Keeney and H. Raiffa. Decisions with Multiple Objectives: Preferences and Value Trade-Offs. Wiley series
in probability and mathematical statistics. Applied probability and statistics. Cambridge University Press, 1993.
[21] Richard E. Kihlstrom and Leonard J. Mirman. Risk aversion with many commodities. Journal of Economic
Theory, 8(3):361 – 388, 1974.
[22] David M. Kreps and Evan L. Porteus. Temporal resolution of uncertainty and dynamic choice theory. Econometrica, 46(1):185–200, 1978.
[23] Dilip B. Madan. Measures of risk aversion with many commodities. Economics Letters, 11(12):93 – 100, 1983.
24
[24] Jacob Paroush. Risk premium with many commodities. Journal of Economic Theory, 11(2):283 – 286, 1975.
[25] John W. Pratt. Risk aversion in the small and in the large. Econometrica, 32(1/2):122 – 136, 1964.
[26] Scott F. Richard. Multivariate risk aversion, utility independence and separable utility functions. Management
Science, 22(1):12–21, 1975.
[27] Ariel Rubinstein. Equilibrium in supergames with the overtaking criterion. Journal of Economic Theory, 21(1):1
– 9, 1979.
[28] Marco Scarsini. Dominance conditions for multivariate utility functions. Management Science, 34(4):454–460,
1988.
[29] Joseph E Stiglitz. Behavior towards risk with many commodities. Econometrica, pages 660–667, 1969.
[30] Christian P. Traeger. Capturing intrinsic risk attitude. see https://are.berkeley.edu/ traeger/research.html, 2014.
[31] Menahem E Yaari. The dual theory of choice under risk. Econometrica, 55(1):95–115, 1987.
25
Appendix A. Proofs
For readability, all theorems and propositions are restated in this appendix.
Proofs for Section 4. The proofs in this section follow certain conventions that simplify the
presentation:
• x, y, are real number, α, β, δ - with or without indices or primes - are positive reals.
• ai , bi , and ci are points in Ti .
• Li is a lottery over Ti and `i is the realization of Li .
• Variables not explicitly quantified are taken to be universally quantified, it being understood
that the expressions in which they appear are defined.
Proposition 4.1. Assuming A1-A2, there exist Debreu value functions v Ti : Ti → R, i = 1, 2, . . .,
P
such that for all n, vn = ni=1 v Ti represents -n .
Proof. Consider Hn for n ≥ 3. By assumption, any product of the Ti ’s is independent. Hence,
P
there exist value functions vnT1 , . . . , vnTn , with ni=1 vnTi representing -n . We now show that there is
actually a single function v Ti , for each i, that works for all the Hn ’s.
For i = 1, 2, 3, set v Ti := v3Ti . Suppose v Ti has been defined for all i < n; we inductively define
P
Ti represents -n−1 . By independence of Hn−1 in -n ,
v Tn . By the induction hypothesis, n−1
i=1 v
Pn−1 Ti
the function i=1 vn also represents -n−1 . So, by uniqueness of the value functions, there exist
constants β > 0, ξi , such that v Ti = βvnTi + ξi , for i = 1, . . . , n − 1. So, setting v Tn = βvnTn , we have
that
n
X
i=1
which represents
-n ,
v Ti =
n−1
X
n
X
i=1
i=1
(βvnTi + ξi ) + βvnTn = β
as required.
vni + constant,
From now on we assume w.l.o.g. that the factors are already represented in units of the respective
value functions; that is, v Ti (ai ) = ai for all i and ai ∈ Ti . Then un , the NM utility function
∆
n
representing - , is actually only a function of the sum of its arguments; i.e. un (a1 , . . . , an ) =
un (b1 , . . . , bn ) whenever a1 + · · · + an = b1 + · · · + bn . Recall that ûn is the function such that
un (a1 , . . . , an ) = ûn (a1 + · · · + an ). Note that ûn = un ◦ (vn )−1 . Thus, un is concave with respect
to vn if and only if ûn is concave.
Let (φ2 , φ3 , . . .) be the presumed future. By assumption there exists s > 0 with φi ± s ∈ Ti , for
all i.
26
Proofs for Section 4.2.1.
Lemma A.1. Let X1 , X2 , . . . be an infinite sequence of independent uniformly bounded random
P
variables,25 with E(Xi ) = 0 for all i. Set Sn = ni=1 Xi . Then
Pr[Sn ≥ 0 infinitely often] > 0.
(9)
Proof. Denote vi = Var(Xi ), and Vn =
Pn
i=1 vi .
The Xi ’s are independent, so Vn = Var(Sn ). Now,
either Vn → ∞ or not. We consider each case separately.
If Vn → ∞, applying the central limit theorem for uniformly bounded random variables (e.g.
[17], Theorem 9.5) we obtain that
Sn
1
lim Pr[ √ ≥ 0] = √
n→∞
Vn
2π
Z
0
∞
e−x
2 /2
1
dx = .
2
In particular, Pr[Sn ≥ 0 infinitely often] > 0.
Next, suppose that Vn does not go to infinity. Each vi is non-negative. Hence, the Vi ’s form
a monotonically non-decreasing and bounded sequence, and hence converge. Thus, for any δ > 0
P
there exists an Nδ with ∞
i=Nδ vi < δ. If all the Xi are identically 0 there is nothing to prove.
Otherwise, w.l.o.g. X1 is not identically 0. Thus there exists an x > 0 with Pr(X1 ≥ x) = qx > 0.
Choose δ < x2 . Then by the Chebyshev inequality, for all n > Nδ ,
P
n
X
Var( ni=Nδ Xi )
δ
Pr[
Xi < −x] <
≤ 2 < 1.
2
x
x
i=Nδ
Clearly, there is some probability p+ for which Pr[maxn=2,...,Nδ {Sn − X1 } ≥ 0] ≥ p+ . So for all n,
Pr[Sn ≥ 0] ≥ Pr[X1 ≥ x] · Pr[ max (Sn − X1 ) ≥ 0] · Pr[
n=2,...,Nδ
n
X
Xi ≥ −x] ≥
i=Nδ
qx · p+ · (1 −
δ
) > 0.
x2
So, again, in particular, Pr[Sn ≥ 0 infinitely often] > 0.
∆
Theorem 1. Assuming A1-A3, - is weakly risk averse if and only if un is concave with respect to
vn , for all n.
∆
Proof. - is weakly risk averse ⇒ all ûn are concave: Contrariwise, suppose that ûk is not concave,
for some k. So, ûk is not concave on some interval of size ≤ s. So, there exist x, ≤ s and 0 < β < with
ûk (x + β) =
1
(ûk (x − ) + ûk (x + )) .
2
25that is, the support of all the random variables is included in a real interval [b, b], with b, b finite.
27
So, by definition of the presumed future also for any m > k,
ûm (x + φk+1 + · · · + φm + β) =
(10)
1
(ûm (x + φk+1 + · · · + φm − ) + ûm (x + φk+1 + · · · + φm + )) .
2
We construct a recurring lottery sequence L that is ultimately inferior to its repeated certainty
=
equivalent. By definition, x = b1 + · · · + bk , for some (b1 , . . . , bk ) ∈ Hk . The sequence L =
(L1 , L2 , . . .) is defined as follows:
• for i = 1, . . . , k: Li = bi ;
• for j odd: Lk+j = h(φk+j − ), (φk+j + )i;
• for j even: Lk+j = φk+j − β.
We now inductively determine the repeated certainty equivalent of L = (L1 , L2 , . . .), which we
denote (c1 , c2 , . . .). For i = 1, . . . , k, ci = bi . Consider the lottery at time k + 1. The (degenerate)
lotteries in the previous times have brought us to the point x = b1 + · · · + bk , and the lottery at time
k + 1 is Lk+1 = h(φk+1 − ), (φk+1 + )i. So, by (10), its certainty equivalent is β above the average;
that is, ck+1 = φk+1 + β. The next lottery, at time k + 2, is the degenerate lottery Lk+2 = φk+2 − β,
with certainty equivalent ck+2 = φk+2 − β. Hence, having chosen the certainty equivalent at all
times, after time k + 2 we are at point x + ck+1 + ck+2 = x + φk+1 + φk+2 . So again (10) applies
to the lottery at time k + 3, which is Lk+3 = h(φk+1 − ), (φk+1 + )i. So ck+3 = φk+3 + β. This
process repeats again and again. So, ck+j = φk+j + β for j odd and ck+j = φk+j − β for j even.
Now, assume w.l.o.g. that E(Li ) = 0 for all i. Then, for j odd, Lk+j is a ± lottery and ck+j = β.
For all other i’s, `i is a degenerate lottery and ci = 0. Let `i be the realization of Li . Then,
n
X
n−k
β>
`i from some n on] = 1,
Pr[(c1 , . . . , cn ) (`1 , . . . , `n ) from some n on] = Pr[
2
i=1
where the last equality is by the law of large numbers. So, (c1 , c2 , . . .) is ultimately superior to
(L1 , L2 , . . .) .
∆
All ûn are concave ⇒ - is weakly risk averse: Consider a lottery sequence L = (L1 , L2 , . . .). W.l.o.g.
E(Li ) = 0 for all i. Denote by c = (c1 , c2 , . . .) the repeated certainty equivalent of L. Since all ûn ’s
are concave, also all the functions un are concave in each of their arguments. So, ci ≤ 0 for all i.
So, for any n,
n
X
Pr[(`1 , . . . , `n ) ≺n (c1 , . . . , cn )] ≤ Pr[
`i < 0].
i=1
So,
n
X
Pr[(`1 , . . . , `n ) ≺n (c1 , . . . , cn ) from some n on] ≤ (1 − Pr[
`i ≥ 0 infinitely often]) < 1.
i=1
where the last inequality is by Lemma A.1. So, (c1 , c2 , . . .) is not ultimately superior to (L1 , L2 , . . .).
28
Proofs for Section 4.2.2. Theorem 2 follows directly from Theorem 3 (a) and Proposition 4.2. So,
proceed to prove this theorem and proposition.
For α > 0 let caraα be the function caraα (x) = −e−αx . It is well known that Acaraα (x) = α for
all x. For a real-valued lottery L and NM utility function f let risk-premf (x, L) be the risk-premium
according to f of the lottery L applied at wealth x.
Lemma A.2. RPû () = Ω(2 ) as → 0 if and only if there exists an α such that
risk-premûn (x, L) ≥ risk-premcaraα (x, L)
(11)
for all n, x and L.
Proof. Suppose that RPû () = Ω(2 ). Then there exists 0 and α > 0 with
risk-premûn (x, ±) ≥ α2
(12)
for all n, x and ≤ 0 .
For the function caraα , using the Taylor expansion of e around 0,
−e−α· − eα·
caraα () + caraα (−)
=
2
2
1
α2 2
α2 2
= − (1 − α +
+ 1 + α +
+ O(3 ))
2
2
2
α2 2
= −(1 +
+ O(3 ))
2
(13)
So, for sufficiently small
2α2 2
caraα () + caraα (−)
2
> −(1 +
) > −e−α(−2α /3) = caraα (−2α2 /3).
2
3
So,
2
risk-premcaraα (0, ±) < α2 .
3
For the function caraα the risk premium is independent of x, and hence,
2
risk-premcaraα (x, ±) < α2 ,
3
(14)
for all x.
So, combining (12) and (14)
(15)
risk-premûn (x, ±) > risk-premcaraα (x, ±),
for sufficiently small. But then, by Pratt [25], (15) holds for any lottery L.
Conversely, if risk-premûn (x, ±) ≥ risk-premcaraα (x, ±) then by (13)
risk-premûn (x, ±) ≥
so RPû () = Ω(2 ).
α2
+ O(3 ),
2
29
The following simple lemma establishes that any risk premium exhibited by ûk , for some k, is
(re)exhibited by all subsequent ûm , for m > k.
Lemma A.3. For any m > k,
risk-premûm (x + φk+1 + . . . , φm , ±) = risk-premûk (x, ±).
Proof. Set β = risk-premûk (x, ±). By definition
1
ûk (x − β) = (ûk (x − ) + ûk (x + )).
2
Let a+ , a− , a−β ∈ Hk be such that vk (a+ ) = x + , vk (a− ) = x − , and vk (a−β ) = x − β. So,
(a−β )∼k ha− , a+ i .
∆
∆
k
∆
m
By assumption, - and -
agree on the preferences over ∆(Hk ) when fixing the state in Tk+1 ×
· · · × Tm to the presumed future (φk+1 , . . . , φm ). So,
(a−β , φk+1 , . . . , φm )∼m h(a− , φk+1 , . . . , φm ), (a+ , φk+1 , . . . , φm )i .
∆
Hence,
ûm (x − β + φk+1 + · · · + φm ) =
1
(ûm (x − + φk+1 + · · · + φm ) + ûm (x + + φk+1 + · · · + φm )).
2
The following lemma establishes that if ûk exhibits some risk premium, at some point x, then
not only is this risk premium re-exhibited by all subsequent utility functions ûm , but also that it
is “reachable” from any state y, of any period K.
Lemma A.4. For any k, K, x, y, with x in the domain of ûk and y in the domain of ûK , there
exist m ≥ max{k, K} and bK+1 , . . . , bm , bi ∈ Ti , with
risk-premûm (y + bK+1 + · · · + bm , ±) = risk-premûk (x, ±).
Proof. Set K 0 = max{k, K}. If K < k then for i = K + 1, . . . , k, let bi be any point in Ti and set
y 0 = y + bK+1 + · · · + bk . Otherwise (K ≥ k) set y 0 = y.
Let δ = y 0 − x, j = dδ/se, and m = K 0 + j. For i = K 0 + 1, . . . , m, set bi = φi + δ/j. Then,
m > max{k, K}, and x + φk+1 + · · · + φm = y + bK+1 + · · · + bm . The result then follows from
Lemma A.3 .
30
The following Theorem is from Alon and Spencer [1].
Theorem A.5 ([1], Theorem A.1.19). For every C > 0 and γ > 0 there exists a δ > 0 so that the
following holds: Let Xi , 1 ≤ i ≤ n, n arbitrary, be independent random variables with E[Xi ] = 0,
P
P
|Xi | ≤ C, and Var(Xi ) = σi2 . Set Sn = ni=1 Xi and Σ2n = ni=1 σi2 , so that Var(Sn ) = Σ2n . Then,
for 0 < a ≤ δ · Σn
Pr[Sn > aΣn ] < e−
(16)
a2
(1−γ)
2
.
Lemma A.6. Let X1 , X2 , . . ., be independent random variables with E[Xi ] = 0, |Xi | ≤ C, and
Var(Xi ) = σi2 . Set Sn , σi2 and Σ2n as above. If Σn → ∞, then for any α > 0
Pr[Sn > αΣ2n infinitely often] = 0.
Proof. Denote by n(i) the least n such that Σ2n ≥ i. Since Σn → ∞, for any i there exists an n(i).
Since |Xi | ≤ C, i ≤ Σ2n(i) ≤ i + C 2 .
Denote by Ak the event that there exists i, n(k) < i ≤ n(k + 1), for which Si > αΣ2i . We bound
Pr[Ak ].
Set γ = 0.5, and let δ be that provided by Theorem A.5. Set β = min{δ, α/2}. Then, considering
n(k), by Theorem A.5, setting a = βΣn(k)
−
Pr[Sn(k) > βΣn(k) · Σn(k) ] < e
(17)
β 2 Σ2
n(k)
(1−γ)
2
≤ e−
β2 k
4
Now consider the random variables Xi for i = n(k) + 1, . . . , n(k + 1). Set Dj =
Pj
i=n(k)+1 Xi .
Then,
Var((Dn(k+1) ) = Σ2n(k+1) − Σ2n(k) ≤ (k + 1 + C 2 ) − k = 1 + C 2 .
So, by the Kolmogorov inequality
(18)
Pr[
{Dj } ≥ βΣ2n(k) ] ≤
max
n(k)<j≤n(k+1)
Var(Dn(k+1) )
1 + C2
.
≤
β 2 k2
(βΣ2n(k) )2
Combining (17)-(18), for any k
Pr[Ak ] = Pr[∃i, n(k) < i ≤ n(k + 1), Si > αΣ2i ]
≤ Pr[Sn(k) ≥ βΣ2n(k) ] + P r[
max
{Dj } ≥ βΣ2n(k) ]
n(k)<j≤n(k+1)
≤ e−
So,
P∞
k=1 Pr[Ak ]
β2 k
4
+
1 + C2
.
β 2 k2
< ∞. So, by the Borel Cantelli lemma
Pr[Ak occurs infinitely often] = 0.
For any k there is only a finite number of i’s with n(k) < i ≤ n(k + 1). So, Si > αΣ2i infinitely
often only if Ak occurs infinitely often, and the result follows.
31
Theorem 3. Assuming A1-A3,
∆
(a) If RPû () = Ω(2 ) as → 0 then - is risk averse.26
∆
(b) If RPû () = O(2+β ) as → 0, for some β > 0, then - is not risk averse.
Proof. (a): Suppose that RPû () = Ω(2 ) as → 0.
Let L = (L1 , L2 , . . .) be a bounded, non-vanishing lottery sequence. W.l.o.g. E(Li ) = 0 for all i.
P
P
Set σi2 = Var(Li ), Sn = ni=1 Li and Σ2n = Var(Sn ) = ni=1 σi2 . Since L is non-vanishing Σn → ∞.
Since L is bounded, there exists a C such that |Li | ≤ C for all i.
By the Taylor expansion,
caraα () = −e−α = −1 + α −
(19)
α2 2
+ O(α3 3 ).
2
Let α1 be such that the O(α3 3 ) term in (19) is small for || ≤ C; that is,
caraα1 () ≈ −1 + α1 −
(20)
α12 2
,
2
for || ≤ C.
Let (c1 , c2 , . . .) be the repeated certainty equivalent of L. Let α0 be that provided by Lemma
A.2. Then, for any α < α0
ci < −risk-premcaraα (0, Li ).
Set α = min{α0 , α1 }. Suppose that Li gets values xi1 , . . . , xim with probabilities p1 , . . . , pm , respectively. Then,


ci < −risk-premcaraα (0, Li ) =cara−1
α
m
X

caraα (xij )pj 
j=1

m
2
i
2
X
α (xj )
 (−1 + αxij −
)pj 
≈cara−1
α
2
j=1


m
m
m
2
i
2
X
X
X
α (xj )
 (−1)pj + α
=cara−1
pj 
xij pj −
α
2
j=1
j=1
j=1
α2 σi2
=cara−1
−1 + 0 −
α
2
2
≈cara−1
−e−α(−ασi /2) < −ασi2 .
α

So,
(21)
2
−α · (Σn ) < Sn ⇒
" n
X
#
ci < Sn ⇒ [(c1 , . . . , cn ) ≺ (`1 , . . . , `n )] .
i=1
26recall that g(y) = Ω(h(y)) as y → 0 if there exists a constant M and y such that g(y) > M · h(y) for all y < y .
0
0
32
So, it is sufficient to prove that
Pr[Sn > −α(Σn )2 from some n on] = 1.
which is equivalent to saying that
Pr[Sn < −α(Σn )2 infinitely often] = 0,
(22)
which is provided by Lemma A.6 (by symmetry).
(b): Suppose that RPû () = O(2+β ) as → 0, with β > 0. So, there exists α and 0 such that for
any < 0 , there exists an i and x with
risk-premûi (x, ±) ≤ α · 2+β .
(23)
Set 1 = min{20 , s2 }. For j = 1, 2, . . ., set aj as follows:
( √
2
1
if j = 3k for some integral k
aj =
√ 1
1 √j otherwise
So, by (23), for any j there exists ij and xj with
risk-premûi (xj , ±aj ) ≤ α · a2+β
.
j
(24)
j
We construct a bounded, non-vanishing lottery sequence L = (L1 , L2 , . . .) that is not ultimately
superior to its repeated certainty equivalent, which we denote by (c1 , c2 , . . .). The construction of
L is inductive, wherein the lotteries are defined in chunks. For each j, the j-th chunk consists of
a sequence of degenerate lotteries, followed by a single ±aj lottery, with which the chunk ends.
We denote by n(j) the index of the last lottery in the j-th chunk. The chunks are constructed as
follows. Set n(0) = 0. Suppose L1 , . . . , Ln(j−1) have been defined, and that their repeated certainty
equivalent is c1 , . . . , cn(j−1) . Let ij , xj be as in (24). Set yn(j−1) = c1 + · · · + cn(j−1) . By Lemma
A.4 and (24), there exists m > max{n(j − 1), ij } and bn(j−1)+1 , . . . , bm , with
risk-premûm (yn(j−1) + bn(j−1)+1 + · · · + bm , ±aj ) ≤ αa2+β
.
j
Hence also (moving to m + 1)27,
risk-premûm+1 (yn(j−1) + bn(j−1)+1 + · · · + bm + φm+1 , ±aj ) ≤ αa2+β
,
j
which means that
ûm+1 (yn(j−1) + bn(j−1)+1 + · · · + bm + φm+1 − (αa2+β
)) ≤
j
1
≤ (ûm+1 (yn(j−1) + bn(j−1)+1 + · · · + bm + φm+1 − aj ) + ûm+1 (yn(j−1) + bn(j−1)+1 + · · · + bm + φm+1 + aj )).
2
27We move to m + 1 with φ
m+1 to guarantee sufficient distance from the boundaries to allow a ±aj lottery.
33
Accordingly, set Li = bi , for i = n(j − 1) + 1, . . . , m and Lm+1 = h(φm+1 − aj ), (φm+1 + aj )i. By
construction, ci = bi for i = n(j − 1) + 1, . . . , m, and
cm+1 ≥ φm+1 − αa2+β
.
j
(25)
Denote n(j) = m + 1; that is, n(j) is the index of the ±aj lottery.
We now show that (c1 , c2 , . . .), is not ultimately inferior to (L1 , L2 , . . .). W.l.o.g. E(Li ) = 0 for
all i. So, we have that Li = h(−σi ), (σi )i with
 √


 1
√ 1
σi =
1 √j


 0
2
if i = n(j) with j = 3k for some integral k
if i = n(j) for other j’s
otherwise
and

1+β/2


 −α(1 )
ci ≥
−α(1 )1+β/2 ·


 0
Let Sn =
Pn
i=1 Li .
So, Var(Sn ) =
2
if i = n(j) with j = 3k for some integral k
1
j 1+β/2
if i = n(j) for other j’s
otherwise
Pn
2
i=1 σi .
2
So, for n = n(3k ),
k2
Var(Sn(3k2 ) ) ≥
3
X
1
j=1
j
k2
>
e
X
1
j=1
j
> 1 · k 2 .
On the other hand,
2

X
ci ≥ −α(1 )1+β/2 
n(3k )
i=1
for D =
P∞
1
j=1 j 1+β/2
k2
3
X
j=1

1
j 1+β/2
< ∞.
34
+ k  > −α(1 )1+β/2 (D + k) ,
Set γ = α(1 )1+β/2 . Then, for k sufficiently large

2
n(3k )

Pr[(`1 , . . . , `n(3k2 ) ) - (c1 , . . . , cn(3k2 ) )] = Pr Sn(3k2 ) ≤
X


ci  ≥
i=1
h
i
≥ Pr Sn(3k2 ) ≤ −γ (D + k) =
"
#
Sn(3k2 )
(D + k)
≤ −γ
≥
= Pr
Var(Sn(3k2 ) )1/2
Var(Sn(3k2 ) )1/2
"
#
Sn(3k2 )
(D + k)
≥ Pr
≤ −γ √
≥
1 · k
Var(Sn(3k2 ) )1/2
"
#
Sn(3k2 )
2
≥ Pr
≤ −γ · √
≈
1
Var(Sn(ek2 ) )1/2
1
≈√
2π
Z
−1/2
−2γ1
e−x
2 /2
dx = p > 0,
−∞
for some constant p. In particular, (`1 , . . . , `n(3k2 ) ) - (c1 , . . . , cn(3k2 ) ) for infinitely many k’s, with
probability 1.
Proposition 4.2. RPû () = Ω(2 ) as → 0, if and only if Aûn (x) is bounded away from 0,
uniformly for all n and x (assuming ûn is twice differentiable for all n).
Proof. Follows directly from Lemma A.2 and the fact that Acaraα (x) = α for all x.
Theorem 4. Assuming A1-A3, for vn , un , and û as in Theorems 1 and 3
∆
(a) Weak risk loving: - is weakly risk loving if and only if un is convex with respect to vn for
all n.
(b) Risk loving
∆
• If (−RPû ()) = Ω(2 ) as → 0 then - is risk loving.
∆
• If (−RPû ()) = O(2+β ) as → 0 (for some β > 0) then - is not risk loving.
∆
(c) Risk Neutral: - is risk neutral if and only if un is a linear transformation of vn for all n.
Proof. The proofs of (a) and (b) are analogous to those of Theorems 3 and 1. (c) follows from
combining Theorems 3 and 4.
Proofs for Section 5. The proof of Proposition 5.1 is easier with the aid of Theorem 5, so we
start with proving the theorem and then come back to proving the proposition.
Throughout, the following notation is used:
• v denotes a Debreu value function on S , and v Ti a Debreu value function on the factor Ti .
• u denotes an NM utility function on S . An NM utility for S necessarily exists since the
NM axioms are assumed to hold, and we consider only lotteries with finite support (see
∆
Fishburn [13, Theorem 8.2]). Furthermore, since - is continuous, so is u.
35
∆
Theorem 5. For -, -, with NM utility u and Debreu value function v, then u is concave with
∆
respect to v if and only if - is weakly-risk-averse by Definition 3, and strictly-concave with respect
∆
to v if and only if - is strongly-risk-averse by Definition 5.
Proof. Set û = u ◦ v −1 .
Concavity ⇒ Weak Risk Aversion. Suppose û is concave. Consider a fair lottery sequence L =
(L1 , . . . , Ln ). Then,
û(v(c(L))) = E`∼L [û(v(`))] ≤ û(E`∼L [v(`)]),
where the inequality is by concavity of û. So, since û is monotone
v(c(L)) ≤ E`∼L [v(`)].
Since all the Li ’s are fair, the distribution of v(`) is symmetric around E(v(`)). So,
Pr[E[v(`)] ≤ v(`)] = Pr[E[v(`)] ≥ v(`)]
So,
Pr[c(L) - `] = Pr[v(c(L)) ≤ v(`)]
≥ Pr[E[v(`)] ≤ v(`)]
= Pr[E[v(`)] ≥ v(`)]
≥ Pr[v(c(L)) ≥ v(`)] ≥ Pr[c(L) % `].
Strict Concavity ⇒ Strong Risk Aversion. Consider a non-degenerate repeated lottery sequence
+
L = (L1 , L2 , d) (w.l.o.g. we may assume i = 1, j = 2). Suppose that L1 is the fair lottery a−
1 , a1
− +
+
(the fair lottery between a−
1 and a1 ), and similarly L2 = a2 , a2 .
Lotteries L1 and L2 are of the same magnitude. That is,
+
+ −
(a−
1 , a2 ) ∼ (a1 , a2 ).
So
(26)
T2 +
T1 +
T2 −
v T1 (a−
1 ) + v (a2 ) = v (a1 ) + v (a2 ).
The four possible, equi-probability realizations of L are:
−
(1) ` = (a−
1 , a2 , d)
+
(2) ` = (a−
1 , a2 , d)
−
(3) ` = (a+
1 , a2 , d)
+
(4) ` = (a+
1 , a2 , d)
36
So,
n
T1 +
T2 −
T2 +
X
v T1 (a−
+
1 ) + v (a1 ) + v (a2 ) + v (a2 )
E`∼L [(v(`)] =
+
v Tdi (di ) = v(a−
1 , a2 , d)
2
i=3
where the second equality is by (26).
Suppose û is strictly concave. Then
+
û(v(c(L))) = E`∼L [û(v(`))] <û(E`∼L [(v(`)]) = û(v(a−
1 , a2 , d)).
So,
+
c(L) ≺ (a−
1 , a2 , d).
−
So, of the four possible realization of L only (a−
1 , a2 , d) is - c(L). So,
Pr[c(L) - `] =0.75
Pr[c(L) % `] =0.25,
as necessary.
Weak Risk Aversion ⇒ Concavity: Conversely, suppose that û is not concave. Then, since it
is continuous, it is strictly convex on some interval (x, x). Set z =
x+x
2
and 1 = z − x. By
definition, there exists some (a1 , a2 , d) with z = v(a1 , a2 , d). Since z is internal in (x, x), we may
assume, w.l.o.g. that a1 , a2 are not extreme in T1 , T2 , respectively (that is, there exists a0 , a00 , with
−
a0 ≺ a1 ≺ a00 , and similarly for a2 ). Choose < 1 sufficiently small so that there exists a−
1 , a2 ,
Tai − , and a+ , a+ , with v Ti (a+ ) = v Tai + , i = 1, 2. Set L = a− , a+ , for = 1, 2.
with v Ti (a−
i
2
1
i
i
i
i )=v
Set L = (L1 , L2 , d). Then, L is a repeated lottery, and û is strictly convex on the domain of its
realizations. So, by the reasoning above (for “strict-concavity ⇒ strict-risk-aversion”),
Pr[c(L) - `] < Pr[c(L) % `].
So, u is not weakly risk averse, in contradiction.
Strong Risk Aversion ⇒ Strict Concavity: Suppose that û is not strictly concave. Then, it is
convex on some interval. So, as above, we can construct a repeated lottery on this interval. For
this lottery, by the above reasoning (for “concavity ⇒ weak-risk-aversion”),
Pr[c(L) - `] ≤ Pr[c(L) % `].
∆
So, - is not strongly risk averse.
We now return to proving Proposition 5.1.
Lemma A.7. If there exist two non-identical independent partitions S = A × B and S = C × D ,
then there exist value functions v A , v B , v C , and v D (for A, B , C , D ), such that
(1) v A + v B and v C + v D both represent -,
(2) v A + v B = v C + v D ,
37
(3) if v̂ A , v̂ B are value functions for A, B , and v̂ C , v̂ D , are value functions for C , D , then v̂ A + v̂ B
is a positive affine transformation of v̂ C + v̂ D .
Q
Proof. Each factor T = Ti is a product of some set of commodity spaces, that is T = j∈T Ci , for
Q
Q
some index set T . For factors T = j∈T Cj and R = j∈R Cj , by a slight abuse of notation, we
Q
Q
write T ∩ R for j∈T ∩R Cj , T − R for j∈T −R Cj , and T ⊆ R if T ⊆ R. We say that T and R
overlap if T ∩ R 6= ∅ and neither is contained in the other; the factor T is non-degenerate if T 6= ∅.
Gorman [16, Theorem 1] proves that if two independent factors E and F overlap then E ∪ F , E ∩
F , E − F , F − E , and E 4F = (E − F ) ∪ (F − E ) are all independent.
Set W = A ∩ C , X = A ∩ D , Y = B ∩ C , and Z = B ∩ D . Then, by Gorman’s theorem,
W , X , Y , Z are independent, as is any product thereof. Since the partitions are not identical, at
least three out of W , X , Y , Z are non-degenerate. So, S = W × X × Y × Z is an independent
partition with at least 3 factors. So, by Debreu [7], there are value functions v W , v X , v Y , and v Z ,
with v W + v X + v Y + v Z representing -. So, the pair of functions v A = v W + v X and v B = v Y + v Z
are value functions for the independent partition S = A × B . Similarly, the functions v C = v W +v Y ,
and v D = v X + v Z are value functions for the independent partition S = C × D , proving (1) and
(2). Finally, (3) follows from (2) by the uniqueness of value functions.
∆
Proposition 5.1. If - is weakly-risk-averse by Definition 3 with respect to some independent
partition S = T1 × · · · × Tn , then it is also weakly-risk-averse with respect to any independent
partition. Similarly for (strong)-risk-aversion of Definition 5.
Proof. If there is only one independent partition then there is nothing to prove. Otherwise, by
Lemma A.7 there exists a Debreu value function representing -. Note that aggregate value function
is identical for the two partitions.
∆
Suppose that - is risk averse with respect to one partition. Then, by Theorem 5, u is concave
∆
with respect to v. So, again, by the same theorem, - is risk averse with respect to any other
partition. Similarly for weak risk aversion, risk loving and weak risk loving, and risk neutrality. ∆
Theorem 6. For -, -, with NM utility u and Debreu value function v, then u is convex with
∆
respect to v if and only if - is weakly-risk-loving by Definition 6, and strictly-convex with respect
∆
to v if and only if - is strongly-risk-averse by Definition 6.
∆
u is a linear transformation of v if and only if - is risk-neutral by Definition 6.
Proof. The proof for risk-loving is analogues to to that of risk aversion. Risk neutrality then follows
from the weak versions of risk-aversion and risk-loving.
Proofs for Section 6.
Proposition 6.1. In the multi-commodity setting (with S = T1 ×· · ·× Tn an independent partition),
the NM utility function u is DARA with respect to v if and only if for any commodity i, lottery Li
38
over Ti , state ci ∈ Ti , and d−i , d0−i ∈ Ω−{i} , if d−i ≺ d0−i then
(ci , d−i )-(Li , d−i ) ⇒ (ci , d0−i )≺(Li , d0−i ).
∆
∆
Similarly, u is CARA with respect to v if and only if for any commodity i, lottery Li over Ti , state
ci ∈ Ti , and d−i , d0−i ∈ Ω−{i}
(ci , d−i )-(Li , d−i ) ⇐⇒ (ci , d0−i )-(Li , d0−i )
∆
∆
Proof. Note that, technically, all claims and proofs in Pratt work [25], hold with respect to any
scale one chooses to use. Now, using the v scale, the proposition follows directly from Sections 6-7
and Theorem 2 of [25].
∆
Theorem 7. If preference order - is (weakly) risk averse by Definition 3 then it is (weakly)
∆
correlation averse with respect to all pairs of factors Ti , Tj . Conversely, if - is correlation averse
with respect to some pair of factors Ti , Tj , then it is (weakly) risk averse by Definition 3.
Proof. We prove the case for weak risk aversion and weak correlation aversion. The proof for strict
risk aversion and strict correlation aversion is similar.
We first prove the theorem for the case where there exists a Debreu value function, v, representing
- on the partition. In this case, the theorem is essentially a direct corollary of Theorem 4(a) of
Epstein and Tanny[12], which states (using the notations of our paper):
∆
X
Let X = X1 × X2 , be an independent partition, and - a preference order on ∆(X )
with NM utility u(x1 , x2 ). If uX (x1 , x2 ) = φ(α1 x1 + α2 x2 ), for some function φ, and
∆
constants α1 , α2 > 0, then -
X
is weakly correlation averse by 3 if and only if φ is
concave.
P
Set φ = u ◦ v −1 . So, u(a1 , . . . , an ) = φ( ni=1 v Ti (ai ))).
∆
Now, if - is weakly risk averse by Definition 3 then u is concave with respect to v (Theorem 5).
So, φ is concave. So, setting X1 = T1 , and X2 = T2 × · · · × Tn , by the above Epstein and Tanny
∆
theorem, - is weakly correlation averse with respect to X1 , X2 . In particular, for a1 , b1 ∈ T1 ,
a1 ≺ b1 , and (a2 , c), (b2 , c) ∈ X2 = T2 × · · · × Tn , with (a2 , c) ≺ (b2 , c),
∆
h(ai , aj , c), (bi , bj , c)i - h(ai , bj , c), (bi , aj , c)i .
∆
Since T2 is independent, (a2 , c) ≺ (b2 , c) if and only if a2 ≺ b2 . So, - is weakly correlation averse
∆
with respect to T1 , T2 . The same argument works for proving that - is weakly correlation averse
with respect to any Ti , Tj .
∆
Conversely, suppose that - is weakly correlation averse with respect to some Ti , Tj . For any
∆
fixed c ∈ S−1,2 , set S c = Ti × T2 × {c}. Set X1c = Ti and X1c = Tj × {c}. Then, - is weakly
correlation averse with respect to X1c , X2c . So, φ is concave on v(S c ). This holds for all c. By
continuity, each v(S c ) is an interval. Furthermore, the union of these intervals is v(S ), which is
the entire domain of φ, and any point in the interior of v(S ) is in the interior of some v(S c ). So,
∆
φ is concave on its entire domain. So, - is weakly risk averse (Theorem 5).
39
It remains to prove the theorem for the case that the space is not additively separable. This can
only happen when n = 2, as for n ≥ 3 independence of the partition implies additive separability.
∆
We first define a restricted notion of correlation aversion. For S = T1 × T2 , and - on ∆(S ), we
∆
∆
∆
say that - is perfect weak correlation averse if for any a1 , b1 ∈ T1 , a2 , b2 ∈ T2 , with a1 ≺b1 , a2 ≺b2 ,
and (a1 , b2 ) ∼ (a2 , b1 )
∆
h(a1 , a2 ), (b1 , b2 )i - h(a1 , b2 ), (b1 , a2 )i .
(27)
The difference is that with perfect correlation aversion, (27) is only required when (a1 , b2 ) ∼ (a2 , b1 ).
∆
Lemma A.10, which we prove later on, establishes that - is (weak) correlation averse if and only
if it is perfect (weak) correlation averse. So it suffices to prove the theorem for perfect (weak)
correlation aversion.
∆
∆
Suppose that - is risk averse by Definition 3. Consider the a1 , b1 ∈ T1 , a2 , b2 ∈ T2 , with a1 ≺b1 ,
∆
a2 ≺b2 , and (a1 , b2 ) ∼ (a2 , b1 ). Set L1 = ha1 , b1 i , L2 = ha2 , b2 i, and consider L = (L1 , L2 ). So,
Pr[c(L) - `] ≥ Pr[c(L) % `] ⇐⇒
Pr[u(L) ≤ u(`)] ≥ Pr[u(L) ≥ u(`)] ⇐⇒
Pr[u(L) ≤ u(`)] ≥1/2
(28)
The four possible realizations of L are
(a1 , a2 ) ≺ (a1 , b2 ) ∼ (b1 , a2 ) ≺ (b1 , b2 ).
So, continuing (28),
Pr[u(L) ≤ u(`)] ≥1/2 ⇐⇒
u(L) ≤u(a1 , b1 ) ⇐⇒
u(a1 , a2 ) + u(a1 , b2 ) + u(b1 , a2 ) + u(b1 , b2 )
≤u(a1 , b1 ) ⇐⇒
(since (a1 , b1 ) ∼ (b1 , a2 )),
4
u(a1 , a2 ) + u(b1 , b2 ) u(a1 , b1 ) + u(b1 , a1 )
≤
2
2
establishing (27).
We now turn to proving that correlation aversion and perfect correlation aversion are equivalent,
focusing on the case of a partition into two factors. Again, if the space is additively separable, the
proof is easy. The more involved case is when it is not additively separable.
When considering a partition into two factors, we adopt the following notation, which is somewhat different from that used in the rest of the paper. The independent partition is denoted
S = A × B . We use a, A, and and b, B, with or without subscripts or superscripts, for points in A
and B , respectively. By convention, a ≺ A and b ≺ B.
Let w A : A → R be a continuous real valued function representing -A , and similarly w B a
continuous real function representing -B (such function are exist by Debreu [6] since -A and -B
40
are continuous). Define w : A × B → R2 as w(a, b) = (w A (a), w B (b)). Let IA × IB ⊆ R2 be the
image of A × B under w.
Lemma A.8. u ◦ w−1 : IA × IB → R is well defined, increasing in each coordinate, and continuous.
Proof. If w(a, b) = w(a0 , b0 ) then (a, b) ∼ (a0 , b0 ), and hence u(a, b) = u(a0 , b0 ). Thus, u ◦ w−1 is
well defined. It is increasing is each coordinate as u and w A , w B agree on the certainty preference.
Denote û = u ◦ w−1 , and for x ∈ IA define ûxB : IB :→ R, by ûxB (y) = û(x, y). Then, the ûxB is
monotone. Also, ûxB (IB ) = u((w A (x)−1 , B ) is an interval (since B is a finite product of connected
spaces and u continuous). So, ûxB is continuous for any x. Similarly, the function ûyA : IA :→ R,
defined by ûyB (x) = û(x, y) is continuous for any y.
To prove continuity of û, we prove that the pre-images of the open rays (−∞, r) and (r, ∞) are
open, for all r. Consider (−∞, r) (the other case is analogous). Set Er = {(x, y) : û(x, y) < r}.
If Er = ∅ or Er = IA × IB then there is nothing to prove. Otherwise, consider (x∗ , y ∗ ) with
û(x∗ , y ∗ ) < r − , for some > 0. We show that there is a neighborhood of (x∗ , y ∗ ) fully contained
in Er . Suppose that x∗ is not maximal in IA and y ∗ not maximal in IB (the proof for the case that
one of them is maximal is similar). The function ûxB∗ is continuous. So, there exists some y 0 with
(29)
1
0 < ûxB∗ (y 0 ) − ûxB∗ (y ∗ ) < .
2
Similarly, the function ûyA0 is continuous. Thus, there exists x0 with
1
0 < ûyA0 (x0 ) − ûyA0 (x∗ ) < .
2
Combining (29) and (30), we obtain
(30)
û(x∗ , y ∗ ) < û(x0 , y 0 ) + < r.
Set δ = min{x0 − x∗ , y 0 − y ∗ }. Then, for any (x, y) if k(x, y) − (x∗ , y ∗ )k < δ then x < x0 and y < y 0 .
So, by monotonicity of û, û(x, y) < û(x0 , y 0 ) < r. So, the entire ball of size δ around (x∗ , y ∗ ) is
contained in Er , as required.
Lemma A.9. Let A × B be an independent partition and a ≺ A, b ≺ B. Set a0 = a, and while
(ai , B) - (A, b) let ai+1 be such that (ai+1 , b) ∼ (ai , B) (such an ai+1 exists by continuity). Then,
there exists an ī such that (aī , B) % (A, b) (that is, the sequence a0 , a1 , . . . is finite).
Proof. Contrariwise, suppose there is no such ī. Then, for i = 1, 2, . . ., (ai , B) ≺ (A, b), and hence
ai ≺ A. Clearly, ai - ai+1 . Thus, the sequence a1 , a2 , . . . , is an infinite monotone and bounded
sequence, and hence converges to a limit â. By definition, for each i
(ai , B) ∼ (ai+1 , b).
Thus, by continuity,
(â, B) ∼ (â, b),
which is impossible since b ≺B B and - is strictly monotone in each factor.
41
∆
Lemma A.10. - is (weakly) correlation averse if and only if it is perfect (weakly) correlation
averse.
Proof. (weak) correlation aversion ⇒ perfect (weak) correlation aversion: The requirement of perfect risk aversion - (27) - is identical to that of correlation aversion - (3) - only limited to the case
that (a1 , b2 ) ∼ (b2 , a1 ).
Perfect (weak) correlation aversion ⇒ (weak) correlation aversion: We prove for the strong case.
The weak case is similar.
∆
Suppose that - is perfect correlation averse on S = A × B . Let a, A ∈ A, b, B ∈ B , with a ≺ A
and b ≺ B. We need to show that
∆
h(a, b), (A, B)i ≺ h(A, b), (a, B)i .
(31)
If (a, B) ∼ (A, b) then (31) holds by the definition of perfect correlation aversions.
∆
Otherwise, let u be an NM utility for -. set
diff = u(a, b) + u(A, B) − u(a, B) − u(A, b).
We show that diff < 0, which establishes (31).
Let w A be a continuous function representing -A and w B a continuous function representing
-B (the certainty preferences). In order to prove that diff < 0, we start out by proving that there
exists a 1 , A 1 , b 1 , B 1 , with
2
2
2
2
a - a 1 ≺ A 1 - A,
2
2
and b - b 1 ≺ B 1 - B,
2
2
such that
(32)
1
w A (A 1 ) − w A (a 1 ) ≤ (w A (A) − w A (a))
2
2
2
1
w B (B 1 ) − w B (b 1 ) ≤ (w B (B) − w B (b))
2
2
2
or
and
(33)
diff < u(a 1 , b 1 ) + u(A 1 , B 1 ) − u(a 1 , B 1 ) − u(A 1 , b 1 ).
2
2
2
2
2
2
2
2
W.l.o.g. we may assume that (a, B) ≺ (A, b); so (a, b) ≺ (a, B) ≺ (A, b). Thus, since -A is
continuous and A connected, there exists a ≺ a1 ≺ A with
(34)
(a1 , b) ∼ (a, B).
Figure 2 illustrates the following argument. Set a0 = a. Given ai , let ai+1 be such that (ai+1 , b) ∼
(ai , B). Let ī be the first index with (aī , B) % (A, b); such an ī exists by Lemma A.9. Then,
(a, B) ≺ (A, b) - (aī , B). Thus, there exists A1 , a ≺ A1 - aī , such that (A1 , B) ∼ (A, b). Clearly,
aī - A. Thus, either
(35)
1
w A (A1 ) ≤ (w A (a) + w A (A)),
2
42
Figure 2. Illustration of the proof of Lemma A.10. The values ai are calculated
left-to-right, starting at a = a0 . Here ī = 2 and the point a2 is such that w A (a2 ) ≥
1
A
2 (w (A)
+ w A (a)) (assuming the picture is scaled according to w A ).
or
1
w A (aī ) ≥ (w A (a) + w A (A)).
2
(36)
We consider each of these cases separately.
First, suppose that (35) holds. Then, by construction (A1 , B) ∼ (A, b), and they are perfectly
hedged. Hence, by assumption,
(A1 , b), (A, B) ≺ (A1 , B), (A, b) .
∆
So,
u(A1 , b) + u(A, B) − u(A1 , B) − u(A, b) < 0.
Hence,
u(a, b) + u(A, B) − u(A, b) − u(a, B) =
u(a, b) + u(A1 , B) − u(A1 , b) − u(a, B) + u(A1 , b) + u(A, B) − u(A1 , B) − u(A, b) <
(37)
u(a, b) + u(A1 , B) − u(A1 , b) − u(a, B).
Setting a 1 = a, A 1 = A1 , b 1 = b and B 1 = B, by (35) and (37) we get (32) and (33).
2
2
2
2
Next, suppose that (36) holds. Then, by construction, for i = 1, . . . , ī, (ai−1 , B) ∼ (ai , b), and
∆
each such pair is perfectly hedged. Since - is ordinally risk averse,
(ai−1 , b), (ai , B) ≺ (ai−1 , B), (ai , b) ,
∆
43
for all i. So,
ī
ī
1 X
1 X
i−1
i
u(a , b) + u(a , B) <
u(ai−1 , B) + u(ai , b) ;
2ī
2ī
(38)
i=1
i=1
and
u(a0 , b) + u(aī , B) < u(aī , b) + u(a0 , B);
so (as a0 = a)
u(a, b) + u(aī , B) − u(aī , b) − u(a, B) < 0 .
Hence,
u(a, b) + u(A, B) − u(A, b) − u(a, B) =
u(a, b) + u(aī , B) − u(aī , b) − u(a, B) + u(aī , b) + u(A, B) − u(aī , B) − u(A, b) <
(39)
u(aī , b) + u(A, B) − u(aī , B) − u(A, b).
Setting a 1 = aī , A 1 = A, b 1 = b and B 1 = B, by (36) and (39) we get (32) and (33).
2
2
2
2
Thus, we have established (32) and (33), and we now return to complete the proof that diff < 0.
Set
diff 1 = u(a 1 , b 1 ) + u(A 1 , B 1 ) − u(a 1 , B 1 ) − u(A 1 , b 1 ).
2
2
2
2
2
2
2
2
2
Then,
diff < diff 1 .
2
Applying the above halving procedure repeatedly, we obtain that for any δ > 0 there exists
(aδ , bδ ), (Aδ , Bδ ), such that
(40)
w A (Aδ ) − w A (aδ ) ≤δ
(41)
w B (Bδ ) − w B (bδ ) ≤δ
or
and
diff 1 <u(aδ , bδ ) + u(Aδ , Bδ ) − u(aδ , Bδ ) − u(Aδ , bδ ) =
2
(42)
(u(Aδ , Bδ ) − u(aδ , Bδ )) + (u(aδ , bδ ) − u(Aδ , bδ )) =
(43)
(u(Aδ , Bδ ) − u(Aδ , bδ )) + (u(aδ , bδ ) − u(aδ , Bδ )) .
By Lemma A.8 the function u ◦ (w A , w B )−1 is continuous. So it is uniformly continuous on the
rectangle [w A (a), w A (A)] × [w B (b), w B (B)]. That is, for any > 0, there exists a δ such that if
k(w A (a0 ), w B (b0 )) − (w A (a00 ), w B (b00 ))k < δ
then
|u(a0 , b0 ) − u(a00 , b00 )| < .
In particular, if (40) holds then (42) is ≤ 2, and if (41) holds then (43) is ≤ 2. Thus, diff 1 ≤ 0,
2
so diff < 0.
44
Appendix B. Unbounded Lottery Sequences
Here we show why in Definition 1 one needs to require that the lottery sequence be bounded.
Suppose that the conditions of Section 4 hold. We show that if we allow for unbounded lottery
∆
∆
1
∆
2
sequences, then for any preference policy - = (- , - , . . .), there exists a lottery sequence that is
ultimately inferior to its repeated certainty equivalent.
Let v Ti be the value function of Ti . W.l.o.g. suppose that Ti is already represented in terms of
v Ti , that is v Ti (ai ) = ai for all ai ∈ Ti . Then, the certainty preferences -n are simply determined
by the sum of the coordinates.
∆
n
Let un be a NM utility representing - . For each n, let bn be such that
2−n · un (0, . . . , 0, bn ) + 1 − 2−n un (0, . . . , 0, −1) = un (0, . . . , 0).
Let Ln be the lottery obtaining the value bn with probability 2−n and the value −1 with probability
1 − 2−n . Then, c1 , c2 , . . . , the repeated certainty equivalent of the lottery sequence L1 , L2 , . . ., has
cn = 0 for all n. However,
∞
X
Pr[`n > −1] =
n=1
∞
X
2−n < ∞.
n=1
So, by the Borel Cantelli lemma
Pr[`n > −1 infinitely often] = 0.
So,
n
X
Pr[
`i < 0 from some n on] = 1,
i=1
and hence
n
n
X
X
Pr[
`i < 0 =
ci from some n on] = 1.
i=1
i=1
So, L1 , L2 , . . . is ultimately inferior to c1 , c2 , . . ..
45