Perception, Utility and Evolution

Perception, Utility and Evolution
José F. Tudón M.∗
October 14, 2016
Abstract
This paper presents a model of the evolution of the hedonic utility
function in which perception is imperfect. Netzer (2009) considers a
model with perfect perception and finds that the optimal utility function allocates marginal utility where decisions are made frequently.
This paper shows that it is also beneficial to allocate marginal utility
away from more perceptible events. The introduction of perceptual
errors can lead to qualitatively different utility functions, such as
discontinuous functions with flat regions rather than continuous and
strictly increasing functions.
1
Introduction
Based on the hypothesis that individual decisions are guided by hedonic
utility, several recent studies have found an evolutionary foundation for
S-shaped utility functions. Though they use different mechanisms, Netzer
(2009) and Rayo and Becker (2007) are two notable examples.1 Netzer
(2009) derives a utility function which is steep in regions where decisions
have to be made frequently, and where wrong decisions would lead to large
∗
Department of Economics, The University of Chicago, 1126 E. 59th St., Chicago,
IL 60637 (e-mail: [email protected]). Thanks to Benjamin Brooks, Alex Frankel,
Emir Kamenica, Philip Reny, Luis Silvestre, Balázs Szentes, Richard Van Weelden and
workshop participants at the University of Chicago.
1
See also Robson (2001). Robson and Samuelson (2010) survey the literature.
1
losses. Rayo and Becker (2007) consider the case of an individual with local
information and derive S -shaped utility functions that spread incentives
over the output domain.
Besides advancing our understanding of happiness as a tool for decision making, these recent contributions are also linked to risk preferences.
Because of this link, and the possibility of an evolutionary foundation of
the S -shape, the debate is followed by both proponents and detractors of
prospect theory.2
This paper builds on the framework developed in Netzer (2009) by allowing limited perception to have a prominent role. In this general setting,
Netzer’s intuition can be overturned because nature faces a trade off between perception and robustness. Indeed, nature may not allocate marginal
utility in regions where decisions have to be made frequently. Thus, this
paper shows what is lost by approximating limited perception with perfect
perception.
2
The environment
The environment is based on Netzer (2009). Suppose that an agent makes
repeated choices between two options, x and y. These options are independently drawn from a normalized set, [0, 1], according to a distribution
function F with full support. The options are identified with fitness levels:
option x yields fitness x, where fitness could simply be thought of as the
number of offspring.
The agent has a hedonic utility function U (x), which is shaped by natural selection. Naturally, the agent will choose the option that makes her
happier. Because evolution, through natural selection, optimizes the utility
function, I will use the common metaphor of a gene that maximizes offspring. Thus, the selfish gene will design the utility function that promises
the largest expected fitness.3
2
See Kahneman and Tversky (1979). For a recent review on prospect theory, see
Barberis (2013).
3
See Dawkins (1976). I assume that the utility function values offspring. See Rayo
and Robson (2014) for a theory on the evolution of utility functions over intermediate
2
However, not all utility functions are admissible. Besides focusing on
measurable utility functions, there will be 2 constraints. First, the human
brain has a limited size. Thus, happiness must be bounded. Second, the
human senses are imperfect. Thus, happiness cannot be perceived without
errors.4 More formally, I will make the following assumptions.
Assumption 1 (Bounded happiness). Without loss of generality, the utility function must be bounded between 0 and 1.
Assumption 2 (Limited perception). The agent will perceive the happiness of x as U (x) + ε, where the error ε is a random variable.5
2.1
The difference with Netzer (2009)
Assumption 2 deserves further discussion because it is the main variation
from Netzer (2009). In his paper, Netzer models limited perception “by
assuming that utility can take only discrete, albeit extremely numerous,
values. [...] The set of admissible utility functions is thus restricted to the
set of increasing step functions with N ∈ N jumps, each corresponding to a
utility increment of size 1/N . As a result, the agent cannot distinguish two
alternatives located on the same step of the utility function. Any choice
between such alternatives will have to be random, and a mistake can occur.”
To derive his main result, Netzer takes the limit when N → ∞, so
perception becomes perfect. Netzer takes this elegant approach because,
with finite N , the optimization problem gets complicated, to say the least.
In this way, Netzer finds a solution with a clear interpretation; the perfect perception solution is the limit of a series of solutions with limited
perception.
However, by maintaining limited perception we can focus on the issues
that we miss by taking the limit. It turns out that in theory the omissions
goods such as food, shelter, etc.
4
See Rayo and Becker (2007) or Netzer (2009) for further details.
5
Note that limited perception of happiness is different from limited perception of
the alternatives. If the x is perceived with a noise, then evolution will operate on the
senses needed to distinguish one alternative from another. In this model, however, we
are interested in the consequences of perceiving utility U (x) with noise.
3
may be considerable. On the other hand, what we lose in the approximation we gain in tractability. As we shall see, with limited perception the
existence of an analytic solution is not guaranteed.
3
The gene’s problem
With limited perception, the agent gets confused. She will choose x over
y if and only if U (x) + εx ≥ U (y) + εy , where the errors are iid and share
the spirit of the drift-diffusion model of decision making (Fehr and Rangel,
2011). With a slight abuse of notation, let ε ≡ εy − εx and call G the
distribution of ε. Note that ε is symmetric by construction.6 Then, the
agent chooses x over y if and only if U (x) − U (y) ≥ ε with x, y ∼ F and
ε ∼ G, all independent. Assume that the distributions can be represented
by bounded densities f and g with finitely many discontinuities. In Section
7, I explore the possibility of multiplicative errors and its connection to
Weber’s Law.
The agent chooses x with probability P [U (x) > U (y) + ε] = P [ε <
U (x) − U (y)] = G[U (x) − U (y)]. Let V represent the expected fitness given
a utility function U . Thus,
Z
1
Z
1
V [U ] =
xf (x)f (y)G U (x) − U (y) dxdy
0
0
Z 1Z 1
h
i
+
yf (x)f (y) 1 − G U (x) − U (y) dxdy.
0
0
Z
1
Z
∴ V [U ] = E[X] +
0
1
(x − y)f (x)f (y)G U (x) − U (y) dxdy
0
Therefore, the gene’s problem becomes
max
U ∈U
V [U ]
subject to
0 ≤ U (x) ≤ 1,
(1)
where U is the set of admissible functions, which I assume is the set of
measurable functions bounded between 0 and 1.
If Z and Z 0 are iid, Z s ≡ Z − Z 0 ∼ Z 0 − Z = −(Z − Z 0 ) = −Z s , which implies that
Z is a symmetric random variable.
6
s
4
Remark. The function U appears twice inside the integral with different running variables. Intuitively, any change of U at any point must be
weighted versus the whole function. Thus, this problem is not local.
4
A simple example
Before proceeding with the formal analysis, this section introduces a simple
example that overturns the intuition of allocating more marginal utility
where decisions have to be made more frequently.
Suppose that errors are uniform U [−1, 1]. That is: G(ε) = (1 + ε)/2.
Outcomes X have a pdf denoted by f . In this case:
1
V [U ] = E[X] +
2
Z
0
1
Z
1
(x − y) f (x)f (y) 1 + U (x) − U (y) dxdy.
0
After simplifications,
Z
V [U ] = E[X] +
1
x − E[X] f (x)U (x)dx.
0
This implies that the optimal utility function U ∗ should be an indicator
function for x ≥ E[X]:

1 x ≥ E[X]
U ∗ (x) =
0 x < E[X]
Note that this is optimal regardless of f . In fact, the optimal utility allocates all the marginal utility at E[X] even if f (E[X]) should be low.
Figure 1 represents this case: two different distribution functions have the
same optimal utility function, which is a step function. However, one pdf
allocates more mass at the extremes while the other one at the center.
Figure 2 compares the optimal utility function and Netzer (2009)’s result when the pdf is a Beta(0.5, 0.5), which allocates more mass at the
extremes. The optimal utility function is a step function, essentially flat
almost everywhere. However, even though f (1/2) is the minimum of f ,
U ∗ has infinite marginal utility at x = 1/2. The intuition of Netzer is
5
Two D is t r ibut ions , s ame Ut ility func t ion
8
Pr obabilit ie s f ( x )
7
Beta(10,10)
Beta(0.5,0.5)
6
5
4
3
2
1
0
0
0.5
Out c ome s x
1
Note: The two distributions are Beta(10, 10) and Beta(0.5, 0.5), which implies that
E[X] = 1/2 for both. For both distributions, the optimal utility function is U ∗ (x) =
1{x ≥ 1/2}.
Figure 1: Two different distributions with the same optimal utility function
overturned because, once we allow for errors, the optimization problem becomes non local. One has to weight the local marginal utility against the
marginal utility evaluated at every point. In other words, mistakes are not
only made for alternatives with similar fitness, but also when comparing
alternatives with very different fitness and therefore the optimal solution
is not only determined by the local density, but by the global probability
function.
Granted, this example is extreme because the errors are big. However,
as we will see, this intuition holds true for any size of the errors. Indeed, as
long as there is limited perception, there will be flat regions in the utility
function regardless of f .
6
Ne t z e r ’s r e s ult and opt imal Ut ility func t ion
1
0.9
Pr obabilit ie s f ( x )
0.8
Optimal U
Netzer (2009)
CDF of Beta(0.5,0.5)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.5
Out c ome s x
1
Note: The outcomes X ∼ Beta(0.5, 0.5), which implies that E[X] = 1/2. The optimal
utility function
is U ∗ (x) = 1{x ≥ 1/2}. The utility function in Netzer (2009) equals
Rx
U (x) = c 0 f (x)2/3 dx, where f is the pdf of X and c is a normalizing constant.
Figure 2: Comparison of the optimal utility function and Netzer’s result
5
Existence and uniqueness
The first step to tackle this problem is to assure existence and uniqueness
of a solution. As is, we can not apply an existence theorem for the set U
because it is not compact. But we can prove the theorem for a subset of U:
the subset of monotone functions. The following lemma will help us show
that if a solution exists, it must be monotone, non decreasing.
Lemma 1. For any U ∈ U exists a monotone, non decreasing function
U ∗ ∈ U such that V [U ] ≤ V [U ∗ ].
Proof. This proof is not as trivial as it seems, since changing U at any
point will change the whole integral. The strategy will be to consider
simple functions first, and then apply the dominated convergence theorem
to prove the general case.
7
The objective function can be written as:
1
Z
Z
V [U ] = K + 2
R1R1
0
y
(x − y)f (x)f (y)G U (x) − U (y) dxdy
y
0
where K = E[X] +
1
(y − x)f (x)f (y)dxdy.
Suppose that Un is a simple function that is constant on the intervals
, s = 1, . . . , n. Each simple function of this form can be written
P
as the step function Un (x) = ns=1 us 1 s−1
, ns (x). Now we can define
n
s−1 s
,n
n
the n-vector u as the vector with values us , which is the value that Un (x)
takes on the sth interval. From this definition we see that each n-vector
corresponds to a step function.
We will now define the sorting operator S acting on the vector u as
follows. Let l be an integer in 1, . . . , n such that ul > um for some l < m.
If l exists, set S(u) to be the n-vector with um on the lth position and
ul on the mth position, and all other elements equal to the corresponding
elements of u. If such an l does not exist, set S(u) = u. Then, S sorts two
elements of u in increasing order.7
Similarly, let xn and fn be two simple functions; the idea is that xn → x
and fn → f . With a slight abuse of notation, also let xn and fn stand
for the n-vectors with xs and fs in the sth position. Therefore, we can
approximate V [Un ] with
1
Z
Vn [Un ] ≡ K + 2
Z
1
(xn (x) − xn (y))fn (x)fn (y)G Un (x) − Un (y) dxdy
0
=K+
2
n2
y
n
n
XX
(xi − xj )fi fj G(ui − uj ).
j=1 i=j
In Appendix A, I prove the following lemma.
Lemma 2. Vn [Un ] ≤ Vn [S(Un )].
Therefore, if we apply the sorting operator S sufficiently many times to
Un , to a maximum of n times, we find the rearranged vector Un∗ : a vector
completely sorted in ascending order. Every application of the sorting
7
For example: if u = (5 2 3) ⇒ S(u) = (2 5 3).
8
operator S improves Vn , and since we apply it a finite number of times
Vn [Un∗ ] ≡ Vn [S ◦ S ◦ · · · ◦ SUn ] ≥ Vn [Un ].
Now that this holds for simple functions, we go for the general case.
Since f , U and the identity are measurable functions, there are sequences
of bounded simple functions {fn }, {Un } and {xn } converging almost everywhere to f , U and x. Define the increasing rearrangement of U as
Z
U (x) ≡ inf q ∈ R : x ≤
1
∗
1{U (z) ≤ q}dz .
0
This rearrangement always exists; it is like sorting the values of U in increasing order. This is a quantile function.
From this definition, Un → U a.e. implies that the rearrangements also
converge a.e. to the rearrangement: Un∗ → U ∗ a.e. (see Chernozhukov,
Fernández-Val and Galichon (2009) and the references therein). Moreover,
Vn [Un ] → V [U ] and, since the inequality works for every n, by Lebesgue’s
dominated convergence theorem, V [U ] ≤ V [U ∗ ].
Therefore, Lemma 1 allows us to look for the optimum in the set of non
decreasing monotone functions. Let M ⊂ U denote the set of admissible
monotone functions:
M ≡
n
U
o
: [0, 1] 7→ [0, 1] U is monotone, non decreasing .
And note that M is closed, bounded and convex. From here, we can
establish existence and uniqueness.
Since the optimal utility function is not defined if f (x) = 0, I will
formally assume full support, except perhaps at the extremes.
Assumption 3. Suppose that f (x) > 0 for all x ∈ (0, 1). Moreover,
without loss of generality, if f (0) = 0 then set U (0) = 0 and if f (1) = 0
then set U (1) = 1.
Proposition 3. A solution to maxU ∈M V [U ] exists. Moreover, if G is
strictly concave in [0, ∞), and hence strictly convex in (−∞, 0], then the
solution is unique.
9
Proof. I will prove that the maximum is attained. Let {Un } be a sequence
such that V [Un ] → sup{V [U ] : U ∈ M}. Then, I will use the following
Lemma, which is called Helly’s first theorem (Montesinos, Zizler and Zizler,
2015). Its proof is in Appendix A.
Lemma 4. If {Un } is a sequence in M, then there is a subsequence {Unk }
such that {Unk } pointwise converges in [0, 1] to a function U ∈ M.
By Lemma 4, G (Unk (x) − Unk (y)) → G (U (x) − U (y)) almost everywhere—the “almost everywhere” comes from the fact that a sequence of
right-continuous functions can converge to a left-continuous one. Since
G ≤ 1, V is bounded. Therefore, we can apply Lebesgue’s Dominated
Convergence Theorem to conclude that V [Unk ] → V [U ]. Therefore, V
attains its supremum at U .
To prove uniqueness, note that V [U ] can be written as
Z
1
Z
0
Z
1
Z
1
(y − x)f (x)f (y)G[U (y) − U (x)]dxdy
V [U ] = E[X] +
y
y
(y − x)f (x)f (y)G[U (y) − U (x)]dxdy.
+
0
0
Here, we can see that (y − x)G(U (y) − U (x)) is concave when G is concave
in [0, ∞) and convex in (−∞, 0]. Then, V is concave and the maximum is
unique.
6
The optimal utility function
Since the optimization problem is non local, the properties of the solution
are hard to characterize. Indeed, if we take the approach of calculus of
variations we end up with an integral equation, which in general does not
have analytic solutions. However, we can prove one thing: there will be
flat regions in the utility function.
Proposition 5. Suppose that G is concave in [0, ∞). Then, there exist
0 < a ≤ b < 1 such that the utility function U ∗ that solves the problem
defined in (1) is equal to 0 for x ∈ [0, a] and equal to 1 for x ∈ [b, 1].
10
Proof. Let us begin by writing the Lagrangian:
Z
L[U, λ0 , λ1 ] = V +
1
h
i
λ0 (x)U (x) − λ1 (x)[U (x) − 1] dx.
(2)
0
The Karush–Kuhn–Tucker conditions are:
−U (x) ≤ 0, λ0 (x) ≥ 0,
U (x) − 1 ≤ 0, λ1 (x) ≥ 0,
λ0 (x)U (x) = 0;
(3)
λ1 (x)[U (x) − 1] = 0.
In the jargon of Calculus of Variations8 , for U ∗ to solve (1), we need
that the variation of L equals zero along U ∗ , in other words, the First Order
Condition is δL = 0. The variation δL is defined as:
δL ≡
∂
L [U (x) + αδU (x)] ∂α
α=0
where δU ∈ U is some admissible perturbation, close enough to U such
that U + αδU ∈ U for all α sufficiently close to 0.
I will start with δV and then continue with the rest of δL.
∂
δV ≡
V [U (x) + αδU (x)] ∂α
α=0
Z 1Z 1
=
(y − x)f (x)f (y)g[U (y) − U (x)][δU (y) − δU (x)]dxdy
0
0
Z 1Z 1
=
(y − x)f (x)f (y)g[U (y) − U (x)]δU (y)dxdy
0
0
Z 1Z 1
−
(y − x)f (x)f (y)g[U (y) − U (x)]δU (x)dxdy .
0
0
|
{z
}
Switch the variables here
Because x and y are dummy variables, rename the variables in the
underbraced part of this equation; where it says x, write y and where it
says y, write x. Then, because U is measurable, apply Fubini’s Theorem
to interchange the order of integration. Finally, since g is symmetric, we
8
See Elsgolc (2007).
11
get the following expression:
1
Z
1
Z
δU (y)
δV =
|0
0
2(y − x)f (x)f (y)g U (y) − U (x) dx dy.
{z
}
≡I(y)
Define the interior integral as I(y), which is a function of y. Then,
Z
δV =
1
δU (y)I(y)dy.
0
Now, we focus on the second part of δL. Using δL =
∂
L[U +αδU ]α=0 ,
∂α
we get that:
i
1h
λ0 (x)δU (x) − λ1 (x)δU (x) dx
δL = δV +
0
Z 1
h
i
δU (x) I(x) + λ0 (x) − λ1 (x) dx.
=
Z
0
Since, δL = 0 should hold for any arbitrary (but close) deviation δU ,
subject to some regularity conditions, we can apply the Fundamental Theorem of Calculus of Variations to conclude that the First Order Condition9
becomes
I(x) + λ0 (x) − λ1 (x) = 0
∀x ∈ [0, 1].
Therefore,


<0



I(x) = 0



> 0
if
U (x) = 0
if
U (x) ∈ (0, 1)
if
U (x) = 1.
However, note that
Z
I(0) = −2f (0)
1
yf (y)g U (0) − U (y) dy.
0
It follows that for any U ∈ M, if f (0) > 0, then I(0) < 0 and if f (0) = 0,
then I(0) = 0 and, by assumption, U (0) = 0. But in both cases, for a
9
We already discussed existence and uniqueness but see Appendix B for the Second
Order Conditions.
12
small enough a > 0, we conclude that I(a) < 0. Therefore, U (a) = 0. By
monotonicity, U (x) = 0 for all x ∈ [0, a].
R1
The case of I(1) = 2f (1) 0 (1 − y)f (y)g U (1) − U (y) dy > 0 is analogous so that I(b) > 0 for b close enough to 1. Again, by monotonicity,
U (x) = 1 for all x ∈ [b, 1].
The proof offers little intuition but take a look at I(0), which represents
what happens if, at the optimum, U (0) changes. I(0) has an integral
because U (0) must be compared against the whole function U (y). Thus,
the optimal U ∗ saves marginal utility at the edges, where it makes little
difference, to have a larger slope at the middle.
So what is special about the edges? Intuitively, since the utility at the
extremes is extreme, the event of a high x versus a low y is easy to decide.
However, the case of high x versus medium y is more difficult. To solve
this, the gene builds perception in the case of high x versus medium y by
compromising perception in the case of high x versus high y. This results
in a robust strategy: when errors are big, optimize behavior by having
extreme utility in regions where losses and rewards are high.
The following corollary gives us a sense in which Netzer’s result is the
limit case of perfect perception, in which case there are no flat regions in
the utility function.
Corollary 6. Suppose that the errors ε ∼ N (0, σ 2 ). Then, when σ 2 → 0,
there are no flat regions in the optimal utility function.
Proof. Given σ 2 , choose a small enough a > 0 such that I(a) < 0. In this
case:
Z
I(a) = 2f (a)
0
1
[U (a)−U (y)]2
1
(a − y)f (y) √ e− 2σ2
dy → 0,
σ 2π
when σ 2 → 0. The case for I(b) > 0 is analogous.
As mentioned before, the non locality of the integral equation makes any
further analysis complicated. In particular, the integral equation may not
be solved by Picard iterations (cf. Montesinos, Zizler and Zizler (2015)). A
sensible conjecture is to posit the existence of a continuous, differentiable
13
function u : (a, b) 7→ (0, 1) such that


0
x≤a



U (x) = u(x) x ∈ (a, b)



1
x ≥ b.
This is not easy to show. However, the numerical exploration of the problem
has yet to provide a counterexample. Figure 3 shows how the optimal utility
function behaves when the size of the error changes. As σ 2 decreases, the
flat regions shrink. This property holds generally across numerical trials.10
Finally, Figure 4 repeats the exercise but with a skewed distribution for
the outcomes. The resulting utility functions are equally skewed, showing
loss aversion and diminishing sensitivity around the mean, E[X].
7
Concluding remarks
The numerical and analytical results suggest that, when errors are large,
the marginal utility gets concentrated around the mean, as in Figures 3
and 4. However, I have not been able to prove this. On the other hand, the
intuition for robustness holds: with limited perception, the robust approach
pushes to distinguish relatively bad outcomes from good ones.
On the multiplicative errors, note the connection with the Psychophysical Law of Weber and Fechner, which states that noticeable differences
between two stimuli are proportional to the magnitude of the stimuli.11
In fact, “[Fechner] argued that the sensation function is logarithmic since
equal relative changes in the stimuli brought about equal absolute changes
in sensation” (Sinn, 2003).
Back to this paper, instead of additive errors, suppose that the utility
b is shocked with a proportional (multiplicative) error so that
function U
b (x)εx ≥ U
b (y)εy . Then, taking logs and renaming
x is selected over y iff U
10
I used Knitro, via Matlab, to solve a discrete version of the optimization problem.
The codes are available upon request or at home.uchicago.edu/~jtudon.
11
In other words, between two weights, the agent notices only differences of, say, 10%
or more. See Sinn (2003).
14
Changing the size of the errors
σ2 = 1
σ 2 = 0.5
σ 2 = 0.1
Netzer (2009)
Utility U (x)
1
0
0
0.5
Outcomes x
1
Note: The errors ε ∼ N (0, σ 2 ) are compared for σ 2 = 1, 0.5, 0.1. The outcomes are
X ∼ Beta(4, 4), which has a nice bell curve. Netzer (2009) corresponds to the case
where σ 2 → 0. To put it in perspective, σ 2 = 0.1 means that 5% of mistakes are going
to be bigger than 0.2 in absolute value.
Figure 3: Optimal U ∗ for different sizes of errors
b , we get that x is chosen iff U (x)−U (y) ≥ ε ≡ ln(εy /εx ). Assuming
U ≡ ln U
that these expressions make (technical) sense, it can be shown that the
results hold with minor rescaling. Because this is a model of cardinal
utility functions, Weber’s Law is mathematically equivalent to the present
model.
As a final thought, let us think about extrapolating the resulting utility
function to an environment with risk where the agent faces lotteries: for
example, p[x] + q[y] would be the lottery of getting x with probability p
and getting y with probability q. There is a problem.
Netzer (2009) and I solved for the utility function over outcomes x but
not over lotteries p[x] + q[y]. In the risky environment, the gene would
solve for an optimal value function that takes into account outcomes and
probabilities—as in prospect theory. But even if the optimal value function
can be separated into a utility function and a probability weighting func15
Changing the size of the errors when f is skewed
σ2 = 1
σ 2 = 0.5
σ 2 = 0.1
Netzer (2009)
Utility U (x)
1
0
0
0.375
0.5
Outcomes x
1
Note: The errors ε ∼ N (0, σ 2 ) are compared for σ 2 = 1, 0.5, 0.1. The outcomes are
X ∼ Beta(3, 5), which has a curve skewed to the left. Netzer (2009) corresponds to
the case where σ 2 → 0. The utility functions exhibit loss aversion and diminishing
sensitivity around x = .375 = E[X].
Figure 4: Optimal U ∗ for different sizes of errors and skewed f
tion, such utility function will likely be different from the optimal utility
function of the environment without risk. Herold and Netzer (2015) deal
precisely with this problem as they explore the possibility that the gene
chooses a probability weighting function given that the utility function is
S-shaped. In this sense, prospect theory functions evolved as a series of
kludges (Ely, 2011). All in all, we have to be cautious of extrapolating our
result to a risky environment.
Having said that, this paper makes us aware of the costs and benefits
of assuming limited perception.
16
A
Proofs
Proof of Lemma 2. Consider the following expression:
n
n
1 XX
Vn [Un ] − Vn [S(Un )]
= 2
(xi − xj )fi fj [G(ui − uj ) − G(Sui − Suj )]
2
n j=1 i=j
First, observe that G(ui − uj ) − G(Sui − Suj ) = 0 for all i, j 6= l, m.
Next, for i = l, there are j = 1, . . . , l summands and for i = m, there
are j = 1, . . . , m summands. Similarly, for j = l, there are i = l, . . . , n
summands and for j = m, there are i = m, . . . , n summands. However, we
are double-counting 3 summands, but 2 are zero: The non-zero summand
corresponds to j = l and i = m. Finally, note that G(um − uj ) − G(Sum −
Suj ) = −[G(uj − um ) − G(Suj − Sum )]. Thus,
Vn [Un ] − Vn [S(Un )]
2
n X
n
X
=
(xi − xj )fi fj [G(ui − uj ) − G(Sui − Suj )]
n2
=
+
+
+
=
j=1 i=j
n
X
(xi − xl )fj fl [G(ui − ul ) − G(Sui − Sul )]
i=l
n
X
(xi − xm )fj fm [G(ui − um ) − G(Sui − Sum )]
i=m
l
X
(xl − xj )fl fj [G(ul − uj ) − G(Sul − Suj )]
j=1
m
X
(xm − xj )fm fj [G(um − uj ) − G(Sum − Suj )]
j=1
n
X
(xi − xl )fi fl [G(ui − ul ) − G(Sui − Sul )]
i=1
+
n
X
(xi − xm )fi fm [G(ui − um ) − G(Sui − Sum )]
i=1
− (xm − xl )fm fl [G(um − ul ) − G(ul − um )] .
At this point, use the definition of S and conclude that the preceding sum
17
equals:
n
X
=
(xi − xl )fi fl [G(ui − ul ) − G(ui − um )]
i=1,i6=l,m
n
X
+
(xi − xm )fi fm [G(ui − um ) − G(ui − ul )]
i=1,i6=l,m
+ (xm − xl )fm fl [G(um − ul ) − G(ul − um )] .
=
n
X
[(xi − xl )fl − (xi − xm )fm ] fi [G(ui − ul ) − G(ui − um )] .
i=1
Now we have to show that the preceding expression is non positive.
By assumption ul > um . Moreover, G is a non decreasing cdf and
xl < xm . Therefore, G(ui − ul ) − G(ui − um ) ≤ 0.
On the other hand, (xi − xl )fl − (xi − xm )fm ≥ 0: Suppose that fl >
fm , then (xi − xl )fl > (xi − xm )fl > (xi − xm )fm . Now suppose that
fl ≤ fm , then fm (xm − xi ) > fm (xl − xi ) ≥ fl (xl − xi ), which implies that
(xi − xl )fl − (xi − xm )fm ≥ 0.
Therefore we conclude that Vn [Un ] − Vn [S(Un )] ≤ 0.
Proof of Lemma 4. Let {Un } be a sequence in M and let {rk } be a sequence
that includes all the rational numbers in [0, 1]. Let {Un1 } be a subsequence
of {Un } such that {Un1 (r1 )} converges. From {Un1 }, choose a further subsequence {Un2 } such that {Un2 (r2 )} converges. Keep going and define un ≡ Unn
so that the sequence of functions {un } is a subsequence of the original {Un }
and of each {Unk } for each k ∈ N. Thus, {un (rk )}∞
n=1 converges for each
k ∈ N.
b on the set {rk : k ∈ N} as U
b (rk ) ≡
Now, define the function U
b to [0, 1] by defining U
b (x) ≡ supr <x U
b (rk )
limn→∞ un (rk ) for k ∈ N. Extend U
k
b ∈ M. Then, U
b has only a countable set Q of
for all x ∈ [0, 1]. Thus, U
discontinuities.
Let A ≡ Q ∩ {rk : k ∈ N} and let {ak } be a sequence of elements of A.
By another diagonal argument over over {ak }, we can find a subsequence
{wn } of {un } such that {wn } converges for all ak ∈ A.
By monotonicity, we can show that {wn } converges at all points of
18
b : Let x0 be a point of continuity of U
b . Given ε > 0, pick rk
continuity of U
b (ri ) − U
b (rk ) < ε/2. Fixing rk and ri ,
and ri such that rk < x0 < ri and U
b (rk )| < ε/2 and |un (ri ) − U
b (ri )| < ε/2 for
find n0 ∈ N such that |un (rk ) − U
b (x0 ) − ε < un (rk ) ≤ un (x0 ) ≤ un (ri ) < U
b (x0 ) + ε, which
n > n0 . Then, U
b at all points of continuity of U
b . Then, wn → U
b at all
implies that un → U
b , which implies that wn converges at all points of
points of continuity of U
[0, 1] to some function U , which is monotone, non decreasing.
B
Second order conditions
∂2
For the sake of completeness, note that 2 V [U + αδU ]
is equal to
∂ α
α=0
Z
0
1
Z
1
(x − y)f (x)f (y)g 0 U (x) − U (y) [δU (x) − δU (y)]2 dxdy ≤ 0,
0
since U ∈ M.
References
Barberis, Nicholas C. 2013. “Thirty Years of Prospect Theory in Economics: A Review and Assessment.” Journal of Economic Perspectives,
27(1): 173–196.
Chernozhukov, V., I. Fernández-Val, and A. Galichon. 2009. “Improving point and interval estimators of monotone functions by rearrangement.” Biometrika, 96(3): 559–575.
Dawkins, Richard. 1976. The Selfish Gene. . 2006 ed., New York:Oxford
University Press.
Elsgolc, Lev D. 2007. Calculus of Variations. . Dover ed., Mineola, New
York:Dover Publications.
Ely, Jeffrey C. 2011. “Kludged.” American Economic Journal: Microeconomics, 3(August 2011): 210–231.
19
Fehr, Ernst, and Antonio Rangel. 2011. “Neuroeconomic Foundations
of Economic Choice—Recent Advances.” Journal of Economic Perspectives, 25(4): 3–30.
Herold, Florian, and Nick Netzer. 2015. “Second-best Probability
Weighting.”
Kahneman, Daniel, and Amos Tversky. 1979. “Prospect Theory: An
Analysis of Decision under Risk.” Econometrica, 47(2): 263–292.
Montesinos, Vicente, Peter Zizler, and Václav Zizler. 2015. An
Introduction to Modern Analysis. . 1st ed., Springer International Publishing.
Netzer, Nick. 2009. “Evolution of Time Preferences and Attitudes toward
Risk.” The American Economic Review, 99(3): 937–955.
Rayo, Luis, and Arthur J. Robson. 2014. “Biology and the Arguments
of Utility.”
Rayo, Luis, and Gary S. Becker. 2007. “Evolutionary Efficiency and
Happiness.” Journal of Political Economy, 115(2): 302–337.
Robson, Arthur J. 2001. “The Biological Basis of Economic Behavior.”
Journal of Economic Literature, 39(1): 11–33.
Robson, Arthur J., and Larry Samuelson. 2010. “The Evolutionary
Foundations of Preferences.” In Handbook of Social Economics.
Sinn, Hans-Werner. 2003. “Weber’s Law and the Biological Evolution
of Risk Preferences: The Selective Dominance of the Logarithmic Utility
Function, 2002 Geneva Risk Lecture.” The Geneva Papers on Risk and
Insurance Theory, 28(2): 87–100.
20