Perception, Utility and Evolution José F. Tudón M.∗ October 14, 2016 Abstract This paper presents a model of the evolution of the hedonic utility function in which perception is imperfect. Netzer (2009) considers a model with perfect perception and finds that the optimal utility function allocates marginal utility where decisions are made frequently. This paper shows that it is also beneficial to allocate marginal utility away from more perceptible events. The introduction of perceptual errors can lead to qualitatively different utility functions, such as discontinuous functions with flat regions rather than continuous and strictly increasing functions. 1 Introduction Based on the hypothesis that individual decisions are guided by hedonic utility, several recent studies have found an evolutionary foundation for S-shaped utility functions. Though they use different mechanisms, Netzer (2009) and Rayo and Becker (2007) are two notable examples.1 Netzer (2009) derives a utility function which is steep in regions where decisions have to be made frequently, and where wrong decisions would lead to large ∗ Department of Economics, The University of Chicago, 1126 E. 59th St., Chicago, IL 60637 (e-mail: [email protected]). Thanks to Benjamin Brooks, Alex Frankel, Emir Kamenica, Philip Reny, Luis Silvestre, Balázs Szentes, Richard Van Weelden and workshop participants at the University of Chicago. 1 See also Robson (2001). Robson and Samuelson (2010) survey the literature. 1 losses. Rayo and Becker (2007) consider the case of an individual with local information and derive S -shaped utility functions that spread incentives over the output domain. Besides advancing our understanding of happiness as a tool for decision making, these recent contributions are also linked to risk preferences. Because of this link, and the possibility of an evolutionary foundation of the S -shape, the debate is followed by both proponents and detractors of prospect theory.2 This paper builds on the framework developed in Netzer (2009) by allowing limited perception to have a prominent role. In this general setting, Netzer’s intuition can be overturned because nature faces a trade off between perception and robustness. Indeed, nature may not allocate marginal utility in regions where decisions have to be made frequently. Thus, this paper shows what is lost by approximating limited perception with perfect perception. 2 The environment The environment is based on Netzer (2009). Suppose that an agent makes repeated choices between two options, x and y. These options are independently drawn from a normalized set, [0, 1], according to a distribution function F with full support. The options are identified with fitness levels: option x yields fitness x, where fitness could simply be thought of as the number of offspring. The agent has a hedonic utility function U (x), which is shaped by natural selection. Naturally, the agent will choose the option that makes her happier. Because evolution, through natural selection, optimizes the utility function, I will use the common metaphor of a gene that maximizes offspring. Thus, the selfish gene will design the utility function that promises the largest expected fitness.3 2 See Kahneman and Tversky (1979). For a recent review on prospect theory, see Barberis (2013). 3 See Dawkins (1976). I assume that the utility function values offspring. See Rayo and Robson (2014) for a theory on the evolution of utility functions over intermediate 2 However, not all utility functions are admissible. Besides focusing on measurable utility functions, there will be 2 constraints. First, the human brain has a limited size. Thus, happiness must be bounded. Second, the human senses are imperfect. Thus, happiness cannot be perceived without errors.4 More formally, I will make the following assumptions. Assumption 1 (Bounded happiness). Without loss of generality, the utility function must be bounded between 0 and 1. Assumption 2 (Limited perception). The agent will perceive the happiness of x as U (x) + ε, where the error ε is a random variable.5 2.1 The difference with Netzer (2009) Assumption 2 deserves further discussion because it is the main variation from Netzer (2009). In his paper, Netzer models limited perception “by assuming that utility can take only discrete, albeit extremely numerous, values. [...] The set of admissible utility functions is thus restricted to the set of increasing step functions with N ∈ N jumps, each corresponding to a utility increment of size 1/N . As a result, the agent cannot distinguish two alternatives located on the same step of the utility function. Any choice between such alternatives will have to be random, and a mistake can occur.” To derive his main result, Netzer takes the limit when N → ∞, so perception becomes perfect. Netzer takes this elegant approach because, with finite N , the optimization problem gets complicated, to say the least. In this way, Netzer finds a solution with a clear interpretation; the perfect perception solution is the limit of a series of solutions with limited perception. However, by maintaining limited perception we can focus on the issues that we miss by taking the limit. It turns out that in theory the omissions goods such as food, shelter, etc. 4 See Rayo and Becker (2007) or Netzer (2009) for further details. 5 Note that limited perception of happiness is different from limited perception of the alternatives. If the x is perceived with a noise, then evolution will operate on the senses needed to distinguish one alternative from another. In this model, however, we are interested in the consequences of perceiving utility U (x) with noise. 3 may be considerable. On the other hand, what we lose in the approximation we gain in tractability. As we shall see, with limited perception the existence of an analytic solution is not guaranteed. 3 The gene’s problem With limited perception, the agent gets confused. She will choose x over y if and only if U (x) + εx ≥ U (y) + εy , where the errors are iid and share the spirit of the drift-diffusion model of decision making (Fehr and Rangel, 2011). With a slight abuse of notation, let ε ≡ εy − εx and call G the distribution of ε. Note that ε is symmetric by construction.6 Then, the agent chooses x over y if and only if U (x) − U (y) ≥ ε with x, y ∼ F and ε ∼ G, all independent. Assume that the distributions can be represented by bounded densities f and g with finitely many discontinuities. In Section 7, I explore the possibility of multiplicative errors and its connection to Weber’s Law. The agent chooses x with probability P [U (x) > U (y) + ε] = P [ε < U (x) − U (y)] = G[U (x) − U (y)]. Let V represent the expected fitness given a utility function U . Thus, Z 1 Z 1 V [U ] = xf (x)f (y)G U (x) − U (y) dxdy 0 0 Z 1Z 1 h i + yf (x)f (y) 1 − G U (x) − U (y) dxdy. 0 0 Z 1 Z ∴ V [U ] = E[X] + 0 1 (x − y)f (x)f (y)G U (x) − U (y) dxdy 0 Therefore, the gene’s problem becomes max U ∈U V [U ] subject to 0 ≤ U (x) ≤ 1, (1) where U is the set of admissible functions, which I assume is the set of measurable functions bounded between 0 and 1. If Z and Z 0 are iid, Z s ≡ Z − Z 0 ∼ Z 0 − Z = −(Z − Z 0 ) = −Z s , which implies that Z is a symmetric random variable. 6 s 4 Remark. The function U appears twice inside the integral with different running variables. Intuitively, any change of U at any point must be weighted versus the whole function. Thus, this problem is not local. 4 A simple example Before proceeding with the formal analysis, this section introduces a simple example that overturns the intuition of allocating more marginal utility where decisions have to be made more frequently. Suppose that errors are uniform U [−1, 1]. That is: G(ε) = (1 + ε)/2. Outcomes X have a pdf denoted by f . In this case: 1 V [U ] = E[X] + 2 Z 0 1 Z 1 (x − y) f (x)f (y) 1 + U (x) − U (y) dxdy. 0 After simplifications, Z V [U ] = E[X] + 1 x − E[X] f (x)U (x)dx. 0 This implies that the optimal utility function U ∗ should be an indicator function for x ≥ E[X]: 1 x ≥ E[X] U ∗ (x) = 0 x < E[X] Note that this is optimal regardless of f . In fact, the optimal utility allocates all the marginal utility at E[X] even if f (E[X]) should be low. Figure 1 represents this case: two different distribution functions have the same optimal utility function, which is a step function. However, one pdf allocates more mass at the extremes while the other one at the center. Figure 2 compares the optimal utility function and Netzer (2009)’s result when the pdf is a Beta(0.5, 0.5), which allocates more mass at the extremes. The optimal utility function is a step function, essentially flat almost everywhere. However, even though f (1/2) is the minimum of f , U ∗ has infinite marginal utility at x = 1/2. The intuition of Netzer is 5 Two D is t r ibut ions , s ame Ut ility func t ion 8 Pr obabilit ie s f ( x ) 7 Beta(10,10) Beta(0.5,0.5) 6 5 4 3 2 1 0 0 0.5 Out c ome s x 1 Note: The two distributions are Beta(10, 10) and Beta(0.5, 0.5), which implies that E[X] = 1/2 for both. For both distributions, the optimal utility function is U ∗ (x) = 1{x ≥ 1/2}. Figure 1: Two different distributions with the same optimal utility function overturned because, once we allow for errors, the optimization problem becomes non local. One has to weight the local marginal utility against the marginal utility evaluated at every point. In other words, mistakes are not only made for alternatives with similar fitness, but also when comparing alternatives with very different fitness and therefore the optimal solution is not only determined by the local density, but by the global probability function. Granted, this example is extreme because the errors are big. However, as we will see, this intuition holds true for any size of the errors. Indeed, as long as there is limited perception, there will be flat regions in the utility function regardless of f . 6 Ne t z e r ’s r e s ult and opt imal Ut ility func t ion 1 0.9 Pr obabilit ie s f ( x ) 0.8 Optimal U Netzer (2009) CDF of Beta(0.5,0.5) 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.5 Out c ome s x 1 Note: The outcomes X ∼ Beta(0.5, 0.5), which implies that E[X] = 1/2. The optimal utility function is U ∗ (x) = 1{x ≥ 1/2}. The utility function in Netzer (2009) equals Rx U (x) = c 0 f (x)2/3 dx, where f is the pdf of X and c is a normalizing constant. Figure 2: Comparison of the optimal utility function and Netzer’s result 5 Existence and uniqueness The first step to tackle this problem is to assure existence and uniqueness of a solution. As is, we can not apply an existence theorem for the set U because it is not compact. But we can prove the theorem for a subset of U: the subset of monotone functions. The following lemma will help us show that if a solution exists, it must be monotone, non decreasing. Lemma 1. For any U ∈ U exists a monotone, non decreasing function U ∗ ∈ U such that V [U ] ≤ V [U ∗ ]. Proof. This proof is not as trivial as it seems, since changing U at any point will change the whole integral. The strategy will be to consider simple functions first, and then apply the dominated convergence theorem to prove the general case. 7 The objective function can be written as: 1 Z Z V [U ] = K + 2 R1R1 0 y (x − y)f (x)f (y)G U (x) − U (y) dxdy y 0 where K = E[X] + 1 (y − x)f (x)f (y)dxdy. Suppose that Un is a simple function that is constant on the intervals , s = 1, . . . , n. Each simple function of this form can be written P as the step function Un (x) = ns=1 us 1 s−1 , ns (x). Now we can define n s−1 s ,n n the n-vector u as the vector with values us , which is the value that Un (x) takes on the sth interval. From this definition we see that each n-vector corresponds to a step function. We will now define the sorting operator S acting on the vector u as follows. Let l be an integer in 1, . . . , n such that ul > um for some l < m. If l exists, set S(u) to be the n-vector with um on the lth position and ul on the mth position, and all other elements equal to the corresponding elements of u. If such an l does not exist, set S(u) = u. Then, S sorts two elements of u in increasing order.7 Similarly, let xn and fn be two simple functions; the idea is that xn → x and fn → f . With a slight abuse of notation, also let xn and fn stand for the n-vectors with xs and fs in the sth position. Therefore, we can approximate V [Un ] with 1 Z Vn [Un ] ≡ K + 2 Z 1 (xn (x) − xn (y))fn (x)fn (y)G Un (x) − Un (y) dxdy 0 =K+ 2 n2 y n n XX (xi − xj )fi fj G(ui − uj ). j=1 i=j In Appendix A, I prove the following lemma. Lemma 2. Vn [Un ] ≤ Vn [S(Un )]. Therefore, if we apply the sorting operator S sufficiently many times to Un , to a maximum of n times, we find the rearranged vector Un∗ : a vector completely sorted in ascending order. Every application of the sorting 7 For example: if u = (5 2 3) ⇒ S(u) = (2 5 3). 8 operator S improves Vn , and since we apply it a finite number of times Vn [Un∗ ] ≡ Vn [S ◦ S ◦ · · · ◦ SUn ] ≥ Vn [Un ]. Now that this holds for simple functions, we go for the general case. Since f , U and the identity are measurable functions, there are sequences of bounded simple functions {fn }, {Un } and {xn } converging almost everywhere to f , U and x. Define the increasing rearrangement of U as Z U (x) ≡ inf q ∈ R : x ≤ 1 ∗ 1{U (z) ≤ q}dz . 0 This rearrangement always exists; it is like sorting the values of U in increasing order. This is a quantile function. From this definition, Un → U a.e. implies that the rearrangements also converge a.e. to the rearrangement: Un∗ → U ∗ a.e. (see Chernozhukov, Fernández-Val and Galichon (2009) and the references therein). Moreover, Vn [Un ] → V [U ] and, since the inequality works for every n, by Lebesgue’s dominated convergence theorem, V [U ] ≤ V [U ∗ ]. Therefore, Lemma 1 allows us to look for the optimum in the set of non decreasing monotone functions. Let M ⊂ U denote the set of admissible monotone functions: M ≡ n U o : [0, 1] 7→ [0, 1] U is monotone, non decreasing . And note that M is closed, bounded and convex. From here, we can establish existence and uniqueness. Since the optimal utility function is not defined if f (x) = 0, I will formally assume full support, except perhaps at the extremes. Assumption 3. Suppose that f (x) > 0 for all x ∈ (0, 1). Moreover, without loss of generality, if f (0) = 0 then set U (0) = 0 and if f (1) = 0 then set U (1) = 1. Proposition 3. A solution to maxU ∈M V [U ] exists. Moreover, if G is strictly concave in [0, ∞), and hence strictly convex in (−∞, 0], then the solution is unique. 9 Proof. I will prove that the maximum is attained. Let {Un } be a sequence such that V [Un ] → sup{V [U ] : U ∈ M}. Then, I will use the following Lemma, which is called Helly’s first theorem (Montesinos, Zizler and Zizler, 2015). Its proof is in Appendix A. Lemma 4. If {Un } is a sequence in M, then there is a subsequence {Unk } such that {Unk } pointwise converges in [0, 1] to a function U ∈ M. By Lemma 4, G (Unk (x) − Unk (y)) → G (U (x) − U (y)) almost everywhere—the “almost everywhere” comes from the fact that a sequence of right-continuous functions can converge to a left-continuous one. Since G ≤ 1, V is bounded. Therefore, we can apply Lebesgue’s Dominated Convergence Theorem to conclude that V [Unk ] → V [U ]. Therefore, V attains its supremum at U . To prove uniqueness, note that V [U ] can be written as Z 1 Z 0 Z 1 Z 1 (y − x)f (x)f (y)G[U (y) − U (x)]dxdy V [U ] = E[X] + y y (y − x)f (x)f (y)G[U (y) − U (x)]dxdy. + 0 0 Here, we can see that (y − x)G(U (y) − U (x)) is concave when G is concave in [0, ∞) and convex in (−∞, 0]. Then, V is concave and the maximum is unique. 6 The optimal utility function Since the optimization problem is non local, the properties of the solution are hard to characterize. Indeed, if we take the approach of calculus of variations we end up with an integral equation, which in general does not have analytic solutions. However, we can prove one thing: there will be flat regions in the utility function. Proposition 5. Suppose that G is concave in [0, ∞). Then, there exist 0 < a ≤ b < 1 such that the utility function U ∗ that solves the problem defined in (1) is equal to 0 for x ∈ [0, a] and equal to 1 for x ∈ [b, 1]. 10 Proof. Let us begin by writing the Lagrangian: Z L[U, λ0 , λ1 ] = V + 1 h i λ0 (x)U (x) − λ1 (x)[U (x) − 1] dx. (2) 0 The Karush–Kuhn–Tucker conditions are: −U (x) ≤ 0, λ0 (x) ≥ 0, U (x) − 1 ≤ 0, λ1 (x) ≥ 0, λ0 (x)U (x) = 0; (3) λ1 (x)[U (x) − 1] = 0. In the jargon of Calculus of Variations8 , for U ∗ to solve (1), we need that the variation of L equals zero along U ∗ , in other words, the First Order Condition is δL = 0. The variation δL is defined as: δL ≡ ∂ L [U (x) + αδU (x)] ∂α α=0 where δU ∈ U is some admissible perturbation, close enough to U such that U + αδU ∈ U for all α sufficiently close to 0. I will start with δV and then continue with the rest of δL. ∂ δV ≡ V [U (x) + αδU (x)] ∂α α=0 Z 1Z 1 = (y − x)f (x)f (y)g[U (y) − U (x)][δU (y) − δU (x)]dxdy 0 0 Z 1Z 1 = (y − x)f (x)f (y)g[U (y) − U (x)]δU (y)dxdy 0 0 Z 1Z 1 − (y − x)f (x)f (y)g[U (y) − U (x)]δU (x)dxdy . 0 0 | {z } Switch the variables here Because x and y are dummy variables, rename the variables in the underbraced part of this equation; where it says x, write y and where it says y, write x. Then, because U is measurable, apply Fubini’s Theorem to interchange the order of integration. Finally, since g is symmetric, we 8 See Elsgolc (2007). 11 get the following expression: 1 Z 1 Z δU (y) δV = |0 0 2(y − x)f (x)f (y)g U (y) − U (x) dx dy. {z } ≡I(y) Define the interior integral as I(y), which is a function of y. Then, Z δV = 1 δU (y)I(y)dy. 0 Now, we focus on the second part of δL. Using δL = ∂ L[U +αδU ]α=0 , ∂α we get that: i 1h λ0 (x)δU (x) − λ1 (x)δU (x) dx δL = δV + 0 Z 1 h i δU (x) I(x) + λ0 (x) − λ1 (x) dx. = Z 0 Since, δL = 0 should hold for any arbitrary (but close) deviation δU , subject to some regularity conditions, we can apply the Fundamental Theorem of Calculus of Variations to conclude that the First Order Condition9 becomes I(x) + λ0 (x) − λ1 (x) = 0 ∀x ∈ [0, 1]. Therefore, <0 I(x) = 0 > 0 if U (x) = 0 if U (x) ∈ (0, 1) if U (x) = 1. However, note that Z I(0) = −2f (0) 1 yf (y)g U (0) − U (y) dy. 0 It follows that for any U ∈ M, if f (0) > 0, then I(0) < 0 and if f (0) = 0, then I(0) = 0 and, by assumption, U (0) = 0. But in both cases, for a 9 We already discussed existence and uniqueness but see Appendix B for the Second Order Conditions. 12 small enough a > 0, we conclude that I(a) < 0. Therefore, U (a) = 0. By monotonicity, U (x) = 0 for all x ∈ [0, a]. R1 The case of I(1) = 2f (1) 0 (1 − y)f (y)g U (1) − U (y) dy > 0 is analogous so that I(b) > 0 for b close enough to 1. Again, by monotonicity, U (x) = 1 for all x ∈ [b, 1]. The proof offers little intuition but take a look at I(0), which represents what happens if, at the optimum, U (0) changes. I(0) has an integral because U (0) must be compared against the whole function U (y). Thus, the optimal U ∗ saves marginal utility at the edges, where it makes little difference, to have a larger slope at the middle. So what is special about the edges? Intuitively, since the utility at the extremes is extreme, the event of a high x versus a low y is easy to decide. However, the case of high x versus medium y is more difficult. To solve this, the gene builds perception in the case of high x versus medium y by compromising perception in the case of high x versus high y. This results in a robust strategy: when errors are big, optimize behavior by having extreme utility in regions where losses and rewards are high. The following corollary gives us a sense in which Netzer’s result is the limit case of perfect perception, in which case there are no flat regions in the utility function. Corollary 6. Suppose that the errors ε ∼ N (0, σ 2 ). Then, when σ 2 → 0, there are no flat regions in the optimal utility function. Proof. Given σ 2 , choose a small enough a > 0 such that I(a) < 0. In this case: Z I(a) = 2f (a) 0 1 [U (a)−U (y)]2 1 (a − y)f (y) √ e− 2σ2 dy → 0, σ 2π when σ 2 → 0. The case for I(b) > 0 is analogous. As mentioned before, the non locality of the integral equation makes any further analysis complicated. In particular, the integral equation may not be solved by Picard iterations (cf. Montesinos, Zizler and Zizler (2015)). A sensible conjecture is to posit the existence of a continuous, differentiable 13 function u : (a, b) 7→ (0, 1) such that 0 x≤a U (x) = u(x) x ∈ (a, b) 1 x ≥ b. This is not easy to show. However, the numerical exploration of the problem has yet to provide a counterexample. Figure 3 shows how the optimal utility function behaves when the size of the error changes. As σ 2 decreases, the flat regions shrink. This property holds generally across numerical trials.10 Finally, Figure 4 repeats the exercise but with a skewed distribution for the outcomes. The resulting utility functions are equally skewed, showing loss aversion and diminishing sensitivity around the mean, E[X]. 7 Concluding remarks The numerical and analytical results suggest that, when errors are large, the marginal utility gets concentrated around the mean, as in Figures 3 and 4. However, I have not been able to prove this. On the other hand, the intuition for robustness holds: with limited perception, the robust approach pushes to distinguish relatively bad outcomes from good ones. On the multiplicative errors, note the connection with the Psychophysical Law of Weber and Fechner, which states that noticeable differences between two stimuli are proportional to the magnitude of the stimuli.11 In fact, “[Fechner] argued that the sensation function is logarithmic since equal relative changes in the stimuli brought about equal absolute changes in sensation” (Sinn, 2003). Back to this paper, instead of additive errors, suppose that the utility b is shocked with a proportional (multiplicative) error so that function U b (x)εx ≥ U b (y)εy . Then, taking logs and renaming x is selected over y iff U 10 I used Knitro, via Matlab, to solve a discrete version of the optimization problem. The codes are available upon request or at home.uchicago.edu/~jtudon. 11 In other words, between two weights, the agent notices only differences of, say, 10% or more. See Sinn (2003). 14 Changing the size of the errors σ2 = 1 σ 2 = 0.5 σ 2 = 0.1 Netzer (2009) Utility U (x) 1 0 0 0.5 Outcomes x 1 Note: The errors ε ∼ N (0, σ 2 ) are compared for σ 2 = 1, 0.5, 0.1. The outcomes are X ∼ Beta(4, 4), which has a nice bell curve. Netzer (2009) corresponds to the case where σ 2 → 0. To put it in perspective, σ 2 = 0.1 means that 5% of mistakes are going to be bigger than 0.2 in absolute value. Figure 3: Optimal U ∗ for different sizes of errors b , we get that x is chosen iff U (x)−U (y) ≥ ε ≡ ln(εy /εx ). Assuming U ≡ ln U that these expressions make (technical) sense, it can be shown that the results hold with minor rescaling. Because this is a model of cardinal utility functions, Weber’s Law is mathematically equivalent to the present model. As a final thought, let us think about extrapolating the resulting utility function to an environment with risk where the agent faces lotteries: for example, p[x] + q[y] would be the lottery of getting x with probability p and getting y with probability q. There is a problem. Netzer (2009) and I solved for the utility function over outcomes x but not over lotteries p[x] + q[y]. In the risky environment, the gene would solve for an optimal value function that takes into account outcomes and probabilities—as in prospect theory. But even if the optimal value function can be separated into a utility function and a probability weighting func15 Changing the size of the errors when f is skewed σ2 = 1 σ 2 = 0.5 σ 2 = 0.1 Netzer (2009) Utility U (x) 1 0 0 0.375 0.5 Outcomes x 1 Note: The errors ε ∼ N (0, σ 2 ) are compared for σ 2 = 1, 0.5, 0.1. The outcomes are X ∼ Beta(3, 5), which has a curve skewed to the left. Netzer (2009) corresponds to the case where σ 2 → 0. The utility functions exhibit loss aversion and diminishing sensitivity around x = .375 = E[X]. Figure 4: Optimal U ∗ for different sizes of errors and skewed f tion, such utility function will likely be different from the optimal utility function of the environment without risk. Herold and Netzer (2015) deal precisely with this problem as they explore the possibility that the gene chooses a probability weighting function given that the utility function is S-shaped. In this sense, prospect theory functions evolved as a series of kludges (Ely, 2011). All in all, we have to be cautious of extrapolating our result to a risky environment. Having said that, this paper makes us aware of the costs and benefits of assuming limited perception. 16 A Proofs Proof of Lemma 2. Consider the following expression: n n 1 XX Vn [Un ] − Vn [S(Un )] = 2 (xi − xj )fi fj [G(ui − uj ) − G(Sui − Suj )] 2 n j=1 i=j First, observe that G(ui − uj ) − G(Sui − Suj ) = 0 for all i, j 6= l, m. Next, for i = l, there are j = 1, . . . , l summands and for i = m, there are j = 1, . . . , m summands. Similarly, for j = l, there are i = l, . . . , n summands and for j = m, there are i = m, . . . , n summands. However, we are double-counting 3 summands, but 2 are zero: The non-zero summand corresponds to j = l and i = m. Finally, note that G(um − uj ) − G(Sum − Suj ) = −[G(uj − um ) − G(Suj − Sum )]. Thus, Vn [Un ] − Vn [S(Un )] 2 n X n X = (xi − xj )fi fj [G(ui − uj ) − G(Sui − Suj )] n2 = + + + = j=1 i=j n X (xi − xl )fj fl [G(ui − ul ) − G(Sui − Sul )] i=l n X (xi − xm )fj fm [G(ui − um ) − G(Sui − Sum )] i=m l X (xl − xj )fl fj [G(ul − uj ) − G(Sul − Suj )] j=1 m X (xm − xj )fm fj [G(um − uj ) − G(Sum − Suj )] j=1 n X (xi − xl )fi fl [G(ui − ul ) − G(Sui − Sul )] i=1 + n X (xi − xm )fi fm [G(ui − um ) − G(Sui − Sum )] i=1 − (xm − xl )fm fl [G(um − ul ) − G(ul − um )] . At this point, use the definition of S and conclude that the preceding sum 17 equals: n X = (xi − xl )fi fl [G(ui − ul ) − G(ui − um )] i=1,i6=l,m n X + (xi − xm )fi fm [G(ui − um ) − G(ui − ul )] i=1,i6=l,m + (xm − xl )fm fl [G(um − ul ) − G(ul − um )] . = n X [(xi − xl )fl − (xi − xm )fm ] fi [G(ui − ul ) − G(ui − um )] . i=1 Now we have to show that the preceding expression is non positive. By assumption ul > um . Moreover, G is a non decreasing cdf and xl < xm . Therefore, G(ui − ul ) − G(ui − um ) ≤ 0. On the other hand, (xi − xl )fl − (xi − xm )fm ≥ 0: Suppose that fl > fm , then (xi − xl )fl > (xi − xm )fl > (xi − xm )fm . Now suppose that fl ≤ fm , then fm (xm − xi ) > fm (xl − xi ) ≥ fl (xl − xi ), which implies that (xi − xl )fl − (xi − xm )fm ≥ 0. Therefore we conclude that Vn [Un ] − Vn [S(Un )] ≤ 0. Proof of Lemma 4. Let {Un } be a sequence in M and let {rk } be a sequence that includes all the rational numbers in [0, 1]. Let {Un1 } be a subsequence of {Un } such that {Un1 (r1 )} converges. From {Un1 }, choose a further subsequence {Un2 } such that {Un2 (r2 )} converges. Keep going and define un ≡ Unn so that the sequence of functions {un } is a subsequence of the original {Un } and of each {Unk } for each k ∈ N. Thus, {un (rk )}∞ n=1 converges for each k ∈ N. b on the set {rk : k ∈ N} as U b (rk ) ≡ Now, define the function U b to [0, 1] by defining U b (x) ≡ supr <x U b (rk ) limn→∞ un (rk ) for k ∈ N. Extend U k b ∈ M. Then, U b has only a countable set Q of for all x ∈ [0, 1]. Thus, U discontinuities. Let A ≡ Q ∩ {rk : k ∈ N} and let {ak } be a sequence of elements of A. By another diagonal argument over over {ak }, we can find a subsequence {wn } of {un } such that {wn } converges for all ak ∈ A. By monotonicity, we can show that {wn } converges at all points of 18 b : Let x0 be a point of continuity of U b . Given ε > 0, pick rk continuity of U b (ri ) − U b (rk ) < ε/2. Fixing rk and ri , and ri such that rk < x0 < ri and U b (rk )| < ε/2 and |un (ri ) − U b (ri )| < ε/2 for find n0 ∈ N such that |un (rk ) − U b (x0 ) − ε < un (rk ) ≤ un (x0 ) ≤ un (ri ) < U b (x0 ) + ε, which n > n0 . Then, U b at all points of continuity of U b . Then, wn → U b at all implies that un → U b , which implies that wn converges at all points of points of continuity of U [0, 1] to some function U , which is monotone, non decreasing. B Second order conditions ∂2 For the sake of completeness, note that 2 V [U + αδU ] is equal to ∂ α α=0 Z 0 1 Z 1 (x − y)f (x)f (y)g 0 U (x) − U (y) [δU (x) − δU (y)]2 dxdy ≤ 0, 0 since U ∈ M. References Barberis, Nicholas C. 2013. “Thirty Years of Prospect Theory in Economics: A Review and Assessment.” Journal of Economic Perspectives, 27(1): 173–196. Chernozhukov, V., I. Fernández-Val, and A. Galichon. 2009. “Improving point and interval estimators of monotone functions by rearrangement.” Biometrika, 96(3): 559–575. Dawkins, Richard. 1976. The Selfish Gene. . 2006 ed., New York:Oxford University Press. Elsgolc, Lev D. 2007. Calculus of Variations. . Dover ed., Mineola, New York:Dover Publications. Ely, Jeffrey C. 2011. “Kludged.” American Economic Journal: Microeconomics, 3(August 2011): 210–231. 19 Fehr, Ernst, and Antonio Rangel. 2011. “Neuroeconomic Foundations of Economic Choice—Recent Advances.” Journal of Economic Perspectives, 25(4): 3–30. Herold, Florian, and Nick Netzer. 2015. “Second-best Probability Weighting.” Kahneman, Daniel, and Amos Tversky. 1979. “Prospect Theory: An Analysis of Decision under Risk.” Econometrica, 47(2): 263–292. Montesinos, Vicente, Peter Zizler, and Václav Zizler. 2015. An Introduction to Modern Analysis. . 1st ed., Springer International Publishing. Netzer, Nick. 2009. “Evolution of Time Preferences and Attitudes toward Risk.” The American Economic Review, 99(3): 937–955. Rayo, Luis, and Arthur J. Robson. 2014. “Biology and the Arguments of Utility.” Rayo, Luis, and Gary S. Becker. 2007. “Evolutionary Efficiency and Happiness.” Journal of Political Economy, 115(2): 302–337. Robson, Arthur J. 2001. “The Biological Basis of Economic Behavior.” Journal of Economic Literature, 39(1): 11–33. Robson, Arthur J., and Larry Samuelson. 2010. “The Evolutionary Foundations of Preferences.” In Handbook of Social Economics. Sinn, Hans-Werner. 2003. “Weber’s Law and the Biological Evolution of Risk Preferences: The Selective Dominance of the Logarithmic Utility Function, 2002 Geneva Risk Lecture.” The Geneva Papers on Risk and Insurance Theory, 28(2): 87–100. 20
© Copyright 2026 Paperzz