Dynamic (In)Consistency and the Value of Information

Dynamic (In)Consistency and the Value of Information
Alexander M. Jakobsen∗
Job Market Paper
November 19, 2016
For the latest version, please visit:
http://www.princeton.edu/~jakobsen/JakobsenJMP.pdf
Abstract
This paper develops a revealed-preference model of information disclosure. One decision maker, DM1, ranks information sources (Blackwell experiments) knowing that a
second decision maker, DM2, uses the information to select an act from a menu. Both
decision makers are subjective expected utility maximizers but may differ in their preferences and/or beliefs. I assume the analyst observes, for each menu of acts, (i) a
preference ordering over all Blackwell experiments (DM1’s preference for information),
and (ii) for each signal, DM2’s choice from the menu. The main result is a representation theorem characterizing DM1’s value of information: the ex-ante expected utility
associated with an experiment. The primitives of the model are sufficient to uniquely
identify the tastes and beliefs of both DM1 and DM2, to establish that DM2 uses
Bayes’ rule to update his beliefs, and to show that DM1 correctly anticipates the behavior of DM2. I also present simple conditions to test whether DM1 and DM2 share a
common prior, common preferences, or both. Hence, examination of a decision maker’s
informational choice provides a useful new method of revealed-preference analysis.
∗
Department of Economics, Fisher Hall, Princeton, NJ 08544-1021; [email protected]. I am extremely grateful to Faruk Gul and Wolfgang Pesendorfer for their continued patience, guidance, and support.
Thanks also to Cristian Alonso, Roland Benabou, Stephen Morris, and Ben Young for helpful conversations
and suggestions. Financial support from SSHRC is gratefully acknowledged.
1
Introduction
A growing body of literature investigates how strategic information provision can be used
to manage the behavior of economic agents. This idea has developed in settings ranging
from pure Bayesian Persuasion (Kamenica and Gentzkow (2011)) to behavioral theories
of cognition and self-regulation (Benabou and Tirole (2002)), among others. A common
theme is that a sender may prefer to withhold information from a receiver (either his future
self or some other agent) if the incentives of the receiver are not aligned with those of
the sender. In this paper, I show how the sender’s preference for commitment—revealed
through informational choice—and the receiver’s signal-contingent choices can be used to
test a large class of information disclosure models and identify the relevant parameters
(priors and preferences) of both agents.
The model involves two decision makers, DM1 and DM2. DM1 ranks information sources
knowing that DM2 uses the information to select an action affecting them both. The two
decision makers are subjective expected utility maximizers but may differ in their preferences
or beliefs. By controlling the flow of information, DM1 attempts to steer DM2’s decision
toward actions that are of higher value to himself. The main result of this paper is a
representation theorem characterizing DM1’s value of information: the ex-ante expected
utility associated with an information structure, given its effect on DM2’s choices.
In the representation, DM2 selects among acts in menus A. An (Anscombe and Aumann
(1963)) act is a profile f = (fω )ω∈Ω assigning lotteries fω ∈ ∆X to states of the world ω ∈ Ω,
where X is a finite set of outcomes. Information structures take the form of Blackwell
experiments (Blackwell (1951, 1953)) assigning probability distributions over finitely many
signals to states of the world. In this setup, DM2’s choice from a menu A only depends on
the observed signal s. So, for each state ω, a Blackwell experiment σ induces a probability
distribution over acts f s (A) ∈ A, where f s (A) is the act chosen by DM2 when signal s ∈ σ
P
realizes. This, in turn, induces a lottery s∈σ sω fωs (A) ∈ ∆X for each state ω, where sω is
the probability of signal s in state ω.1 Therefore, if DM1 is an expected-utility maximizer
with prior ν and utility index v, his expected payoff from an experiment σ is:
V A (σ) =
X
ω
νω
X
sω v(fωs (A)),
(1)
s∈σ
where νω denotes the probability assigned to state ω by ν, and v(p) :=
1
P
x
v(x)p(x) for
The statement ‘s ∈ σ’ means s is a signal that can be generated by σ. For ease of exposition, this
formulation ignores the possibility of DM2 being indifferent between two or more acts in A; the main
representation handles this more rigorously.
1
lotteries p ∈ ∆X . The function V A is analogous to an indirect utility function for a sender
in Bayesian Persuasion models (Kamenica and Gentzkow (2011)).
For each menu A, a function of the form (1) represents some ranking %A over the set
of all Blackwell experiments. The main result of this paper is a representation theorem
identifying necessary and sufficient conditions (axioms) on arbitrary orderings %A (one for
each menu A) as well as signal-contingent choices for DM2 such that each %A is represented
by a formula of the form (1), where the prior ν and utility index v do not depend on A. I
call such a family of representations a Value of Information Representation for DM1.
In addition to five axioms that are direct adaptations of the Anscombe-Aumann model to
this setting, two new axioms characterize DM1: Foresight and Consistency. Foresight tests
whether DM1 correctly anticipates the behavior of DM2, while Consistency ensures that his
beliefs and preferences over outcomes are not menu-dependent. Hence, DM1’s preference
for information—combined with the choice behavior of DM2—is sufficient to pin down his
beliefs and preferences over outcomes.
Signal-contingent choices for DM2 take the form of choice correspondences cs where
cs (A) ⊆ A is the (nonempty) set of acts chosen by DM2 from menu A upon observing signal
s. A signal s is identified with a profile (sω )ω∈Ω of likelihoods of occurrence, so that only s
(rather than the whole experiment σ) is needed for DM2 to perform Bayesian updating. The
representation for DM2’s behavior generalizes existing Bayesian representation theorems to
allow non-partitional information, and employs a modified version of the Anscombe-Aumann
axioms. The key axiom, Bayesian Independence, generalizes the standard independence
axiom to incorporate the effect of Bayesian updating on choice. Specifically, it establishes
an equivalence between skewing beliefs and skewing utilities that must be satisfied by any
Bayesian decision maker.
As suggested above, an interesting feature of this environment is that DM1 and DM2
need not be distinct individuals, but may represent a single individual at two different points
in time. If preferences or prior beliefs are not fixed across periods, the individual exhibits
a form of dynamic inconsistency and may have a preference for commitment as in Strotz
(1955). In contrast to well-known models of temptation and commitment (for example,
Gul and Pesendorfer (2001)), agents in this model do not directly reveal a preference for
hard commitment because DM1’s (hypothetical) ranking of menus is not observable—only
his rankings %A of Blackwell experiments, indexed by menus A, are available. However,
informational choice offers a form of commitment and, as the main result shows, reveals
enough about the individual to pin down the priors and preferences for both periods, provided
the choices of DM2 are observable as well. Following this logic, I derive results relating DM1’s
preference for information to the tastes and beliefs of DM1 and DM2. Violations of the
2
Blackwell ordering on information structures reveal a preference for commitment, indicating
a difference in beliefs, a difference in tastes, or both. I show that DM1’s preference for
information can distinguish between these possibilities; that is, his preference reveals whether
DM1 and DM2 have a common prior, common tastes, both, or neither.
Thus, in addition to characterizing the testable implications of a wide class of information
disclosure models, my results show that examining an individual’s preference for information
offers an alternative new method of revealed-preference analysis. The classic approach to
identification of utilities and beliefs involves direct comparison of acts themselves. These are
interpreted as risky prospects, and the decision maker’s subjective beliefs and utilities are
elicited by determining which gambles are more attractive to him. In contrast, my results
show that beliefs and utilities may be identified by determining which types of information
the individual finds more valuable, and that such preferences can be used to test whether
the individual is a subjective expected utility maximizer or not. Combined with ex-post
choice data, one can also determine if the individual is dynamically consistent, a Bayesian
information processor, and/or sophisticated in the sense of correctly anticipating future
behavior.
Although a preference for information may seem rather abstract, it is not difficult to
see how individuals reveal their informational preferences in different environments. People
choose which television stations to watch, web sites to browse, and newspapers to read. Many
online newspapers allow subscribers to customize their news feeds by selecting categories
(sports, finance, politics, etc) about which they will be informed of new developments. By
customizing such a news feed, the individual reveals what type of information he considers
to be the most valuable, interesting, or useful. Such data could be used to test whether the
individual is consistent with Bayesian expected utility maximization.
In other environments, one is able to observe not only an individual’s choice of information, but also choices made upon receiving information. For example, online retailers allow
customers to sign up to receive information about new products, services, or sales as they
become available. Again, a customer’s tailoring of such a news feed potentially reveals a
lot about his value of information. But the retailer also observes which signal (news item)
is sent to the customer as well as the customer’s subsequent purchase decisions from the
retailer, because these decisions are tied to the same account as the news feed. Hence, the
retailer observes both ex-ante informational choice and ex-post (signal-contingent) choices.
This data could be used not only to test the Bayesianism and dynamic consistency of an
individual, but also to identify his underlying preferences (tastes and beliefs) across periods.
The paper is organized as follows. In the next subsection, I discuss related literature.
In section 2 I define notation and other concepts used throughout, including Blackwell ex3
periments and their mixtures. In section 3 I define the Value of Information and Bayesian
representations for DM1 and DM2, respectively, before presenting the axioms and representation theorems in sections 3.1 and 3.2. Section 4 outlines the proof of the representation
theorem. In section 5 I present simple conditions to test whether DM1 and DM2 share a
common prior, common preferences over outcomes, or both. Finally, section 6 concludes.
1.1
Related Literature
As indicated above, the model studied here is closely related to the Bayesian Persuasion
framework of Kamenica and Gentzkow (2011), henceforth KG. In their model, a sender
chooses a Blackwell experiment and a receiver takes an action from some set after observing
a signal generated by the experiment. Building on techniques of Aumann and Maschler
(1995), KG examine when it is possible for the sender to improve his own expected payoff
through persuasion (committing to a Blackwell experiment). In my framework, the sender
(DM1) benefits from persuasion when there is some σ that is strictly preferred over the
uninformative experiment that generates the same signal in each state of the world with
certainty.
Instead of studying when the sender might benefit from persuasion, my model investigates
how the informational choice of the sender and the signal-contingent choices of the receiver
can be used to test whether the agents are Bayesian subjective expected utility maximizers
and, if so, what their underlying beliefs and tastes are. Importantly, the KG setting involves
a fixed set of actions, a common prior for the sender and receiver, and state-dependent
utilities. In my model, actions are modeled as Anscombe-Aumann acts; this captures much
of the intuition that the value of an action depends on the realized state of the world
but, as is well-known, falls short of capturing truly state-dependent preferences. I also
examine preferences for a wide range of possible action sets (menus) for the receiver; this
is needed to pin down the underlying parameters of the agents. Finally, my framework
permits the sender and receiver to have different priors. Alonso and Camara (2014) extend
the KG framework to allow this possibility, and find that (generically) the sender benefits
from persuasion under heterogeneous priors. This is consistent with my analysis since, as
Proposition 4 shows, heterogeneous priors necessitate violations of Blackwell monotonicity
in simple decision environments.
More generally, this paper is related to the rapidly growing literature on information
disclosure under commitment power.2 Some of the more closely related contributions include Rayo and Segal (2010), who study optimal disclosure rules for a class of models with
2
In contrast, the cheap talk literature assumes the sender has no commitment power; see, for example,
Crawford and Sobel (1982).
4
two-dimensional information (value to the sender and value to the receiver); Kolotilin, Li,
Mylovanov, and Zapechelnyuk (2015), who study persuasion when the receiver has private information; Lipnowski and Mathevet (2016), who consider how a benevolent principal should
disclose information to agents that are susceptible to temptation, reference dependence, or
other behavioral considerations; and Mathevet, Perego, and Taneva (2016), who develop a
generalized epistemic model of persuasion.
Recently, Taneva (2015) has extended the Bayesian persuasion framework to consider
environments with multiple agents (receivers). The principal seeks to commit to a disclosure
rule such that the resulting Bayes Nash equilibrium maximizes her expected utility. Other
authors have examined how equilibrium play varies with the information structure in games,
leading to various generalizations of the Blackwell ordering and other methods of comparing
information structures in games; see, for example, Gossner (2000), Bassan, Gossner, Scarsini,
and Zamir (2003), Peski (2008), Lehrer, Rosenberg, and Shmaya (2010), and Bergemann and
Morris (2016).
Dynamic inconsistency and related conceptual issues have been studied in a variety of
settings by several authors. For example, Epstein and Le Breton (1993) show that if an individual is dynamically consistent, then his beliefs can be represented by a probability distribution. Karni and Schmeidler (1991) show that preferences over conditional lotteries satisfying
standard assumptions are dynamically consistent if and only if they are expected utility preferences; similarly, Border and Segal (1994) show that upon observing low-probability events,
conditional preferences are well-approximated by expected utility preferences provided the
individual is dynamically consistent and has differentiable ex-ante preferences. Grant, Kajii,
and Polak (2000) examine when a dynamically consistent individual with non-expected utility preferences prefers more information to less. These papers take underlying preferences
(in some cases, corresponding utility representations) as given and examine implications of
dynamic consistency or inconsistency. In contrast, the main primitive of my model is an individual’s ranking of information structures themselves, from which preferences and beliefs
can be identified and dynamic consistency tested.
Behavioral economists have developed models where information suppression or selfsignaling can be used to regulate behavior. Carrillo and Mariotti (2000) show that, in
a model of personal equilibrium, time-inconsistent agents may benefit from acquiring less
information. Benabou and Tirole (2002, 2006) consider models where, for instance, players
rationally limit the information available to future selves. In other settings, agents experience
utility (or disutility) depending on how information is revealed over time; see, for example,
Kreps and Porteus (1978), Grant, Kajii, and Polak (1998), Dillenberger (2010), Ely, Frankel,
and Kamenica (2015), and Gul, Natenzon, and Pesendorfer (2016).
5
The analysis of DM2 builds on existing Bayesian representation theorems to allow nonpartitional information. Ghirardato (2002) develops a representation using conditional preferences over acts; that is, families of preferences indexed by events, with the interpretation
that the event represents an observed signal. Karni (2007) uses a similar family of conditional
preferences defined over conditional acts; the extra structure of conditional acts permits both
prior beliefs and state-dependent utilities to be identified, in addition to testing Bayesian updating for the case of partitional information. Wang (2003) axiomatizes Bayes’ rule and some
of its extensions in a setting with conditional preferences over (infinite-horizon) consumptioninformation profiles; preferences are conditioned on sequences of previously realized events.
Lu (2016) shows how random choice data can be used to derive a representation where a
decision maker’s behavior is explained by the combination of a utility index and a distribution of probability measures (beliefs) over a state space. Since Blackwell experiments induce
distributions over posteriors, these beliefs can be interpreted as posteriors in order to determine both a prior (the average of the posteriors) and an experiment inducing them. Hence,
random choice data may be used to infer an individual’s information. Decision-theoretic
models of rational inattention (Denti, Mihm, de Oliveira, and Ozbek (2016), Ellis (2013),
Caplin and Dean (2015)) also use standard choice primitives to make inferences about an
individual’s preferences, beliefs, and information processing ability. My model takes the opposite approach and uses an individual’s informational choice to make inferences about his
underlying tastes and beliefs.
2
Framework and Notation
2.1
Outcomes, lotteries, acts
Let X denote a finite set of N ≥ 2 outcomes. Elements of X are typically denoted x, y,
while elements of ∆X (lotteries) are denoted p, q.3 A lottery p assigns probability p(x) to
outcome x.
There is a finite, exogenous state space Ω = {1, . . . , W } where W ≥ 2 denotes the number
of states. Arbitrary states are typically denoted ω, ω 0 , while members of ∆Ω (probability
distributions over Ω) are denoted µ or ν.
As a notational convention, subscripts denote states. So, a distribution µ ∈ ∆Ω may be
expressed as µ = (µω )ω∈Ω , where µω is the probability assigned to state ω.
A function f : Ω → ∆X is an (Anscombe-Aumann) act. Let A denote the set of all
3
For any finite set S, ∆S denotes the standard probability simplex over S, equipped with the usual convex
mixture operation.
6
acts. Acts are typically denoted f , g, h, and may be written as profiles: f = (fω )ω∈Ω , where
fω ∈ ∆X . The set A is equipped with the standard mixing operation: if f, g ∈ A and
α ∈ [0, 1], then αf + (1 − α)g := (αfω + (1 − α)gω )ω∈Ω .
A menu is a finite, nonempty set of acts. Menus are typically denoted A, B.
2.2
Blackwell Experiments
Definition 1 (Blackwell Experiment). A matrix σ with entries in [0, 1] is a (finite) Blackwell
experiment if it has exactly W rows, no columns consisting only of zeros and, for each row,
the sum of entries is exactly one. Let E denote the set of all Blackwell experiments.
Implicitly, each column of σ represents a signal that might be generated. This way, each
row represents a state-contingent probability distribution over a finite set of signals. The
assumption that each column contains at least one nonzero entry eliminates signals that
have zero probability of occurrence in each state. Note that entries in any given column are
not required to sum to one.
It will be convenient to express experiments in terms of their columns. Let
S := {s = (sω )ω∈Ω ∈ [0, 1]Ω : ∃ω such that sω 6= 0}
(2)
Elements of S are called signals. Clearly, every column of an experiment σ corresponds to a
signal s where sω is the entry for the column in row ω.
The statement ‘s ∈ σ’ means σ has a column given by s. Note, however, that an experiment σ may have duplicate columns. When quantifying over signals in an experiment,
different columns of σ are distinguished even if they are identical as signals. For example, the requirement that each row in σ has entries summing to one may be expressed as
P
‘∀ω, s∈σ sω = 1’ because the summation notation implicitly distinguishes between different
columns of σ that happen to yield the same s. Similarly, statements like ‘∀s ∈ σ, y s ∈ Y ’ associate (potentially) different members of Y to different columns of σ, even if those columns
are identical as signals.
For each σ and α ∈ (0, 1), let ασ denote the matrix formed by multiplying each entry of
σ by α. If σ, σ 0 ∈ E and α ∈ (0, 1), then ασ ∪ (1 − α)σ 0 denotes the matrix consisting of the
columns of ασ together with the columns of (1 − α)σ 0 . It is easy to verify that this mixture
yields a well-defined experiment.4 If α ∈ {0, 1}, then ασ ∪ (1 − α)σ 0 refers either to σ (when
α = 1) or to σ 0 (when α = 0).
Note that this operation is not commutative. Specifically, ασ ∪ (1 − α)σ 0 means that the matrix (1 − α)σ 0
is appended to the right of matrix ασ. So, typically, ασ ∪ (1 − α)σ 0 6= (1 − α)σ 0 ∪ ασ.
4
7
2.3
Primitives
I assume the analyst obtains two sets of observations:
(1) For each menu A, a preference %A over E; and
(2) For each signal s ∈ S, a choice correspondence cs where, for each menu A, cs (A) is a
nonempty subset of A.
These should be interpreted as ex-ante preference for information and ex-post (signal contingent) choices, respectively. Specifically, an initial decision maker (DM1) ranks Blackwell
experiments knowing that a second decision maker (DM2) will observe a signal generated by
the experiment before choosing an act from A. Only a column s ∈ σ is needed for DM2 to
perform Bayesian updating; this is why the choice correspondences are indexed by signals s
instead of pairs (σ, s).
3
The Representation
The objective is to represent DM1’s preference for information in terms of subjective expected
utility and DM2’s signal-contingent behavior as subjective expected utility with Bayesian
updating.
First, consider DM2. In the representation, DM2 has a full-support prior µ ∈ ∆Ω and a
P
utility index u : X → R. If p ∈ ∆X , let u(p) := x u(x)p(x).
Definition 2 (Bayesian Representation). A pair (µ, u) is a Bayesian Representation for
DM2 if µ has full support, u : X → R is a non-constant utility index and, for all s ∈ S and
all menus A,
(
)
X
X
cs (A) = f ∈ A : ∀g ∈ A,
u(fω )µsω ≥
u(gω )µsω
(3)
ω
ω
where the posteriors µs satisfy Bayes rule:
µω s ω
ω 0 µω 0 s ω 0
µsω = P
(4)
This definition says each choice correspondence cs is rationalized by an expected utility
model with prior µ, utility index u, and Bayesian updating: upon observing signal s, µ is
updated to the Bayesian posterior µs ∈ ∆Ω given by (4).
The key to understanding DM1’s preference for information is to examine how an experiment, a menu, and DM2’s choice behavior combine to yield an Anscombe-Aumann act.
The next definition formalizes this process.
8
Definition 3 (Induced Acts). The set of induced acts for experiment σ at menu A is given
by
(
!
)
X
L A (σ) :=
sω fωs
: f s ∈ ∆cs (A) for all s
(5)
s∈σ
ω∈Ω
For each menu A, let E ∗ (A) denote the set of experiments such that L A (σ) is a singleton.
If σ ∈ E ∗ (A), then LωA (σ) := fω where L A (σ) = f ∈ A.
It is easy to show that L A (σ) is convex. When the choice correspondences cs have a
Bayesian representation, each operator L A also satisfies a natural linearity: L A (ασ ∪ (1 −
α)σ 0 ) = αL A (σ) + (1 − α)L A (σ 0 );5 for a proof, please see the appendix. Note that L A (σ) is
definable for any collection of choice correspondences {cs : s ∈ S}, and requires only finitely
many observations to compute. In other words, the choice correspondences do not have to
satisfy any axioms—and the analyst does not have to perform any identification—in order
to determine L A (σ).
Intuitively, f s ∈ ∆cs (A) represents a randomization over the acts that DM2 might choose
from A upon observing signal s. Since signal s occurs with probability sω in state ω, and since
P
P
s
s
=
1,
this
yields
a
lottery
ω
s∈σ
s∈σ sω fω ∈ ∆X for state ω. Repeating this procedure
for each state ω yields an Anscombe-Aumann act, and letting f s vary across all members
of ∆cs (A) generates a (convex) set of acts L A (σ). This set encapsulates the full range of
possibilities for DM2’s behavior, given that second period choices must be consistent with
the correspondences cs .
When L A (σ) contains exactly one act, DM1 evaluates the act using subjective expected
utility. If L A (σ) contains more than one act, however, then standard expected utility
cannot unambiguously assign a value to this set. The representation permits a high degree
of freedom regarding the evaluation of such sets; only basic requirements of linearity and
consistency, defined next, are imposed.
Definition 4. A function V A : E → R is:
1. A representation of %A if V A (σ) ≥ V A (σ 0 ) ⇔ σ %A σ 0
2. Linear if V A (ασ ∪ (1 − α)σ 0 ) = αV A (σ) + (1 − α)V A (σ 0 )
3. Consistent if V A (σ) = V A (σ 0 ) whenever L A (σ) = L A (σ 0 ).
In other words, a consistent function V A assigns the same value to experiments that
induce the same set of acts. So, it is as if V A assigns values to convex sets of acts.
5
For convex sets X, Y ⊆ A and α ∈ [0, 1], let αX + (1 − α)Y := {αf + (1 − α)g : f ∈ X, g ∈ Y }.
9
P
If v : X → R is a utility index and p ∈ ∆X , let v(p) := x v(x)p(x). The next definition
formalizes the desired representation for DM1’s preferences.
Definition 5 (Value of Information Representation). A family {V A : E → R} of consistent,
linear representations (one for each A) and a pair (ν, v) constitute a Value of Information
Representation for DM1 if ν ∈ ∆Ω has full support, v : X → R is a non-constant utility
index and, for each menu A and all σ ∈ E ∗ (A),
V A (σ) =
X
νω v(LωA (σ))
(6)
ω
As suggested above, a Value of Information Representation is fairly silent regarding DM1’s
attitude toward ties for DM2 (that is, his evaluation of experiments σ inducing non-singleton
sets L A (σ)). He may hold best-case beliefs, worst-case beliefs, or anything inbetween; he
might also receive bonus utility or disutility from the presence of ties. Other than linearity
and consistency of V A , the model does not impose any particular assumptions about DM1’s
attitude toward ties. Instead, his attitude will be derived from the preferences %A .
3.1
Characterization of DM1
In this section I focus on DM1. I present seven axioms employing both first-period preferences
%A and second-period choices cs , and show that if DM2 has a Bayesian representation,
then these axioms are necessary and sufficient for DM1 to have a Value of Information
representation with standard uniqueness properties. In section 3.2 I present five axioms on
second-period choices cs and show that they are necessary and sufficient for DM2 to have
a Bayesian representation, also with standard uniqueness properties. Hence, a combined
representation theorem characterizing both decision makers follows immediately.
The first three axioms for DM1 are standard vNM axioms, adapted to operate on Blackwell experiments and their mixtures. The Independence and Continuity axioms hold because
L A (ασ∪(1−α)σ 0 ) = αL A (σ)+(1−α)L A (σ 0 ) whenever DM2 has a Bayesian representation.
Axiom A1 (Rationality). Each %A is complete and transitive.
Axiom A2 (Independence). If σ A σ 0 and α ∈ (0, 1), then ασ ∪ (1 − α)σ 00 A
ασ 0 ∪ (1 − α)σ 00 for all σ 00 .
Axiom A3 (Continuity). If σ A σ 0 A σ 00 , then there exist α, β ∈ (0, 1) such that
ασ ∪ (1 − α)σ 00 A σ 0 A βσ ∪ (1 − β)σ 00 .
10
Although these axioms are familiar, the Mixture Space Theorem (Herstein and Milnor
(1953)) does not apply because the set E with the given mixture operation does not qualify
as a mixture space. The next axiom will help circumvent this problem.
Axiom A4 (Foresight). If L A (σ) = L A (σ 0 ), then σ ∼A σ 0 .
Foresight says that DM1’s ranking of experiments only depends on the corresponding
sets of induced acts, provided the menu A is fixed. When σ and σ 0 both induce a single
act, f , this amounts to the assumption that DM1 cares about f but not which particular
experiment induces f . This is clearly satisfied if DM1 is an expected utility maximizer. If
σ and σ 0 each induce a set X of acts that is not a singleton, the axiom says that DM1
only cares about X. That is, he only cares about the range of possible behavior consistent
with DM2’s choice correspondences, and in this sense DM1 displays foresight: he correctly
anticipates DM2’s behavior to the highest extent possible.
For p ∈ ∆X , h ∈ A, and ω ∈ Ω, let p[ω]h denote the act formed by taking h and
replacing hω with p. The next axiom is analogous to the State Independence axiom in the
Anscombe-Aumann model, once again adapted to operate on experiments.6 The version
presented here rules out null states, so that DM1’s prior ν will have full support.
Axiom A5 (State Independence). Suppose L A (σ) = p[ω]h and L A (σ 0 ) = q[ω]h while
L A (σ̂) = p[ω 0 ]ĥ and L A (σ̂ 0 ) = q[ω 0 ]ĥ. Then σ %A σ 0 implies σ̂ %A σ̂ 0 .
Recall that for each menu A, E ∗ (A) ⊆ E denotes the set of experiments σ such that L A (σ)
is single-valued. The following axiom is necessary (but not sufficient) to obtain uniqueness
of the derived parameters (ν, v) for DM1.
Axiom A6 (Non-Degeneracy). There is a menu A and experiments σ, σ 0 ∈ E ∗ (A) such
that σ A σ 0 .
Finally, Axiom A7 ensures the desired uniqueness properties for DM1. It states that
DM1’s ranking of two induced acts does not depend on which particular menus and/or experiments induce them. This is the only axiom asserting any relationships between different
orderings %A and %B .
The standard axiom says: if ω, ω 0 are non-null and p[ω]h is weakly preferred over q[ω]h, then p[ω 0 ]ĥ is
weakly preferred over q[ω 0 ]ĥ for all ĥ.
6
11
Axiom A7 (Consistency). If L A (σ) = f = L B (σ̂) and L A (σ 0 ) = g = L B (σ̂ 0 ), then
σ %A σ 0 implies σ̂ %B σ̂ 0 .
The difference between Consistency and Foresight is that Foresight applies to only one
menu at a time but holds for all experiments, while Consistency restricts attention to experiments inducing single acts but permits comparisons across menus. Consequently, DM1’s
ranking over induced Anscombe-Aumann acts does not depend on the menu A, while his
attitude toward tie-breaking issues (non-singleton sets L A (σ)) may be menu-dependent.
Proposition 1. Suppose DM2 has a Bayesian representation. The relations %A satisfy
Axioms A1–A7 if and only if DM1 has a Value of Information Representation. Moreover, ν
is unique and, for each A, V A (hence, v) is unique up to positive affine transformation.
Although most of the axioms for DM1 are adaptations of the Anscombe-Aumann axioms
to this setting, Proposition 1 does not follow directly from the Anscombe-Aumann theorem.
Different menus A induce different sets of acts, and the parameters (µ,u) of DM2 determine
which acts can be generated. Some menus generate richer sets than others, but in all cases
there is only a subset of the full domain A to work with. So, some care is required both to
derive candidates for (ν, v) and to ensure that these parameters are not menu-dependent.
In section 4, I sketch the main steps needed to prove Proposition 1; for a complete proof,
please see the appendix.
3.2
Characterization of DM2
While the axiomatic characterization of DM1 requires both the ex-ante preferences %A and
ex-post choice correspondences cs of DM2, the characterization of DM2’s Bayesian Representation only requires the correspondences cs .
Axiom B1 (Rationality). Each choice function cs satisfies WARP.
Since the model restricts attention to finite menus, Axiom B1 implies that for each s
there is a complete and transitive relation %s rationalizing cs . Specifically, f %s g if and
only if there exists a menu A such that f, g ∈ A and f ∈ cs (A).
Axiom B2 (Non-Degeneracy). For each s, there are acts f, g such that f s g.
12
This is a standard axiom in the Anscombe-Aumann model. It says that each relation %s
does not simply assign indifference among all acts, and is necessary to disentangle preferences
and beliefs.
Endow A with the standard Euclidean topology, and endow S with the topology of real
projective space.7 Let A × S employ the corresponding product topology.
Axiom B3 (Continuity). For each f , the sets {(g, s) ∈ A × S : f %s g} and {(g, s) ∈
A × S : g %s f } are closed.
This axiom expresses two forms of continuity. First, holding s constant, it says that
% satisfies the usual continuity: contour sets of %s are closed. Second, holding f and g
constant, it says that if f s g, then it is possible to perturb s while maintaining strict
preference for f over g. This holds because the Bayesian posterior µs varies continuously
with s in the given topology.
If E ⊆ Ω and f, h ∈ A, let f Eh denote the act g such that gω = fω for ω ∈ E and
gω = hω otherwise. Similarly, if s, t ∈ S, let sEt denote the profile r such that rω = sω for
ω ∈ E and rω = tω otherwise. Note that r may not be a well-defined signal (at least one
entry in r must be nonzero).
s
Axiom B4 (State Independence). Suppose f = p[ω]h and g = q[ω]h while f 0 = p[ω 0 ]h0
0
and g 0 = q[ω 0 ]h0 . If sω , s0ω0 > 0 and f %s g, then f 0 %s g 0 .
This is a slight modification of the standard State Independence axiom used in the
Anscombe-Aumann model. It rules out null states for DM1’s prior (so the prior µ will
have full support) and ensures the existence of a common ranking over lotteries independently of the state and independently of the preference %s under consideration.
Axiom B5 (Bayesian Independence). If f s g, α ∈ (0, 1) and t = sE(αs), then
(αf + (1 − α)h)Ef t (αg + (1 − α)h)Eg for all h.
To understand this axiom, note that if E = Ω it reduces to the standard Independence
axiom for %s ; that is, f s g and α ∈ (0, 1) implies αf +(1−α)h s αg +(1−α)h for all h. If
E 6= Ω, observe that only the “virtual” posterior probabilities sω µω matter when comparing
f and g at signal s (that is, the denominator in Bayes rule can be ignored). Relative to
7
This is the quotient topology of the standard Euclidean topology on [0, 1]Ω \0 with respect to the equivalence relation s ∼ λs for all λ > 0.
13
s, signal t scales the virtual probabilities for states ω 0 ∈
/ E down by a factor of α. This
has the same effect as retaining the virtual probabilities from s but scaling all utility values
down by α for states ω 0 ∈
/ E. Clearly, this will not reverse a preference for f over g if the
utilities u(fω ) and u(gω ) are scaled down by α as well. Such a scaling is exactly equivalent
to requiring (αf + (1 − α)h)Ef t (αg + (1 − α)h)Eg since (1 − α)h can be “canceled”
from both sides to yield a comparison between “(αf )Ef ” and “(αg)Eg”. Combined with
the scaled probabilities outside of E, this is equivalent to comparing αf and αg. In other
words, Bayesian Independence expresses an equivalence between scaling virtual probabilities
and scaling utilities themselves.
Proposition 2. The choice functions cs satisfy axioms B1–B5 if and only if they have a
Bayesian representation (µ, u). Furthermore, µ is unique and u is unique up to positive
affine transformation.
The following theorem is an immediate consequence of Propositions 1 and 2.
Theorem 1. The relations %A and the choice correspondences cs satisfy Axioms A1–A7
and B1–B5 if and only if DM1 has a Value of Information representation and DM2 has a
Bayesian representation.
Proof. First apply Proposition 2 to establish a Bayesian representation (µ, u) for DM2 with
the desired uniqueness properties. Then apply Proposition 1 to establish a Value of Information representation for DM1, also with the desired uniqueness properties.
The proof of Proposition 2 is fairly straightforward. First, observe that each %s satisfies
the Anscombe-Aumann axioms. In particular, Bayesian Independence implies the standard
Independence axiom, and the standard Continuity axiom is implied by axiom B3. Hence,
%s has an expected utility representation with parameters (µs , us ), where µs is a prior and
0
us a utility index. Axiom B4 implies that %s and %s have the same ranking over constant
acts (lotteries), so it is without loss of generality to assume that us = u for all s.
So, the only task is to ensure that the priors µs are the correct Bayesian posteriors given
signal s for some prior µ. The natural candidate for µ is µe , where eω = 1 for all ω, because
the signal e provides no new information. Essentially, the Bayesian Independence axiom
s
e
ensures that the probability ratios satisfy µµsω = ssω0 µµωe , as prescribed by Bayes’ rule. For full
ω ω0
ω0
details, please see the appendix.
4
Outline of the Proof
In this section I highlight the main steps needed to prove Proposition 1. The proof is divided
into three steps: establishing a linear representation V A : E → R for each A, constructing a
14
menu A∗ rich enough to derive candidates for ν and v, and then showing that these candidates
represent %A on E ∗ (A) for all menus A. For full details, please see the appendix.
Step 1: A Linear V A for each %A
Since E with the given mixture operation does not qualify as a mixture space,8 a small trick
is needed to establish a linear representation. Fix a menu A and let MA := {L A (σ) :
σ ∈ E}. Since each set L A (σ) is a convex set of acts, it is straightforward to show that
the pair (MA , +) constitutes a mixture space, where the mixture operation is given by
αX + (1 − α)Y := {αf + (1 − α)g : f ∈ X, g ∈ Y }.
By Axiom A4 (Foresight), the following binary relation on MA is well-defined:
X DA Y ⇔ ∃σ X , σ Y such that L A (σ X ) = X, L A (σ Y ) = Y, and σ X %A σ Y
(7)
Axioms A1–A3 ensure that D is a complete and transitive relation satisfying the standard
Independence and Continuity axioms. Thus, by the Mixture Space Theorem, DA has a linear
representation W A : MA → R. This translates into a linear representation V A : E → R for
%A by letting V A (σ) := W A (L A (σ)).
Step 2: Construction of Candidates (ν, v)
This section illustrates the construction for the case of two states; the general case essentially
replicates this logic for every pair of states. The idea of the proof is to construct a menu
∗
A∗ generating a set of acts A(A∗ ) := {L A (σ 0 ) : σ 0 ∈ E ∗ (A∗ )} rich enough to invoke the
Anscombe-Aumann theorem. In particular, A(A∗ ) will contain subsets of the form L∗ × p
and p × L∗ , where L∗ ⊆ ∆X is full-dimensional. This way, the State Independence axiom
is active on a sufficiently rich domain to ensure a common preference over lotteries across
states.
Figure 1 illustrates the construction. First, an affinely independent9 set P = {p1 , . . . , pN }
of lotteries is chosen such that uN > . . . > u1 and u2 − u1 > u3 − u2 > . . . > uN − uN −1 ,
where ui := u(pi ). The symmetric menu for P is defined by
A∗ := (pi , pN −i+1 ) : i = 1, . . . , N
Figure 1a plots A∗ in utility space, and Figure 1b shows how DM2’s choices from A∗ cut the
For example, 21 σ ∪ 12 σ 6= σ.
A set {w0 , . . . , wm } ⊆ Rn is affinely independent if {w1 − w0 , . . . , wm − w0 } is linearly independent. For
additional background on affine algebra, please see the appendix.
8
9
15
(p1 , p4 )
u
(p2 , p3 )
4
u3
(p3 , p2 )
s
u2
1
s2
u
1
s3
s4
(p4 , p1 )
u1
u2
u3 u4
(a) Utility space
(b) Signal space
Figure 1: Symmetric Menu when Ω = {1, 2} and N = 4. The condition u2 − u1 > u3 − u2 >
u4 − u3 ensures that for each f ∈ A∗ , there is a signal s such that cs (A∗ ) = f .
signal space into convex cones. The colored lines in 1a run perpendicular to their counterparts
in 1b and indicate indifference between acts. Figure 1b also plots a symmetric experiment
σ = {s1 , s2 , s3 , s4 }; in particular, s1 and s4 are reflections across the diagonal line, as are s2
∗
and s3 . By the choice of A∗ , this implies that L A (σ) = (p, p) for some p in the convex hull
of P .
Figure 1b also shows a (square) neighborhood around each signal. Varying each signal
within it’s bounding square (subject to the constraint that the resulting set of signals constitutes an experiment) generates a rich set of acts. Since P is affinely independent, perturbing
the state 1 entries of the signals while holding the state 2 entries fixed generates a set of
acts of the form L∗ × p, where L∗ ⊆ ∆X is full-dimensional. These perturbations can be
mirrored in the state 2 coordinate (leaving the original state 1 coordinates fixed), thereby
generating the set p × L∗ by symmetry of A∗ .
Step 3: Spreading the Representation
The purpose of this step is to show that there is a unique, non-degenerate linear preference
% on A such that each preference %A agrees with % on E ∗ (A). That is, if L A (σ) = f and
L A (σ 0 ) = g, then σ %A σ 0 if and only if f % g.
The Consistency axiom plays a key role here. Each %A gives a linear function on the
convex set A(A) := {L A (σ) : σ ∈ E ∗ (A)}. So, by Consistency, if two menus A and B yield
sets A(A) and A(B) with nonempty intersection, then they have same linear ordering on
A(A) ∩ A(B). If A(A) ∩ A(B) has full dimension, the linear ordering on this intersection
has a unique extension to all of A. The goal is to show that the family of full-dimensional
16
zf
sf
sg
zg
T (A)
g
f
sf
sh
sg
zh
sh
(a) Utility Space
h
(b) Signal space
Figure 2: The relationship between T (A) and the sets S A (f ). Each act translates into a
vertex of T (A); for example, z f = (µ1 u(f1 ), µ2 u(f2 )). DM2 chooses f at signal s if s ∈ S A (f ).
sets A(A) is sufficiently connected—if there are isolated components, the Consistency axiom
cannot force the orderings on different components to derive from a common linear ordering.
The easiest way to proceed is to restrict attention to menus A where each f ∈ A is strictly
preferred by DM2 for some signals s. Such menus are called k-menus, where |A| = k. Once
a unique representation is established for this class of menus, it is straightforward to extend
it to all menus.
A k-menu A yields a partition of S into convex cones where, for each f ∈ A, there is
a full-dimensional convex cone S A (f ) := {s ∈ S : cs (A) = f }; this is the support of f in
A. These supports are used to construct a polytope T (A) ⊆ RΩ , where each vertex is of
the form z = (µω u(fω ))ω∈Ω for some f ∈ A, and the faces of T (A) represent indifference.
As Figure 2 shows, there is a natural duality between the set of supports S A (f ) and the
polytope T (A).
The first step is to show that it is without loss of generality to restrict attention to
independent menus. These are the k-menus A where, for each ω, the set Aω := {fω : f ∈
A} contains a subset of N affinely independent lotteries. This way, the set A(A) is fulldimensional. The idea is that any k-menu A can be extended to an independent menu by
finding an independent menu B such that A(A) and A(A ∪ B) have sufficient overlap. See
the appendix for details.
Independent menus A and B share a representation if there is a unique linear % on A
that agrees with %A on E ∗ (A) and with %B on E ∗ (B). By the previous argument, it will
suffice to show that all independent menus share a representation. A useful special case is
when B is an oriented translation of A. This means T (B) = T (A) + λ for some λ ∈ RΩ (so
that every f ∈ A has a corresponding ψ(f ) ∈ B), and that for each α ∈ [0, 1], the menu
Aα := {(1 − α)f + αψ(f ) : f ∈ A} is independent. As Figure 3 shows, not all translations
17
(a) Oriented
(b) Not Oriented
Figure 3: Orientedness when T (A) = T (B) and N = 3. The lotteries Aω = {fω : f ∈ A} are
represented by solid dots, while the lotteries Bω = {gω : g ∈ B} are represented by circles.
The configuration in (b) is not oriented because the affine path from Aω to Bω (traversing
along indifference curves) yields a collinear set of lotteries at α = 1/2.
z
g1
f
f
zg
g2
1
sf
zg
2
T (A)
sg
zh
sh
(a) Utility Space
h
(b) Signal Space
Figure 4: Adding a face to T (A). The act g from Figure 2 is replaced by two acts, g 1 and g 2 ,
that can be made arbitrarily close to g. Consequently, the act induced by σ = {sf , sg , sh } in
the new menu closely approximates the induced act in the original menu.
T (B) = T (A) + λ are oriented.
When B is a translation of A, every menu Aα gives the same division of the signal space
α
S into convex cones. Therefore, for every experiment σ ∈ E ∗ (A), the induced act L A (σ)
varies continuously with α; hence, the sets A(Aα ) are continuous in α as well. When the
translation is oriented, this continuity ensures full-dimensional overlap between nearby sets
0
A(Aα ) and A(Aα ), so that A and B share a representation. This result is required near the
end of the proof.
The next step is to show that if A is an independent menu, then it is without loss of
generality to assume that T (A) has a face with normal e, where e ∈ RΩ satisfies eω = 1 for
all ω. Figure 4 illustrates the procedure. The idea is to intersect T (A) with a half-space with
normal e, splitting one or more vertices of T (A) into multiple vertices. These vertices—and
acts corresponding to them—can be made arbitrarily close to the original ones, forming a
menu A0 that shares a representation with A.
18
L2
L1
(a) Utility Levels
(b) A∗ω and Bω∗
(c) Aα , α = 0.7
Figure 5: Constructing oriented sets of lotteries. The solid (blue) lines are the utility levels
for A∗ω , and the dotted (red) lines are utility levels for B ∗ . With this construction, every set
Aα is affinely independent.
Now let A and B denote arbitrary independent menus such that T (A) and T (B) each
have a face with normal e. The idea is to construct menus A∗ and B ∗ such that (i) the
vertices of T (A∗ ) (resp. T (B ∗ )) lie just beyond the e-face of T (A) (resp. T (B)), and (ii)
B ∗ is an oriented translation of A∗ . Then A and A0 := A ∪ A∗ share a representation; in
fact, A0 and A∗ share a representation as well, because there exists an experiment σ such
that cs (A0 ) ∈ A∗ for all s ∈ σ (this holds because the vertices of T (A∗ ) are near the e-face
of T (A)). Similar statements hold for B, B 0 := B ∪ B ∗ , and B ∗ . Since B ∗ is an oriented
translation of A∗ , it follows that A∗ and B ∗ share a representation. Therefore A and B share
a representation, as desired.
The menus A∗ and B ∗ are constructed by first determining what their polytopes T (A∗ )
and T (B ∗ ) = T (A∗ ) + λ must be, and then choosing acts inducing these polytopes in a way
that makes the translation oriented. Each polytope consists of N vertices, though some care
is needed when constructing them because many conditions must be satisfied—I omit the
details in this sketch.
Having chosen the polytopes, a particular technique is needed to ensure orientedness.
Figure 5 illustrates the procedure for the case N = 3. In a given state ω, plot the utility levels
(hyperplanes) for the ω-coordinates of the vertices for T (A∗ ) and T (B ∗ ). These hyperplanes
share an edge in the simplex ∆X . Pick two lines L1 , L2 in ∆X near (and parallel to)
this edge, and construct the lotteries A∗ω as follows. Let H i denote the hyperplanes for the
utility levels in T (A∗ ). If u1 is the highest utility value, let p1 be the unique intersection
of L1 with H 1 . Similarly, let p2 lie at the intersection of L2 and H 2 . Finally, take p3 to
be the intersection of L2 with H 3 . These are the state ω lotteries for the three acts in A∗ .
Now perform a similar procedure for the ω-coordinates of T (B ∗ ) using the same L1 and L2 ,
yielding the state ω-coordinates for the acts in B ∗ . It is now simple to verify that Aα is
independent for all α.
19
5
Comparing Individuals
In this section, I show how the data %A and cs can be used to make simple comparisons
between the priors and preferences of different individuals. I study two different scenarios.
The first, presented in section 5.1, examines how DM1’s preferences %A may be used to
test whether DM1 and DM2 have a common prior, common preferences, or both. The
second, presented in section 5.2, considers the possibility that two (potentially) dynamically
˙ are characterized by data (%A , cs ) and (%̇, ċs ). I
inconsistent decision makers, DM and DM,
˙ have common tastes or common beliefs,
develop techniques to test whether DM1 and DM1
˙
and perform similar analysis for DM2 and DM2.
5.1
Comparing DM1 and DM2
Suppose the data %A and cs satisfy axioms A1–A7 and B1–B5, so that DM1 has a Value of
Information representation and DM2 has a Bayesian representation. The goal is to formulate
tests indicating whether DM1 and DM2 have a common prior or a common utility index
without having to explicitly identify these parameters.
For each ω, let eω denote the signal s such that sω = 1 and sω0 = 0 for all ω 0 6= ω. Let
σ ∗ := {eω : ω ∈ Ω} denote perfect information; this is the experiment that always reveals
the true state.
Proposition 3. DM1 and DM2 have the same preferences (that is, u is a positive affine
transformation of v) if and only if for all menus A, σ ∗ , σ ∈ E ∗ (A) implies σ ∗ %A σ.
Proposition 3 offers a simple way to determine if preferences over ∆X are a source of
disagreement for DM1 and DM2. If, in any menu A, DM1 ranks some σ ∈ E ∗ (A) higher
than σ ∗ , then DM1 and DM2 must have different preferences over lotteries. Conversely, DM1
and DM2 must have the same ranking of lotteries if, independently of the menu A, DM1’s
most-preferred experiment is σ ∗ . This provides a way of comparing the preferences of DM1
and DM2 without explicitly constructing the utility indices v and u.
To understand why Proposition 3 holds, observe that if a signal eω is generated, then
DM2 chooses an act f ∗ ∈ A such that u(fω∗ ) ≥ u(fω ) for all f ∈ A. It follows that if DM1
and DM2 rank lotteries the same way, then they must agree on the optimal act in menu A at
signal eω . Thus, σ ∗ always induces choices for DM2 that are optimal for DM1, and therefore
DM1 satisfies σ ∗ %A σ for all σ ∈ E ∗ (A). If DM1 and DM2 disagreed about the ranking of
some lotteries, then there would be a menu A and a signal eω where DM1 does not agree
with DM2’s choice at eω , making room for some σ 6= σ ∗ to yield higher ex-ante expected
utility for DM1. For full detail, please see the appendix.
20
If σ, σ 0 ∈ E, let σ w σ 0 denote the Blackwell ordering. That is, σ w σ 0 means σ is more
informative than σ 0 .10
Definition 6 (Blackwell Monotonicity). A binary relation % on E is Blackwell monotone
on E 0 ⊆ E if
(i) For all σ, σ 0 ∈ E 0 , σ w σ 0 implies σ % σ 0 , or
(ii) For all σ, σ 0 ∈ E 0 , σ w σ 0 implies σ - σ 0 .
In other words, a preference % is Blackwell monotone if, on E 0 , it always agrees with the
Blackwell ordering or always reverses the Blackwell ordering.
If ω 6= ω 0 , let E = [ω, ω 0 ] denote an ordering of ω and ω 0 . For lotteries p, q and any act
h, let (p, q)Eh denote the act g such that gω = p, gω0 = q, and gω00 = hω00 for all ω 00 6= ω, ω 0 .
An E-menu is a menu of the form A = {(p, q)Eh, (q, p)Eh} such that S A (f ) 6= ∅ for each
f ∈ A, where S A (f ) := {s ∈ S : cs (A) = f } is the support of f in A.
Proposition 4. DM1 and DM2 have a common prior (ν = µ) if and only if, for each E
and each E-menu A, %A is Blackwell monotone on E 0 = E ∗ (A).
Proposition 4 says that if DM1’s ranking of experiments in simple decision environments
(E-menus) is Blackwell monotone, then DM1 and DM2 have a common prior. This does
not mean that DM1’s ranking is consistent with the Blackwell ordering in all E-menus;
depending on which lotteries p, q are used in a given E-menu, and depending on how DM1
and DM2 rank p and q, DM1’s preferences either agree with the Blackwell ordering or reverse
it. So, unless their preferences over lotteries coincide, there will be E-menus A where %A
agrees with the Blackwell ordering, and some where %A reverses the Blackwell ordering. As
long as no %A exhibits a strict non-monotonicity with respect to the Blackwell ordering,
DM1 and DM2 must have the same prior. This provides a way of testing the common prior
assumption without explicitly constructing ν and µ.
To understand why Proposition 4 holds, consider the case |Ω| = 2 and suppose u(p) >
u(q). Observe that in an E-menu, DM2’s choice at signal s is determined by a cutoff value
for ss12 . Specifically, DM2 strictly prefers (p, q) at s if and only if s1 µ1 > s2 µ2 ; equivalently,
s1
> µµ21 . Similarly, DM2 strictly prefers (q, p) if and only if ss12 < µµ21 . Now suppose ν = µ.
s2
Then, although DM1 does not get to choose the act, his preferred choices at arbitrary signals
s either all agree or all disagree with DM2’s decision (provided v(p) 6= v(q)) because DM1’s
(hypothetical) optimal choices are generated by a similar cutoff rule; the cutoff is νν12 = µµ12 .
Hence, DM1’s preference %A is Blackwell monotone on E ∗ (A).
10
By now there are many different characterizations of this ordering. See de Oliveira (2016), BielinskaKwapisz (2003), Crémer (1982), or Leshno and Spector (1992) for accessible treatments.
21
(q, p)
C
µ1
µ2
ν1
ν2
(p, q)
(a) Signal Space
Figure 6: Illustration of Proposition 4 when Ω = {1, 2}. The solid (blue) line is DM2’s
cutoff, and the dashed line is DM1’s (hypothetical) cutoff. If these cutoffs disagree, then
Blackwell monotonicity is violated by considering garblings of experiments with signals near
the boundary of region C.
Now suppose ν 6= µ. Then DM1’s (hypothetical) cutoff rule uses cutoff νν21 6= µµ12 . This
means S contains three convex cones separated by the lines s1 µ1 = s2 µ2 and s1 ν1 = s2 ν2 .
Let C denote the middle region bounded by the two lines. If DM1 and DM2 agree on the
optimal act in region C, then they disagree in the other two regions, and vice versa. This
wedge makes it possible to construct experiments violating Blackwell monotonicity. If, for
example, DM1 and DM2 disagree on the optimal act in region C, and if σ involves a signal
inside C but close to the boundary of C, then there is a garbling σ 0 of σ such that σ 0 A σ;
thus, σ and σ 0 reverse the Blackwell ordering. But DM1 and DM2 agree on choices outside
of C, and in that region %A agrees with the Blackwell ordering. Hence, %A is not Blackwell
monotone.
The decision maker is dynamically consistent if ν = µ and u is a positive affine transformation of v. The next proposition is a direct consequence of Propositions 3 and 4.
Proposition 5. The decision maker is dynamically consistent if and only if, for all menus
A and all σ, σ 0 ∈ E ∗ (A), σ w σ 0 implies σ %A σ 0 .
Proof. Suppose σ w σ 0 implies σ %A σ 0 for all σ, σ 0 ∈ E ∗ (A). Then DM1 and DM2 have a
common preference by Proposition 3 because σ ∗ w σ for all σ. Moreover, %A is Blackwell
monotone for each E-menu A, so that ν = µ by Proposition 4. The converse is immediate.
5.2
Identification and Comparisons Across Individuals
˙ and that each is decomposed into two
Suppose there are two individuals, DM and DM,
˙ and DM2.
˙
˙ are characterized by
decision makers: DM1 and DM2, and DM1
DM and DM
22
A
data (%A , cs ) and (%̇ , ċs ) satisfying axioms A1–A7 and B1–B5. In this section, I show how
˙
this data may be used to compare the priors and preferences of DM and DM.
To perform these comparisons, I show how the priors and preferences of a given individual
can be elicited directly using a particular class of menus. So, the techniques developed here
provide not only a method of comparing individuals, but also of eliciting their parameters
directly from a subset of the primitives.
Definition 7. Suppose A is a menu, E = [ω, ω 0 ], and s ∈ S such that |cs (A)| = 1. If ŝ, t ∈ S,
write ŝ − s ≡ t to indicate:
(i) cŝ (A) = cs (A), and
(ii) ŝ − s := (ŝω00 − sω00 )ω00 ∈Ω = tE0.
The idea is that ŝ is a translation of s along the ω, ω 0 coordinates (in direction t) that
does not affect the choice of DM2. Note that ktk must be relatively small in order for the
translation to yield a well-defined ŝ such that cŝ (A) = cs (A).
An E-menu A is non-degenerate if %A is non-degenerate (that is, there exist σ, σ 0 ∈ E ∗ (A)
such that σ A σ 0 ). The idea of the next definition is to find a signal t that reveals the ratio
t
of νω to νω0 ; specifically, ννω0 = tωω0 .
ω
Definition 8. A signal t calibrates E = [ω, ω 0 ] (for DM) if there is a non-degenerate E-menu
0
A, α > 0 and signals s, s0 , ŝ, ŝ0 such that cs (A) = f 6= g = cs (A) and
(i) {s, s0 } ∼A {ŝ, ŝ0 }
(ii) ŝ − s ≡ αt and s0 − ŝ0 ≡ αt.
Note that (i) implicitly requires {s, s0 } and {ŝ, ŝ0 } to be well-defined experiments. A similar
˙ by replacing E ∗ (A) with Ė ∗ (A) and c with ċ.
definition holds for DM
It is not difficult to show that every E has a calibrating signal t, and that if t and t0
calibrate E, then (t0ω , t0ω0 ) = λ(tω , tω0 ) for some λ > 0. Using the representation, it is also
easy to see that tω νω = tω0 νω0 if t calibrates E = [ω, ω 0 ]; in other words, t reveals the ratio
t
νω
= tωω0 . This suggests the following:
ν 0
ω
˙ have a common prior ν = ν̇ if and only if for all E, every
Proposition 6. DM1 and DM1
˙
t that calibrates E for DM1 also calibrates E for DM1.
Proposition 6 follows from the fact that calibrating signals t pin down probability ratios
in the prior for DM1, as described above. This means that finding calibrating signals offers
23
a direct way of pinning down the prior for DM1, and that Proposition 6 is effectively a
corollary of this method of identification. Still, Proposition 6 offers a simple way to refute
˙
the hypothesis ν = ν̇: if, for some E = [ω, ω 0 ], calibrating signals t and ṫ for DM1 and DM1,
respectively, do not satisfy (ṫω , ṫω0 ) = λ(tω , tω0 ) for some λ > 0, then ν 6= ν̇.
˙ is slightly more involved because
Comparing the utility indices v and v̇ of DM1 and DM1
A
˙
the preferences %A and %̇ depend on the preferences and priors of DM2 and DM2.
In
˙ have different priors and preferences, so some care is needed to elicit
general, DM2 and DM2
v and v̇.
Definition 9. Let p, p0 ∈ ∆X . Then p0 is revealed preferred to p (for DM1) if there exists
E = [ω, ω 0 ], a signal t calibrating E for DM1, and a menu A = {(q, p)Eh, (p0 , q)Eh} such
that:
0
(i) {s, s0 } %A {ŝ, ŝ0 } where cs (A) = (q, p)Eh, cs (A) = (p0 , q)Eh, and
(ii) ŝ − s ≡ t, and s0 − ŝ0 ≡ t.
Once again, (i) implicitly requires {s, s0 } and {ŝ, ŝ0 } to be well-defined experiments. Write
p0 Rp to indicate that p0 is revealed preferred to p for DM1, and p0 Ṙp to indicate that p0 is
˙
revealed preferred to p for DM1.
This method of eliciting DM1’s preferences over ∆X is quite direct, and is robust to the
possibility that DM2 is indifferent between p and p0 —in such cases, an appropriate q can be
found to elicit DM1’s ranking of p and p0 provided p and p0 are interior. Since the ranking
of interior lotteries is enough to pin down a linear preference on ∆X , the method can elicit
DM1’s preferences regardless of the beliefs and preferences of DM2.
˙ have a common utility index (that is, v̇ is a positive affine
Proposition 7. DM1 and DM1
transformation of v) if and only if, for all interior p, p0 ∈ ∆X , p0 Rp ⇔ p0 Ṙp.
˙
Since signal-contingent choices are observed for both DM2 and DM2,
it is fairly straightforward to compare u to u̇ and µ to µ̇. Recall that a constant act f such that fω = p for all
ω may be denoted p.
Proposition 8.
˙ have a common utility index (u̇ is a positive affine transformation of
(i) DM2 and DM2
u) if and only if, for all p, q ∈ ∆X ,
s
p %s q ⇔ p %̇ q
24
˙ have a common prior (µ = µ̇) if and only if, for all E = [ω, ω 0 ] and all
(ii) DM2 and DM2
˙ sq0,
p, q, p0 , q 0 ∈ ∆X such that p s q and p0 s
(p, q)Eh %s (q, p)Eh ⇔ (p0 , q 0 )Eh %̇ (q 0 , p0 )Eh
˙ have common preferences over ∆X if and only
Proof. Part (i) is quite clear: DM2 and DM2
if their ranking of constant acts coincide. For (ii), let E = [ω, ω 0 ]. Since p s q for some s,
the representation implies u(p) > u(q). Hence, there is an s such that (p, q)Eh ∼s (q, p)Eh.
s
˙ s (q 0 , p0 )Eh for
Using the representation, this means µµω0 = sωω0 . But s also satisfies (p0 , q 0 )Eh∼
ω
˙ s q 0 , so that µ̇µ̇ω = ssω0 . Thus, µµω = µ̇µ̇ω for all ω, ω 0 , forcing µ = µ̇.
any p0 , q 0 such that p0 0
0
0
ω
ω
6
ω
ω
Conclusion
In this paper, I have developed a revealed-preference model of information disclosure. By
considering both ex-ante preferences for information as well as ex-post (signal-contingent)
choices, the model axiomatically characterizes the testable implications of a large class of
sender-receiver models with commitment power (Bayesian persuasion). Interpreting the
sender and receiver as a single (potentially) dynamically inconsistent individual, I also show
how the sender’s preferences can be used to disentangle sources of dynamic inconsistency;
that is, an individual’s preference for information reveals whether his tastes or beliefs are
time consistent. Since preferences for hard commitment are not directly observed, violations
of the Blackwell ordering on information structures are used instead to perform this analysis.
Methodologically, this paper expands the scope of revealed-preference analysis to include preferences over information structures (Blackwell experiments). This provides a new
method of testing whether an individual is an expected utility maximizer and, if so, identifying what his underlying tastes and beliefs are. The model considered here also tests whether
the individual uses Bayes’ rule to update beliefs, and whether he correctly anticipates future
behavior and plans accordingly.
In principle, there is no reason why preferences for information should be limited to the
domain of Bayesian subjective expected utility theory. There are many other models of
behavior and, while expected utility is a natural starting point, it seems plausible that other
theories of decision might also be characterized in terms of their effects on informational
choice. I hope to explore these possibilities in future work.
25
References
Alonso, R. and O. Camara (2014). Bayesian persuasion with heterogeneous priors. Available at SSRN 2306820 .
Anscombe, F. J. and R. J. Aumann (1963). A definition of subjective probability. The
annals of mathematical statistics 34 (1), 199–205.
Aumann, R. J. and M. Maschler (1995). Repeated games with incomplete information.
MIT press.
Bassan, B., O. Gossner, M. Scarsini, and S. Zamir (2003). Positive value of information
in games. International Journal of Game Theory 32 (1), 17–31.
Benabou, R. (2006). Belief in a just world and redistributive politics. Quarterly Journal
of Economics, 121 (2).
Benabou, R. and J. Tirole (2002). Self-confidence and personal motivation. The Quarterly
Journal of Economics 117 (3), 871–915.
Bergemann, D. and S. Morris (2016). Bayes correlated equilibrium and the comparison of
information structures in games. Theoretical Economics 11 (2), 487–522.
Bielinska-Kwapisz, A. (2003). Sufficiency in blackwells theorem. Mathematical Social Sciences 46 (1), 21–25.
Blackwell, D. (1951). Comparison of experiments. In Proceedings of the second Berkeley
symposium on mathematical statistics and probability, Volume 1, pp. 93–102.
Blackwell, D. (1953). Equivalent comparisons of experiments. The annals of mathematical
statistics 24 (2), 265–272.
Border, K. C. and U. Segal (1994). Dynamic consistency implies approximately expected
utility preferences. Journal of Economic Theory 63 (2), 170–188.
Caplin, A. and M. Dean (2015). Revealed preference, rational inattention, and costly
information acquisition. The American Economic Review 105 (7), 2183–2203.
Carrillo, J. D. and T. Mariotti (2000). Strategic ignorance as a self-disciplining device.
The Review of Economic Studies 67 (3), 529–544.
Crawford, V. P. and J. Sobel (1982). Strategic information transmission. Econometrica,
1431–1451.
Crémer, J. (1982). A simple proof of blackwell’s comparison of experiments theorem.
Journal of Economic Theory 27 (2), 439–443.
26
de Oliveira, H. (2016). Blackwells informativeness theorem using category theory. Working
Paper .
Denti, T., M. Mihm, H. de Oliveira, and K. Ozbek (2016). Rationally inattentive preferences and hidden information costs. Theoretical Economics.
Dillenberger, D. (2010). Preferences for one-shot resolution of uncertainty and allais-type
behavior. Econometrica 78 (6), 1973–2004.
Ellis, A. (2013). Foundations for optimal inattention.
Ely, J., A. Frankel, and E. Kamenica (2015). Suspense and surprise. Journal of Political
Economy 123 (1), 215–260.
Epstein, L. G. and M. Le Breton (1993). Dynamically consistent beliefs must be bayesian.
Journal of Economic theory 61 (1), 1–22.
Ghirardato, P. (2002). Revisiting savage in a conditional world. Economic theory 20 (1),
83–92.
Gossner, O. (2000). Comparison of information structures. Games and Economic Behavior 30 (1), 44–63.
Grant, S., A. Kajii, and B. Polak (1998). Intrinsic preference for information. Journal of
Economic Theory 83 (2), 233–259.
Grant, S., A. Kajii, and B. Polak (2000). Preference for information and dynamic consistency. Theory and Decision 48 (3), 263–286.
Gul, F., P. Natenzon, and W. Pesendorfer (2016). Random evolving lotteries and intrinsic
preference for information. Working paper .
Gul, F. and W. Pesendorfer (2001). Temptation and self-control. Econometrica 69 (6),
1403–1435.
Herstein, I. N. and J. Milnor (1953). An axiomatic approach to measurable utility. Econometrica, 291–297.
Kamenica, E. and M. Gentzkow (2011). Bayesian persuasion. American Economic Review 101 (6), 2590–2615.
Karni, E. (2007). Foundations of bayesian theory. Journal of Economic Theory 132 (1),
167–188.
Karni, E. and D. Schmeidler (1991). Atemporal dynamic consistency and expected utility
theory. Journal of Economic Theory 54 (2), 401–408.
27
Kolotilin, A., M. Li, T. Mylovanov, and A. Zapechelnyuk (2015). Persuasion of a privately
informed receiver. Technical report.
Kreps, D. M. and E. L. Porteus (1978). Temporal resolution of uncertainty and dynamic
choice theory. Econometrica, 185–200.
Lehrer, E., D. Rosenberg, and E. Shmaya (2010). Signaling and mediation in games with
common interests. Games and Economic Behavior 68 (2), 670–682.
Leshno, M. and Y. Spector (1992). An elementary proof of blackwell’s theorem. Mathematical Social Sciences 25 (1), 95–98.
Lipnowski, E. and L. Mathevet (2016). Disclosure to a psychological audience. Working
Paper .
Lu, J. (2016). Random choice and private information. Econometrica (forthcoming).
Mathevet, L., J. Perego, and I. Taneva (2016). Information design: The epistemic approach. Working Paper .
Peski, M. (2008). Comparison of information structures in zero-sum games. Games and
Economic Behavior 62 (2), 732–735.
Rayo, L. and I. Segal (2010). Optimal information disclosure. Journal of political Economy 118 (5), 949–987.
Strotz, R. H. (1955). Myopia and inconsistency in dynamic utility maximization. The
Review of Economic Studies 23 (3), 165–180.
Taneva, I. A. (2015). Information design. Working Paper .
Wang, T. (2003). Conditional preferences and updating. Journal of Economic Theory 108 (2), 286–321.
28
A
Proof of Proposition 1
Preliminaries
In this section we review some basic definitions and results about affine spaces. Throughout,
we work with (nonempty) subsets of Rn .
If X ⊆ Rn , the affine hull of X is the set
(
aff(X) =
α0 x0 + . . . + αm xm : x0 , . . . , xm ∈ X and
m
X
)
αi = 1
i=1
Elements of aff(X) are called affine combinations of X. Clearly, co(X) ⊆ aff(X), where
co(X) is the convex hull of X.
A set X ⊆ Rn is an affine space if X = aff(X). Moreover, every affine space X is of the
form
X = a + Y := {a + y : y ∈ Y }
for some a ∈ Rn and linear subspace Y ⊆ Rn . Since Y is uniquely determined by X, we may
define the dimension of an affine space to be
dim(X) := dim(Y ),
where X = a + Y . We extend this definition to arbitrary convex subsets C ⊆ R by letting
dim(C) := dim(aff(C))
That is, the dimension of a convex set is the dimension of its affine hull.
Clearly the set ∆(X ) can be identified with a convex subset of Rn , where n = |X |. It is
easy to see that dim(∆(X )) = |X | − 1. Similarly, the set of Anscombe-Aumann acts can be
identified with the set ∆(X )×. . .×∆(X ) = ∆(X )|Ω| , and has dimension |Ω|(|X |−1). We will
move freely between the lottery/act and vector representations in several proofs. Finally, we
say that a convex subset C ⊆ ∆(X)m (m ≥ 1) has full dimension if dim(C) = dim(∆(X)m );
that is, if ∆(X)m ⊆ aff(C).
A set {x0 , . . . , xm } ⊆ Rn is affinely independent if {x1 − x0 , . . . , xm − x0 } is linearly
independent. If X ⊆ Rn is an affine space of dimension m − 1 and B = {x0 , . . . , xm } ⊆ X
is affinely independent, then B is an affine basis for X. In that case, every x ∈ X may be
expressed in affine coordinates: for each x ∈ X, there are unique scalars α0 , . . . , αm with
P i
α = 1 such that x = α0 x0 + . . . αm xm . Every affine space has an affine basis.
29
Let C ⊆ Rn be convex. A function T : C → R is linear if T (αx + (1 − α)y) =
αT (x) + (1 − α)T (y) whenever x, y ∈ C and α ∈ [0, 1]. A function T ∗ : C → R is affine if
T ∗ (α0 x0 + . . . αn xn ) = α0 T ∗ (x0 ) + . . . + αn T ∗ (xn )
whenever xi ∈ C, α0 x0 + . . . + αn xn ∈ C, and α0 + . . . + αn = 1. Clearly every affine function
is linear; the converse also holds.
If C is convex and T : C → R is linear (hence affine), then T has a unique affine extension
T ∗ : aff(C) → R. That is, T ∗ is affine and satisfies T ∗ (x) = T (x) for all x ∈ C.
Step 1: A linear representation for %A
Lemma 1. For each menu A and experiment σ, L A (σ) is convex.
Proof. Suppose f, g ∈ L A (σ) and α ∈ [0, 1]. Then for each s ∈ σ there are acts f s , g s ∈
P
P
s
s
s
g
. Thus
s
f
and
g
=
∆cs (A) such that f =
ω
ω
ω
ω
s∈σ
s∈σ
ω∈Ω
ω∈Ω
!
αf + (1 − α)g =
α
X
sω fωs + (1 − α)
X
sω gωs
s∈σ
s∈σ
ω∈Ω
!
=
X
sω [αfωs + (1 − α)gωs ]
s∈σ
ω∈Ω
!
=
X
sω hsω
s∈σ
where hs = αf s + (1 − α)g s ∈ ∆cs (A),
ω∈Ω
so that αf + (1 − α)g ∈ L A (σ).
Definition 10. For each menu A, let MA := L A (σ) : σ ∈ E . If X, Y ∈ MA and α ∈
[0, 1], let αX + (1 − α)Y := {αf + (1 − α)g : f ∈ X, g ∈ Y }.
Lemma 2. If σ, σ 0 ∈ E and α ∈ [0, 1], then L A (ασ ∪ (1 − α)σ 0 ) = αL A (σ) + (1 − α)L A (σ 0 ).
Proof. The statement clearly holds n
if α ∈ {0, 1}. So suppose α ∈ (0,
o 1) and let σ̂ = ασ ∪
P
0
A
s
s
s
(1 − α)σ . Recall that L (σ) =
and that L A (σ 0 ) =
s∈σ sω fω ω∈Ω : f ∈ ∆c (A)
n P
o
0
0
0 s0
s
g
: g s ∈ ∆cs (A) . Since σ̂ = {αs : s ∈ σ} ∪ {(1 − α)s0 : s0 ∈ σ 0 } we have:
0
0
ω
ω
s ∈σ
ω∈Ω
(
!
X
L A (σ̂) =
αsω fωs +
s∈σ
X
=
α
s∈σ
s0
(1 − α)s0ω gω
s0 ∈σ 0
sω fωs + (1 − α)
: f s ∈ ∆cαs (A), g ∈ ∆c
(1−α)s0
(A)
ω∈Ω
!
(
X
)
s0
X
s0
s0ω gω
s0 ∈σ 0
: f s ∈ ∆cαs (A), g ∈ ∆c
ω∈Ω
30
)
s0
(1−α)s0
(A)
Note that cλt = ct for all signals t and scalars λ > 0. Thus
(
L A (σ̂) =
!
α
X
s∈σ
=
sω fωs + (1 − α)
X
)
s0
s0
s0ω gω
s0 ∈σ 0
s0
: f s ∈ ∆cs (A), g ∈ ∆c (A)
ω∈Ω
αf + (1 − α)g : f ∈ L (σ), g ∈ L A σ 0
A
,
as desired.
Definition 11 (Mixture Space). A mixture space is a set M and an operator ⊕ : [0, 1] ×
M × M → M (where ⊕(α, m, m0 ) is written αm ⊕ (1 − α)m0 ) such that:
(i) 1m ⊕ 0m0 = m,
(ii) αm ⊕ (1 − α)m0 = (1 − α)m0 ⊕ αm, and
(iii) α[βm ⊕ (1 − β)m0 ] ⊕ (1 − α)m0 = (αβ)m ⊕ (1 − αβ)m0 .
Lemma 3. For each menu A, the pair MA , + is a mixture space.
Proof. First, we verify that the family MA is closed under the proposed mixture operation.
If X, Y ∈ MA , then there exist σ X , σ Y ∈ E such that L A (σ X ) = X and L A (σ Y ) = Y . Let
α ∈ [0, 1]. To see that αX + (1 − α)Y ∈ MA , apply Lemma 2 to get L A (ασ X ∪ (1 − α)σ Y ) =
αL A (σ X ) + (1 − α)L A (σ Y ) = αX + (1 − α)Y .
The remainder of the argument is standard and well-known, but reproduced here for
completeness. Properties (i) and (ii) are simple to verify. For (iii), let Z = βX + (1 − β)Y .
To see that αZ + (1 − α)Y = αβX + (1 − αβ)Y , observe that if h ∈ αZ + (1 − α)Y , then
there are acts f ∈ X and g, g 0 ∈ Y such that
h = α(βf + (1 − β)g) + (1 − α)g 0
= αβf + α(1 − β)g + (1 − α)g 0
1−α 0
α(1 − β)
= αβf + (1 − αβ)
g+
g
1 − αβ
1 − αβ
∈ αβX + (1 − αβ)Y
Conversely, if h ∈ αβX + (1 − αβ)Y , then there are acts f ∈ X, g ∈ Y such that
h = αβf + (1 − αβ)g
= αβf + α(1 − β)g + (1 − α)g
= α(βf + (1 − β)g) + (1 − α)g
∈ αZ + (1 − α)Y
31
Hence, (MA , +) is a mixture space.
Lemma 4. Every %A has a unique (up to positive affine transformation) linear representation V A : E → R.
Proof. The function L A maps %A to a complete and transitive relation DA on MA defined
by:
X D Y ⇔ ∃σ X , σ Y such that L A (σ X ) = X, L A (σ Y ) = Y, and σ X %A σ Y
This is well-defined because the Foresight Axiom (A4) forces σ ∼A σ 0 whenever L A (σ) =
L A (σ 0 ). Thus, the induced ranking of X and Y does not depend on the choice of representatives σ X , σ Y . Clearly every X ∈ MA has such a representative σ X (recall that
MA := {L A (σ) : σ ∈ E}), and completeness and transitivity of DA is inherited from %A .
Let BA denote the strict part of DA .
By Lemma 2 and the Independence Axiom (A2), DA satisfies the standard vNM independence axiom: if X BA Y and Z ∈ MA , then σ X A σ Y for all representatives
σ X , σ Y of X and Y . Axiom A2 implies ασ X ∪ (1 − α)σ Z A ασ Y ∪ (1 − α)σ Z for all
α ∈ (0, 1) and all representatives σ Z of Z. Apply Lemma 2 and the definition of DA to get
αX + (1 − α)Z BA αY + (1 − α)Z, as desired.
A similar argument employing Lemma 2 and Axiom A3 establishes that DA satisfies vNM
Continuity: X BA Y BA Z implies there are α, β ∈ (0, 1) such that αX + (1 − α)Z BA Y BA
βX + (1 − β)Z.
Thus, DA is a preference relation satisfying the vNM axioms on the mixture space
(MA , +). By the Mixture Space Theorem, DA has a unique (up to positive affine transformation) linear representation W A : MA → R. This induces a linear representation
V A : E → R for %A by defining V A (σ) := W A (L A (σ)). Moreover, V A satisfies
V A (ασ ∪ (1 − α)σ 0 ) = W A (L A (ασ ∪ (1 − α)σ 0 ))
= W A (αL A (σ) + (1 − α)L A (σ 0 )) (by Lemma 2)
= αW A (L A (σ)) + (1 − α)W A (L A (σ 0 )) (by linearity of W A )
= αV A (σ) + (1 − α)V A (σ 0 ),
so that V A is a linear representation for %A .
Step 2: Construction of candidate representation
Recall that N = |X | and that u, µ denote the utility index and prior, respectively, for DM2.
32
Lemma 5. There exists an affinely independent set P = {p1 , . . . , pN } of interior lotteries
such that:
(i) u(pN ) > u(pN −1 ) > . . . > u(p1 ), and
(ii) u(p2 ) − u(p1 ) > u(p3 ) − u(p2 ) > . . . > u(pN ) − u(pN −1 ).
Proof. It is easy to find interior lotteries satisfying conditions (i) and (ii). If necessary,
perturb these lotteries along indifference curves (hyperplanes) in ∆X to arrive at an affinely
independent set.
Definition 12 (Symmetric Menu). Suppose P = {p1 , . . . , pN } satisfies the requirements of
Lemma 5 and that u(p) > u(p) for all p ∈ P . For each E = [ω, ω 0 ], let
AE := (pi , pN −i+1 )Ep : i = 1, . . . , N
= (p1 , pN )Ep, (p2 , pN −1 )Ep, . . . , (pN , p1 )Ep
and let
A∗ :=
[
AE
E
Then A∗ is the symmetric menu on (P, p).
Throughout the remainder of the proof, we take as given a menu A∗ symmetric on some
pair (P, p).
Definition 13 (Interior Experiment). Fix a menu A. For each f ∈ A, let S A (f ) := {s ∈ S :
cs (A) = f }. An experiment σ is A-interior if:
(i) cs (A) is single-valued for all s ∈ σ, and
(ii) For each f ∈ A, there is exactly one s ∈ σ such that cs (A) = f .
Similarly, any set σ of signals (not necessarily qualifying as an experiment) is A-interior if
it satisfies conditions (i) and (ii) above. Such a set is necessarily nonempty and finite.
Definition 14 (ε-Neighborhood). Suppose σ ⊆ S Ω is A-interior and let ε > 0. For each
Q
s ∈ σ, let Qs,ε := ω (sω − ε, sω + ε). Let B ε denote the set of all A-interior sets σ 0 ⊆ S Ω
such that:
(i) For each ω,
P
s0 ∈σ 0
s0ω =
P
s∈σ
sω , and
0
(ii) If s ∈ σ, s0 ∈ σ 0 , and cs (A) = cs (A), then s0 ∈ Qs,ε .
33
Then B ε is an ε-neighborhood of σ.
Note that Definition 14 does not require σ to be an experiment, and that B ε ⊆ E (in
fact, B ε ⊆ E ∗ (A)) if and only if σ is an experiment.
Definition 15 (Symmetric Experiment). For each E = [ω, ω 0 ], let S E := {s ∈ S : ω̂ ∈
/E⇒
sω̂ = 0}; these are the signals with support E. An experiment σ is symmetric with respect
to E if σ = σ ∗ ∪ {eω̂ : ω̂ ∈
/ E} where σ ∗ is AE -interior and
(i) Every s ∈ σ ∗ has support E;
(ii) If s = (sω , sω0 )E0 ∈ σ ∗ , then s := (sω , sω0 )E0 ∈ σ ∗ , where E = [ω 0 , ω].
Lemma 6. For each E = [ω, ω 0 ] and each f ∈ AE , there is an s ∈ S E such that cs (AE ) = f .
Moreover, if s ∈ S E such that cs (AE ) = (pi , pN −i+1 )Ep, then cs (AE ) = (pN −i+1 , pi )Ep.
Proof. First, I show that if (pi , pN −i+1 )Ep s (pi+1 , pN −(i+1)+1 )Ep for some s ∈ S E , then
(pi+1 , pN −(i+1)+1 )Ep s (pi+2 , pN −(i+2)+1 )Ep.
If s ∈ S E and (pi , pN −i+1 )Ep s (pi+1 , pN −(i+1)+1 )Ep, then
sω µω u(pi ) + sω0 µω0 u(pN −i+1 ) > sω µω u(pi+1 ) + sω0 µω0 u(pN −(i+1)+1 )
Equivalently,
sω0 µω0 [u(pN −i+1 ) − u(pN −(i+1)+1 )] > sω µω [u(pi+1 ) − u(pi )]
Observe that, by our choice of P ,
u(pi+1 ) − u(pi ) > u(pi+2 ) − u(pi+1 )
(8)
u(pN −(i+1)+1 ) − u(pN −(i+2)+1 ) > u(pN −i+1 ) − u(pN −(i+1)+1 )
(9)
and
Thus,
sω0 µω0 [u(pN −(i+1)+1 ) − u(pN −(i+2)+1 )] > sω0 µω0 [u(pN −i+1 ) − u(pN −(i+1)+1 )]
> sω µω [u(pi+1 ) − u(pi )]
> sω µω [u(pi+2 ) − u(pi+1 )]
so that (pi+1 , pN −(i+1)+1 )Ep s (pi+2 , pN −(i+2)+1 )Ep.
34
A similar argument establishes that if (pi , pN −i+1 )Ep s (pi−1 , pN −(i−1)+1 )Ep for some
s ∈ S E , then (pi−1 , pN −(i−1)+1 )Ep s (pi−2 , pN −(i−2)+1 )Ep.
Thus, for 1 < i < N , we have cs (AE ) = (pi , pN −i+1 )Ep if and only if
(pi , pN −i+1 )Ep s (pi+1 , pN −(i+1)+1 )Ep and (pi , pN −i+1 )Ep s (pi−1 , pN −(i−1)+1 )Ep
Since s ∈ S E , it cannot be the case that both sω = 0 and sω0 = 0. Suppose sω0 > 0. Using
the representation for DM2, the above conditions are equivalent to
µω0 u(pN −(i−1)+1 ) − u(pN −i+1 )
sω
µω0 u(pN −i+1 ) − u(pN −(i+1)+1 )
<
<
µω
u(pi ) − u(pi−1 )
sω 0
µω
u(pi+1 ) − u(pi )
By (8) and (9) (with i − 1 in place of i), this yields an interval of values for ssω0 such that
ω
cs (AE ) = (pi , pN −i+1 )Ep. Similar algebra shows that for any such s, the signal s yields
cs (AE ) = (pN −i+1 , pi )Ep.
For i = 1 or i = N , observe that s ∈ S E satisfies cs (AE ) = (p1 , pN )Ep if and only
if (p1 , pN )Ep s (p2 , pN −1 )Ep while cs (AE ) = (pN , p1 )Ep if and only if (pN , p1 )Ep s
(pN −1 , p2 )Ep (this follows from the first two claims established in this proof). Using the
representation for DM2 in a similar manner, it is easy to see that there exist signals s ∈ S E
such that cs (AE ) = (p1 , pN )Ep, and that every such s satisfies cs (AE ) = (pN , p1 )Ep.
Lemma 7. Suppose E = [ω, ω 0 ] and let π : Ω → Ω be a bijection. Then:
(i) There exists an experiment that is symmetric with respect to E;
∗
∗
∗
(ii) If σ is symmetric with respect to E, then LωA (σ) = LωA0 (σ) and Lω̂A (σ) = pN for all
ω̂ ∈
/ E.
(iii) If s ∈ S E and π(s) := sπ−1 (ω)
ω∈Ω
, then
cπ(s) (A∗ ) = π(cs (A∗ ))
:= csπ−1 (ω) (A∗ )
ω∈Ω
(iv) If σ = σ ∗ ∪ {eω̂ : ω̂ ∈
/ E} where σ ∗ ⊆ S E is AE -interior, then
∗
∗
L A (π(σ)) = π L A (σ)
∗
:= LπA−1 (ω) (σ)
where π(σ) = {π(s) : s ∈ σ}.
35
ω∈Ω
Proof.
(i) First, observe that for any s ∈ S E , cs (A) ∈ AE . This follows from the fact that
u(pi ) > u(p) for all i = 1, . . . , N and the fact that only acts in AE = AE assign lotteries
from P to both ω and ω 0 .
If N is even, let M = N/2; otherwise, let M = (N + 1)/2. For each i = 1, . . . , M − 1
choose a signal si ∈ S E such that cs (A) = (pi , pN −i+1 )Ep. If M is odd, then take
M
sM = eE0; otherwise, any sM ∈ S E such that cs (A) = (pM , pN −M +1 ) will suffice.
i
Then si satisfies cs (A) = (pN −i+1 , pi )Ep for all i = 1, . . . , M . Let σ ∗ = {s1 , . . . , sM } ∪
{s1 , . . . , sM }.
P
By construction, there is a scalar λ > 0 such that s∈σ∗ s = (λ, λ)E0. If λ 6= 1, replace
P
σ ∗ with the set { λ1 s : s ∈ σ ∗ }. Then s∈σ∗ s = eE0.
To complete the experiment, let σ := σ ∗ ∪ {eω̂ : ω̂ ∈
/ E}. Clearly, σ ∈ E is symmetric
with respect to E.
(ii) Suppose σ := σ ∗ ∪ {eω̂ : ω̂ ∈
/ E} is symmetric with respect to E. Then, for each
i
i = 1, . . . , N , there is a unique si ∈ σ ∗ (with support E) such that cs (A) = (pi , pN −i+1 ).
−i+1
By symmetry, sN −i+1 = si ; that is, siω = sN
. Therefore
ω0
X
s∈σ ∗
sω csω (A∗ )
=
N
X
siω pi
=
i=1
N
X
i=1
sωN0−i+1 pi =
X
sω0 csω0 (A∗ )
s∈σ ∗
If signal eω̂ (ω̂ 6= ω, ω 0 ) is generated, then DM2 is indifferent among acts in f ∈ A∗ such
∗
that fω̂ = pN . Thus, every f eω̂ ∈ ∆ceω̂ (A∗ ) satisfies fω̂eω̂ = pN . Therefore LωA (σ) =
∗
∗
LωA0 (σ) and Lω̂A (σ) = pN .
(iii) First note that if s ∈ S E and E = [ω, ω 0 ], then π(s) = (sω , sω0 )π(E)0 ∈ S π(E) ,
where π(E) = [π(ω), π(ω 0 )]. It follows that cπ(s) (A∗ ) ∈ Aπ(E) and that cπ(s) (A∗ ) =
(pi , pN −i+1 )π(E)p if and only if cs (A∗ ) = (pi , pN −i+1 )Ep := f i . Since
(pi , pN −i+1 )π(E)p = π(f i ) := fπi −1 (ω)
the result follows.
(iv) This follows immediately from (iii).
36
ω∈Ω
The next two lemmas provide general results about menus and (neighborhoods of) experiments that induce full-dimensional sets of acts. These will be used in Step 3 of the proof
as well.
Lemma 8. Suppose that, for each ω, L∗ω ⊆ ∆X is full-dimensional. Let f ∗ ∈ A and define
L∗ω [ω]f ∗ := {p[ω]f ∗ : p ∈ L∗ω }. If X ⊆ A is convex and L∗ω [ω]f ∗ ⊆ X for all ω, then X has
full dimension.
Proof. It will suffice to show that every Anscombe-Aumann act is in the the affine hull of
X. To begin, note that for each ω, aff(X) contains aff(L∗ω [ω]f ∗ ) = {p[ω]f ∗ : p ∈ ∆X } since
L∗ω has full dimension. Therefore aff(X) contains aff(C), where
C=
[
{p[ω]f ∗ : p ∈ ∆X }
ω
So, it is enough to find a finite set B ⊆ C such that A ⊆ aff(B). A natural candidate for B
involves the (affinely independent) set P = {p1 , . . . , pN } ⊆ ∆X . In particular, let
B=
[
pi [ω]f ∗ : i = 1, . . . , N
ω∈Ω
Clearly B ⊆ C. To see that A ⊆ aff(B), let f ∈ A and let α = (αω )ω∈Ω ∈ (0, 1)Ω such that
P
ω αω = 1. For each ω, we have
P
αω0 fω∗
= fω∗ ∈ ∆X
1 − αω
ω 0 6=ω
Therefore, there is some fˆω ∈ ∆X such that
P
fω = αω fˆω + (1 − αω )
X
= αfˆω +
αω0 fω∗
αω0 fω∗
1 − αω
ω 0 6=ω
ω 0 6=ω
Since P is affinely independent with dim(aff(P )) = dim(∆X ), for each ω there are numbers
βωi (i = 1, . . . , N ) such that
fˆω =
N
X
βωi pi
and
i=1
N
X
i=1
37
βωi = 1
Thus
fω = αω fˆω +
X
αω0 fω∗
= αω
N
X
ω 0 6=ω
N
X
αωi pi +
PN
i=1
αωi = αω for each ω, so that
N
XX
ω
αωi pi [ω]f ∗
=
i=1
X
αω0 fω∗ , where αωi := αω βωi
ω 0 6=ω
i=1
Note that
αω0 fω∗
ω 0 6=ω
i=1
=
X
βωi pi +
P PN
ω
N
X
i=1
αωi pi
i=1
+
αωi =
P
ω
N
XX
αω = 1. Then
!
αωi 0 fω∗
ω 0 6=ω i=1
ω∈Ω
!
=
αω fˆω +
X
ω 0 6=ω
αω0 fω∗
ω∈Ω
=f
Thus, f ∈ aff(B), as desired.
Lemma 9. Suppose σ ∈ E is A-interior and that, for each ω, there is a nonempty B ⊆ A
such that |B| = N and Bω := {fω : f ∈ B} is affinely independent. If B ε is an ε-neighborhood
for σ, then:
(i) For each ω, L A (B ε ) := {L A (σ 0 ) : σ 0 ∈ B ε } has a subset of the form {p[ω]f ∗ : p ∈ L∗ },
where L∗ ⊆ ∆X is full-dimensional and f ∗ = L A (σ); and
(ii) L A (B ε ) contains a full-dimensional ball around L A (σ).
Proof.
P
s
B
∗
(i) Let f ∗ = L A (σ) and f−B
:=
:= {s ∈ σ : cs (A) ∈ B}
s∈σ −B sω cω (A), where σ
and σ −B := σ\σ B . Then |σ B | = N . Without loss of generality, let B ε denote an
ε-neighborhood of σ B . Let ω ∈ Ω and note that for every σ 0 ∈ B ε , there is a natural
bijection between signals of σ and signals of σ 0 ; specifically, s ∈ σ and s0 ∈ σ 0 are related
0
if and only if cs (A) = f s = cs (A). For each s ∈ σ, let s0 denote the corresponding
signal in σ 0 .
Fix a state ω and consider σ 0 ∈ B ε such that for all s ∈ σ and all ω 0 6= ω, sω0 = s0ω0 .
38
Thus, every such σ 0 induces an act of the form p[ω]f ∗ , where
(
)
X
p∈
∗
s0ω fωs + f−B
: s0ω ∈ (sω − ε, sω + ε) for all s0 ∈ σ 0 , and
s0 ∈σ 0
X
sω
s∈σ B
)
X
∗
(sω + δ s )fωs + f−B
: |δ s | < ε and
s∈σ B
X
δs = 0
s∈σ B
(
=
s0ω =
s0 ∈σ 0
(
=
X
)
X
fω∗ +
δ s fωs : |δ s | < ε and
s∈σ B
X
δs = 0
s∈σ B
So, it will suffice to show that the set
(
)
X
D :=
δ s fωs : |δ s | < ε and
s∈σ B
X
δs = 0
s∈σ B
has dimension N − 1 (clearly, D is convex). Note that N − 1 is an upper bound on the
dimension of D because D is a translation of a subset of ∆X .
P
P
∗
Pick any s∗ ∈ σ B and note that if s∈σB δ s = 0, then δ s = − s∈σB \s∗ δ s . Thus
D=
=

 X
 B ∗
s∈σ \s

 X

s∈σ B \s∗
δ s fωs −
X

X

: |δ s | < ε ∀s 6= s∗ , and δ s < ε

∗
δ s fωs
s∈σ B \s∗
s∈σ B \s∗


X
s
s
s∗
s
∗
s
δ (fω − fω ) : |δ | < ε ∀s 6= s , and δ <ε

s∈σ B \s∗
∗
Let λs := fωs − fωs for each s ∈ σ B \s∗ . Then {λs : s ∈ σ B \s∗ } is linearly independent
because Bω = {fωs : s ∈ σ B } is affinely independent. Let
0
D := {0} ∪
ε/2 s
λ : s ∈ σ B \s∗
N −1
Then D0 is an affinely independent set of N vectors in RN , so that its convex hull has
dimension N − 1. Moreover, D contains the convex hull of D0 because if λ ∈ co(D0 ),
P
then there are scalars αs ∈ [0, 1] (s ∈ σ B ) such that s∈σB αs = 1 and
∗
λ = αs 0 +
X
s∈σ\s∗
αs
ε/2 s
λ
N −1
s
P
α (ε/2) B
∗
We have λ ∈ D because N −1 < ε for all s ∈ σ \s and s∈σB \s∗
39
αs (ε/2) N −1 < ε/2.
Since D contains the convex hull of D0 , and D0 has dimension N − 1, it follows that D
has dimension N − 1 (recall that N − 1 is an upper bound on the dimension of D).
(ii) By part (i), L A (B ε ) (hence A(A)) contains a subset of the form L∗ω [ω]f ∗ for each ω,
where each set L∗ω ⊆ ∆X has full dimension. Since A(A) is convex, apply Lemma 8 to
get the result.
We now return to the analysis of the symmetric menu A∗ .
Lemma 10. There is a full-dimensional L∗ ⊆ ∆X such that, for all E = [ω, ω 0 ],
L∗ EpN := {(p, q)EpN : p, q ∈ L∗ } ⊆ A(A∗ ).
Proof. By Lemma 7, there is an experiment σ that is symmetric with respect to E such
∗
∗
∗
that LωA (σ) = LωA0 (σ) and Lω̂A (σ) = pN for all ω̂ 6= ω, ω 0 . Choose an ε-neighborhood B ε
for σ and apply part (i) of Lemma 9 to generate a set L∗ω [ω]pN , where L∗ω ⊆ ∆X has full
dimension. Note that by symmetry and definition of B ε , any perturbation of σ along the
ω coordinate can be mirrored in the ω 0 coordinate, producing a new act that assigns the
same lottery to states ω and ω 0 . Thus, the proof of part (i) of Lemma 9 actually yields a set
L∗ EpN , as desired. To see that this can be generated for every choice of E, apply part (iv)
of Lemma 7.
Lemma 11. A(A∗ ) has full dimension.
Proof. This follows immediately from Lemma 10 and Lemma 9.
∗
∗
Lemma 12. Any linear representation W A : A(A∗ ) → R of %A on E ∗ (A∗ ) has a unique
linear extension W : A → R. The extension represents a preference % on A satisfying all of
the Anscombe-Aumann axioms except (possibly) the Non-Degeneracy axiom.
∗
∗
Proof. A linear representation W A exists by Step 1 of the proof (restrict V A to the domain
∗
∗
E ∗ (A∗ ) to form W A ). By Lemma 11, W A has a unique linear extension W : A → R.
Clearly, this induces a complete and transitive relation % on A by letting f % g if and only
if W (f ) ≥ W (g). The Independence and Continuity axioms are satisfied by linearity of W .
To verify that % satisfies the State Independence axiom, suppose p[ω]h % q[ω]h. We
want to show that p[ω 0 ]h0 % q[ω 0 ]h0 . By a standard result, there exist linear functions
P
Uω : ∆X → R (unique up to positive affine transformation) such that W (f ) = ω Uω (fω ).
Thus, p[ω]h % q[ω]h implies Uω (p) ≥ Uω (q).
40
Since L∗ [ω]pN ⊆ A(A∗ ) for each ω, where L∗ ⊆ ∆X has full dimension, there exists
r ∈ L∗ and α ∈ (0, 1) such that αp + (1 − α)r ∈ L∗ and αq + (1 − α)r ∈ L∗ . Thus,
(αp + (1 − α)r)[ω]pN , (αq + (1 − α)r)[ω]pN , (αp + (1 − α)r)[ω 0 ]pN , and (αq + (1 − α)r)[ω 0 ]pN
are elements of A(A∗ ). Moreover, (αp + (1 − α)r)[ω]pN % (αq + (1 − α)r)[ω]pN because
W ((αp + (1 − α)r)[ω]pN ) ≥ W ((αq + (1 − α)r)[ω]pN ) if and only if Uω (p) ≥ Uω (q) (recall
that each Uω is linear).
∗
Since %A satisfies State Independence (Axiom A5) on the domain E ∗ (A∗ ), it follows
that (αp + (1 − α)r)[ω 0 ]pN % (αq + (1 − α)r)[ω 0 ]pN . Therefore Uω0 (p) ≥ Uω0 (q), so that
P
P
Uω0 (p) + ω̂6=ω Uω̂ (h0ω̂ ) ≥ Uω0 (q) + ω̂6=ω Uω̂ (h0ω̂ ). Thus, p[ω 0 ]h0 % q[ω 0 ]h0 , as desired.
Note that, at this point, we cannot yet invoke the Anscombe-Aumann theorem to derive
unique candidates for ν and v. This is because the Non-Degeneracy axiom only requires
that some menu A (not necessarily A∗ ) has a non-degenerate preference %A . Step 3 of the
proof will show that all preferences %A (restricted to domains E ∗ (A)) derive from the same,
uniquely determined linear preference % on A. Then the Non-Degeneracy axiom will imply
that % does not assign indifference among all acts, so that (combined with the above result
proving that % satisfies State Independence) unique beliefs ν and preferences v can be found.
Step 3: Spreading the representation
Throughout the remainder of the proof, assume that DM2’s utility index u has been normalized to take values in [0, 1].
Definition 16. Let A and B be menus such that E ∗ (A) and E ∗ (B) are nonempty.
(i) A relation % on A agrees with %A if, for all σ, σ 0 ∈ E ∗ (A), σ %A σ 0 ⇔ L A (σ) % L A (σ 0 ).
(ii) A inherits a representation from B if every % on A that agrees with B also agrees with
A.
(iii) A and B share a representation if there is a unique % on A that agrees with both A
and B.
Lemma 13. Let A and B be menus such that E ∗ (A) and E ∗ (B) are nonempty.
(i) If dim(A(A)) = dim(A(A) ∩ A(B)) ≤ dim(A(B)), then A inherits a representation
from B.
(ii) If dim(A(A)) = dim(A(A) ∩ A(B)) = dim(A(B)) = dim(A), then A and B share a
representation.
41
Proof. By the Consistency axiom, %A and %B agree on the domain A(A)∩A(B). By Lemma
4, the restriction of V B to A(A) ∩ A(B) is a linear function L. Since A(A) ∩ A(B) is convex,
L has a linear extension to A. Every such extension represents a linear % on A that agrees
with A and B, proving (i). For (ii), note that L has a unique linear extension to A whenever
dim(A(A) ∩ A(B)) = dim(A).
Lemma 14. Suppose A inherits a representation from B. If %B has an expected utility
representation with parameters (v, ν), then so does %A .
Proof. For convenience, let X = A(A) and Y = A(B). Let LB : Y → R denote the expected
utility representation for %B and LA : X → R a linear representation for %A (such an LA
exists by Lemma 4 and the fact that A(A) is convex). By the Consistency axiom, %A and
%B induce the same linear ordering on X ∩ Y . Let L∗ : X ∩ Y → R denote the restriction
of LB to the domain X ∩ Y . Since dim(X ∩ Y ) = dim(X) and X ∩ Y is convex, L∗ has a
unique linear extension to X. Thus, we may assume LA (f ) = L∗ (f ) for all f ∈ X ∩ Y . So,
on X ∩ Y , LA takes the desired form.
Let f ∈ X\Y . Since X is convex and dim(X) = dim(X ∩ Y ), there are g, h ∈ X ∩ Y
and α ∈ (0, 1) such that h = αf + (1 − α)g. Then LA (h) = αLA (f ) + (1 − α)LA (g). Since
LA = L∗ on X ∩ Y , it follows that
1 ∗
[L (h) − (1 − α)L∗ (g)]
α"
#
X
1 X
v(gω )νω
=
v(hω )νω − (1 − α)
α ω
ω
1X
=
[v(αfω + (1 − α)gω ) − (1 − α)v(gω )] νω
α ω
1X
=
[αv(fω ) + (1 − α)v(gω ) − (1 − α)v(gω )] νω
α ω
X
=
v(fω )νω ,
LA (f ) =
ω
as desired.
Definition 17. Let A be a menu.
1. If f ∈ A, the support of f is the set S A (f ) := {s ∈ S : cs (A) = f }.
2. A is a k-menu if |A| = k ≥ 2 and each f ∈ A has nonempty support.
3. A is independent if it is a k-menu for some k and, for each ω, there is an N -menu
B ⊆ A such that Bω := {fω : f ∈ B} is affinely independent.
42
Lemma 15. Suppose A is a k-menu.
(i) If f ∈ A, then S A (f ) is a convex cone and has full dimension;
(ii) There exists an A-interior experiment σ;
(iii) If A is independent, then A(A) has full dimension.
Proof.
(i) First, observe that s ∈ S A (f ) if and only if, for all g ∈ A,
X s ω µω
s ω µω
P
u(fω ) >
u(gω )
ω 0 s ω 0 µω 0
ω 0 s ω 0 µω 0
ω
ω
X
X
⇔
sω µω u(fω ) >
sω µω u(gω )
X
P
ω
ω
It is now straightforward to verify that if s, t ∈ S A (f ), then λs ∈ S A (f ) for all λ > 0
such that λs ∈ S, and αs + (1 − α)t ∈ S A (f ) for all α ∈ [0, 1]. Thus, S A (f ) is a
convex cone. To see that it is a full-dimensional subset of S := [0, 1]Ω \0, note that
since the above inequalities are strict, there is an open ball (in the subspace topology
for S derived from the standard topology on RΩ ) around each s ∈ S A (f ); since the
open ball has full dimension, the result follows.
(ii) Since A is finite and each set S A (f ) is a convex cone, there are signals sf (f ∈ A) such
P
f
that cs (A) = f and, for each ω, f ∈A sfω ≤ 1 (simply choose any signals sf ∈ S A (f )
P
and, if necessary, scale them all down by a factor α ∈ (0, 1) to ensure f ∈A sfω ≤ 1).
For each ω, there is an f ∈ A such that u(fω ) ≥ u(gω ) for all g ∈ A. Thus, sfω can be
P
increased as needed to ensure f ∈A sfω = 1. Repeat this for each ω to get a well-defined
experiment σ = {sf : f ∈ A}.
(iii) By part (ii), there is an A-interior σ and, hence, a ε-neighborhood around σ. Let ω ∈ Ω.
Since A is independent, there is an N -menu B ⊆ A such that Bω = {fω : f ∈ B} is
affinely independent. Now apply Lemma 9.
Definition 18. A finite, nonempty set C of convex cones in S is a conic decomposition if
C = S A (f ) : f ∈ A for some k-menu A. For each k-menu A, the set
C(A) := S A (f ) : f ∈ A
is the conic decomposition for A.
43
Definition 19. For each k-menu A and f ∈ A, let U (f ) := (µω u(fω ))ω∈Ω denote the (virtual)
utility coordinate for f , and let U (A) := {U (f ) : f ∈ A} denote the utility profile for A. If a
set U ⊆ RΩ
+ satisfies U = U (A) for some k-menu A, then U is a k-utility profile. Finally, a
finite set U ⊆ RΩ
+ is a utility profile if U is a k-utility profile for some k.
Lemma 16. If A and B are k-menus such that U (A) = U (B), then C(A) = C(B).
Proof. This follows immediately from the definition of U (A) and the fact that s ∈ S A (f ) if
P
P
and only if ω sω µω u(fω ) > ω µω u(gω ).
By Lemma 16, each utility profile U has an associated conic decomposition C(U ). Specifically, C(U ) is the unique C such that U (A) = U implies C(A) = C.
Definition 20. Let U be a utility profile and z = (zω )ω∈Ω ∈ U . The support of z in U is
the set
(
)
X
X
S U (z) := s ∈ S : ∀z 0 ∈ U,
sω zω >
sω zω0
(10)
ω
ω
Definition 21. Let U be a utility profile. For each z ∈ U and s ∈ S U (z), let H(z, s) :=
{λ ∈ RΩ : s · (λ − z) ≤ 0}. The support polytope of z in U , denoted T (z, U ), is defined as
\
T (z, U ) :=
H(z, s).
(11)
s∈S U (z)
The polytope of U , denoted T (U ), is given by
T (U ) :=
\
T (z, U ).
(12)
z∈U
A polytope T ⊆ RΩ is a decision polytope if T = T (U ) for some utility profile U ; it is a
k-polytope if T = T (U (A)) for some k-menu A.
Definition 22. Let T be a decision polytope. For each face F of T , let η F ∈ S+Ω :=
F
{η ∈ RΩ
is normal to the hyperplane associated with F . Let
+ : kηk = 1} such that η
N (T ) := {η F : F is a face of T } denote the set of normals for T .
The next task is to show that every k-menu inherits a representation from some independent `-menu, and that all independent menus share a representation. The proof is divided
into three parts.
44
Part 1: Every k-menu inherits a representation from some independent menu
Lemma 17 (Vertex Expansion). Let A be a k-menu. There is an act g ∈
/ A such that
B = A ∪ {g} is a (k + 1)-menu and A inherits a representation from B.
Proof. Let σ ∈ E be A-interior and choose 2ε > 0 such that B 2ε is a 2ε-neighborhood of σ.
Then B ε is an ε-neighborhood where, for all σ 0 ∈ B ε and all s ∈ σ 0 , the closure of Qs,ε is in
the interior of S A (f ), where f = cs (A).
Let f ∈ A. For each σ 0 ∈ B ε and each s ∈ σ, consider the half-space H(f, s) := {λ ∈
RΩ
+ : s · (λ − U (f )) ≤ 0}. This is the half-space (bounded above but not below) where the
bounding hyperplane has normal s and passes through U (F ). Let T ∗ be (the closure of)
the intersection over all H(f, s) where f ∈ A and s ∈ σ 0 ∈ B ε . Notice that for each f , the
set B ε (f ) := {s ∈ S : cs (A) = f and s ∈ σ 0 ∈ B ε } is an (open) convex cone in S, and a
strict subset of int(S A (f )) by our choice of ε. Thus, B ε (f ) and B ε (f 0 ) are strictly separated
whenever f 6= f 0 , and therefore T (A) ( T ∗ . Pick any point u∗ ∈ [T ∗ \T (A)] ∩ RΩ
+ and let
∗
g ∈ A such that U (g) = u . Then B = A ∪ {g} is the desired menu.
To see why A and B share a representation, note that (by construction) cs (A) = cs (B)
for all s ∈ σ 0 ∈ B ε . Hence, L A (σ 0 ) = L B (σ 0 ) whenever σ 0 ∈ B ε . Since dim(A(A)) =
dim(L A (B ε )) and L A (B ε ) = L B (B ε ) ⊆ A(B), it follows that %A inherits a representation
from %B .
Lemma 18. Let A be a k-menu. There exists an independent menu B such that A inherits
a representation from B.
Proof. Fix an A-interior experiment σ and a neighborhood B ε of the form used in the proof
of Lemma 17. It is easy to see that a similar argument can be used to add N additional
vertices to the region T ∗ \T (A) to yield a (k + N )-polytope. Moreover, these vertices can be
chosen so that for each state ω, the ω coordinates yield N distinct, interior utility values.
We are free to pick any N lotteries p1ω , . . . , pN
ω yielding these utility values. Clearly, these
can be chosen to form an affinely independent set. Now let f i = (piω )ω∈Ω ∈ A, and let
B = A ∪ {f 1 , . . . , f N }.
Part 2: Oriented translations share a representation
Definition 23. Let A and B be independent menus. Then B is a translation of A if there
exists λ∗ ∈ RΩ such that T (B) = T (A)+λ∗ := {λ+λ∗ : λ ∈ T (A)}. The notation B = A+λ∗
means T (B) = T (A) + λ∗ .
Lemma 19. If B = A + λ∗ , then:
45
(i) The map ψ : U (A) → U (B) given by ψ(z) := z + λ∗ is a bijection. Hence, there is a
bijection ψ : A → B where ψ(f ) denotes the unique g ∈ B such that U (g) = U (f ) + λ∗ .
(ii) C(B) = C(A).
Proof. Part (i) is clear. For part (ii), observe that s ∈ S A (f ) if and only if
X
sω u(fω ) >
ω
⇔
X
X
sω u(gω )
ω
sω [µω u(fω ) + λ∗ω ] >
X
sω [µω u(gω ) + λ∗ω ]
ω
ω
⇔s ∈ S B (ψ(f ))
It follows that C(B) = C(A).
Definition 24. Suppose B is a translation of A, and let ψ : A → B denote the associated
bijection (Lemma 19). The affine path from f to ψ(f ) is the map α 7→ f α := (1−α)f +αψ(f )
for α ∈ [0, 1] and the affine path from A to B is the map α 7→ Aα := {f α : f ∈ A} for
α ∈ [0, 1].
Definition 25. A bijection ϕ : P → Q between to sets of N lotteries is oriented if
(i) For all p, p0 ∈ P , u(p) > u(p0 ) implies u(ϕ(p)) > u(ϕ(p0 )), and
(ii) For each α ∈ [0, 1], the set {(1 − α)p + αϕ(p) : p ∈ P } is affinely independent.
Independent menus A and B are oriented if B is a translation of A and, for each ω, the
map ϕω : Aω → Bω given by ϕω (fω ) := ψ(f )ω is oriented, where Aω := {fω : f ∈ A},
Bω := {gω : g ∈ B}, and ψ : A → B is the associated bijection (Lemma 19).
Note that not all translations B = A + λ∗ are oriented; in fact, it is possible to construct
menus A and B such that U (A) = U (B) (so that B is trivially a translation of A) but where
A and B are not oriented, so that even T (A) = T (B) is not enough to guarantee that A and
B are oriented. So, some care is needed when applying the following lemma:
Lemma 20. If A and B are oriented menus, then A and B share a representation.
Proof. Since A and B are oriented, there is a λ∗ ∈ RΩ such that B = A+λ∗ and an associated
bijection ψ : A → B (Lemma 19).
Consider the affine path associated with ψ (Definition 24), and note that for each α,
α
A = A + αλ∗ ; that is, T (A∗ ) = T (A) + αλ∗ .
46
Thus, every A-interior (B-interior) experiment σ is also Aα -interior. Pick such a σ and
α
α
a corresponding neighborhood B ε , and let f α := L A (σ). Importantly, L A (B ε ) contains
a full-dimensional subset of A because Aα is an independent menu (since A and B are
oriented).
α
For every α, f α is in the interior of L A (B ε ). Let δ(α) > 0 denote the radius of the
α
largest open ball around f α contained in L A (B ε ); call this ball B α . Clearly, f α and δ(α)
are continuous in α. Therefore δ ∗ = minα δ(α) is well-defined.
Now construct a finite sequence α(0), α(1), . . . , α(I) such that α(0) = 0, α(I) = 1, and
d(f α(i) , f α(i−1) ) < δ ∗ /2 for all i = 1, . . . , I, where d denotes the standard Euclidean metric.
This can be done because f α is continuous in α. Notice that f α(i) ∈ B α(i−1) for all i = 1, . . . , I.
Thus, B α(i) and B α(i−1) intersect in a full-dimensional region, so that Aα(i) and Aα(i−1) share
a representation. Hence, A and B share a representation.
Part 3: Independent menus share a representation
Lemma 21 (Face Expansion). Let A be an independent menu and suppose N = N (A) ∪ {λ}
for some λ ∈ S+Ω . Then there is an independent menu B such that:
(i) N (B) = N
(ii) A and B share a representation.
Proof. Fix an A-interior experiment σ and an ε-neighborhood B ε around σ. Without loss
of generality, no s ∈ σ is of the form s = γλ for any γ > 0 (if necessary, choose some
other σ 0 ∈ B ε and redefine σ to be σ 0 ). Let f ∗ := L A (σ). Since A is independent, the set
{L A (σ 0 ) : σ 0 ∈ B ε } contains a ball of radius δ around f ∗ for some δ > 0.
Let H := {λ0 ∈ RΩ : λ · λ0 = ζ} denote the (unique) hyperplane with normal λ that
intersects the boundary (but not the interior) of T (A). The half-space H ∗ (ζ) := {λ0 ∈ RΩ :
λ · λ0 ≤ ζ} below H contains T (A). Shifting H ∗ toward the origin by a small amount (that is,
taking H ∗ (ζ 0 ) with ζ 0 < ζ) and intersecting with T (A) yields a new decision polytope T 0 where
one or more vertices of T (A) are split into multiple vertices. This means that for at least one
n
f ∈ A, the vertex z f = U (f ) ∈ T (A) is split into vertices z f1 , . . . , z f in T 0 , and the set S A (f )
i
i
is divided into convex cones S(f i ) ⊆ S A (f ) where S(f i ) := {s ∈ S : s · z f > s · z ∀z 0 6= z f }.
By construction, T 0 has a face with normal λ. By letting ζ 0 → ζ, T 0 converges to T (A).
n
Therefore, if the vertex z f ∈ T (A) corresponding to some f ∈ A is split into z f1 , . . . , z f
i
in T 0 , the coordinates z f each converge to z f as ζ 0 → ζ. Therefore, acts f i such that
i
U (f i ) = z f can be chosen such that f i → f as ζ 0 → ζ. Moreover, the acts corresponding
to new vertices can be chosen so that the resulting menu B is independent (perturb the
constituent lotteries along indifference curves for u if necessary).
47
Thus, there is a ζ 0 near ζ for which the corresponding menu B satisfies d(f ∗ , L B (σ)) < δ;
that is, L B (σ) is in the interior of the ball of radius δ around f ∗ . Since B is independent,
A(B) contains a ball of radius δ 0 around L B (σ) for some δ 0 > 0. Thus, dim(A(A) ∩A(B)) =
dim(A), so that A and B share a representation.
Lemma 22. Suppose A is a k-menu and B ⊆ A such that ce (A) ∈ B. There exists an
experiment σ such that cs (A) ∈ B for all s ∈ σ. Moreover, σ may be chosen so that for each
f
f ∈ B, σ contains a signal sf such that cs (A) = B.
Proof. Let f e ∈ B denote the act satisfying ce (A) ∈ B. For each f ∈ B\f e , pick sf such
P
f
that cs (A) = f ; such sf exist because A is a k-menu. Let s := f ∈B\f e sf , and choose
α ∈ (0, 1) such that e − αs ∈ S A (f e ). Such an α exists because for small enough α, e − αs
is close to e ∈ S A (f e ), which is a full-dimensional subset of S = [0, 1]Ω \0. Finally, let
σ = {αsf : f ∈ B\f e } ∪ {e − αs}. Since cλt = ct for all λ > 0 such that λt ∈ S, it follows
that σ is a well-defined experiment satisfying all desired properties.
Lemma 23. Suppose U is a k-utility profile and U 0 is an `-utility profile such that T = T (U )
and T 0 = T (U 0 ) satisfy e ∈ N (T ) ∩ N (T 0 ). For each choice of A and B such that U = U (A)
and U 0 = U (B), there exists an N -utility profile U ∗ and a λ ∈ RΩ such that:
(i) U ∪ U ∗ is a (k + N )-utility profile and U 0 ∪ (U ∗ + λ) is a (` + N )-utility profile
∗
0
(ii) There is a z ∈ U ∗ such that e ∈ S U ∪U (z) and e ∈ S U ∪(U
∗ +λ)
(z + λ)
(iii) If U ∗ = U (A∗ ) and U ∗ + λ = U (B ∗ ), then A inherits a representation from A ∪ A∗ and
B inherits a representation from B ∪ B ∗ .
Proof. Let A and B satisfy U = U (A) and U 0 = U (B). Choose an A-interior experiment σ
0
and a corresponding neighborhood B ε , and a B-interior σ 0 with neighborhood B ε . As in the
proof of Lemma 17, the half-spaces corresponding to signals s ∈ σ̂ ∈ B ε passing through the
point U (f s ) (where f s = cs (A)) intersect to form a space T ∗ (A) such that T (A) ⊆ T ∗ (A).
Moreover, T ∗ (A)\T (A) contains a full-dimensional subset of RΩ near the face of T (A) with
normal e because every s ∈ σ̂ ∈ B ε is bounded away from e. In other words, T ∗ (A)\T (A)
contains a full-dimensional subset of the region above the hyperplane corresponding to this
face. A similar argument yields a region T ∗ (B) for which analogous statements hold.
Thus, there is a δ > 0 such that both T ∗ (A)\T (A) and T ∗ (B)\T (B) contain an open ball
of radius δ. Letting DA denote such a ball in T ∗ (A)\T (A) and DB the ball in T ∗ (B)\T (B),
it follows that DB = DA + λ for some λ ∈ RΩ .
The profile U ∗ is constructed as follows. First, pick a point z 1 ∈ DA . Then z 1 + λ ∈ DB .
By our choice of DA and DB , we have that T (U ∪ {z 1 }) is a (k + 1)-polytope such that
48
1
e ∈ S U ∪{z } (z 1 ); that is, if some act f 1 satisfies U (f 1 ) = z 1 , then ce (A ∪ {f 1 }) = f 1 . Since
this is a strict preference, there is in fact a full-dimensional, convex set of signals s such
that cs (A ∪ {f 1 }) = f 1 , and e belongs to the interior of this set. Similar statements hold
for B ∪ {g 1 } for any g 1 such that U (g 1 ) = z + λ. Therefore, there is a full-dimension set of
signals s such that cs (A ∪ {f 1 }) = f 1 and cs (B ∪ {g 1 }) = g 1 . Call the set of all such s the
support of z 1 .
We now proceed by induction. Suppose U ∗ = {z 1 , . . . , z n } ⊆ DA such that each z ∈ U ∗
has full-dimensional support. That is, for any A∗ such that U (A∗ ) = U ∗ and each f ∈ A∗ , the
∗
∗
set S z = S A∪A (f )∩S B∪(A +λ) (g) has full dimension, where g ∈ B ∗ satisfies U (g) = U (f )+λ.
Pick an s in the interior of S z such that sz 6= e and consider the hyperplane H(s; z) with
normal s passing through z. Now pick a point z n+1 ∈ H(s; z)\z; if z n+1 is sufficiently close to
z, then z n+1 ∈ DA , T (U ∪U ∗ ∪{z n+1 }) is a (k +n+1)-polytope, and T (U 0 ∪(U ∗ ∪{z n+1 }+λ))
is an (` + n + 1)-polytope. Moreover, z n+1 has full dimensional support.
The resulting set U ∗ = {z 1 , . . . , z N } clearly satisfies (i) and (ii). For (iii), note that our
original choice of DA and DB guarantees that for all s ∈ σ̂ ∈ B ε , cs (A ∪ A∗ ) = cs (A) and
0
0
0
s0 ∈ σ̂ 0 ∈ B ε implies cs (B ∪ B ∗ ) = cs (B). Thus, L A (B ε ) ⊆ A(A ∪ A∗ ) and L B (B ε ) ⊆
A(B ∪ B ∗ ), so that dim(A(A)) ≤ dim(A(A ∪ A∗ )) and dim(A(B)) ≤ dim(A(B ∪ B ∗ )).
Lemma 24. Suppose U, U 0 ⊆ (0, 1) are sets of cardinality N . There exist sets P, Q ⊆ ∆X
and a bijection ϕ : P → Q such that
(i) U = {u(p) : p ∈ P } and U 0 = {u(q) : q ∈ Q}, and
(ii) ϕ is oriented.
Proof. Consider the indifference curves (hyperplanes) in ∆X corresponding to the utilities
in U ∪ U 0 . There is an edge E of ∆X such that each of these planes intersects the (relative)
interior of E. Specifically, E is any edge connecting lotteries δb and δw for any choice of
b, w ∈ X such that u(b) ≥ u(x) ≥ u(w) for all x ∈ X . Since each utility level is interior,
it can be expressed as a non-degenerate mixture of u(b) and u(w), forcing the associated
hyperplane to intersect the relative interior of E. Parallel to this edge is an interior line L
passing through (the interior of) each hyperplane, so that in fact there is an ε > 0 such that
every parallel ε perturbation of L passes through each hyperplane. Let B ⊆ ∆X denote
the region spanned by these perturbations; clearly, B has dimension equal to that of ∆X
(namely, N − 1).
Now pick N − 1 lines L1 , . . . , LN −1 in B, each parallel to L, such that the convex hull of
{L1 , . . . , LN −1 } has dimension N − 1. Rank the numbers in ui ∈ U so that u1 > u2 > . . . >
uN . For i = 1, . . . , N − 1, let pi be the (unique) intersection of Li and the indifference plane
49
for utility ui , and let pN be the unique intersection of LN −1 with the indifference plane for
utility uN . Observe that {p1 , . . . , pN −1 } lie on a hyperplane H in ∆X and that pN is not in
the affine hull of H because LN −1 passes through H at a single point (pN −1 ) while pN lies
at a different point on LN −1 . Thus, P = {p1 , . . . , pN } is affinely independent.
Using the same lines L1 , . . . , LN −1 and the same rank-based construction for U 0 yields an
affinely independent set Q = {q 1 , . . . , q N } where u(q 1 ) > . . . > u(q N ).
Now consider P α := {(1−α)pi +αq i : i = 1, . . . , N }. Observe that (1−α)u(pi )+αu(q i ) >
(1 − α)u(pi+1 ) + αu(q i+1 ) for all i = 1, . . . , N − 1 because u(pi ) > u(pi+1 ) and u(q i ) > u(q i+1 ).
Notice also that (1 − α)pi + αq i is on line Li (i = 1, . . . , N − 1) and (1 − α)pN + αq N is on
LN −1 . Thus, by the same argument, P α is affinely independent. Hence, the map ϕ : P → Q
given by ϕ(pi ) = q i (i = 1, . . . , N ) is oriented.
Lemma 25. If A and B are independent, then A and B share a representation.
Proof. By Lemma 21, we may assume that e ∈ N (A) and e ∈ N (B). Then, by Lemma
23, there is a utility profile U and a λ ∈ RΩ such that if U = U (A∗ ) and U 0 := U + λ =
U (B ∗ ), then A and A0 := A ∪ A∗ share a representation, and B and B 0 := B ∪ B ∗ share
a representation. In fact, by Lemma 22, A0 shares a representation with A∗ provided A∗
is independent. Similarly, B 0 shares a representation with B ∗ provided B ∗ is independent.
Therefore, it will suffice to find independent menus A∗ and B ∗ such that U = U (A∗ ),
U 0 = U (B ∗ ), and such that A∗ and B ∗ share a representation.
To do so, choose a state ω and apply Lemma 24 to the sets Uω := {zω : z ∈ U } and Uω0 :=
0
{zω0 : z 0 ∈ U 0 } to get affinely independent sets Pω := {pzω : z ∈ U } and Qω := {qωz : z 0 ∈ U 0 }
0
such that u(pzω ) = zω and u(qωz ) = zω0 for all z ∈ U and z 0 ∈ U 0 (if necessary, apply a small
perturbation to U and U 0 in order to get N distinct utility values in Uω for each ω, and N
distinct utility values in Uω0 for all ω). Repeating this for each ω yields acts f z := (pzω )ω∈Ω and
0
0
0
g z := (qωz )ω∈Ω for each z ∈ U and z 0 ∈ U 0 . Then A∗ := {f z : z ∈ U } and B ∗ := {g z : z 0 ∈ U 0 }
are oriented, so that by Lemma 20, A∗ and B ∗ share a representation.
Lemma 26. There is a unique, linear L∗ : A → R such that, for all k-menus A, V A (σ) :=
L∗ (L A (σ)) represents %A for all σ ∈ E ∗ (A).
Proof. By Lemma 25, all independent menus share a representation. This means there is
a unique linear % on A that agrees with each relation %A . This % also agrees with %A
since every k-menu inherits a representation from an independent menu (Lemma 18). To
construct L∗ , choose any independent menu A and consider the linear representation V A
(Lemma 4) restricted to the domain A(A). Since A(A) has full dimension, V A has a unique
linear extension to A. Take L∗ to be this extension.
50
Proof of Proposition 1
Let A be an arbitrary menu and consider the utility profile U (A). If U (A) consists of a
single point, or if E ∗ (A) = ∅, there is nothing to prove. Otherwise, let σ, σ 0 ∈ E ∗ (A). Then
0
there is a submenu A0 ⊆ A that is a k-menu (for some k) such that L A (σ) = L A (σ) and
0
0
L A (σ 0 ) = L A (σ 0 ). By the Consistency Axiom (A7), σ %A σ 0 if and only if σ %A σ 0 . Hence,
any linear representation for E ∗ (A0 ) gives a linear representation for E ∗ (A). In particular, if
some pair (ν, v) gives an expected utility representation on some (any) independent menu
A, then it gives a linear representation for E ∗ (B) for all menus B (Lemma 14).
The only remaining task is to pin down the desired uniqueness properties for ν and
v. First, note that by the Non-Degeneracy axiom, the (unique) linear representation L∗ of
Lemma 26 must be non-constant; otherwise, by the previous paragraph, every %A assigns
∗
indifference among all experiments in E ∗ (A). Thus, by Lemma 12, %A (uniquely) extends
to % on A (where A∗ is the symmetric menu constructed in Step 2), so that % satisfies
all of the Anscombe-Aumann axioms, including Non-Degeneracy. Thus, % has an expected
utility representation with a unique ν and a unique (up to positive affine transformation)
utility index v. Since L∗ is a linear representation for %, it follows that the expected utility
representation holds for all menus %A on E ∗ (A).
B
Proof of Proposition 2
Axioms B1–B5 imply that for each s, cs is rationalized by an Anscombe-Aumann representation with prior µs and utility index us . Axiom B4 implies that us is a positive affine
0
transformation of us for all s, s0 , so we may assume us = u for all s. That is, every %s has
a representation of the form
f %s g ⇔
X
µsω u(fω ) ≥
ω
X
µsω u(gω )
ω
We will refer to this as the expected utility representation for %s . To complete the proof, we
only need to find a (full support) µ such that, for all s, µs is the Bayesian posterior induced
by prior µ and signal s.
Lemma 27. If sω > 0, then µsω > 0.
Proof. If µsω = 0, then
u(p)µsω +
X
u(hω0 )µsω0 = u(q)µsω +
ω 0 6=ω
X
ω 0 6=ω
51
u(hω0 )µsω0
for all p, q ∈ ∆(X ) and all h ∈ A. Thus p[ω]h ∼s q[ω]h, so that p[ω 0 ]h ∼s q[ω 0 ]h (for all ω 0 )
by Axiom B4. Pick ω 0 such that µsω0 > 0. Then u(p)µsω0 = u(q)µsω0 , forcing u(p) = u(q). This
holds for all p, q ∈ ∆(X ), so that f ∼s g for all f, g ∈ A. This contradicts Axiom B2.
Let e ∈ S denote the signal where eω = 1 for all ω.
Lemma 28. For every E = [ω, ω 0 ], there are acts f, g, h such that f Eh ∼e gEh, u(gω ) −
u(fω ) > 0, and u(fω0 ) − u(gω0 ) > 0.
e
Proof. By Lemma 27, µµeω := δ is well-defined. Suppose δ ≥ 1. Let p, p0 be interior such
ω0
that u(p) − u(p0 ) > 0. Take fω0 = p, gω0 = p0 , gω = p, and fω = αp + (1 − α)p0 . Then
u(gω ) − u(fω ) > 0 and u(fω0 ) − u(gω0 ) > 0 for all α ∈ [0, 1). Moreover,
u(p) − u(p0 )
1
u(fω0 ) − u(gω0 )
=
=
0
u(gω ) − u(fω )
u(p) − αu(p) − (1 − α)u(p )
α
Since δ ≥ 1, take α = 1/δ. Then
u(fω0 ) − u(gω0 )
µeω
= e
u(gω ) − u(fω )
µω 0
so that
u(fω )µeω + u(fω0 )µeω0 = u(gω )µeω + u(gω0 )µeω0 .
Therefore f Eh ∼e gEh for all h. The proof for δ ≤ 1 is similar.
Lemma 29. If f ∼s g, α ∈ (0, 1) and t = sE(αs), then (αf + (1 − α)h)Ef ∼t (αg + (1 −
α)h)Eg for all h.
Proof. Suppose toward a contradiction that
(αf + (1 − α)h)Ef t (αg + (1 − α)h)Eg
(13)
for some h. By the expected utility representation for %t , (13) actually holds for every choice
of h. In particular, h = f gives
f t (αg + (1 − α)f )Eg
(14)
Take F = Ω\E and let t0 = tF (αt). Notice that t0 = (αt)Et = (αs)E(αs) = αs. Applying
Axiom B5 to (14) with F and t0 gives
0
(αf + (1 − α)h )F f αs
h
i h
i
α[(αg + (1 − α)f )Eg] + (1 − α)h F (αg + (1 − α)f )Eg ∀h0
0
52
Subbing in h0 = f yields
f
αs
h
i h
i
(αg + (1 − α)f )Eg E α[(αg + (1 − α)f )Eg] + (1 − α)f
= [αg + (1 − α)f ]E[αg + (1 − α)f ]
= αg + (1 − α)f
By Continuity (B3), %αs = %s . Thus, f s αg + (1 − α)f . Since %s has an expected
utility representation, this contradicts the original assumption that f ∼s g. Thus, (αg + (1 −
α)h)Eg %t (αf +(1−α)h)Ef for all h. A similar argument establishes (αf +(1−α)h)Ef %t
(αg + (1 − α)h)Eg for all h.
Lemma 30. Let E = [ω, ω 0 ] and s ∈ S such that sω > 0 or sω0 > 0. Then f Eh ∼e gEh
implies (sω0 gω + (1 − sω0 )fω , gω0 )Eh ∼s (fω , sω fω0 + (1 − sω )gω0 )Eh.
Proof. Let α = sω and β = sω0 . Consider E 0 = Ω\ω and t0 = eE 0 (αs) = (α, 1)[ω, ω 0 ]e.
Lemma 29 implies
0
[α(f Eh) + (1 − α)ĥ]E 0 [f Eh] ∼t [α(gEh) + (1 − α)ĥ]E 0 [gEh] ∀ĥ
Equivalently, for all ĥ,
0
(fω , αfω0 + (1 − α)ĥω0 )[ω, ω 0 ](αh + (1 − α)ĥ) ∼t (gω , αgω0 + (1 − α)ĥω0 )[ω, ω 0 ](αh + (1 − α)ĥ)
Subbing in ĥ = g gives
0
(fω , αfω0 + (1 − α)gω0 )[ω, ω 0 ](αh + (1 − α)g) ∼t (gω , gω0 )[ω, ω 0 ](αh + (1 − α)g)
(15)
0
Using the expected utility representation for %t , (15) clearly holds if, on the complement of
E, αh + (1 − α)g is replaced with any act. Thus,
0
(fω , αfω0 + (1 − α)gω0 )Eh ∼t gEh
(16)
Now take E 00 = Ω\ω 0 and let t00 = t0 E 00 (βt0 ) = (α, β)[ω, ω 0 ]e. Applying Lemma 29 to (16),
00
E 00 , and t00 gives g̃(ĥ) ∼t f˜(ĥ) for all ĥ, where
g̃(ĥ) := (β(gEh) + (1 − β)ĥ)E 00 (gEh)
= (gEh)[ω 0 ](β(gEh) + (1 − β)ĥ)
= (βgω + (1 − β)ĥω , gω0 )E(βh + (1 − β)ĥ)
53
and
i
h
i h
00
˜
f (ĥ) := β[(fω , αfω0 + (1 − α)gω0 )Eh] + (1 − β)ĥ E (fω , αfω0 + (1 − α)gω0 )Eh
h
i
= (αfω0 + (1 − α)gω0 )[ω 0 ] β[(fω , αfω0 + (1 − α)gω0 )Eh] + (1 − β)ĥ
= (βfω + (1 − β)ĥω , αfω0 + (1 − α)gω0 )E(βh + (1 − β)ĥ)
00
Thus, substituting ĥ = f into g̃(ĥ) ∼t f˜(ĥ) yields
00
(βgω + (1 − β)fω , gω0 )E(βh + (1 − β)f ) ∼t (fω , αfω0 + (1 − α)gω0 )E(βh + (1 − β)f ) (17)
Using the expected utility representation, (17) holds if, on E c , βh + (1 − β)f is replaced with
any other act. Thus,
00
(βgω + (1 − β)fω , gω0 )Eh ∼t (fω , αfω0 + (1 − α)gω0 )Eh
(18)
Since α = sω and β = sω0 , the desired acts are indifferent under signal t00 = (α, β)Ee = sEe.
If Ω = {ω, ω 0 }, we are done. Otherwise, there is at least one ω̂ 6= ω, ω 0 . We proceed
inductively to show that the indifference holds for all signals of the form sEt where tω̂ > 0
for all ω̂ 6= ω, ω 0 . Then a continuity argument will establish indifference under signal s.
Suppose tω̂ > 0. Let F = Ω\ω̂ and t̂ = t00 F (tω̂ t00 ) = (α, β, tω̂ )[ω, ω 0 , ω̂]e. Applying Lemma
29 to (18) with F and t̂ gives fˆ(ĥ) ∼t̂ ĝ(ĥ) for all ĥ, where
h
i h
i
fˆ(ĥ) := tω̂ [(βgω + (1 − β)fω , gω0 )Eh] + (1 − tω̂ )ĥ F (βgω + (1 − β)fω , gω )Eh
h
i
0
0
= hω̂ [ω̂] tω̂ (βgω + (1 − β)fω ) + (1 − tω̂ )ĥω , tω̂ gω + (1 − tω̂ )ĥω E tω̂ h + (1 − tω̂ )ĥ
and
h
i h
i
ĝ(ĥ) := tω̂ [(fω , αfω0 + (1 − α)gω0 )Eh](1 − tω̂ )ĥ F (fω , αfω0 + (1 − α)gω0 )Eh
h
i
= hω̂ [ω̂] tω̂ fω + (1 − tω̂ )ĥω , tω̂ (αfω0 + (1 − α)gω0 ) + (1 − tω̂ )ĥω0 E tω̂ h + (1 − tω̂ )ĥ
Using these expressions together with the expected utility representation for %t̂ yields
tω̂ µt̂ω [βu(gω ) + (1 − β)u(fω )] + tω̂ µt̂ω0 u(gω0 ) = tω̂ µt̂ω u(fω ) + tω̂ µt̂ω0 [αu(fω0 ) + (1 − α)u(gω0 )]
54
Since tω̂ > 0, we may cancel tω̂ and add
P
ω 00 6=ω,ω 0
µt̂ω00 u(hω00 ) to both sides. Thus
(βgω + (1 − β)fω , gω0 )Eh ∼t̂ (fω , αfω0 + (1 − α)gω0 )Eh
So, the desired indifference holds at signal t̂ = (α, β, tω̂ )[ω, ω 0 , ω̂]e. If there exists some
ω ∗ ∈ Ω\{ω, ω 0 , ω̂}, apply the above argument again, this time with F = Ω\ω ∗ and t∗ =
t̂F (tω∗ t̂) = (α, β, tω̂ , tω∗ )[ω, ω 0 , ω̂, ω ∗ ]e, where tω∗ > 0. Clearly, repeating this procedure
yields the desired indifference for all signals of the form sEt where tω00 > 0 for all ω 00 6= ω, ω 0 .
To see that (sω0 gω + (1 − sω0 )fω , gω0 )Eh ∼s (fω , sω fω0 + (1 − sω )gω0 )Eh, suppose that one
of these acts is strictly preferred over the other at s. By Axiom B3, there is a neighborhood
of s such that every signal in the neighborhood yields the same strict preference. But, as is
easily verified, every neighborhood of s in the given topology contains a signal of the form
sEt where tω00 > 0 for all ω 00 6= ω, ω 0 . As shown above, such signals yield indifference between
the two acts. Thus, indifference must hold at s.
Lemma 31. If sω = 0, then µsω = 0.
Proof. Since sω = 0 and s ∈ S, there is a state ω 0 6= ω such that sω0 > 0. Let E = [ω, ω 0 ].
By Lemma 28, there are acts f, g, h such that f Eh ∼e gEh and u(gω ) − u(fω ) > 0. Lemma
30 implies
µsω [sω0 u(gω ) + (1 − sω0 )u(fω )] + µsω0 u(gω0 ) = µsω u(fω ) + µsω0 [sω u(fω0 ) + (1 − sω )u(gω0 )]
and so
µsω sω0 [u(gω ) − u(fω )] = µsω0 sω [u(fω0 ) − u(gω0 )].
Substituting sω = 0 gives
µsω sω0 [u(gω ) − u(fω )] = 0.
Since sω0 > 0 and u(gω ) − u(fω ) > 0, this implies µsω = 0.
Lemma 32. If sω > 0 and sω0 > 0, then
µsω
µsω0
=
sω µeω
.
sω0 µeω0
Proof. Let E = [ω, ω 0 ]. By Lemma 28, there are acts f, g, h such that f Eh ∼e gEh, u(gω ) −
u(fω ) > 0, and u(fω0 ) − u(gω0 ) > 0. Moreover, the expected utility representation for %e
implies
u(fω )µeω + u(fω0 )µeω0 = u(gω )µeω + u(gω0 )µeω0 .
Since µe has full support (Lemma 27), this implies
u(fω0 ) − u(gω0 )
µe
= eω
u(gω ) − u(fω )
µω 0
55
As in the proof of Lemma 31, Lemma 30 implies
µsω sω0 [u(gω ) − u(fω )] = µsω0 sω [u(fω0 ) − u(gω0 )].
Since sω > 0 and sω0 > 0, Lemma 27 implies µsω > 0 and µsω0 > 0. Thus
µsω
sω u(fω0 ) − u(gω0 )
=
s
µω 0
sω0 u(gω ) − u(fω )
sω µeω
=
,
sω0 µeω0
as desired.
Proof of Proposition 2
To prove the proposition, let s ∈ S and observe that by Lemmas 27 and 31, µsω = 0 if and
only if sω = 0, as prescribed by Bayes’ rule. Combined with Lemma 32, this implies that
s
the ratio µµsω is pinned down for every choice of ω 0 such that sω0 > 0.
ω0
s
Notice that for any λ > 0, λµs := (λµsω )ω∈Ω yields
thee same
ratios. Thus, µ is the unique
probability distribution on the ray passing through ssω0 µµeω
for any choice of ω 0 such that
ω
sω µeω
sω0 µeω0
ω0
ω∈Ω
sω0 > 0. In other words, the ratios
(sω0 > 0) pin down a point in projective space,
which corresponds to a ray through the origin in RΩ . This ray intersects the probability
e
simplex at a unique point. Since the probability distribution given by µsω = P s0ωsµω0 µe is a
ω
ω
ω0
point on this ray, it must coincide with µs . Hence, µs is the Bayesian posterior for signal s
and prior µ := µe . This completes the proof.
C
Proofs for Section 5
Proof of Proposition 3
First, suppose u is a positive affine transformation of v. Without loss of generality, we may
assume that u = v.
P
If σ ∈ E ∗ (A), then V A (σ) = ω νω v(fωσ ), where f σ = L A (σ). If σ ∗ ∈ E ∗ (A), then for
each ω there is a lottery pω such that for all f ∈ ceω (A), fω = pω . Since DM2 has a Bayesian
representation, it follows that u(pω ) ≥ u(gω ) for all g ∈ A. Thus, v(pω ) ≥ v(gω ) for all g ∈ A
as well. Therefore v(pω ) ≥ v(fωσ ) because fωσ is in the convex hull of {gω : g ∈ A}. Thus,
V A (σ ∗ ) =
X
νω v(pω ) ≥
ω
X
ω
56
νω v(fωσ ) = V A (σ),
as desired.
For the converse, two intermediate claims are needed: for all interior p, p0 ∈ ∆X , (i)
u(p) > u(p0 ) implies v(p) ≥ v(p0 ), and (ii) u(p) = u(p0 ) implies v(p) = v(p0 ). From (i) and
(ii), it follows that u is a positive affine transformation of v because u and v are already
known to be non-constant linear functions on ∆X . To see this, note that (ii) implies that
every indifference (hyper)plane H ⊆ ∆X for u is a subset of an indifference (hyper)plane
H 0 for v. Since u and v are non-constant, both H and H 0 must have dimension N − 2. It
follows that H = H 0 . Thus, u and v yield the same partition of ∆X into indifference classes.
By (i), the indifference classes have the same strict ordering, so that u and v represent the
same linear preference on ∆X .
To prove (i), suppose u(p) > u(p0 ) and consider the menu A = {(p, p0 )Eh, (p0 , p)Eh} for
some E = [ω, ω 0 ] and h ∈ A. Then there is an ε > 0 such that if s = (1 − ε, 0)[E]0 and
0
s0 = (ε, 1)E0, then cs (A) = (p, p0 )Eh and cs (A)Eh = (p0 , p)Eh. Thus, σ = {s, s0 } ∈ E ∗ (A),
and
V A (σ ∗ ) − V A (σ) = [νω v(p) + νω0 v(p)] − [νω (sω v(p) + s0ω v(p0 )) + νω0 (sω0 v(p0 ) + s0ω0 v(p))]
= νω [s0ω (v(p) − v(p0 ))] + νω0 [(1 − s0ω0 )(v(p) − v(p0 ))]
= νω ε(v(p) − v(p0 )) + νω0 0(v(p) − v(p0 ))
This implies v(p) − v(p0 ) ≥ 0 because, by assumption, V A (σ ∗ ) ≥ V A (σ).
For (ii), suppose u(p) = u(p0 ). Since p and p0 are interior, there is a q ∈ ∆X such that
u(q) > u(p) = u(p0 ). Let A = {(p, q)Eh, (q, p0 )Eh}. As above, there exists ε > 0 such that
0
if s = (1 − ε, 0)E0 and s0 = (ε, 1)E0, then cs (A) = (q, p0 )Eh and cs (A) = (p, q)Eh. Thus,
σ = {s, s0 } ∈ E ∗ (A), and
V A (σ ∗ ) − V A (σ) = [νω v(q) + νω0 v(q)] − [νω (sω v(q) + s0ω v(p)) + νω0 (sω0 v(p0 ) + s0ω0 v(q))]
= νω [(1 − sω )v(q) − s0ω v(p)] + νω0 [(1 − s0ω0 )v(q) − sω0 v(p0 )]
= νω 0(v(q) − v(p)) + νω0 ε(v(q) − v(p))
so that V A (σ ∗ ) ≥ V A (σ) implies v(q) ≥ v(p). Similar algebra for the the experiment
σ 0 = {t, t0 } where t = (1, ε0 )E0 and t0 = (0, 1 − ε0 )E0 gives v(q) ≥ v(p0 ).
To see that v(p) = v(p0 ), suppose toward a contradiction that v(p) > v(p0 ). Since
u(p) = u(p0 ), p, p0 are interior, and both v and u are linear, this means there is a q ∈ ∆X
such that v(p) > v(q) > v(p0 ) and u(q) > u(p). But, by (i), this implies v(q) ≥ v(p), a
contradiction. Hence, v(p) = v(p0 ).
57
Proof of Proposition 4
First, suppose ν = µ and let A = {(p, q)Eh, (q, p)Eh} be an E-menu for some E = [ω, ω 0 ].
Since A is an E-menu, it is without loss of generality to assume u(p) > u(q). If v(p) = v(q),
then %A is trivially Blackwell monotone. Otherwise, consider the case v(p) > v(q) (that is,
DM1 and DM2 agree on the ranking of p and q).
Observe that DM2 strictly prefers (p, q)Eh over (q, p)Eh at a signal s if and only if
sω µω u(p) + sω0 µω0 u(q) > sω µω u(q) + sω0 µω0 u(p); since u(p) > u(q), this is equivalent to
sω µω > sω0 µω0 . At such an s, DM1 also strictly prefers (p, q)Eh over (q, p)Eh because ν = µ
implies sω µω = sω νω and sω0 µω0 = sω0 νω0 , so that sω νω > sω0 νω0 . Thus, for all s,
argmax
X
f ∈A
µsω̂ u(fω̂ ) = argmax
X
f ∈A
ω̂
νω̂s v(fω̂ )
(19)
ω̂
where µs = ν s is the Bayesian posterior upon observing signal s. It follows that DM1’s
ex-ante value of information function V A may be written as:
V A (σ) =
X
νω̂
X
sω̂ v(fω̂s ) s.t. f s ∈ argmax
f ∈A
s∈σ
ω̂
X
νω̂s v(fω̂ )
(20)
ω̂
for all σ ∈ E ∗ (A). As is well-known, σ w σ 0 if and only if σ yields a weakly greater ex-ante
expected utility than σ 0 in every admissible decision problem. Thus, σ w σ 0 implies σ %A σ 0 .
If instead v(q) > v(p), so that DM1 and DM2 disagree on the ranking of p and q, then
for all s,
argmax
f ∈A
X
µsω̂ u(fω̂ ) = argmin
f ∈A
ω̂
X
(21)
ω̂
= argmax −
f ∈A
νω̂s v(fω̂ )
X
νω̂s v(fω̂ )
(22)
ω̂
It follows that
−V A (σ) =
X
ω̂
νω̂
X
sω̂ v 0 (fω̂s ) s.t. f s ∈ argmax
f ∈A
s∈σ
X
νω̂s v 0 (fω̂ )
(23)
ω̂
where v 0 := −v. Thus, σ w σ 0 implies −V A (σ) ≥ −V A (σ 0 ); that is, σ w σ 0 implies σ 0 %A σ.
For the converse, suppose µ 6= ν and pick E = [ω, ω 0 ] such that µµω0 6= ννω0 . Let A =
ω
ω
{(p, q)Eh, (q, p)Eh} be an E-menu. Suppose u(p) > u(q) and consider the case v(p) > v(q).
We will show that %A is not Blackwell monotone by constructing experiments σ, σ 0 ∈ E ∗ (A)
such that σ w σ 0 but σ 0 σ, and experiments σ̂, σ̂ 0 ∈ E ∗ (A) such that σ̂ w σ̂ 0 and σ̂ A σ̂ 0 .
58
Let σ = {s, r, t} and let σ 0 = {s0 , r0 , t0 } such that, for all ω̂ 6= ω, ω 0 , each signal in σ and σ 0
assigns likelihood 0 to state ω̂. Furthermore, assume σ 0 is obtained by the following garbling
of σ:


#
"
# α
"
0
1−α
0
0
0
sω rω tω 
t
r
s

ω
ω
ω
(24)
0 = 0
β 1 − β
0
0
0
0
0
sω0 rω0 tω0
sω rω tω
0
0
1
(For convenience, the entries for states ω̂ are omitted since they are all zero). Thus, s0 =
αs + βr, r0 = (1 − β)r, and t0 = (1 − α)s + t.
r
Notice that if rωω0 is sufficiently large, then both DM1 and DM2 strictly prefer act (q, p)Eh
t
at signal r. Similarly, both strictly prefer (p, q)Eh at signal t if tωω0 is sufficiently small.
However, since µµω0 6= ννω0 , there is a convex cone C ⊆ S such that if s ∈ C, then DM1 and
ω
ω
DM2 disagree about which act in A is (strictly) optimal. Suppose that µµω0 > ννω0 so that in
ω
ω
C, DM1 prefers (q, p)Eh but DM2 prefers (p, q)Eh (the other case is resolved by a similar
s
argument). Then s ∈ C if and only if ννω0 < sωω0 < µµω0 .
ω
ω
Hence, we can choose s, r, and t such that each signal assigns positive likelihood to states
ω and ω 0 (and likelihood 0 to all other ω̂) and such that cr (A) = (q, p)Eh, ct (A) = (p, q)Eh,
and cs (A) = (p, q)Eh. If σ = {s, r, t}, then
V A (σ) = [νω rω v(q) + νω0 rω0 v(p)] + [νω sω v(p) + νω0 sω0 v(q)] + [νω tω v(p) + νω0 tω0 v(q)]
(25)
0
Now, we want t0 = (1 − α)s + t to satisfy ct (A) = (p, q)Eh and s0 = αs + βr to satisfy
0
0
cs (A) = (q, p)Eh. For any β ∈ (0, 1), r0 = (1 − β)r automatically satisfies cr (A) = (q, p)Eh
provided cr (A) = (q, p)Eh. Thus, we must choose r, s, and t such that
as well as
µ1
αsω0 + βrω0
µ1
(1 − α)sω0 + tω0
<
and
>
(1 − α)sω + tω
µ2
αsω + βrω
µ2
(26)
rω0
µ1
tω0
νω
νω
sω 0
µω
>
and
<
and
<
<
rω
µ2
tω
νω0
νω 0
sω
µω0
(27)
It is straightforward to construct an experiment σ = {r, s, t} satisfying (27). To satisfy (26),
we shrink s toward the origin by scaling it down by some factor γ ∈ (0, 1); to remain an
experiment, add weight (1 − γ)sω to tω and (1 − γ)sω0 to rω0 ; the resulting experiment still
satisfies (27). Clearly, for small enough γ,
γ(1 − α)sω0 + tω0
µ1
γαsω0 + βrω0
µ1
<
and
>
γ(1 − α)sω + tω
µ2
γαsω + βrω
µ2
59
are satisfied because (27) is satisfied. Hence, the desired σ 0 = {s0 , r0 , t0 } exists. Then
V A (σ 0 ) = [νω rω0 v(q) + νω0 rω0 0 v(p)] + [νω s0ω v(q) + νω0 s0ω0 v(p)] + [νω t0ω v(p) + νω0 t0ω0 v(q)]
= [νω rω v(q) + νω0 rω0 v(p)] + α [νω sω v(q) + νω0 sω0 v(p)] +
+ (1 − α) [νω sω v(p) + νω0 sω0 v(q)] + [νω tω v(p) + νω0 tω0 v(q)]
Therefore
V A (σ 0 ) − V A (σ) = α [νω sω v(q) + νω0 sω0 v(p)] − α [νω sω v(p) + νω0 sω0 v(q)] > 0
because [νω sω v(q) + νω0 sω0 v(p)] > [νω sω v(p) + νω0 sω0 v(q)] (this is precisely what it means for
DM1 to prefer (q, p)Eh over (p, q)Eh at signal s). Thus, σ w σ 0 because σ 0 is obtained by
garbling σ, but V A (σ 0 ) > V A (σ). To establish that %A is not Blackwell monotone, it is
enough to find σ̂, σ̂ 0 ∈ E ∗ (A) such that σ̂ w σ̂ 0 and σ̂ A σ̂ 0 . This is actually quite simple:
choose σ̂ = {(1, 0)E0, (0, 1)E0} and note that σ̂ 0 = {(α, β)E0, (1 − α, 1 − β)E0} is a garbling
of σ̂. For α, β ∈ (0, 1), we have V A (σ̂ 0 ) < v(p) = V A (σ̂). Thus, σ̂ w σ̂ 0 and σ̂ A σ̂ 0 .
The analysis so far assumes DM1 and DM2 have the same strict ranking of p over q. If
instead u(p) > u(q) while v(p) < v(q), a symmetric argument applies (they will agree on
choices in the corresponding C region but disagree outside of it, so that analogous constructions can be performed). We do not need to consider the case v(p) = v(q) because only
the existence of some E-menu violating Blackwell monotonicity is required, and it is always
possible to find p and q such that u(p) > u(q) and v(p) 6= v(q) because u and v are linear.
Proof of Proposition 6
First, suppose t calibrates E = [ω, ω 0 ] for DM1. Let A = {(p, q)Eh, (q, p)Eh} be there
relevant E-menu and σ = {s, s0 }, σ̂ = {ŝ, ŝ0 } the relevant experiments. We may assume
0
0
without loss of generality that f s := cs (A) = (p, q)Eh = cŝ (A) and f s := cs (A) = (q, p)Eh =
0
cŝ (A). Note that ŝω̂ = sω̂ and ŝ0ω̂ = s0ω̂ for all ω̂ 6= ω, ω 0 since ŝ − s ≡ αt and s0 − ŝ0 ≡ αt for
some α > 0. Then
V A (σ) − V A (σ̂) =
X
i
h
0
0
νω̂ sω̂ v(fω̂s ) + s0ω̂ v(fω̂s ) − ŝω̂ v(fω̂s ) − ŝ0ω̂ v(fω̂s )
ω̂
= νω [(sω − ŝω )v(p) + (s0ω − ŝ0ω )v(q)] + νω0 [(sω0 − ŝω0 )v(q) + (s0ω0 − ŝ0ω0 )v(p)]
= νω [−αtω v(p) + αtω v(q)] + νω0 [−αtω0 v(q) + αtω0 v(p)]
= ανω tω [v(q) − v(p)] − ανω0 tω0 [v(q) − v(p)]
60
Since A is non-degenerate, v(p) 6= v(q). Then, since σ ∼A σ̂, it follows that
νω tω = νω0 tω0
(28)
Thus, any t that calibrates E = [ω, ω 0 ] pins down ννω0 (recall that ν and µ have full support).
ω
˙
If the same t calibrates E for DM1,
then ννω0 = µµω0 . It is straightforward to find, for each E,
ω
ω
a signal t that calibrates E for DM1. Thus, for all ω, ω 0 , we have ννω0 = µµω0 , so that ν = µ.
ω
ω
Proof of Proposition 7
Suppose p, p0 ∈ ∆X are interior. Then there is a q ∈ ∆X such that u(q) > u(p) and u(q) >
u(p0 ). Let E = [ω, ω 0 ] and A = {(q, p)Eh, (p0 , q)Eh}. Clearly, there exist interior signals s, s0
0
0
such that σ = {s, s0 } ∈ E ∗ (A), f s := cs (A) = (q, p)Eh, and f s := cs (A) = (p0 , q)Eh. Let
t calibrate E for DM1; without loss of generality, there exist ŝ and ŝ0 such that ŝ − s ≡ t,
0
0
s0 − ŝ0 ≡ t, and σ̂ = {ŝ, ŝ0 } ∈ E ∗ (A). Then cŝ (A) = f s and cŝ (A) = f s , so that
V A (σ) − V A (σ̂) =
X
h
i
0
0
νω̂ sω̂ v(fω̂s ) + s0ω̂ v(fω̂s ) − ŝω̂ v(fω̂s ) − ŝ0ω̂ v(fω̂s )
ω̂
= νω [(sω − ŝω )v(q) + (s0ω − ŝ0ω )v(p0 )] + νω0 [(sω0 − ŝω0 )v(p) + (s0ω0 − ŝ0ω0 )v(q)]
= νω [−tω v(q) + tω v(p0 )] + νω0 [−tω0 v(p) + tω0 v(q)]
= νω tω [v(p0 ) − v(q)] − νω0 tω0 [v(p) − v(q)]
Since σ %A σ 0 , this implies
νω tω [v(p0 ) − v(q)] ≥ νω0 tω0 [v(p) − v(q)]
(29)
Observe that νω tω = νω0 tω0 because t calibrates DM1. Thus, we have v(p0 )−v(q) ≥ v(p)−v(q),
so that p0 Rp.
˙ have identical preferences over lotteries, then the same algebra can be
If DM1 and DM1
˙ (with a different calibrating signal) to yield p0 Ṙp. So, if v̇ is a positive
performed for DM1
affine transformation of v, then p0 Rp ⇔ p0 Ṙp. In fact, this algebra also establishes the
converse, because if a menu A reveals p0 Rp, then the fact that p0 Ṙp can be used to construct
a menu Ȧ = {(q̇, p)Eh, (p0 , q̇)Eh} and analogous experiments (replacing t with a signal ṫ that
˙
calibrates E for DM1)
to derive v̇(p0 ) ≥ v̇(p). Thus, for all interior p, p0 ∈ ∆X , v(p0 ) ≥ v(p)
if and only if v̇(p0 ) ≥ v̇(p). Since v and v̇ are linear, it follows that v̇ is a positive affine
transformation of v.
61