Risk, Uncertainty, and Value
Lars Peter Hansen
University of Chicago and NBER
Thomas J. Sargent
New York University and Hoover Institution
February 24, 2013
Contents
Contents
vii
List of Figures
viii
1 Statistical Decision Theories
1.1 Common components . . . . . . . .
1.2 Two models of decisions . . . . . .
1.3 Model Selection Example . . . . . .
1.4 Bayesian Decision Rules . . . . . .
1.5 Partial Ordering and Admissibility
1.6 Max-min Decision Rules . . . . . .
1.7 Risk and Ambiguity . . . . . . . .
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
3
4
5
8
13
17
vii
List of Figures
1.1
This illustrates the max-min problem for two choices of the utility function. The inner concave curve is the decision outcome
frontier for a model selection problem with a unit utility for making the correct model choice. The upper right concave curve is
the risk frontier when the utility assigned to a correctly identifying the θ1 model is increased. The intersection of the 45 degree
line with these frontiers depicts risk outcomes with the max-min
decision rules applied to the respective decision problems. The
negatives of the slopes of the red tangent lines give the implied
ratio of worst-case probabilities for the two parameter values. .
viii
12
Chapter 1
Statistical Decision Theories
This chapter explores two alternative approaches to decision theory under
uncertainty. An econometrician or statistician poses a model of the probability distribution for outcomes. These outcomes will be observed and
decisions will be made based on these outcomes. The model depends on an
unknown parameter. Conditioned on this parameter, the model provides a
complete statement of the probabilities of the future outcomes. We think
of this as “risk”. The parameter is unknown and we will call this lack of
knowledge “ambiguity”. The decision maker represents this ambiguity with
a single prior distribution, or multiple prior distributions. We first present
a familiar decision theory that does not differentiate between risk and ambiguity and then a more general approach that allows the decision maker
to respond differently to these two components of uncertainty. The latter
approach nests the first one as a special case.
1.1
Common components
Two prominent approaches to statistical decision making are cast within
the following common setting. A decision maker observes a random vector
Y ∈ Y and makes a decision d ∈ D that can depend on the realization y of
Y ; θ is an unknown parameter vector residing in a set Θ and the distribution
of the data Y given θ is described by a probability density ψ(y|θ) relative to
the measure τ over Y, where y denotes a generic element of Y. The decision
maker’s preferences are represented in terms of a utility function U (d, θ, y),
where d ∈ D is a decision. A decision rule is a (Borel measurable) function
1
2
CHAPTER 1. STATISTICAL DECISION THEORIES
D : Y → D. Integrate over y to construct expected utility conditioned on
θ:
Z
Ū (D|θ) =
U [D(y), θ, y]ψ(y|θ)τ (dy).
Y
Since the function Ū depends on the unknown parameter θ, it cannot be
used by itself to compute an optimal decision rule. But the function Ū (D|θ)
will be a valuable common ingredient of alternative decision theories.
1.2
Two models of decisions
Two models of decision making differ in their assumptions about what a
decision maker knows about the statistical model ψ(y|θ). Both models assume that the decision maker does not know the parameter θ, but they
do so in different ways. The Bayesian model to be described in section
1.4 summarizes the decision maker’s ignorance of the statistical model by
a prior probability distribution over the unknown parameter vector θ. The
max-min model to be described in section 1.6 characterizes the decision
maker’s ignorance with a set of prior probability distributions over θ. The
max-min model includes the Bayesian model as a special case.
Section 1.5 defines admissibility, a minimal test for a decision rule to be
optimal. We describe the important result that if a decision rule is to be
admissible, it must be a Bayesian decision rule for some prior distribution.
Section 1.6 explains why max-min decision rules are admissible.
The Bayesian decision theory carries with it a convenient sense in which
the problem of statistical inference separates itself from the problem of
making a decision: optimal inferences about the unknown model are independent of the utility function U . The decision function D(y) can be
designed by a pure decision maker who needs to consult only the outcome
of the process of statistical inference. We can summarize the separation
of the processes of making inferences about θ and making decisions as a
function of data by saying that ex post beliefs about the statistical model
ψ(y|θ) do not depend on the utility function U . Here, by ex post, we mean
‘after observing Y ’. This separation of ex post beliefs from the choice of D
does not prevail under max-min decision theory when the decision maker
has multiple priors. When there is a way to construct a unique ex post probability distribution for θ under max-min decision theory, this distribution
depends on the utility function. Exploring how such ex post beliefs depend
1.3. MODEL SELECTION EXAMPLE
3
on utility functions under max-min decision theory is an important theme
of this book.
1.3
Model Selection Example
Suppose that Θ = {θ1 , θ2 } where each value of θ corresponds to an alternative model. A decision-maker wants to select the correct model, a task
that we capture by the following utility function:
U (d, θ1 , y) = d
U (d, θ2 , y) = 1 − d
where d is the probability of choosing θ1 and 1 − d is the probability of
choosing θ2 . Thus we allow for randomization in decision making. We set
D = [0, 1]. For this decision problem, a speical class of decision rules are
of particular interest. Partition Y into disjoint two sets Y1 and Y2 , and
consider a decision rule:
1 if y ∈ Y1
D(y) =
0 if y ∈ Y2 .
These decision rules do not entail randomization, and the expected utilities
conditioned on θ are:
Z
Ū (D|θ1 ) =
ψ(y|θ1 )τ (dy)
Y1
Z
Ū (D|θ2 ) =
ψ(y|θ2 )τ (dy).
Y2
Since Θ only contains two values, we can represent Ū (D|θ) as an ordered
pair for each decision rule D. Notice that Ū (D|θ) is the probability that
the decision maker correctly identifies model θ as being correct, and consequently 1 − Ū (D|θ) is the probability that a mistake is made. Suppose we
label θ1 as the null model and θ2 as the alternative model. Then 1− Ū (D|θ1 )
is the probability of making a so-called type I error and 1 − Ū (D|θ2 ) the
probability of making a type II error.
4
1.4
CHAPTER 1. STATISTICAL DECISION THEORIES
Bayesian Decision Rules
Suppose that we can summarize a decision maker’s ignorance about θ by a
probability measure π over Θ that we call a prior distribution. A Bayesian
decision rule maximizes expected utility, where the mathematical expectation is taken with respect to the prior π over θ:
Problem 1.4.1.
Z
Ū (D|θ)π(dθ)
max
D
Θ
While the decision rule D depends on Y , the function in Ū (D|θ) in decision
problem 1.4.1 integrates out the dependence on the data realization y and
on the unknown parameter θ. We refer to 1.4.1 as an ex ante decision
problem.
An important finding is that we can solve the ex ante decision problem
1.4.1 by breaking it into two subproblems, one that makes an inference
about θ from an observation about Y , and another that, given that inference
about θ, chooses a decision rule. Thus, the ex ante decision problem can be
broken into an estimation part and an optimization part. To demonstrate
this decomposition, let
ψ(y|θ)π(dθ)
ψ(y|θ̃)π(dθ̃)
Θ
π̄(dθ|y) = R
be the posterior distribution conditioned on Y = y. The ex post decision
problem is:
Problem 1.4.2.
Z
max
d∈D
U (d, θ, y)π̄(dθ|y).
Θ
The maximizer d∗ of problem 1.4.2 depends on y and so induces a function
d∗ = D∗ (y). Provided that the resulting ex ante objective is finite, the
function D∗ (y) also solves the ex ante decision problem.
We can think of decentralizing the ex ante Bayesian decision problem by
assigning a statistician to compute the posterior π̄ without telling him the
utility function U (d, θ, y). Then we can hand the posterior to a pure decision
maker who solves the ex post decision problem. In this way, ‘estimation’
and ‘decision making’ can be separated.
1.5. PARTIAL ORDERING AND ADMISSIBILITY
5
Consider again the model selection example from section 1.3. Let π1 be
the prior probability for θ1 and let π2 = 1 − θ1 be the prior probability of
θ2 . Then the posterior probabilities are:
π1 ψ(y|θ1 )
π1 ψ(y|θ1 ) + π2 ψ(y|θ2 )
π2 ψ(y|θ2 )
π̄2 (y) =
π1 ψ(y|θ1 ) + π2 ψ(y|θ2 )
π̄1 (y) =
The objective for the conditional problem is:
D(y)π̄1 (y) + [1 − D(y)]π̄2 (y).
A solution to the conditional problem is the threshold rule:
1 if π̄1 (y) > π̄2 (y)
D(y) =
0 if π̄1 (y) ≤ π̄2 (y)
which corresponds to a particular partition of Y. Since π̄2 (y) = 1 − π̄1 , the
optimizing decision rule can equivalently be constructed by setting D(y) = 1
when π̄1 (y) > 1/2 and the setting D(y) to zero otherwise. In fact when the
π̄1 (y) = 1/2, any choice of D(y) ∈ [0, 1] will have the same conditional
objective. In summary, the Bayesian solution to the model selection problem is simply to select the value of θ for which the posterior probability is
highest. Notice that if the prior probabilities are one half for each model,
then the solution to is pick the value of θ that that maximizes the likelihood
ψ(y|θ) viewed as a function of θ for a fixed y.
1.5
Partial Ordering and Admissibility
R
The criterion Θ Ū (D, θ)π(dθ) on the right side of Problem 1.4.1 depends
on the particular prior distribution π imputed to the decision maker. But
it is worth noting that even without specifying a prior distribution over θ,
the utility function Ū (D, θ) itself induces at least a partial ordering over
decision rules. Consider D1 and D2 . Under a partial ordering , D2 D1
(i.e., ‘D2 is preferred to D1 ’) if
Ū (D2 |θ) ≥ Ū (D1 |θ) for all θ ∈ Θ.
6
CHAPTER 1. STATISTICAL DECISION THEORIES
While some decision rules are dominated by others, will not rank all pairs
of decision rules, which is why we say that the ordering is only partial.
Nevertheless, it will be true that the partial ordering will identify some
decision rules that cannot be dominated. A presumption that it makes
sense to restrict our attention to undominated decision rules leads to the
following
Definition 1.5.1. A decision rule D is said to be admissible
iftheredoes
not exist a decision rule D̃ such that D̃ D with Ū D̃, θ̃ > Ū D, θ̃ for
some θ̃ in Θ.
Claim 1.5.2. Suppose that Θ has a finite number of elements. Consider a
prior that assigns positive probability to each element of Θ. Then a Bayesian
decision rule is necessarily admissible. Furthermore, consider a Bayesian
decision rule associated with some prior. If this decision rule is inadmissible, it can be strictly dominated only on a subset of Θ with zero prior
probability.
We consider an outline of a proof and suggest how we can avoid the
possibility that a Bayesian rule is inadmissible. Suppose in fact that the
Bayesian decision rule D is inadmissible. Let D̃ be a decision rule that
dominates D in the sense of definition 1.5.1. Using the prior to average
across values of θ, we know that the average of the Ū ’s for D and D̃ must
be the same because the Bayesian decision rule maximizes the criterion
stated in problem 1.4.1. It is necessarily the case that
Ū (D|θ) = Ū (D̃|θ)
for θ’s that have positive π. The strict inequality
Ū (D|θ) < Ū (D̃|θ)
can apply only to a subset of θ’s that has prior probability zero. If all θ’s
have strictly positive prior probability, we can thus rule out the possibility
of domination. Notice that even if D happens to be inadmissible because
some of the prior probabilities are zero, we can replace D̃ with D as the
solution to the Bayesian problem without altering the objective. Since D
and D̃ solve the respective ex post problems, it can be shown that D̃ solves
the ex ante problem with the original prior distribution. Thus, if the ex
1.5. PARTIAL ORDERING AND ADMISSIBILITY
7
ante problem has a unique solution (except possibly on sets with τ measure
zero), then the decision rule will be admissible even when some the θ’s
are assigned zero probability under the prior probability π. This argument
becomes more delicate when there is a continuum of values of θ.
We return again to the example in section 1.3. Consider a decision rule
based on the relative magnitudes of the log-likelihoods:
1 if log ψ(y|θ2 ) − log ψ(y|θ1 ) < log r
D(y) =
0 if log ψ(y|θ2 ) − log ψ(y|θ1 ) ≥ log r
(1.1)
for some threshold log r. This is a Bayesian decision rule for some prior. To
see this, note from our previous analysis that this would follow immediately
if we set
log r = log π1 − log(1 − π1 )
(1.2)
because the event
π2 ψ(y|θ2 ) < π1 ψ(y|θ1 )
coincides with the event
log ψ(y|θ2 ) − log ψ(y|θ1 ) < − log π2 + log π1 .
Given a positive real number r, we can solve (1.2) for an “implicit prior”:
π1 =
r
.
1+r
Given that there exists a prior under which this threshold rule is a Bayesian
rule, the decision rule is admissible; and this is true for any choice of the
threshold r. A Bayesian decision-maker interprets this threshold as reflection of his or her prior beliefs. Degenerate rules that always select model
d = θ1 or d = θ2 independent of data are also admissible, albeit typically
not particularly interesting. A Bayesian decision maker arrives at the degenerate rules by assigning all of the prior probability to one of the two
possible values of θ.1
1
Other admissible decision rules are given by replacing ≥ with > and < with ≤ in
construction (1.1).
8
CHAPTER 1. STATISTICAL DECISION THEORIES
1.6
Max-min Decision Rules
We now present an alternative approach to constructing a decision rule.
Let C denote a positive convex function with domain given by hypothetical
probabilities π. This function can be infinite and is zero for some choice of
π. The convex function C is used to model ambiguity aversion associated
with assigning the prior π. The decision maker solves
Problem 1.6.1.
Z
Ū (D|θ)π(dθ) + C(π).
max min
D
π
Θ
The preferences implicit in this decision problem are what Maccheroni et al.
(2006) call variational preferences. As we will discuss later, the function U
used in building Ū can capture risk aversion. The convex function C that
is added to the objective function can capture aversion to the assignment
of a prior π. The solution D∗ of the max-min problem 1.6.1 is a robust
decision rule in the sense that it is designed to guarantee a ‘good enough’
performance over the family of priors affiliated with C.2
We consider two important examples of convex functions C. One possible specification of C uses a convex family of probabilities Π to form
0 π∈Π
C(π) =
∞ π 6∈ Π
This function imposes the constraint that π ∈ Π. For a Bayesian specification Π contains only a single element. Gilboa and Schmeidler (1989)
provide axioms that imply preferences that can be represented with a C
function of this form. Even before their important contribution, a statistical decision theory embodying such preferences had had a long history in
the statistics literature.
More generally, the decision maker entertains a family of prior distributions by specifying the convex set Π. Another specification of C is called
relative entropy. Suppose that Θ has n entries. Let π o denote a reference
prior, and let
n
X
C(π) = κ
πj (log πj − log πjo ).
j=1
2
R Problem 1.6.1 is a zero-sum two-player game where one player chooses
R π to maximize
− Θ Ū (D|θ)π(dθ) − C(π) and another player chooses D to maximize Θ Ū (D|θ)π(dθ) +
C(π). Notice that with this convention, the two objectives sum to zero.
1.6. MAX-MIN DECISION RULES
9
Then C has a minimum at π = π o . It is straightforward to verify that this
specification of C is convex. By setting the parameter κ to be large, we
highly penalize deviations from the baseline prior distributions π o . In the
κ = ∞ limit, this problem reduces to a Bayesian decision problems with
prior π o . This construction of C has a direct extension to more general
parameter spaces and associated collections of prior probabilities. Within
the economics literature, this choice of C was suggested by Hansen and
Sargent (2001, 2007) motivated by extensive literature from control theory.
The objective in problem 1.6.1 is convex in π for each D. If the objective
function is also concave in D for each π, then by virtue of the famous
Minimax Theorem (for instance, see Fan (1952)), the optimized objective
is the same as that of:
Problem 1.6.2.
Z
min max
π∈Π
D
Ū (D|θ)π(dθ) + C(π).
Θ
Since C(π) is additively separable in the objective function of problem 1.6.2,
for each hypothetical π, we compute a Bayesian decision rule. We then
minimize the objective over all such Bayesian decision rules. Provided that
a solution π̂ exists, we obtain a ‘worst-case prior’, that is, a prior such
that D∗ is the corresponding Bayesian decision rule. One interpretation of
this min-max problem is that we pick the prior in a robust way by using
the objective of the decision problem. Under this interpretation, we no
longer have the usual Bayesian separation of inference from decision-making
because the worst-case prior depends on the decision maker’s objective. It
is important to note that the max-min decision cannot be dominated except
possibly on sets to which the worst-case prior assigns probability zero.
We return again to the example from section 1.3, and we set C to be
zero. Let
ψ(y|θ2 )
,
z = φ(y) =
ψ(y|θ1 )
and suppose for simplicity that the ψ(y|θ1 )τ (dy) probability in y space
implies a density υ over the positive real numbers in z space. Then
R
Rr
Ū (D|θ1 ) =
ψ(y|θ1 )τ (dy)
=
υ(z)dz
φ(y)≤r
R
R ∞0
Ū (D|θ2 ) = φ(y)>r φ(y)ψ(y|θ1 )τ (dy) = r zυ(z)dz
10
CHAPTER 1. STATISTICAL DECISION THEORIES
where D is a threshold decision rule with threshold log r. Suppose that
Ū (D|θ1 ) < Ū (D|θ2 ) for a given r. Then the minimizing decision maker
who chooses π will place all of the prior probability on θ1 , resulting in an
objective Ū (D|θ1 ). This objective can be improved by the maximizing decision maker by increasing r. Conversely, suppose that Ū (D|θ1 ) > Ū (D|θ2 ).
Then the minimizing decision maker will place all of the prior probability
on θ2 resulting in an objective Ū (D|θ2 ). This objective can be improved by
minimizing decision maker by decreasing r. This leads us to search for a
value of r such that
Ū (D|θ1 ) = Ū (D|θ2 )
The minimizing decision maker now has no power to influence the objective,
nor are there ways to improve the objective. Changing r will hinder either
the right or left side and diminish the outcome of the minimization. Thus,
we have constructed the solution to the max-min decision problem. As
we know from our earlier argument, once we find such a r, there is an ex
post Bayesian justification for the decision rule. We call it ex post because
we select the prior after determining the value of r for which the utilities
conditioned on the parameter values are equated.
To amplify this example, consider the curve (ū1 (r), ū2 (r)) traced out by
changing r over the positive real numbers where
ū1 (r) = Ū (D|θ1 )
ū2 (r) = Ū (D|θ2 )
for alternative decision rules D indexed by the threshold log r. Compute:
d
ū1 (r) = υ(r)
dr
d
ū2 (r) = −rυ(r).
dr
These two equations define a curve that describes the upper left boundary
of the feasible utilities conditioned on each of the possible values of θ. The
slope of the curve at a given point indexed by r is the ratio of the two
derivatives and equal to −r < 0. In other words the curve is downward
1
sloping. The corresponding second derivative is − υ(r)
< 0 and hence the
curve is concave. Consider a Bayesian with prior probabilities π1 and π2 =
1 − π1 who seeks to maximize expected utility. He or she solves the ex ante
1.6. MAX-MIN DECISION RULES
11
optimization problem by setting
r=
π1
.
π2
The max-min utility decision-maker chooses r so that
ū1 (r) = ū2 (r).
(1.3)
Call this value r∗ and the associated decision rule D∗ .
Consider now the solutions to the zero-sum games defined by 1.6.1 and
1.6.2. For Game 1.6.1, the decision-maker chooses D = D∗ so that the
expected utilities conditioned on θ are equalized for the two values of θ.
The solution to the inner minimization problem is not unique because any
choice of π1 and π2 = 1 − π1 will result in the same objective. For Game
1.6.2, there is a unique worst-case probability
π1∗ =
r∗
1 + r∗
and with this worst-case π ∗ the decision rule D∗ maximizes (unconditional)
expected utility. Notice that the probability pair (π1∗ , π2∗ ) is among the
solutions for (π1 , π2 ) that solve Game 1.6.1 in conjunction with D∗ .
Figure 1.1 illustrates graphically the max-min solution to model selection problems. The set of possible expected utility risks is given by the
following collection of ordered pairs:
R = (Ū (D|θ1 ), Ū (D|θ2 )) : for some decision rule D .
The upper right boundary of this set gives the expected utility pairs for
threshold decision rules that depend on the difference in the log likelihoods
for the two models. Formally, this boundary is given by:
{(ū1 (r), ū2 (r)) : r ∈ R}
The 45 degree line from the origin locates points that equate expected utility
risks for the two values of θ. Since the decision maker wishes to increase
the objective as much as possible, the max-min approach directs attention
to the intersection of the 45 degree line with the frontier. The negative of
the slope of this frontier gives the implied ratio of worst-case probabilities.
12
CHAPTER 1. STATISTICAL DECISION THEORIES
%&'()$
!"#$
%&'(*$
Figure 1.1: This illustrates the max-min problem for two choices of the
utility function. The inner concave curve is the decision outcome frontier
for a model selection problem with a unit utility for making the correct
model choice. The upper right concave curve is the risk frontier when the
utility assigned to a correctly identifying the θ1 model is increased. The
intersection of the 45 degree line with these frontiers depicts risk outcomes
with the max-min decision rules applied to the respective decision problems.
The negatives of the slopes of the red tangent lines give the implied ratio
of worst-case probabilities for the two parameter values.
Figure 1.1 also describes outcomes from another decision problem, one
for which
U (d, θ1 , y) = 1.5d
U (d, θ2 , y) = 1 − d.
Thus, there is now a larger reward for correctly identifying the θ1 than for
the θ2 . As a consequence, there is a smaller worst-case probability assigned
1.7. RISK AND AMBIGUITY
13
to θ1 .
1.7
Risk and Ambiguity
Motivated by a distinction made by Knight (1921), we can think of ψ(y|θ) as
capturing ex ante risk. This notion of risk conditions on the θ, and curvature
in the function U captures risk aversion. We can think of Knight’s concept
of uncertainty as referring to doubts about how to assign a prior π over θ.
We now describe one tractable way to capture this notion of uncertainty.
Before posing a decision problem, consider the following minimization
problem. Suppose that Θ contains n states, and let πj : j = 1, 2, ..., n be
possible probabilities assigned to those states. Consider
min
π
n
X
πj (log πj + log n)
j=1
where the πj ’s must add up to one. It can be verified that this objective
function is convex in the probabilities and that scaled versions of it provide
an example of the convex function C used previously. An implication of
the first-order conditions for this minimization is that
log πj∗ = log s
for some positive number s. Since the probabilities add up to one, s = 1/n,
making the minimized objective function equal to zero. It turns out that
the objective for this minimization problem is an example of entropy of π
relative to a uniform probability assignment for the θj ’s.
We now use relative entropy as a device to capture prior uncertainty. We
modify our decision problem to represent how the decision maker responds
to uncertainty about the prior. Formally, we solve
min
π
n
X
πj Ū (D|θj ) + κ(log πj + log n) .
j=1
In this objective function κ serves as a penalty parameter. Large values
of κ more highly penalize alternative choices of the prior distribution that
are “far away” from an equal assignment of probabilities. The penalization
uses relative entropy as a measure of discrepancy between an arbitrary prior
14
CHAPTER 1. STATISTICAL DECISION THEORIES
and a benchmark prior taken to be uniform in this case. The first-order
conditions now imply that
1
log πj∗ = − Ū (D|θj ) + log s
κ
for some positive number s. Thus
1
exp
−
Ū
(D|θ
)
j
κ 1
πj∗ = Pn
exp
− κ Ū (D, θi )
i=1
(1.4)
which exponentially tilts the probabilities towards the expected utilities
(conditioned on θ) with the most adverse consequences. Notice that large
values of κ make the resulting probabilities close to being equalized across
the alternative values of θj . The resulting minimized objective function is
!
n
1
1X
(1.5)
exp − Ū (D|θj ) .
−κ log
n j=1
κ
The decision-maker maximizes this ambiguity-adjusted objective by choice
of D.
As emphasized by Hansen and Sargent (2007), the indirect utility function, i.e., the outcome of this robust choice of priors, gives a special case of
the “smooth ambiguity” decision model of Klibanoff et al. (2005). In the
Klibanoff et al. (2005) framework, an additional concave function is used to
adjust of ambiguity, which in this example is exponential with parameter
− κ1 as is evident in formula (1.5).3 We can think of
ψ(y|θ)π(dθ)
as a compound lottery for y. First we draw θ and then we draw y conditioned on θ. A Bayesian decision maker commits to a specification of π
and integrates out θ leaving a reduced lottery for y. The Bayesian decision
maker only cares about the reduced lottery whereas in the approach taken
here the compound nature of the lottery matters.
The worst-case probabilities given by (1.4) depend on the decision rule
D. By reversing order of maximization and minimization we can infer worst
3
trol.
This exponential formula plays a central role in the literature on risk-sensitive con-
1.7. RISK AND AMBIGUITY
15
case probabilities that satisfy the counterpart to Problem 1.6.2. Using
these latter probabilities a Bayesian decision would typically find the same
decision rule as emerges by solving the max-min problem for which the
indirect utility function (1.5) is maximized by choice of D.
Bibliography
Fan, K. 1952. Fixed Point and Minimax Theorems in Locally Convex Topological Linear Spaces. Proceedings of the National Academy of Sciences
38:121–126.
Fisher, J. 2006. The Dynamic Effects of Neutral and Investment-Specific
Technology Shocks. Journal of Political Economy 114:413–451.
Gilboa, I. and D. Schmeidler. 1989. Maxmin Expected Utility with NonUnique Prior. Journal of Mathematical Economics 18:141–153.
Hansen, L. P. and T. J. Sargent. 2001. Robust Control and Model Uncertainty. American Economic Review 91:60–66.
Hansen, L. P., J. C. Heaton, and N. Li. 2008. Consumption Strikes Back?:
Measuring Long Run Risk. Journal of Political Economy .
Hansen, Lars Peter and Thomas J. Sargent. 2007. Recursive Robust Estimation and Control without Commitment. Journal of Economic Theory
136:1–27.
Klibanoff, Peter, Massimo Marinacci, and Sujoy Mukerji. 2005. A Smooth
Model of Decision Making under Ambiguity. Econometrica 73 (6):1849–
1892.
Knight, Frank H. 1921. Risk, Uncertainty, and Profit. Houghton Mifflin.
Maccheroni, Fabio, Massimo Marinacci, and Aldo Rustichini. 2006. Ambiguity Aversion, Robustness, and the Variational Representation of Preferences. Econometrica 74 (6):1447–1498.
17
© Copyright 2025 Paperzz