Decision Rules and Decision Markets

Decision Rules and Decision Markets
Abraham Othman
Tuomas Sandholm
Computer Science Department
Carnegie Mellon University
Computer Science Department
Carnegie Mellon University
[email protected]
[email protected]
ABSTRACT
We explore settings where a principal must make a decision about
which action to take to achieve a desired outcome. The principal elicits the probability of achieving the outcome by following
each action from a self-interested (but decision-agnostic) expert.
We prove results about the relation between the principal’s decision rule and the scoring rules that determine the expert’s payoffs.
For the most natural decision rule (where the principal takes the action with highest success probability), we prove that no symmetric
scoring rule, nor any of Winkler’s asymmetric scoring rules, have
desirable incentive properties. We characterize the set of differentiable scoring rules with desirable incentive properties and construct one. We then consider decision markets, where the role of a
single expert is replaced by multiple agents that interact by trading
with an automated market maker. We prove a surprising impossibility for this setting: an agent can always benefit from exaggerating
the success probability of a suboptimal action. To mitigate this, we
construct automated market makers that minimize manipulability.
Finally, we consider two alternative decision market designs. We
prove the first, in which all outcomes live in the same probability
universe, has poor incentive properties. The second, in which the
experts trade in the probability of the outcome occurring unconditionally, exhibits a new kind of no-trade phenomenon.
Categories and Subject Descriptors
J.4 [Social and Behavioral Sciences]: Economics
General Terms
Economics, Theory
Keywords
Decision Rules, Decision Markets, Mechanism Design, Market Design, Prediction Markets, Elicitation
1.
INTRODUCTION
Prediction markets have served as a reliable tool for estimating
the winners of political elections and sports games [Berg et al.,
2001]. However, due to legal restrictions severely limiting their
use, the latest wave of prediction markets have focused not on the
general, but on the specific. Instead of creating large-scale markets
Cite as: Decision Rules and Decision Markets, Abraham Othman and
Tuomas Sandholm, Proc. of 9th Int. Conf. on Autonomous Agents
and Multiagent Systems (AAMAS 2010), van der Hoek, Kaminka,
Lespérance, Luck and Sen (eds.), May, 10–14, 2010, Toronto, Canada, pp.
XXX-XXX.
c 2010, International Foundation for Autonomous Agents and
Copyright Multiagent Systems (www.ifaamas.org). All rights reserved.
on publicly verifiable events, these services bill themselves as accumulators of organizational knowledge on specific internal (e.g., corporate) events. For instance, the prediction market startup Inkling
Markets suggests that companies use their service “to uncover and
quantify risk in your organization". The recently launched Crowdcast lists questions like “When will your product really ship?" and
“How much will it really cost?" on their homepage.
We contend that the current corporate prediction markets do not
actually capture the problems most prominent for their clients. The
chief challenge facing businesses is not whether the decisions they
have made will prove successful, but rather what actions to take to
assure the best chance of success.
Our work approaches this issue directly. In our model, a principal has to choose an action from a set of possibilities (e.g., “Hire
additional sales staff", “Double R&D funding") in order to maximize the probability of achieving a desirable outcome (e.g., “Become the top-grossing company in our market space", “Sell more
than a million widgets", or “Achieve profitability by the end of the
year"). The principal elicits from an expert, for each action, the
probability of achieving the desired outcome. Based on those probabilities, the principal then (deterministically) chooses an action according to a decision rule. Upon success or failure in achieving the
desired outcome, the principal rewards the expert according to a
pre-determined scoring rule. Three functions define the space: the
decision rule, the payoff for success, and the payoff for failure.
We first prove general properties of the way these functions relate. We then provide an in-depth study of the most natural deterministic decision rule, the max decision rule, which selects the
action that has the highest reported success probability. For the
single-agent setting, we show that no symmetric scoring rule, nor
the asymmetric ones from the literature, give the agent the right incentives. We construct an asymmetric scoring rule that does, and
provide a characterization of a space of such rules.
Along the same lines as the original construction of market scoring rules [Hanson, 2003, 2007, Pennock and Sami, 2007], we continue by attempting to expand our scoring rule system into an automated market maker for our multiagent setting, to create decision
markets. Surprisingly, we show that every market of this kind suffers from a peculiar type of manipulability, where an agent benefits
from exaggerating the success probability of a suboptimal action.
We show that this kind of manipulability applies even under an infinite stream of self-interested agents visiting the market—even if
each agent’s beliefs about the probabilities are exactly accurate. We
design families of asymptotically-optimal pricing rules for decision
markets that minimize manipulability.
Finally, we study two alternate decision market designs. We
show the first one suffers from significant manipulability of a different kind, and for the second we show a new kind of no-trade
result. We conclude with a discussion of future research directions.
2.
ELICITING FROM A SINGLE AGENT
We are ultimately interested in the study of decision markets—
prediction markets used for decision support. These markets are
multiagent tools for aggregating information in the form of prices
which correspond to probabilities. Decision markets therefore have
two components: a multiagent component, and a probability elicitation component. As a building block towards creating decision
markets, in this section we study the elicitation of truthful probabilities from a single agent.
If we relax the need to elicit probabilities, the problem can be
solved straightforwardly. For instance, it is trivial to incentivize an
agent to truthfully report which action has the highest probability
of achieving a desired outcome. This can be done by simply giving
the agent a lump sum payment if the outcome is reached, which
aligns the incentives of the agent and the principal. However, as we
discuss in this section, expanding the query to include probabilities
makes the elicitation problem much more complex.
We first describe the basic framework of the elicitation process.
Then we explore the space of all deterministic decision rules, and
finally characterize the space of scoring rules that should be used
with the max decision rule.
2.1
P ROOF. Suppose there exists a deterministic decision rule D and
a strictly proper scoring rule for it. Now consider two belief vectors p1 and p2 such that they report the same probability p for the
chosen action. Consider first an expert with true belief p1 . Since
the scoring rule is strictly proper, we have
pf (p1 ) + (1 − p)g(p1 ) > pf (p2 ) + (1 − p)g(p2 )
Now consider an expert with true belief p2 . Since the scoring rule
is strictly proper, we have
pf (p2 ) + (1 − p)g(p2 ) > pf (p1 ) + (1 − p)g(p1 )
These two equations cannot hold simultaneously, generating a contradiction.
We now relax the requirement of strict incentives to report the
truthful estimates for all actions. We will only require that the incentive to report the truth is strict for the action chosen, and that
there is no strict incentive to lie about the other probabilities.
Definition 5. Let the agent have true belief p = (p1 , . . . , pn ).
Let di be the action taken by the deterministic decision rule D if the
agent reports truthfully. Let u(p) represent the agent’s expected
utility from reporting p given the decision rule/scoring rule pair. If
u(p) ≥ u(p0 )
Setting
In our setting, a principal cares about achieving beneficial outcome o, for example, unemployment under 8% by the beginning of
next year. He needs to choose one of n actions, D = {d1 , . . . , dn },
to try to achieve the outcome. He asks an expert to tell him the
probability of achieving the outcome o under each of the alternative actions. Based on the agent’s report, the principal makes his
decision. The agent is risk neutral and does not care which action
is taken or whether the outcome is achieved, except to the extent
that it affects her compensation.
for all p0 6= p, and
u(p) > u(pi )
where pi represents any report for which the ith component does
not equal pi (that is, the agent does not report P (o|di ) truthfully).
Then that scoring rule/decision rule pair is quasi-strictly proper.
Definition 1. A (deterministic) decision rule is a function D :
(0, 1)n 7→ D that maps an expert report to an action.
T HEOREM 2. Let x be a belief vector on the boundary of the
decision rule D between reports that map to action d1 and those
that map to d2 . If there exists a quasi-strictly proper scoring rule
for D, then it must induce, for an agent with private belief x, the
same expected utility for d1 or d2 being taken.
To avoid degeneracy, we assume that the decision rule maps a
positive measure of space to each possible action.
P ROOF. Without loss of generality, assume D(x) = d1 . Since
x is on the boundary of D, we have that for all > 0 there exists a
vector y such that D(y) = d2 and
Definition 2. A scoring rule is a pair of functions f, g : (0, 1)n 7→
<, such that an expert reporting p ∈ (0, 1)n receives a payoff of
f (p) if the outcome is achieved and g(p) if it is not.
Definition 3. A scoring rule is continuous if both f (·) and g(·)
are continuous.
Definition 4. A scoring rule/decision rule pair is strictly proper
if it strictly incentives truthful report of the expert’s belief. That
is, let the expert have belief p, and let u(p) represent the expert’s
expected utility from a report of p. Then:
u(p) > u(p0 )
for all p0 6= p.
2.2
Results
We proceed to prove several restrictions on how scoring rules
and deterministic decision rules interact.
T HEOREM 1. There does not exist any strictly proper scoring
rule/decision rule pair.
|x − y| < Denote the expected utility of an expert with private belief p
reporting q by up (q). Now suppose for contradiction that for all ,
ux (x) > ux (y),
i.e., the agent has different expected utilities for action d1 or d2
being taken. Since the rule is quasi-strictly proper, for an agent
with true belief y we must have
uy (y) ≥ uy (x)
We will show that there exists an that induces a y such that
uy (x) > uy (y)
Because the rule is quasi-strictly proper, the payoff from any report must be finite. This implies that because |x − y| < , there
exists some constant M such that for all z
|ux (z) − uy (z)| < M So we have some δ > 0 such that for all > 0
uy (x) > uy (x) − δ > uy (y) − 2M Now choose small enough so that
Definition 7. The max decision rule selects the action that has
highest reported probability of achieving the outcome: arg maxi p.
Ties are broken in some fixed way.
δ > 2M At that we have
uy (x) > uy (y)
This contradicts the rule being quasi-strictly proper.
This result holds for any payoff rule, including those that are discontinuous. If we restrict the payoff functions to be continuous, we
can prove an additional restriction on the types of decision rules
that admit quasi-strictly proper scoring rules. Specifically, the decision rule can only switch between actions when their reported
success probabilities are equal.
T HEOREM 3. Let decision rule D have a boundary between
some two actions d1 and d2 where P (o|d1 ) 6= P (o|d2 ). Then
there exists no continuous quasi-strictly proper scoring rule for D.
P ROOF. For shorthand, denote the boundary point (P (o|d1 ) ,
P (o|d2 )) by (p, q). We know from the proof of Theorem 2 that we
must have, for an agent with belief (p, q),
pf (p, q) + (1 − p)g(p, q) = qf (p, q) + (1 − q)g(p, q)
Rearranging,
The rest of this section will discuss creation of quasi-strictly
proper scoring rules for the max decision rule. First, we show that
symmetric scoring rules are not quasi-strictly proper for the max
decision rule. Then we construct an asymmetric scoring rule that is
quasi-strictly proper.
2.3.1
Inadequacy of symmetric scoring rules
As Winkler [1994] discusses, most of the literature on scoring
rules has studied symmetric scoring rules only. These include the
log rule
f (p)
=
log p
g(p)
=
log 1 − p
and the quadratic rule
f (p)
=
1 − (1 − p)2
g(p)
=
1 − p2
A symmetric scoring rule is defined as follows.
Definition 8. A symmetric scoring rule sets
f (1 − p) = g(p)
f (p, q)(p − q) − g(p, q)(p − q) = 0
If p 6= q, this holds only if
f (p, q) = g(p, q)
This means the expert receives the same payoff regardless of whether
the outcome is achieved or not. Obviously, no rule of this kind can
be quasi-strictly proper. This yields a contradiction.
We will now prove another property of quasi-strictly proper scoring rules. It will help us further narrow the space of scoring rules
to consider.
Symmetric scoring rules are intuitively desirable because they
offer a pleasing equivalence between the positive and negative sides
of an event occurring. Consider asking an expert whether it will
rain tomorrow or not. Using a symmetric rule, an expert who reports p to the question “Will it rain tomorrow?" achieves the exact
same payoff, regardless of the realized outcome, as an expert who
reports 1 − p to the question “Will it not rain tomorrow?". This fits
well with our intuitive understanding of how a scoring rule works—
in each case the expert has supplied identical information to the
principal, and is therefore compensated identically.
However, as the following example shows, symmetric scoring
rules are not quasi-strictly proper for the max decision rule, because they can induce an expert to manipulate her report. The following example shows this for the most famous scoring rule.
Definition 6. In an independent of irrelevant alternatives (IIA)
scoring rule, the expert can arbitrarily alter her report of the probabilities of the actions that are not taken without altering her expected utility. Formally, let p1 and p2 represent two reports such
that D(p1 ) = D(p2 ), and the reported probability of the chosen
action di is equal (i.e., pi1 = pi2 ). Letting u(p) represent the agent’s
expected utility from reporting p, u(p1 ) = u(p2 ).
E XAMPLE 1. The principal can take one of two actions, d1 or
d2 , in an effort to achieve beneficial outcome o. The expert agent
has true belief:
T HEOREM 4. For any decision rule, any non-IIA scoring rule
incentivizes an agent with some beliefs to misreport her beliefs.
The interested party uses the max decision rule, and uses the log
scoring rule, which is strictly proper for traditional elicitation:
P ROOF. Consider a non-IIA scoring rule (f 0 (p), g 0 (p)). There
exist two vectors p1 and p2 where the reported probability of the
chosen action is equal and the expert’s payoff from reporting p1
is not equal to that from reporting p2 . Then either an expert with
belief p1 is incentivized to report p2 , or an expert with belief p2
is incentivized to report p1 .
This result allows us, without loss of generality, to represent scoring rules not as general functions of the belief vector, but as scalar
functions only of the probability of success of the action taken (i.e.,
f (pD(p) ) and g(pD(p) ) instead of f (p) and g(p)).
2.3
The max decision rule
We will now study the most natural decision rule that satisfies
the condition of Theorem 2 (that the rule switches actions at equal
probabilities), the max decision rule.
P (o|d1 ) = .5 and P (o|d2 ) = .25
f (x) = 1 + log(x)
g(x) = 1 + log(1 − x)
If the agent reports truthfully, action d1 is taken. The agent receives expected utility
.5f (x) + .5g(x) = 1 + log(.5) ≈ .7
If the agent instead reports P (o|d1 ) < .25 = P (o|d2 ), then
action d2 is taken (because it has the higher reported probability).
The agent receives expected utility
.25f (x) + .75g(x) = 1 + (.25 log(.25) + .75 log(.75)) ≈ .756
So the agent achieves higher expected utility by manipulating her
response and causing the principal to take the inferior action.
We will now expand the manipulation shown in this example into
a proof that covers every symmetric scoring rule, not just the log
rule.
Let us denote the agent’s truthful expected utility by T(p) ≡
pf (p) + (1 − p)g(p). This is the agent’s expected utility when her
maximum probability among her believed probabilities is p and she
reports (all the probabilities) truthfully.
T HEOREM 5. If a scoring rule is quasi-strictly proper for the
max decision rule, then the truthful expected utility function T(p)
is strictly increasing in p.
P ROOF. Suppose the function is not strictly increasing, so there
exists pl < pu such that
T(pl ) ≥ T(pu )
Then an expert with true belief (pl , pu ) will not be disincentivized
to instead report (pl , pl − ).
(It is interesting that even though the scoring rule motivates an
agent to report truthfully, the agent may wish she had a higher value
to report.)
Using the theorem above, we can now prove a general impossibility about symmetric scoring rules.
T HEOREM 6. No symmetric scoring rule is quasi-strictly proper
for the max decision rule.
P ROOF. We will show that no symmetric scoring rule admits
a monotonically increasing truthful expected utility function, and
thus cannot be quasi-strictly proper. Suppose there exists a symmetric quasi-strictly proper scoring rule that admits a monotonically increasing truthful expected utility function, and consider
p > .5. Since the truthful expected utility function is strictly increasing, it must be the case that
T(p) > T(1 − p)
pf (p) + (1 − p)g(p) > (1 − p)f (1 − p) + pg(1 − p)
pf (p) + (1 − p)g(p) > (1 − p)g(p) + pf (p)
0 > 0
where the intermediate step is derived from the definition of a symmetric scoring rule.
2.3.2
Quasi-strictly proper scoring rules for the max
decision rule
Since we cannot use a symmetric scoring rule for the max decision rule, we turn our attention to asymmetric rules. Unfortunately,
the best-known family of asymmetric scoring rules, those devised
by Winkler [1994], are not quasi-strictly proper and thus cannot be
used for our setting. As Figure 2 of Winkler [1994] illustrates, the
truthful expected utility function T(p) for any of Winkler’s rules
reaches a minimum in (0, 1), and is thus not strictly increasing. By
Theorem 5, this means none of Winkler’s asymmetric rules are not
quasi-strictly proper.
We proceed to completely characterize the set of differentiable
quasi-strictly proper scoring rules, and show that the set is nonempty by constructing a quasi-strictly proper scoring rule for the
max decision rule. The following result will guide our characterization.
to report truthfully the probability of her maximum belief element,
but the scoring rule is not quasi-strictly proper.
Because the scoring rule is not quasi-strictly proper, yet it strictly
incentivizes the report of the probability of the maximum belief
element, it must be the case that the agent is motivated to misreport
a non-maximum element of her belief. Since the scoring rule is
IIA and the principal is using the max decision rule, it must also
be the case that the agent would manipulate her report of a nonmaximum belief element to increase it beyond the probability of
her maximum belief element.
Let the optimal action have probability p, the non-optimal action
have true probability pl < p, and the manipulated report be pu > p.
Because the agent wants to manipulate her report, we must have
pf (p) + (1 − p)g(p) < pl f (pu ) + (1 − pl )g(pu )
Now consider instead if the agent had reported pu as her maximum probability. She would have expected utility
pf (pu ) + (1 − p)g(pu ) < pf (p) + (1 − p)g(p)
Simplifying the above lines yields
p(f (pu ) − g(pu )) < pl (f (pu ) − g(pu ))
And because p > pl , this implies f (pu ) < g(pu ).
We will show that the supposition implies that f (p) > g(p) for
all p ∈ (0, 1), yielding a contradiction.
Taking the derivative of T(p) with respect to p for a strictly increasing function yields
T0 (p) = f (p) + pf 0 (p) − g(p) + (1 − p)g 0 (p) > 0
Now consider that the scoring rule strictly incentivizes truthful report of the selected action. That is, for all p ∈ (0, 1),
arg max pf (x) + (1 − p)g(x) = p
x∈(0,1)
The first order condition of this equation is
pf 0 (p) + (1 − p)g 0 (p) = 0
By incorporating this into the equation for T0 (p), we see that it
simplifies to
f (p) > g(p)
In particular, take p = pu to get f (pu ) > g(pu ), a contradiction.
The above proof tells us also how to characterize the set of quasistrictly proper scoring rules. In particular, when combined with the
encouragement of strictly truthful reporting of the maximum probability, the set of differentiable scoring rules with strictly increasing
truthful utility functions are those for which
f (p) > g(p)
Furthermore, the set of differentiable rules which strictly encourage
truthful reporting of the maximum probability are those for which
f 0 (p)
p−1
=
g 0 (p)
p
T HEOREM 7. If an IIA, differentiable scoring rule has a strictly
increasing truthful expected utility function T(p) and it strictly incentivizes an agent to truthfully report the probability of the maximum element of her belief, then it is quasi-strictly proper for the
max decision rule.
The differentiable scoring rules that satisfy these two equations
are exactly those that are quasi-strictly proper for the max decision
rule.
One scoring rule that satisfies these two conditions is
P ROOF. Suppose for the purpose of contradiction that T(p) is
strictly increasing and the scoring rule strictly incentivizes the agent
f (p) = −(p − 1)2 + 1
g(p) = −p2
1.0
tions. The difference between our work and these prior efforts is
that those markets served as cameras, to capture the state of the
outside world, rather than as engines, which shape the decisions
made in the world. The markets we describe here tell the principal
how to make his decision.
The conversion from scoring rules into automated market makers
has been the topic of a flurry of recent research. It stems largely
from the work of Hanson [2003, 2007]. A good overview of the
automated market maker concept is given by Pennock and Sami
[2007]. We provide a brief explanation of how these market makers
work before discussing the impacts of their use.
Payoff
0.5
0.0
-0.5
-1.0
0.0
0.2
0.4
0.6
0.8
1.0
Report
Definition 9. A pricing rule is a continuously differentiable, monotonically increasing function q(p) : (0, 1) 7→ <. If
lim q(p) = −∞
Figure 1: An illustration of an asymmetric quasi-strictly
proper scoring rule for the max decision rule. The upper line
is the expert’s payoff if the objective is achieved, and the lower
line is his payoff if it is not achieved.
then that pricing rule is unbounded from below.
By inspection, it is easy to see that this scoring rule meets the firstorder condition of encouraging truthful report. Moreover, as is evident from Figure 1, f (p) > g(p) for p ∈ (0, 1), implying that
the truthful expected utility function, T(p), is strictly increasing,
and therefore the scoring rule is quasi-strictly proper for the max
decision rule.
Since pricing rules are monotonically increasing they are invertible and we will sometimes refer to them as a function p(q) which
maps quantities into prices.
Agents interact in the marketplace by spending their money to
obtain shares which will have value at expiry. Say an agent seeks
to move the market from q aggregate demand to q 0 > q aggregate
demand (he is betting for the event in question to occur). The agent
pays
Z q0
p(q) dq
3.
DECISION MARKETS
In this section, we discuss replacing our single expert with a pool
of experts who produce a probability estimate through the mediation of an automated market maker. A prediction market is used to
elicit estimates of the probability of achieving a desirable outcome
through each possible action. Rather than acting on the report of
an expert agent, the principal sets a deadline, examines the market
prices at that deadline, uses them as probability estimates P (o|di )
and takes the action di as if P (o|di ) had been the probability report
from the expert, as in the previous section.
The equating of prices and probabilities is standard and sound
for risk-neutral agents because the contract in the market pays out
a dollar if outcome o is achieved. Therefore, the expected payoff
to an agent for the contract corresponding to the action taken is
P (o|di ), which is also the price that the agent would be willing to
pay for it.
After the principal makes his decision d, the market continues,
in that all of the trades placed on P (o|d) remain in effect, but the
contract is renamed P (o). All contracts in the other markets are
canceled, because P (o|di ) has no meaning if d 6= di has been selected. We might be concerned that expanding out to multiple different markets, one for each P (o|di ), might leave agents hopelessly
budget constrained. However, a trader who moves some amount of
funds into the market can trade as if he had moved that money into
the market for every contract, because only one of the markets will
persist into the future.
3.1
Setting and prior work
The idea to use markets as tools in decision analysis stems from
the work of Berg and Rietz [2003], who used conditional markets
relating to the choice of vice-presidential nominees in the 1996
vice-presidential Iowa Electronic Markets. Wolfers and Zitzewitz
[2006] also explore conditional prediction markets in an election
context with an eye towards teasing out causations from correla-
p→0
q
0
and receives a payoff of q −q dollars if the event in question occurs.
Now consider an agent wishing to bet against the event, moving the
aggregate demand from q to q 00 < q. That agent pays:
Z q
1 − p(q) dq
q 00
and receives a payoff of q − q 00 if the event in question does not
occur. Note that an agent who moves the market from price p to
price p0 and then back to price p has no expected benefit and no net
cost—his costs are precisely equal to the payoff he is guaranteed to
receive regardless of whether the event occurs or not.
Because every interaction with a pricing rule corresponds exactly to the payoffs of some strictly proper scoring rule, interactions
with the automated market maker are trivially truthful for a myopic
agent who holds a private belief (which she does not change based
on others’ past and future bets).
The impossibility results of this section rely on examining the
behavior of agents at the end of the trading period, where there are
no strategic impacts from the arrival of other agents to the market.
While there is no guarantee that a market mechanism with desirable
incentive properties for its final participant will work well in practice, it is almost certainly true that any market mechanism with poor
incentive properties at its close will exhibit pathologies in practice.
(Furthermore, good incentive properties for the last agent can imply good incentive properties for every agent—for instance, Chen
et al. [2007] demonstrated a plausible information aggregation setting in which agents are not incentivized to act strategically built on
the fact that the last agent acting does not have these incentives.)
We will focus on decision markets operating under the max decision rule because it is the most natural decision rule to use. However, the model we describe in this section could be used to study
any decision rule.
3.2
A general impossibility result for the
decision rule
max
The direct approach to creating an automated market maker with
good incentive properties would be to convert the strictly-proper
scoring rule for the max decision rule we constructed into a pricing
rule. However, as we show next, not only that rule, but every pricing rule, suffers from manipulability where agents are motivated
to misprice actions and force the principal to take a non-optimal
action.
E XAMPLE 2. Consider an agent that arrives at the end of the
trading period to a market with two possible actions, d1 and d2 .
That agent has the knowledge that the probability of achieving the
outcome by following each action is:
P (o|d1 ) = .8 and P (o|d2 ) = .75
(Furthermore, say that these subjective probabilities are actually
the correct objective probabilities—that is, our counterexample does
not stem from the agent having inaccurate beliefs.) For the purposes of this example, the principal is employing the linear pricing
rule q(p) = p. Suppose that the agent arrives to the market and
sees the prices (.8, .2). By bringing the market prices to the agent’s
true belief, (.8, .75), the agent has zero expected utility. This is because the principal will pursue d1 , which the agent holds no stake
in. If the agent instead raises the price to (.8, .8 + ), then she
has positive expected utility: even though moving the market price
beyond .75 incurs negative expected utility, this loss is more than
countered by the expected gains from the agent’s purchases at low
prices (from .2 to .75). Figure provides a graphical intuition for
the agent’s expected utility. She gains positive utility .5(.552 ) from
raising the price to .75, but loses .5(.05 + )2 ) from raising the
price from .75 to .8 + . Her net utility is ≈ .15 > 0.
1.0
0.8
q
0.6
0.4
0.2
and m solves
Z q(p∗ +m/2)
q(p∗ + m/2) − q(x) dx ≤ z/2
q(p∗ )
Then an agent facing initial prices (, p∗ + m/2) will have greater
positive utility for moving prices to cause d1 to be chosen rather
than d2 .
This type of manipulation can occur even in the face of an infinite
stream of agents with accurate beliefs.
E XAMPLE 3. Imagine a second omniscient agent arriving in
the market in our example after the participation of the first agent.
That agent would not be motivated to change the prices he sees, and
would leave the market at (.8, .8 + ). By repeating this argument,
one can see that even an infinite stream of agents would not correct
the market prices, leading the principal to select an action less
likely to generate the beneficial outcome.
An alternative manipulation for the agent in the first example
would be to bring the price of d2 to its true value and then artificially lower the price of d1 below it. However, this manipulation
would be rectified by the next agent and is not resilient in the way
artificially raising d2 beyond d1 is.
3.3
Families of optimal pricing rules
Even though Theorem 8 shows all markets of this type can be
manipulated, we note that the possibility of manipulation relies
on an agent balancing the positive expected utility from bringing
a price to its true value against the negative expected utility of raising it above its true value. In this section, we derive two families
of pricing rules that minimize a risk-neutral principal’s loss from
manipulation, which is equivalent to minimizing the amount that
an agent would be incentivized to distort the price.
Let the true maximum probability of achieving the principal’s
objective be p, and let the last mover in the market be an agent with
correct beliefs. We do not assume anything about the behavior of
the other market participants: market prices faced by the agent we
are studying are arbitrary. Let U be the principal’s utility of achieving the beneficial outcome, and let 0 be the principal’s utility of not
achieving the outcome, so a risk-neutral principal has expected utility pU from implementing the best action.
Definition 10. A pricing rule has an approximation ratio k > 1
if the expected utility of the principal after the participation of the
manipulating agent is never less than pU/k.
0.0
0.0
0.2
0.4
0.6
0.8
1.0
p
Figure 2: A graphical depiction of a manipulating agent raising
the probability of achieving the outcome if an action is taken
from .2 to .8 given a true belief of .75. The agent makes positive
expected utility for P (o|di ) < .75 and negative expected utility
for .75 < P (o|di ) ≤ .8. The areas expressed in the figure correspond to the amount of expected utility for the manipulator.
T HEOREM 8. All pricing rules q(p) are susceptible to a form of
manipulation in which an agent raises the price of a non-optimal
action higher than the best action.
P ROOF. Let P (o|d1 ) = p∗ and P (p|d1 ) = p∗ + m/2, where
Z q(p∗ )
z=
p∗ − p(x) dx
q()
Definition 11. A pricing rule has an additive approximation k >
0 if the expected utility of the principal after the participation of the
manipulating agent is never less than U (p − k).
T HEOREM 9. If a pricing rule is unbounded from below, then
it has an unbounded worst-case approximation ratio.
P ROOF. A pricing rule unbounded from below admits a manipulating agent to derive an unbounded amount of positive expected
utility by raising the price of an action with a price near 0. Formally,
let there be two possible actions, d1 and d2 . Let the true probabil1
.
ities (and our agent’s beliefs) be P (o|d1 ) = 12 and P (o|d2 ) = 2k
1
Let the market prices before our agent participates be 2 and . Because the pricing rule is unbounded from below,
Z q( 1 )
2k
lim
1/2k − p(q) dq = ∞
0
q →−∞
q0
Since the pricing rule q(p) has pre-image (0, 1),
Z 1
2
1
p(q) −
dq < ∞
1
2k
q(
)
2k
Therefore, our agent will have positive expected utility from pushing the price of d2 above 12 , causing action d2 to be chosen, and
d2 is k times worse than d1 for the principal. This holds for any k.
Thus the pricing rule does not have finite approximation ratio. This result means that the most commonly used pricing rule in
practice, the logarithmic market scoring rule, has bad worst-case
incentive properties, because it is unbounded from below [Hanson,
2007, Pennock and Sami, 2007]. We now proceed to derive families
of optimal pricing rules that are bounded from below.
3.3.1
0
p
Since the two sides are trivially equal at p = 0, the following relationship of their derivatives with respect to p must hold:
q(p) ≤ p(k − 1)kq 0 (kp) + q(p) − q(kp)
q(kp) ≤ p(k − 1)kq 0 (kp)
Substituting q(p) = pn yields
k = 1 + 1/n,
which is asymptotically optimal for large n.
Additively optimal
A pricing rule q(p) with additive error k satisfies, for all p,
Z p
Z p+k
q(x) dx ≤
q(p + k) − q(x) dx
0
p
Since we have equality between the two sides at p = 0, it is sufficient to consider the derivatives of both sides with respect to p.
By the rule for taking derivatives under the integral sign, we can
differentiate both sides with respect to p to get
q(p) ≤ q(p) + kq 0 (p + k) − q(p + k)
which simplifies to
q(p + k) ≤ kq 0 (p + k)
Taking equality and solving the resulting differential equation shows
that the pricing rule
q(p) = ep/k
has additive approximation k, which is optimal as k approaches 0.
3.4
3.4.1
Joint elicitation: P (o ∩ di )
One might think the impossibility results of the previous section
were a function of the different contracts existing in separate probability universes, and that by collapsing the state space into a single
universe, one might avoid manipulation.
In our joint elicitation market, there exist contracts for actions
joined with the probability of the outcome being achieved (i.e.,
P (o ∩ di )). There is also a contract for estimating the chosen action, P (di ). These prices enable the principal to determine the
probabilities pertinent to his decision by using Bayes’ rule:
P (o|di ) =
Multiplicatively optimal
A pricing rule q(p) with approximation ratio k > 1 satisfies, for
all p,
Z p
Z kp
q(x) dx ≤
q(kp) − q(x) dx
3.3.2
In this section, we explore two alternative market designs, which
we dub joint elicitation and direct elicitation.
Alternative market designs
We have shown that our market format, regardless of the specific
pricing rule used, is susceptible to producing a final price prediction which induces the market maker to take a non-optimal course
of action. Put another way, it is possible for the market to get stuck
in a set of prices that do not produce the optimal action. Though in
the last section we developed families of asymptotically worst-case
optimal pricing rules for our market format in an effort to mitigate
this shortcoming, one might wonder whether some alternative format might have better properties.
P (o ∩ di )
P (di )
The fundamental difference between a joint estimation and a
conditional estimation market is that all the contracts in the joint
markets exist within the same probability universe. That is, in a
conditional elicitation market attempting to measure P (o|di ), the
agent receives a null outcome (i.e., does not have to pay and does
not get paid anything) if di is not chosen by the principal. On the
other hand, in a joint market attempting to measure P (o ∩ di ), the
agent is guaranteed to receive a payoff regardless of the action ultimately taken.
Unfortunately, joint elicitation has poor incentives for the agent
who is last to act in the market. She can bring the price of the
currently winning action to zero (that is, both P (di ) = 0 as well
as P (o ∩ di ) = 0), and select another, arbitrary action (say, dj )
to bring to a non-zero probability. The agent benefits from this
manipulation since neither di nor o ∩ di occur.
3.4.2
Direct elicitation: P (o)
We now consider another market design, which we coin direct
elicitation, in which the principal makes a decision based on the
price of the contract that pays out if the outcome actually occurs,
P (o).
Formally, the principal has a decision rule D : (0, 1) 7→ D,
which maps the price of the outcome to an action. This rule can be
published along with the market or not; all that matters for the impossibility results of this section is that the agent has an actionable
belief about the principal’s decision rule.
The key difference between a market of this form and a traditional prediction market is whether the principal is impacted by
market prices, or whether the market serves only as a reflection of
events, that is, whether the market is an engine or a camera. We
contend that many of the existing corporate prediction markets in
practice are in fact engines while they were designed as cameras.
For example, if a principal sees a high probability that a product
will be delayed months, perhaps he will devote additional expenditures to seeing it arrive on time, or maybe he will cancel the product
entirely. The important thing is that the principal adjusts his behavior in response to market price. This is almost certainly the case
for internal corporate markets—or else why pay a startup to run a
prediction market for you in the first place?
Now, imagine that the principal can exert high effort h or low
effort l to achieving his desirable outcome o. The agent knows
that the true probabilities are P (o|h) = 1 and P (o|l) = 0. Let
the market price be .5 before the agent acts (and that there are no
other agents that will act after her, or that every agent that will
act afterwards is rational). Imagine that the principal exerts effort
as follows. If he observes market price P (o) > .5, he exerts low
effort (perhaps because such effort is costly), and ultimately fails to
achieve his goal. Similarly, if he observes price P (o) < .5, he will
exert high effort and achieve his goal. Now, consider how a rational
agent will interact with such a market. If she buys and drives the
price above .5, she will regret buying. If she sells and drives the
price below .5, she will regret selling. Thus, she will not interact
with the market.
Though our contradiction has been couched in the language of
adverse selection, there is no loss of generality here. The decision
to exert high or low effort is in line with any other decision which
could positively or negatively affect the probability of achieving the
goal.
Furthermore, the no-trade result also holds if an automated market maker were not present. In a regular limit order market, a rational agent would be willing to sell at any price under .5 and buy
at any price greater than .5. If the market uses conventional order
clearing rules—that any time a bid prices exceeds an ask price a
trade takes place at the earlier-placed price, no other trader in the
market would match the agent’s orders.
Our no-trade result is fundamentally different than the classic notrade result of Milgrom and Stokey [1982]. In their setting, rational
agents are unable to “agree to disagree", so speculative trade cannot
exist. Purely speculative trade is zero-sum. Since a rational agent
knows that no other rational agent would offer a counterparty a
trade with positive expected value, no trade occurs. In contrast, our
no-trade results stems from the fact prices are endogenous (selfreferential): the price affects the action, which affects the success
probability, which affects the price.
The difference between the settings is perhaps most clear if we
imagine an automated market maker interacting with a rational riskneutral agent with correct beliefs. In the Milgrom and Stokey setting, the agent would adjust the market price to the true probability
of the event occurring. In our setting, the agent does not interact
with the automated market maker at all.
4.
result.
There are many interesting research directions in this new area.
The first would be examining the space of discontinuous scoring
rules, being aware that such rules would nevertheless have to obey
Theorem 2. Another area of exploration would be combinatorial
decision rules, in which the principal could take more than one of
the available actions.
We focused on deterministic decision rules because they are natural. It is known that adding randomness can significantly increase
the incentive-compatible feasible space of mechanisms in other settings—
prominent examples include voting [Gibbard, 1977] and sybill-proof
(false name resistant) voting [Wagman and Conitzer, 2008]. Future
research should study randomized decision rules in our setting as
well. Trivial randomized solutions—like shifting probability to
each action—do not get around our impossibilities, like the nonexistence of strictly proper scoring rules (because the manipulations
in our examples and proofs yield the manipulator benefits that are
not infinitesimally small).
It would also be interesting to explore a larger characterization of
the peculiar no-trade impossibility of Section 3.4.2. Does it apply
in other economic interactions besides markets of this sort? The
key seems to be that any action an agent takes has negative utility
because of the effects of taking that action.
Another important direction concerns what happens when agents
are interested in the action taken by the principal. For instance,
an expert could advocate doubling a company’s advertising budget because she works in the marketing department. This type of
setting is related to recent work by Shi et al. [2009], who study a
setting where an agent can perform an action after a market runs
such that the market can incentivize the agent to act counter to the
principal’s goal. Whether or not the agents can take actions after
the decision market, it would desirable to use decision markets to
balance the utilities of agents impacted by the principal’s decision
with the principal’s desire to achieve his goal.
CONCLUSION
We initiated the study of decision rules and decision markets in
settings where a principal needs to select an action and is advised
by a self-interested but decision-agnostic expert about the success
probabilities of alternative actions.
We began by investigating the properties of general deterministic
decision rules in the context of eliciting from a single expert. We
proved results about the relations between the principal’s decision
rule and the rules that specify the expert’s payoff if the desired outcome is, and is not, achieved. For the most natural decision rule
(where the principal takes the action with highest success probability), we showed that no symmetric scoring rule, nor any of Winkler’s asymmetric scoring rules, are quasi-strictly proper. We characterized the set of differentiable quasi-strictly proper scoring rules
and constructed an asymmetric scoring rule that is quasi-strictly
proper.
Moving to decision markets where multiple experts interact by
trading, we showed a surprising impossibility for every automated
market maker, where an agent is incentivized to artificially raise the
price of a non-optimal action (again under the decision rule where
the principal takes the action with highest success probability). To
counter this impossibility, we constructed two families of asymptotically optimal pricing rules against this form of manipulation,
one additively optimal and the other multiplicatively optimal. Finally, we considered two alternative market designs for decision
markets. The first, in which all outcomes live in the same probability universe, has even worse incentives (for the final participant).
The second, in which the experts trade on the probability of the
outcome occurring unconditionally, exhibits a new kind of no-trade
Acknowledgements
This work is supported by NSF IIS-0905390.
References
J. Berg and T. Rietz. Prediction markets as decision support systems. Information Systems Frontiers, 5(1):79–93, 2003.
J. Berg, R. Forsythe, F. Nelson, and T. Rietz. Results from a Dozen
Years of Election Futures Markets Research. Handbook of Experimental Economics Results, 2001.
Y. Chen, D. Reeves, D. Pennock, R. Hanson, L. Fortnow, and
R. Gonen. Bluffing and Strategic Reticence in Prediction Markets. In WINE, 2007.
A. Gibbard. Manipulation of schemes that mix voting with chance.
Econometrica, 45:665–681, 1977.
R. Hanson. Combinatorial information market design. Information
Systems Frontiers, 5(1):107–119, 2003.
R. Hanson. Logarithmic market scoring rules for modular combinatorial information aggregation. Journal of Prediction Markets,
1(1):1–15, 2007.
P. Milgrom and N. Stokey. Information, Trade and Common
Knowledge. Journal of Economic Theory, 26(1):17–27, 1982.
D. Pennock and R. Sami. Computational Aspects of Prediction
Markets. In Algorithmic Game Theory, chapter 26, pages 651–
674. Cambridge University Press, 2007.
P. Shi, V. Conitzer, and M. Guo. Prediction Mechanisms That Do
Not Incentivize Undesirable Actions. In WINE, 2009.
L. Wagman and V. Conitzer. Optimal false-name-proof voting rules
with costly voting. In AAAI, pages 190–195, 2008.
R. Winkler. Evaluating probabilities: Asymmetric scoring rules.
Management Science, pages 1395–1405, 1994.
J. Wolfers and E. Zitzewitz. Five Open Questions About Prediction
Markets. Technical Report 1975, Institute for the Study of Labor
(IZA), Feb. 2006.