Self-Confirming Price-Prediction Strategies for

Self-Confirming Price-Prediction Strategies for
Simultaneous One-Shot Auctions
Michael P. Wellman1 , Eric Sodomka2 , and Amy Greenwald2
1
Computer Science & Engineering, University of Michigan
2
Computer Science, Brown University
March 29, 2012
Abstract
Bidding in simultaneous auctions is challenging because an agent’s value for a
good in one auction may depend on the outcome of other auctions; that is, bidders
typically face an exposure problem. Given the gap in understanding (e.g., lack of
game-theoretic solutions) of general simultaneous auction games, previous works
have tackled the problem of how to bid in these games with heuristic strategies that
employ probabilistic price predictions—so-called price-prediction strategies. We
introduce a concept of self-confirming prices, and show that within an independent private value model, bidding optimally with respect to self-confirming price
predictions is w.l.o.g. in equilibrium. In other words, Bayes-Nash equilibrium can
be fully characterized as a profile of optimal price-prediction strategies with selfconfirming predictions. We exhibit practical procedures to compute approximately
optimal bids given a probabilistic price prediction, and near self-confirming price
predictions given a price-prediction strategy. We call the output of our procedures
self-confirming price-prediction (SCPP) strategies. An extensive empirical gametheoretic analysis demonstrates that SCPP strategies are effective in simultaneous
auction games with both complementary and substitutable preference structures.
1
Introduction
One of the most attractive features of automated trading is the ability to monitor and
participate in many markets simultaneously. Compared to human traders, software
agents can take in data from multiple sources at very high throughput rates. In principle, software agents can also process massive quantities of information relevant to
trading decisions in short time spans. In practice, however, dealing with multiple markets poses one of the greatest strategic challenges for automated trading. When markets interact, a strategy for bidding in one market must consider the implications of and
ramifications for what happens in others.
Markets are interdependent when an agent’s preference for the outcome in one
market depends on results of other markets. For instance, when an agent’s value for
1
one good is increased by obtaining another, the goods are complements. Dealing with
multiple markets under complementary preferences presents an agent with the classic
exposure problem: before it can obtain a valuable bundle, the agent must risk getting
stuck with a strict subset of the goods, which it may not have wanted at the prevailing
prices. Exposure is a potential issue for substitute goods as well, as the agent risks
obtaining goods it does not want given that it obtains others.
The pitfall of exposure is a primary motivations for combinatorial auctions [Cramton et al., 2005], where the mechanism takes on the responsibility of allocating goods
respecting agents’ expressed interdependencies. Combinatorial auctions are often infeasible, however, due nonexistence of an entity with the authority and capability to
coordinate markets of independent origin. Consequently, interdependent markets are
inevitable. Nonetheless, there is at present very little fundamental understanding of
agent bidding strategies for these markets. Specifically, how should an agent’s bidding
strategy address the exposure problem?
We address this question in what is arguably the most basic form of interdependent
markets: simultaneous one-shot sealed-bid (SimOSSB) auctions. Despite the simplicity of this mechanism and the practical importance of the exposure issue, there is little
available guidance in the auction theory literature on the strategic problem of how to
bid in simultaneous OSSB auctions. We aim to fill this gap by providing computationally feasible methods for constructing bidding strategies for the simultaneous-OSSBauction environment, which we justify with both theory and evidence from simulationbased analysis. Specifically, we:
• characterize Bayes-Nash equilibria of simultaneous one-shot auctions as best
responses to price predictions (§3);
• provide bounds on approximate Bayes-Nash equilibria in terms of the accuracy
of price predictions and the degree of optimality of responses (§4);
• introduce operational methods to construct bidding strategies that respect the
equilibrium form (§5 and §6); and
• demonstrate through a comprehensive empirical game-theoretic analysis the efficacy of these strategies compared to a wide variety of heuristics proposed in
the literature (§7).
2
Previous Work
Theoretical results about general simultaneous auction games are few and far between.
The leading auction theory textbook [Krishna, 2010] treats sequential but not simultaneous auctions,1 and the most influential comprehensive survey [Klemperer, 2004]
addresses simultaneity only in the context of ascending or multi-unit auctions.
In the first work to derive an equilibrium of a simultaneous-auction game, EngelbrechtWiggans and Weber [1979] tackle an example with perfect substitutes, where each
1 This despite the fact that its author is responsible for some of the few results [Krishna and Rosenthal,
1996] in this area.
2
agent is restricted to bid on at most two items. Their analysis was performed in the
large-limit of auctions and agents, and exhibited a mixed equilibrium where the agents
diversify their bids even though the items are indistinguishable. Most of the remaining
published theoretical results on simultaneous OSSB auctions are due to Rosenthal and
colleagues. Krishna and Rosenthal [1996] studied a second-price setup assuming independent private values with two categories of bidders: local bidders who have value
only for a single item, and global bidders who have superadditive values for multiple items. The authors characterize an equilibrium that is symmetric with respect to
the global bidders, and show, somewhat surprisingly, that an increase in the number
of bidders often leads to less aggressive bidding. Rosenthal and Wang [1996] tackled a first-price setup, assuming synergies and common values. Szentes and Rosenthal [2003] studied a class of two-bidder auctions with three identical objects; in their
model, bidders’ marginal values are complete information and are first increasing but
then decreasing in the number of items won.
Recently, Rabinovich et al. [2011] have generalized fictitious play to games of incomplete information with finite actions, and applied their technique to a class of simultaneous second-price auctions. The authors compute approximate equilibria for
environments with complements and substitutes where utilities can be expressed as
linear functions over a one-dimensional type space.
Complementing these theoretical treatments, researchers have also designed trading strategies applicable to simultaneous auctions, which address the exposure problem
through heuristic means. For example, in the Trading Agent Competition (TAC) Travel
game [Wellman et al., 2007], agents face an exposure problem for hotels—they must
obtain a room for each night for the client, otherwise the whole trip is infeasible. Experience from TAC and many other domains has demonstrated the importance of price
prediction for bidding in interdependent markets [Wellman et al., 2004]. Given probabilistic predictions of prices across markets, the agents can manage exposure risk,
choosing bids that trade off the profits and losses of the possible bundles of goods they
stand to win.
Greenwald and Boyan [2004] framed the problem of bidding across interdependent
markets given probabilistic price predictions. Follow-on work [Greenwald et al., 2009,
Wellman et al., 2007] formalized this bidding problem in decision-theoretic terms, and
established properties of optimal bidding strategies given the assumption that bids do
not affect other-agent behaviors. Further experimental comparison was performed by
Greenwald et al. [2010]. These works introduced a taxonomy of heuristic bidding
strategies [Wellman, 2011], which we employ in the current study.
Self-confirming price-prediction bidding strategies were first explored in the context of simultaneous ascending auctions (SimAAs) [Cramton, 2005]. For the SimAA
environment, SCPP strategies were found to be highly effective at tackling the exposure
problem [Wellman et al., 2008]. To our knowledge, no other general price-prediction
methods have been proposed for simultaneous OSSB auctions, other than learning from
historical observations.
3
3
Price-Prediction Strategies and Equilibrium
We consider a market with m goods, X = {1, . . . , m}, and n agents. Agent i’s value
for a bundle X ∈ 2X is given by vi (X), where vi (X) ∈ [0, V̄ ]. We assume free
disposal: if X ⊆ X 0 , then vi (X) ≤ vi (X 0 ).
The m goods are allocated to the agents via SimOSSB auctions, one per good.
That the mechanism is simultaneous means that each agent i submits a bid vector b i =
m
(b1i , . . . , bm
i ) ∈ R+ before a specified closing hour. That the auctions are one-shot
means that all auctions compute and report their results upon the closing hour. That the
bids are sealed means that agents have no information about the bids of other auction
participants until the outcome is revealed.
The second-price sealed-bid (SPSB) auction is an OSSB auction in which the winning bidder pays the second-highest bid rather than its own (highest) bid. The environments studied in our empirical game-theoretic analysis below employ the SPSB
mechanism. For simplicity of description we focus on SPSB throughout, although our
theoretical results hold as well for first-price sealed-bid auctions, or indeed any auction
mechanism where the outcome to agent i (whether it gets the good and the price it
pays) is a function of i’s own bid and the highest other-agent bid.
Our investigation employs the familiar independent private values (IPV) model
(see, for example, [Krishna, 2010]), where each agent i’s values are drawn independently from a probability distribution that is common knowledge. Under IPV, it is a
dominant strategy for an agent to bid its true value in a single SPSB auction. This
result does not generalize to simultaneous SPSB auctions, however, unless the agent’s
value over bundles happens to be additive. When agents’ values for different goods
interdepend (e.g., through complementarity or substitutability), bidding truthfully in
simultaneous auctions is not even an option, as the value for an individual good is not
well-defined.
To deal effectively with interdependent markets, an agent’s bid in each auction
must reflect its beliefs about the outcomes of others. We consider beliefs in the form
of predictions about the prices at which the agent might obtain goods in the respective
auctions. Bidding strategies that are explicitly cast as functions of some input price
prediction are termed price-prediction (PP) strategies.
We denote by pj ∈ R+ a price for good j. The vector p = hp1 , . . . , pm i associates
a price with each good. We represent price predictions as probability distributions
over the joint price space. We use the symbol Π for such predictions in the form of
cumulative probability distributions,
Πp (qq ) = Pr(pp ≤ q ),
(1)
where p ≤ q holds iff pj ≤ qj for all j. We generally omit the subscript p as understood.
We have elsewhere argued [Wellman, 2011, Wellman et al., 2007] that price prediction is a key element of trading agent architecture, for a broad range of complex trading
environments. Here we make a stronger claim: given IPV, PP strategies are necessary
and sufficient for optimal bidding in SimOSSB auctions. The remainder of this section
presents our support for this claim.
4
Let w(bb, q ) = {j | bj > qj } denote the set of goods an agent would win by bidding
b when the highest other-agent bids are q .2 Agent i’s utility for a bid given others’ bids
can thus be written as
X
ui (bb, q ) = vi (w(bb, q )) −
qj .
(2)
j∈w(bb,qq )
Definition 1 (Optimal PP Bidders). An optimal PP bidding strategy s∗ (Π) submits
bids that maximize expected utility given a price prediction Π,
s∗ (Π) ∈ arg max Eq ∼Π [ui (bb, q )].
b
(3)
In games of incomplete information like auctions, each agent’s strategy produces
actions (here, bids) as a function of its type. Under IPV, knowing other agents’ values
does not tell agent i anything about its own value. Therefore, the expected utility of
bidding b i depends only on b i and the marginal distribution of other-agent bids induced
by their underlying valuation distributions (agent i need not condition on its own information). It follows that the best response to any profile of other-agent strategies
depends only on the distribution of other-agent bids.
Let bk∗
−i denote the highest bid submitted to the auction for good k by an agent
other than i. Since agent i’s utility depends only on what it wins and what it pays,
the distribution of highest other-agent bids (i.e., the distribution of bk∗
−i ) is a sufficient
statistic for the other-agent bid distributions. This distribution can be expressed in the
form of a price prediction (1).
Therefore, a best response to other-agent bidding strategies takes the form of an
optimal PP bidding strategy (3), where the input PP is the distribution of highest otheragent bids induced by those other-agent strategies. Since a Bayes-Nash equilibrium
(BNE) is nothing more than a profile of mutual best-response strategies, any profile
of optimal PP strategies, where the PP for each equals the distribution of highest bids
induced by the other agents’ optimal PP strategies, must constitute a BNE. Moreover,
any BNE can be characterized as a profile of optimal PP strategies, or mixtures thereof.
We have thus established the following complete characterization of Bayes-Nash equilibria for the simultaneous OSSB auction game.
Theorem 1. Suppose a SimOSSB auction game, with independent private values,
where the outcome of each auction to agent i depends only on i’s bid and the highest
other-agent bid in that auction. Then the strategy profile s = (si , s−i ) is a Bayes-Nash
equilibrium if and only if, for all i, si is equivalent to an optimal PP bidding strategy
m∗
with input Π(qq ) = Pr((b1∗
−i , . . . , b−i ) ≤ q | s−i ), or equivalent to a mixture of such
optimal PP strategies.
Observe that the price predictions that support strategic equilibrium are themselves
in a form of equilibrium. Specifically, the bidding strategies employ price predictions
2 In the case of ties, the winner is chosen uniformly from among the high bidders. For simplicity of
notation and exposition, we ignore ties, which are a rare occurrence given our setup of real-valued bids and
rich valuation functions. Our theoretical results require that the highest other-agent bid is a sufficient statistic
for outcome, which is true as long as one’s bid does not tie for best with two or more other bidders’.
5
that are actually borne out as correct (i.e., the distributions generated by the strategies
are as predicted) assuming that everyone follows the given bidding strategies. This
situation can be viewed as a form of rationalizable conjectural equilibrium (RCE) [?],
where each agent’s conjecture is about the distribution of highest other-agent bids. In
this instance, the RCE is also a BNE, since the conjecture provides sufficient information to determine a best response. The import of our result is not the observation
that such an RCE exists, but rather that it takes the natural and compact form of a distribution over prices, which, as we demonstrate, leads to a computationally plausible
method for developing effective bidding strategies.
An auction game with IPV is symmetric if all agents have the same probability distribution over valuations. In earlier work on simultaneous ascending auctions Wellman
et al. [2008], we considered the symmetric IPV case and referred to price predictions
in this kind of equilibrium relationship as self-confirming.
Definition 2 (Self-Confirming Price Prediction (SCPP)). The prediction Π is selfconfirming for PP strategy s in Γ iff Π is equal to the distribution of the highest otherm∗
agent prices (b1∗
−i , . . . , b−i ) when all agents play s(Π).
The following theorem specializes Theorem 1 for the symmetric case, employing
the language of self-confirming price predictions.
Theorem 2. In symmetric IPV SimOSSB auctions, a symmetric pure BNE comprises
optimal PP bidders employing self-confirming price predictions.
Hence, existence of pure symmetric BNE in such games entails existence of selfconfirming price predictions.
In summary, for SimOSSB auctions under IPV, we can restrict attention to optimal PP strategies employing price predictions that are in equilibrium with one another
(which for the symmetric case, means SCPPs). Of course, just because we can does
not mean that we should. To make the positive case, in the remainder of this paper, we
demonstrate that
1. price prediction strategies are amenable to effective approximation;
2. price prediction is a convenient abstraction based on which to design and implement a trading strategy; and
3. price prediction strategies exhibit a high degree of robustness across simulation
environments.
4
Approximate Price Prediction
We have shown that an optimal PP strategy is a best response to the strategies of other
agents, if price predictions exactly reflect the other-agent bid distributions. Since it is
unrealistic to expect perfect price prediction, we examine the consequences of employing PP strategies given price predictions with approximate accuracy.
We measure the distance between probability distributions in two ways. The first is
a multivariate form of the Kolmogorov-Smirnov (KS) statistic: KS (Π, Π0 ) ≡ supq |Π(qq )−
6
Π0 (qq )|. Second, we define the bundle probability distance, BP (Π, Π0 , b ), with respect
to a bid b :
X
| Pr (w(bb, q ) = X) − Pr 0 (w(bb, q ) = X)|/2.
X⊆X
q ∼Π
q ∼Π
To analyze the impact of imperfect price predictions, we separately consider the
effects on expected value of winnings and payment. Let us denote the payment for
bPgiven highest other-agent bids q by ψ(bb, q ), so that for SPSB we have ψ(bb, q ) =
j∈w(bb,qq ) qj . Overall expected utility is the difference between expected value of
winnings and payment:
Eq [ui (bb, q )] = Eq [vi (w(bb, q ))] − Eq [ψ(bb, q )].
(4)
where the first term represents expected value of winnings, and the second expected
payment.
We first consider payment. The following bounds the effect on expected payment
of imperfect price predictions.
Pm
Lemma 3. Let δKS = KS (Π, Π0 ) and kbbk1 ≡ j=1 bj . Then for all b ,
Eq ∼Π [ψ(bb, q )] ≤ Eq ∼Π0 [ψ(bb, q )] + 2δKS (kbbk1 ),
(5)
Proof. Since the total payment is the sum of the individual-good payments, the expected total payment is the sum of the expected individual-good payments. Let us
therefore consider a particular good j. A variant distribution Π can increase expected
payment for j in two ways: by increasing the expected price paid for j, or by decreasing prices so that the agent is more likely to win j under Π than under the baseline
distribution Π0 . The first effect is bounded by a situation where probability mass δKS
is shifted from pj = 0 to pj = bj − , for some arbitrarily small > 0. This increases expected payments by no more than δKS bj ; any greater shift would violate the
constraint on KS distance. At the same time, we can shift another probability mass
δKS from a point pj > bj , to pj = bj − , which by the second effect increases the
probability of bj winning by δKS , and thus expected payment by another δKS bj . Doing
the same for every good yields a total expected payment increase bounded above by
2δKS kbbk1 .
This bound is tight. Consider a single good. Suppose the baseline prediction is
bimodal: [(0.5, 0); (0.5, 20)] (i.e., with probability 0.5, the price will be zero; and with
probably 0.5, the price will be 20), and consider a bid of 10. At the baseline, expected
payment is zero: either the good is won at a price of zero, or the good is not won (again,
at a cost of zero). Now change the probability distribution to [(0.5 − δ, 0); (2δ, 10 −
); (0.5 − δ, 20)], for some infinitesimal > 0. Observe that this variant distribution
satisfies the KS constraint: the CDF is off by −δ in the range [0, 10 − ) and by +δ in
the range [10 − , 20). Further, a bid of 10 wins the good at a price of 10 (essentially)
with probability 2δ, so the expected payment is 20δ.
We can similarly bound the effect of inaccurate price predictions on expected value
of winnings. A variant distribution can degrade expected value of winnings only by
decreasing the probability of winning valuable bundles. By constraining BP distance,
7
we can ensure, for any set of bundles, that the total probability of winning a bundle
from that set at b can decrease by at most δBP . This means that the expected value of
winnings can suffer by at most δBP V̄ .
Lemma 4. Let δBP = BP (Π, Π0 , b ). Then
Eq ∼Π [vi (w(bb, q ))] ≥ Eq ∼Π0 [vi (w(bb, q ))] − δBP V̄ .
(6)
Combining the lemmas, we have the following bound on expected utility (4).
Theorem 5. Let δKS = KS (Π, Π0 ) and δBP = BP (Π, Π0 , b ). Then for all i,
Eq ∼Π [ui (bb, q )] ≥ Eq ∼Π0 [ui (bb, q )] − δBP V̄ − 2δKS kbbk1 .
Henceforth, let us denote by b̄ the maximum payment, that is, the L1 -norm of the
greatest possible bid vector. The value of b̄ is bounded above by mV̄ for any rational
bidding strategy under any valuation distribution, but typically it will be far less than
that.
Theorem 6. Suppose that for all agents i, strategy ŝi is a best response to other-agent
highest-bid distribution Π̂−i , and that Πŝ−i is the other-agent bid distribution actually
induced by ŝ−i . If for all i, KS (Π̂−i , Πŝ−i ) ≤ δKS , and for all b , BP (Π̂−i , Πŝ−i , b ) ≤
δBP , then ŝs constitutes an -Bayes-Nash equilibrium, for = 2δBP V̄ + 4δKS b̄.
Proof. We present the argument from the perspective of a fixed agent, omitting the i
and −i subscripts as understood. Let U (s, Π) denote the expected utility of playing
strategy s when the other-agent highest-bid distribution is Π. By the constraints on
distance and Theorem 5, we have
U (ŝ, Πŝ ) ≥ U (ŝ, Π̂) − δBP V̄ − 2δKS b̄.
Let s∗ be the best response to Πŝ . By the same reasoning, we have
U (s∗ , Π̂) ≥ U (s∗ , Πŝ ) − δBP V̄ − 2δKS b̄.
Combining these two equations, we get
U (ŝ, Πŝ ) + U (s∗ , Π̂) ≥ U (ŝ, Π̂) + U (s∗ , Πŝ ) − 2δBP V̄ − 4δKS b̄.
Rearranging,
U (s∗ , Π̂) − U (ŝ, Π̂) + 2δBP V̄ + 4δKS b̄ ≥ U (s∗ , Πŝ ) − U (ŝ, Πŝ ).
Since, ŝ is a best response to Π̂, the expected utility difference on the LHS is nonpositive, and so we have
2δBP V̄ + 4δKS b̄ ≥ U (s∗ , Πŝ ) − U (ŝ, Πŝ ),
which, when applied to all agents, is the condition for ŝ to be an approximate BNE.
8
5
Heuristic PP Bidding Strategies
Having shown that optimal PP bidding strategies are theoretically ideal in that they
comprise a BNE, we turn our attention to practical PP bidding strategies. Following
and building on prior work, we present a broad range of heuristic bidding strategies
defined as functions of available price predictions. Our focus in this section is on
describing the bidding strategies considered in this study, along with relevant concepts
necessary to understand how the strategies operate. Our methods for deriving price
predictions are presented in the following section.
5.1
Marginal Values and Optimal Bundles
Interdependence dictates that the value of any individual good must be assessed relative
to a bundle of goods. This idea is captured by the notion of marginal value.
Definition 3 (Marginal Value). Agent i’s marginal value, µi (x, X), for good x with
respect to a fixed bundle of other goods X is given by:
µi (x, X) ≡ vi (X ∪ {x}) − vi (X) .
Given a fixed vector of prices, p = hp1 , . . . , pm i, let σi (X, p ) denote agent i’s
surplus from obtaining the set of goods X at those prices:
X
σi (X, p ) ≡ vi (X) −
pj .
(7)
j|xj ∈X
Definition 4 (Acquisition [Boyan and Greenwald, 2001]). Given price vector p , the
acquisition problem selects an optimal bundle of goods to acquire:
X ∗ = ACQi (pp) ≡ arg max σi (X, p ),
X⊆X
where σ is defined by (7).
Faced with perfect point price predictions, an optimal bidding strategy would be to
compute X ∗ = ACQi (pp) and then to buy precisely those goods in X ∗ . By definition,
this strategy yields the optimal surplus at these prices:
σi∗ (pp) ≡ σi (ACQi (pp), p ).
To assess goods with respect to (typically imperfect) point price predictions, we extend the concept of marginal value. Let p [pj ← q] be a version of the price vector p with
the jth element revised as indicated: p[pj ← q] = hp1 , . . . , pj−1 , q, pj+1 , . . . , pm i.
Definition 5 (Marginal Value at Prices). Agent i’s marginal value µi (xj , p) for good
xj with respect to prices p is given by:
µi (xj , p ) = σi∗ (pp[pj ← 0]) − σi∗ (pp[pj ← ∞]).
9
Here, σi∗ (pp[pj ← 0]) represents agent i’s optimal surplus at the given prices, assuming it receives good xj for free. Similarly, σi∗ (pp[pj ← ∞]) represents the optimal
surplus at the given prices, if xj were unavailable. The difference is precisely the
marginal value of good xj with respect to its buying opportunities for other goods.
Note that Definition 5 generalizes Definition 3, under the interpretation that goods in
X have zero price, and all other goods have infinite price.
5.2
A Non-Predictive Baseline Strategy
To calibrate the performance of bidding strategies based on price prediction, we introduce a baseline strategy that employs no price information. Without price guidance,
there is little basis to choose among alternative bundles. We therefore simply assume
the agent selects its most valued set of goods, which is the solution to its acquisition
problem at zero prices. Let X ∗ ∈ ACQi (00).
The BaselineBidding strategy bids for goods in X ∗ , assuming a total budget of
v(X ∗ ). To determine the bid price for each individual good, it starts with the value
for each good taken individually, and evenly divides the remaining budget among the
goods. More precisely, let the excess value be given by
X
e = v(X ∗ ) −
v({x}).
x∈X ∗
For complementary preferences, e is guaranteed to be nonnegative. BaselineBidding
then bids for good j as follows:
(
v({xj }) + |Xe∗ | if xj ∈ X ∗ ,
BaselineBiddingj =
0
otherwise.
5.3
Bidding with Point Price Predictions
We consider first a set of strategies that employ predictions in the form of a vector of
point prices, p = hp1 , . . . , pm i. All strategies considered in this section and the next
are described with further motivation and detail by Wellman et al. [2007, Chapter 5].
The family of strategies TargetBidder restricts their bidding to goods in X ∗ =
ACQi (pp). They differ, however, in the amounts they bid. The TargetPrice strategy
bids the predicted price for each good in its optimal bundle.
(
pj if xj ∈ X ∗ ,
TargetPricej =
0 otherwise.
TargetMV bids at marginal value rather than predicted price.
(
µ(xj , p ) if xj ∈ X ∗ ,
TargetMVj =
0
otherwise.
10
TargetMV* also bids at marginal value, except that it calculates µ under the assumption
that goods outside the target bundle are unavailable.
(
µ(xj , p [p` ← ∞ | x` 6∈ X ∗ ]) if xj ∈ X ∗ ,
TargetMV*j =
0
otherwise.
The virtue of the TargetBidder strategies is that they never obtain goods outside
their choice bundle. A weakness is their fragility—if the agent fails to acquire all the
goods in this bundle, there is no recourse to others outside the set. The StraightMV
strategy hedges against incorrect price predictions by bidding marginal value for all
goods:
StraightMVj = µ(xj , p ).
5.4
Bidding with Price Distributions
A point price estimate fails to convey the uncertainty inherent in future prices. Probability distributions over prices provide a more general representation, expressing degrees of belief over the possible prices that might obtain. We therefore expand our
strategy set to incorporate bidding methods that take as input a probability distribution
over prices.
5.4.1
Expected Value Methods
The expected value method [Birge and Louveaux, 1997] approximates a stochastic optimization by collapsing probability distributions into point estimates through expectation. Let p̂pΠ = hp̂1 , . . . , p̂m i, where p̂j = Ep ∼Π [pj ] is the expectation of pj under
given price prediction Π.
Any bidding strategy defined for point price predictions can be adapted to take as
input distribution price predictions through this expected value method, simply by using p̂pΠ for the point price prediction. We thus define StraightMU, TargetMU, and
TargetMU* as strategies that take as input distribution predictions, but employ corresponding point prediction strategies in this way. “MU” stands for “marginal utility”,
but its usage here is simply to distinguish the name from its corresponding point strategy ending with “MV”.
We implemented two approaches to calculating p̂pΠ , given Π. The first computes
p̂pΠ by sampling from Π, and the second computes the exact expectation of a piecewise
approximation of Π. For the sampling method, accuracy depends on the number of
samples k drawn; thus we indicate a version of the strategy by appending “k” to the
name. For example, StraightMU8 calculates p̂pΠ as the mean of eight samples from Π,
and employs StraightMV with that average price vector as input.
5.4.2
AverageMU
Whereas StraightMU bids the marginal value of the expected price, the PP strategy
AverageMU bids the expected marginal value:
AverageMUj = Ep ∼Π [µ(xj , p )].
11
Our implementation samples from the price distribution, calculates marginal values for
each sample, and averages the results.
5.5
Explicit Optimization (with Price Distributions)
Finally, we consider strategies that explicitly attempt to optimize bids given a distribution price prediction, as do the optimal PP bidders introduced above (Definition 1).
5.5.1
BidEval
One heuristic optimization approach is to first generate candidate bid vectors, and then
evaluate them according to the given price distribution. The BidEval strategy uses other
bidding strategies to propose candidates, and estimates the candidates’ performance by
computing their expected utility with respect to the price prediction Π. It then selects
the candidate bid vector with the greatest expected utility estimate.
There are many variations of the BidEval strategy, defined by specifying:
• The method used to generate candidates. When this method is another named
bidding strategy, we indicate this fact in parentheses; for instance, BidEval(SMU8)
generates candidates using StraightMU8. The strategy BidEvalMix employs a
mix of methods to generate candidates.
• Number of candidates generated. Note that a single generation method based
on sampling naturally produces a diverse set of candidates. For example, each
invocation of StraightMU8 employs a new draw of eight samples from Π to
estimate p̂pΠ , thus we generally obtain different bids.
• Whether the candidates are evaluated by exact computation on a piecewise version of Π, or by sampling. If by sampling, then how many samples are used.
5.5.2
LocalBid
The LocalBid strategy (see Algorithm 1) employs a local search method in pursuit
of optimal bids. Starting with an initial bid vector proposed by another heuristic
strategy, LocalBid makes incremental improvements to that bid vector for a configurable number of iterations. Those incremental improvements are made good-bygood, treating all other goods’ bids as fixed. Assuming bids for all goods except j
are fixed, the agent effectively faces a single auction for good j, with the winnings
for other goods determined probabilistically. For a single SPSB auction, it is a dominant strategy to bid one’s expected marginal value for good j, which is given by
Ep ∼Π [v(w(bb, p ) ∪ {j}) − v(w(bb, p ) \ {j})], under these circumstances.
LocalBid is an iterative improvement algorithm: the expected utility of b is nondecreasing with each update. Further, if LocalBid converges, it returns a bid vector that
is consistent (unlike AverageMU), in the sense that each element of the vector is the
average marginal value for its corresponding good given the rest of the bid.
12
Algorithm 1 LocalBid
Input: price prediction Π, heuristic strategy s, number of iterations K
Output: bid vector b
Initialize b ← s(Π)
for k = 1 to K do
for j = 1 to m do
bj ← Ep ∼Π [v(w(bb, p ) ∪ {j}) − v(w(bb, p ) \ {j})]
return b
5.5.3
BruteForce
BruteForce considers every possible bid vector within some discretized bid space, and
returns the bid vector with the highest expected utility estimate. For small discretization
factors or a large number of goods, this heuristic is not computationally feasible. It is
therefore not part of our empirical game-theoretic analysis (§7), though we do use it as
a benchmark when evaluating the degree of optimality of other heuristics (§8).
6
Self-Confirming Price Predictions
Now that we have a suite of strategies that employ price predictions, we turn to the
question of how to generate such predictions. We propose here to employ self-confirming
price predictions (Definition 2), originally introduced and evaluated in the context of
simultaneous ascending auctions [Wellman et al., 2008]. Building on the ideas set forth
in this prior work, we propose methods to approximate both point and distribution selfconfirming price predictions.
6.1
Procedure for Approximating Self-Confirming Point Price Predictions
Given a game Γ, we employ a simple iterative procedure to derive an approximate selfconfirming point prediction for point-price-prediction strategy PP. At each time t, this
procedure takes as input point price prediction π t−1 and outputs point price prediction
π t . We take π 0 = h1, . . . , 1i.
At iteration t, we run G instances of game Γ, with all agents playing point-priceπ t−1 ). We tally the prices resulting from each instance, and
prediction strategy PP(π
denote the average price vector by p̄pt . We then update the price prediction by
π t ← π t−1 + κt p̄pt − π t−1 .
Here, κt is a decaying sequence, which controls the updating of predictions from one
iteration to the next. When the distance ∆ between the results of two successive iterations falls below threshold, the procedure halts and returns π t . We employ
π tj − π t−1
∆ ≡ max|π
j | ≤ τ.
j
13
Parameter
G
L
τ
τ
κt
Meaning
number of games per iteration
max number of iterations
price distance threshold (point)
KS threshold (distribution)
decay schedule
Value
106
100
0.001V̄
0.01
L−t+1
L
Table 1: Parameter settings for self-confirming price prediction procedures.
With decaying parameter
Point-wise price distance
Point-wise price distance
No decaying parameter
50
0
-50
0
50
Number of iterations
100
50
0
-50
0
50
Number of iterations
100
Figure 1: Effect of the decay parameter on the iterative procedure, given PP strategy
TargetMV*.
If this distance never falls below threshold, then the procedure terminates after L iterations and returns π L .
The procedure above is open to some flexibility in what one considers as the prices
resulting from each instance. We explore two versions of the above procedure, one in
which this resulting price is the highest other-agent bid (HB), and a second in which it
is the actual transaction price of the good (price).
We evaluated the convergence of this procedure using the parameter settings shown
in Table 6.1 with respect to three different PP strategies: StraightMV, TargetMV,
and TargetMV*. Computational results reported in this section employ the U [5, 8]
environment—eight agents with complementary preferences over five goods—described
in detail in Section 7.2.1. These runs also adopt the price convention (not HB) for interpreting price outcomes.
For StraightMV, we reach a fixed point within 20 iterations, even without a decay parameter. (The absence of a decay parameter is expressed by setting κt = 1.)
For TargetMV and TargetMV*, the price predictions tend to oscillate, hence we need
a decay schedule to settle the process. For these cases, we do not expect the results
to approximate self-confirmation as well. Nevertheless, the iterative procedure always produces a well-defined prediction, and so bidding strategies incorporating this
prediction-derivation method are likewise well defined.
Figure 1 shows a series of pointwise price distances between consecutive iterations
in deriving point price predictions for TargetMV* with and without a decay parameter.
At this setting, the iterative update of price predictions for TargetMV* fails to converge.
14
Since TargetMV* bids for goods in its target bundle under the assumption that no
others are available, these bids tend to be at the extremes of its value range. With all
agents playing this strategy, this leads to high prices, which causes changes in targets
in the next iteration (by all agents), causing these same goods to have very low prices
then. TargetMV produces a similar pattern, though attenuated by its consideration
of alternative buying opportunities. With regular oscillation, the effect of the decay
sequence is to return the average of the swings that would otherwise be observed.
6.2
Procedure for Approximating Self-Confirming Distribution Price
Predictions
A similar iterative procedure is used to approximate self-confirming distribution price
predictions. We employ a discrete approximation, rounding all observed prices to the
closest integer. We can thus represent the price prediction distribution using a probability mass function, f (qq ), corresponding to the probability Pr(pp = q ). We adopt
the simplifying (and incorrect) assumption that prices are probabilistically independent across goods, which allows us to maintain this function as an array of marginal
distributions, hf1 (q1 ), . . . , fm (qm )i.
To measure the difference between distributions at successive iterations, we adopt
the Kolmogorov-Smirnov statistic (Definition ??), KS (F, F 0 ). Here, F is the cumulative distribution function corresponding to f . Since we maintain our prediction in
terms of marginal distributions, our comparison takes the maximum of the KS statistic
separately for each good: KS marg (F, F 0 ) ≡ maxj KS (Fj , Fj0 ).
Our initial prediction F 0 considers all integer prices in the feasible range equally
likely. As in the point-prediction case, at iteration t, we run G instances of game Γ.
Here all agents play the distribution-prediction strategy PP(F t−1 ). We tally the prices
resulting from each instance, and denote the sample distribution for good j by f¯jt . We
then update the price prediction, for all j and q, by
f t (q) ← f t−1 (q) + κt f¯t (q) − f t−1 (q) .
j
t−1
j
j
j
When KS marg (F
, F ) falls below threshold, we halt and return F t . Or, if this
distance never falls below threshold, then the procedure terminates after L iterations
and returns F L .
A comparison of the self-confirming price predictions derived for StraightMU (distributions) and StraightMV (points) is depicted in Figure 2.
6.3
t
Accuracy of Self-Confirming Price Predictions
The decay parameter ensures the iterative procedure for deriving price predictions will
terminate, but potentially masks violations of self-confirmation in the result. To test the
accuracy of the self-confirming price predictions calculated for this study (both point
and distribution), we ran an extra iteration (i.e., one million additional game instances)
after a supposed self-confirming price prediction was returned by the procedure. The
distance between the supposed self-confirming price predictions for a variety of PP
strategies and the output of an extra iteration (without any attenuation) are reported in
Table 2. As shown in the table, all are self-confirming to a fine level of approximation.
15
Frequency of occurrence in a million simulations
4
15
x 10
Price distribution from StraightMU
10
Slot 1
Slot 2
Slot 3
Slot 4
Slot 5
5
0
0
10
20
30
40
50
Price
Point price prediction from StraightMV
Slot 1
Slot 2
Slot 3
Slot 4
Slot 5
0
10
20
30
40
50
Price
Figure 2: Self-confirming price predictions derived for StraightMU and StraightMV
for simultaneous SPSB environment U [5, 8].
7
Empirical Game-Theoretic Analysis
The heuristic strategies introduced in §5 represent plausible but not generally optimal approaches to bidding in simultaneous auctions. Even the strategies based on
explicit optimization (§5.5) fall short of ideal due to inaccuracy in price prediction and
non-exhaustive search of bid candidates. To evaluate the performance of these strategies, we conducted an extensive computational study, simulating thousands of strategy
profiles—millions of times each—in five different simultaneous SPSB environments.
Analysis of the game model induced from simulation data provides evidence for the
efficacy and robustness of approximately optimal PP strategies across these environments.
16
Point Strategy
StraightMV
TargetMV
TargetMV*
Distribution Strategy
StraightMU64
TargetMU64
TargetMU*64
AverageMU64
∆ Statistic
0.0116
0.0660
0.0553
KS marg Statistic
0.00450
0.00879
0.00788
0.00516
Table 2: Accuracy of self-confirming point and distribution price predictions.
7.1
Approach
The methodology of applying game-theoretic reasoning to simulation-induced game
models is called empirical game-theoretic analysis (EGTA) [Wellman, 2006]. In EGTA,
we simulate profiles of an enumerated strategy set playing a game, and estimate a
normal-form game from the observed payoffs. The result is a simulation-induced
game model, sometimes called the empirical game. By then applying standard gametheoretic solution concepts to the empirical game, we can draw conclusions about the
strategic properties of the strategies and profiles evaluated. Although the strategy space
considered in the empirical game is necessarily a severely restricted subset of the original, by including a broad set of strategies representing leading ideas from the literature,
we can produce relevant evidence bearing on the relative quality of heuristic strategies
in the simulated environments.
Our EGTA study of simultaneous SPSB environments followed these steps.
1. Define an environment: numbers of goods and agents, and valuation distributions.
2. Specify a set of heuristic strategies. For PP strategies, this includes deriving selfconfirming distribution price predictions to be input to these strategies, based
on the environment defined in Step 1. The full set of strategies included in our
EGTA study is described in Appendix A.
3. Simulate select profiles among these strategies, sampling from the valuation distributions for each simulation instance (at least one million per profile, most
profiles two million or more). Calculate mean payoffs for each strategy in each
profile.
4. Analyze the empirical game defined by these mean payoffs, to identify Nash
equilibria, dominance relationships, regret values, and other analytic constructs.
In actuality, Steps 2–4 were applied in an iterative and interleaved manner, with
intermediate analysis results informing the selection of strategies to explore and profiles to sample. The exploration and sample selection were guided manually, generally
driven by the objective of confirming or refuting equilibrium candidates among the profiles already evaluated. The process for each environment was terminated when all of
17
the following conditions were met: (1) a broadly representative set of heuristic strategies were covered, (2) all symmetric mixed profiles evaluated were either confirmed
or refuted as equilibria, and (3) all strategies showing relative success in at least one
environment were evaluated against the equilibria in all other environments. Overall,
the analysis commanded some tens of CPU-years over a roughly six-month period.
7.2
Environments
We evaluated five simultaneous SPSB environments, involving 3–8 agents bidding on 5
or 6 goods. The environments span two qualitatively different valuation distributions,
from highly complementary to highly substitutable. Both of these assume IPV and
symmetry, so that each agent receives a private valuation drawn independently from
the same distribution.
7.2.1
Scheduling Valuations
The first valuation distribution we employ in this study is based on a model of marketbased scheduling [Reeves et al., 2005]. Goods represent time slots of availability for
some resource: for example, a machine, a meeting room, a vehicle, or a skilled laborer.
Agents have tasks, which require this resource for some duration of time to complete.
Specifically, the goods X = {x1 , . . . , xm } comprise a set of m time slots available
to be scheduled. Agent i’s task requires λi time slots to accomplish, and the agent
values a set of time slots according to when they enable completion of the task. If
agent i acquires λi time slots by time t, it obtains value vit . Value with respect to time
0
is a nonincreasing function: for all i, if t < t0 then vit ≥ vit . If it fails to obtain a
∞
sufficient set of goods, the agent accrues value vi = 0. Let X ⊆ X denote a set of
slots. The expression |{xj ∈ X | j ≤ t}| represents the number of these that are for
time t or earlier. We can thus write
T (X, λ) = min ({t s.t. |{xj ∈ X | j ≤ t}| ≥ λ} ∪ {∞})
to denote the earliest time by which X contains at least λ slots. The overall valuation
function for agent i is then
T (X,λi )
vi (X) = vi
.
For each agent, a task length λi is drawn uniformly over the integers {1, . . . , m}.
Values associated with task completion times are drawn uniformly over {1, . . . , 50},
then pruned to impose monotonicity [Reeves et al., 2005]. The valuations induced by
this scheduling scenario exhibit strong complementarity among goods. When λ > 1,
the agent gets no value at all for goods in a bundle of fewer than λ. On the other hand,
there is some degree of substitutability across goods when there may be multiple ways
of acquiring a bundle of the required size.
We denote environments using this valuation by U [m, n], with m and n the numbers of goods and agents, and “U” indicating the uniform distribution over task lengths.
18
7.2.2
Homogeneous-Good Valuations
The second valuation distribution expresses the polar opposite of complementarity:
goods are perfect substitutes, in that agents cannot distinguish one from another. Agents’
marginal values for units of this good are weakly decreasing. Specifically, valuation is
a function of the number of goods obtained, constructed as follows. Agent i’s value for
obtaining exactly one good, vi ({1}), is drawn uniformly over {0, . . . , 127}. Its value
for obtaining two, vi ({1, 2}), is then drawn from {vi ({1}), vi ({1})+1, . . . , 2vi ({1})}.
In other words, its marginal value for the second good is uniform over {0, . . . , vi ({1})}.
Subsequent marginal values are similarly constrained not to increase. Its marginal
value for the kth good is uniform over {0, . . . , vi ({1, . . . , k−1})−vi ({1, . . . , k−2})}.
We denote environments using this valuation by H[m, n], with “H” an indicator for
homogeneity. Our EGTA study covered environments H[5, 3] and H[5, 5].
7.3
Regret
We evaluate the stability of a strategy profile by measuring regret, the maximal gain
a player could achieve by deviating from the profile. Formally, let Γ = {n, S, u(·)}
be a symmetric normal-form game with n players, strategy space S (the same for each
player, since the game is symmetric), and payoff function u : S × S n−1 → R. The
expression u(si , s−i ) represents the payoff to playing strategy si in a profile where the
other players play strategies s−i ∈ S n−1 .
Definition 6 (Regret). The regret (ss) of a strategy profile s = (s1 , . . . , sn ) is given
by
(ss) = max max
(u(s0i , s−i ) − u(si , s−i )) .
0
i
si ∈S
A Nash equilibrium profile has zero regret, and more generally regret provides a
measure of approximation to Nash equilibrium. Using this regret definition, profile s
is an (ss)-Nash equilibrium.
Regret is a property of profiles. Evaluation of a particular strategy is inherently
relative to a context of strategies played by other agents. Jordan et al. [2007] proposed
ranking strategies according to their performance when other agents are playing an
equilibrium.
Definition 7 (NE regret [Jordan, 2010]). Let s NE be a Nash equilibrium of game Γ.
NE
NE
The regret of strategy si ∈ S relative to s NE , u(sNE
i , s−i ) − u(si , s−i ), is an NE
regret of si in Γ.
NE regret represents the loss experienced by an agent for deviating to a specified
strategy from a Nash equilibrium of a game. The rationale for this measure comes from
the judgment that all else equal, Nash equilibria provide a compelling strategic context
for evaluating a given strategy. For games with multiple NE, a given strategy may have
multiple NE-regret values.
19
7.4
Results
Table 3 summarizes the extent of simulation coverage of the five SPSB environments
investigated.3 The empirical games comprise between 600 and 14,000 profiles, over
29–34 strategies. Evaluated profiles constitute a small fraction (as little as 0.03%)
of the entire profile space over these strategies. Nevertheless, these are sufficient to
confirm symmetric Nash equilibria for each game. Although it is impossible to rule
out additional equilibria without exhaustively evaluating the profiles, we have either
confirmed or refuted every evaluated symmetric mixed profile (i.e., every subset of
strategies for which all profiles are evaluated). Since this includes every strategy in
conjunction with those most effective in other contexts, we doubt that there are any
other small-support symmetric equilibria, and expect few if any alternative symmetric
equilibria among the explored strategies.
Environment
U [6, 4]
U [5, 5]
U [5, 8]
H[5, 3]
H[5, 5]
# Strategies
34
30
29
32
34
# Profiles
1165
5219
9096
608
13004
% Profiles
1.76
1.88
0.03
10.16
2.59
Table 3: Strategies and profiles simulated for the environments addressed in our EGTA
study.
As it happens, our process identified exactly one symmetric NE in each game; we
present these in Table 4. Of the 44 distinct strategies explored across environments,
only seven were supported in equilibrium in any environment. Variants of SCLocalBid, a strategy that explicitly optimizes with respect to self-confirming prices, predominate in equilibrium in four out of five environments. LocalBid and the BidEvalMix
strategies also explicitly optimize, but with respect to price predictions that are selfconfirming for different strategies (see Appendix A). AverageMU64 HB is the sole
non-optimizing heuristic to appear in equilibria, as it performs remarkably well in the
complementary-valuation environments.
Whether a strategy is in equilibrium or not is a crude binary classification of merit.
We measure relative degrees of effectiveness by NE regret (Definition 7), as reported
in Table 5. The table lists the NE-regret values and NE-regret rank (in parentheses)
for 12 top strategies: all those ranked fourth or better in at least one environment. For
calibration of statistical significance, the table also displays a conservative estimate of
the standard error for each environment.4
3 The final version of this paper will include an online data supplement with full payoff data for the empirical games. The results we report here subsume those of a preliminary study [Yoon and Wellman, 2011].
That study set the groundwork for the current investigation by developing the infrastructure for simulation
and derivation of self-confirming price predictions, and tuning parameters (e.g., number of evaluating samples) for several of the strategies. However, the preliminary results reflected sparser coverage of relevant
profiles and a weaker overall set of strategy candidates.
4 The actual variances are profile-specific, so characterization of standard error is problematic in this
context.
20
SCLocalBidSearch K16Z HB
SCLocalBidSearch K16 HB
SCLocalBidSearchS5K6 HB
LocalBidSearch K16 HB
AverageMU64 HB
BidEvaluatorMix E8S32K8 HB
BidXEvaluatorMix3 K16 HB
U [6, 4]
U [5, 5]
U [5, 8]
H[5, 3]
H[5, 5]
—
—
0.082
—
—
—
—
—
0.505
0.635
—
0.910
0.918
—
—
0.145
—
—
0.409
0.272
0.855
0.090
—
—
—
—
—
—
0.086
—
—
—
—
—
0.093
Table 4: For each environment, probability of each strategy in symmetric equilibrium,
as identified through EGTA.
Our first observation on these results is about the robustness of the SCLocalBid
strategy. For the one environment it fails to participate in equilibrium, its NE regret is
still quite low, thus we find it to be a strong all-around strategy. This situation contrasts
starkly with prior findings for bidding in simultaneous ascending auctions [Wellman
et al., 2008], where the best strategies for complementary (scheduling valuation) environments were awful in substitutes (homogeneous good) environments, and vice versa.
The fact that SCLocalBid performs so well aligns with our key theoretical finding, in
support of optimal PP bidders with self-confirming price predictions. As the LocalBid
search method is most effective in optimizing bids (see §8), it is consistent with our
theory to find that both examples of strategies in this class are leaders among explicit
optimizing strategies.
All the remaining top strategies are in the BidEval class (also explicit optimizers), except for AverageMU64. In contrast to the others, however, AverageMU64’s
quality is limited to one of the valuation distributions—the strategy performs poorly
in homogeneous-good environments. This observation is consistent with the results
reported by Boyan and Greenwald [2001], where an example environment with perfect
substitutes was contrived to demonstrate the shortcomings of marginal-utility-based
bidding. Given such examples, it is perhaps unsurprising that the heuristic strategies
based on marginal value have a difficult time competing with explicit optimizers. If
anything it is the observed success of AverageMU64 that is striking, but this outcome
is consistent with past experimental results in an environment that exhibits substantial
complementarity [Stone et al., 2003]. Still, compared to optimal PP bidders, AverageMU lacks robustness across valuation classes.
Finally, all the top strategies but one employ the highest-bid (HB) statistic in de-
21
Strategy
SCLocalBidSearch K16Z HB
SCLocalBidSearch K16 HB
SCLocalBidSearchS5K6 HB
LocalBidSearch K16 HB
BidXEvaluatorMix3 K16 HB
BidEvaluatorMix E8S32K8 HB
AverageMU64 HB
AverageMU64Z HB
BidXEvaluatorMixA K16 HB
SCBidXEvaluatorMixA K16 HB
SCBidEvaluatorMixA K16 HB
BidEvaluatorMixA
approx. std. error
U [6, 4]
0.050 (5)
0.065 (8)
0.036 (4)
0 (1)
0.083 (9)
0.282 (19)
0 (1)
0.054 (6)
0.021 (3)
0.163 (12)
0.260 (18)
0.534 (29)
0.012
U [5, 5]
0.020 (4)
0.065 (6)
0 (1)
0.405 (13)
0.408 (14)
0.535 (20)
0 (1)
0.010 (3)
0.333 (10)
0.186 (7)
0.279 (9)
0.517 (18)
0.016
U [5, 8]
0 (1)
0.028 (3)
0 (1)
0.241 (13)
0.235 (12)
0.306 (16)
0.177 (7)
0.216 (9)
0.175 (6)
0.124 (4)
0.204 (8)
0.354 (19)
0.006
H[5, 3]
0.614 (5)
0 (1)
0.841 (8)
0 (1)
1.111 (10)
0 (1)
12.245 (23)
—
2.246 (14)
0.892 (9)
0.566 (4)
3.429 (18)
0.166
Table 5: NE regret (ranking) for top strategies across environments.
riving self-confirming price distributions, as opposed to the actual transaction price.
This, too, is aligned with what the theory would dictate. The lone exception in Table 5
is BidEvaluatorMixA, which performed impressively in one of the homogenous-good
environments but not so well in the rest.
8
Optimization
Having conducted an empirical game-theoretic analysis, we turn our attention to the
decision-theoretic performance of our price-prediction strategies. That is, we investigate their degree of optimality—how close their expected profit comes to that of the
“best” response, as best as we can estimate it—assuming correct price predictions. We
additionally analyze how and when our different bidding strategies succeed and fail,
with the hope that this understanding leads to new insights about the auction environment structure and what makes a profitable bidding strategy.
8.1
Evaluation Framework
We conduct our evaluation in simulation environment U [5, 5]. We evaluate three priceprediction strategies: AverageMU64 HB, LocalBidSearch K16 HB, and BidXEvaluatorMix3 K16 HB (subsequently referred to as AverageMU, LocalBid, and BidEval, respectively). As input, all bidding strategies were given self-confirming prices
generated by AverageMU, shown in Figure 3(a).
For each strategy, we sampled 5000 valuations and generated bids according to that
strategy. We also approximated a best response bid vector for each sampled valuation;
this approximate best response was calculated by running BruteForce with discretization factor 1, and then feeding the optimal discretized bids to LocalBid.
Expected profit for each bidding heuristic was compared to that of the calculated
best response bid vector (hereafter referred to as OPT). Given independent price pre22
H[5, 5]
0.239 (6)
0 (1)
1.280 (10)
0 (1)
0 (1)
1.915 (14)
10.889 (22)
—
5.851 (16)
2.228 (5)
1.088 (9)
0.036 (4)
0.066
dictions and valuations nonincreasing with time, expected profit for a given bid vector
is the difference between expected revenue and expected cost:
EP (bb, v ) =
X
γjλ vj
−
j
XZ
j
bj
xfj (x)dx,
x=0
The term γjλ denotes the probability that the earliest satisfied deadline was j, defined
recursively:

n−1
γj−1
Fj (bj ) if n = λ, j ≥ 1

 n−1

j
n
γj−1 Fj (b ) + γj−1 (1 − Fj (bj )) if n < λ, j ≥ 1
γjn =
.

1 if n = 0, j = 0


0 otherwise
Note that neither LocalBid nor AverageMU are guaranteed to return bids with positive expected profit. We thus consider two simple extended strategies, LocalBid-post
and AverageMU-post, which perform an additional post-processing step of calculating expected profit as above for the proposed bid vector, and not bidding in any instance
in which it is negative.
8.2 OPT Bids
Figures 3(b) and 3(c) present information about the bids computed by OPT for the
given price predictions. As seen in Figure 3(b), profitability decreases with higher λ
values, and not bidding at all is frequently optimal when λ = 5. Intermediate λ values
likely pose the greatest challenge for bidding heuristics, since the instances maintain a
high degree of complementarity, yet there is more profit to potentially lose by bidding
sub-optimally.
Figure 3(c) shows the maximum value an agent could receive for a given instance
(i.e., the value for winning the earliest λ items) versus the sum of the OPT bids for that
instance. It is often most profitable to bid in aggregate about λ times higher than the
maximum value. The high aggregate bids with respect to value demonstrate that OPT
bidders are willing to expose themselves to the risk of paying high prices in order to
avoid the chance of winning goods but still failing to meet their λ threshold.
8.3
Evaluation Results
Table 6 compares the expected profit of each strategy to OPT. The variation of OPT expected profit across strategies is due to variation in sampling the 5000 valuations; these
differences were not statistically significant (p = .452).5 Each strategy similarly has a
high standard deviation in expected profit, although the differences between strategies
in terms of absolute distance from OPT were found to be statistically significant in
5 Checking for a significant difference in at least one of the groups was done using Kruskal-Wallis Rank
Sum Test.
23
Expected Profits Reverse CDF
for Optimal Strategy
Lambda
1.0
●
0
10
20
30
40
0.8
0.6
1
2
3
4
5
0.4
1
2
3
4
5
0.0
0.2
●
●
0.2
Prob. Higher Expected Profit
0.8
0.6
0.4
Good
0.0
Probability of Winning
1.0
CDFs of Highest Opponent Bids
50
●
0
10
Bid
20
30
Expected Profit
(b)
250
(a)
Maximum Possible Value
vs. Sum of Optimal Bids
Lambda
200
150
1
2
3
4
5
100
50
0
Sum of Optimal Bids
●
●●
●
●
●
●
●●
●●
●●
●
●●
●●
●
●
●●
●●
●
●●
●
●●
●●
●● ●
●●
●●
●●
●
●
●
●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
●●
●●
●
●●●●●
●●
●
●●
●●
●●
●●
●
●● ●●
●●
●●
●●
●
●
●
●
●●
●
●●
●●
●●
●●● ● ●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●
●●
●●● ●
●
●
●
●
●
●
0
10
20
30
40
50
Maximum Possible Value
(c)
Figure 3: (a) Cumulative density function over other-agent bids, based on the selfconfirming prices generated by AverageMU. (b) Reverse cumulative density function
over expected profit for OPT. (c) Maximum possible bidder value versus worst-case
amount paid by OPT.
all cases (p < .0001) except for the difference between LocalBid and LocalBid-post
(p = .970).6
Strategy
LocalBid-post
LocalBid
BidEval
AverageMU-post
AverageMU
OPT profit
mean
s.d.
3.94 (7.15)
3.94 (7.15)
4.01 (7.18)
4.34 (7.52)
4.34 (7.52)
ALG profit
mean
s.d.
3.90 (7.16)
3.89 (7.16)
3.79 (7.13)
3.73 (7.13)
2.86 (7.71)
(OPT-ALG) profit
mean
s.d.
0.05
(0.35)
0.05
(0.35)
0.22
(0.61)
0.61
(1.25)
1.47
(2.53)
% OPT
98.77
98.67
94.49
85.91
66.10
Table 6: Performance of each bidding strategy (ALG) with respect to OPT.
Next, we look at the performance of each bidding heuristic in more detail: Figure 4
6 All p-values in these comparisons were calculated using Pairwise Wilcoxon Rank Sum Tests, adjusting
for multiple measures using the Bonferroni correction.
24
AverageMU Profit vs. Optimal Profit
0
10
20
30
20
Optimal Profit
0
10
−10
1
2
3
4
5
10
20
●
●
Lambda
0
Optimal Profit
30
●●●
●●●
●●
●
●●
●
●
●
●
● ●●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
BidEval Profit vs. Optimal Profit
● ●●
●
30
●
●
●●
●●
●●
●●
●
●●
●
●●
●●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Lambda
1
2
3
4
5
0
10
AverageMU Profit
20
30
BidEval Profit
(a)
(b)
20
10
0
Optimal Profit
30
LocalBid Profit vs. Optimal Profit
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Lambda
1
2
3
4
5
0
10
20
30
LocalBid Profit
(c)
Figure 4: Expected profit for bidding heuristics versus best response, for each of 5000
valuations. Any points above the y = x diagonal are suboptimal instances for the given
heuristic.
shows each strategy’s expected profit compared to that of OPT, for every sampled
valuation instance.
AverageMU Note from Figure 4(a) that AverageMU is frequently placing bids
with negative expected profit (51.62% of all instances). The majority of these negative
expected profit instances were for higher λ values, meaning greater complementarity
and risk of exposure (see Table 7). As shown in Table 6, not bidding in negative
expected profit instances as done by AverageMU-post improves performance to nearly
86% of OPT.
Strategy
AverageMU
LocalBid
Negative
Instances
2581
17
1
0
0
% of Negatives with λ Value
2
3
4
5
0.1259 0.2534 0.3034 0.3173
0.1765 0.2353 0.4118 0.1765
Table 7: Breakdown of λ values for bids with negative expected profit.
BidEval BidEval improves over AverageMU in terms of expected profit, but there
25
are some valuation instances for which no bids are placed by BidEval, when in reality
a set of nonzero bids would have been optimal. The potential success or failure of
BidEval lies in its bid generation strategy. Because it cannot explore the space of
candidate bid vectors exhaustively, a tradeoff exists between diversifying, so that the
space of bid vectors is suitably explored, and focusing on candidate bid vectors that
other reasonably-profitable heuristics suggest. The instances in which no bids were
placed are cases where every bid vector generated by BidEval’s inner strategy had
non-positive expected profit.
BidEval performs worse on instances with lower optimal expected profit; this is
likely due to the distribution of expected profits across λ values: as seen in Figure 3(b),
lower optimal expected profits occur more frequently for instances with higher λ values, and the greater complementarity in these problems makes finding profitable bids
more difficult.
LocalBid The main source of suboptimality for LocalBid was the vertical region
at x = 0 in Figure 4(c). This region represents lost opportunities—valuation instances
where it would have been profitable to bid, but LocalBid did not do so. After inspecting
instances in this vertical region, we see that in most cases, OPT placed bids on the first
λ goods at points with high win probabilities. Since LocalBid only considers changing
one bid at a time, any initialized bids that left LocalBid without enough certainty that it
would receive at least λ − 1 items resulted in LocalBid gradually opting out of bidding:
if winning λ goods was still unlikely after a bid change, the greedy improvement was
to not bid at all for that item.
The horizontal region at y = 0 in Figure 4(c) shows that LocalBid also placed bids
with negative expected profit (but did so in less than 1% of all instances). In these cases,
a local maximum was reached because the initialized bid vector satisfied the agent’s λ
threshold with negative expected profit, and the reduced costs for not bidding on a
single good were outweighed by the loss in revenue for no longer winning λ items.
While these regions of suboptimality exist, LocalBid solves most valuation instances nearly optimally: 95.98% of all the sampled instances resulted in expected
profits within $0.01 of optimal. It is somewhat surprising that a simple local search
is so successful, but a visualization of the expected profit function provides some explanation. Figure 5 shows every possible bid vector’s expected profit for a simplified
version of the scheduling problem where there are only two goods. We see that the expected profit surface is quite smooth. Often times, as is the case here, LocalBid would
converge to the unique global maximum regardless of initial bids. Numerous domain
features could explain the smoothness of the expected profit function, such as (1) the
particular values of self-confirming price predictions, (2) the price predictions being
independent across goods, (3) the payment scheme of the second-price auction, or (4)
the structure of scheduling valuations. Better understanding how these factors affect
the quality of LocalBid’s search is left for future investigation.
9
Conclusion
Our theoretical and experimental findings point to two key ingredients for developing effective bidding strategies for simultaneous one-shot auctions with independent
26
Profit for Bids
50
20
40
Bid 2
15
30
20
10
10
●
5
0
0
10
20
30
40
50
Bid 1
Figure 5: Expected profit visualization for a simplified two-good version of the scheduling
problem.
private values.
The first is an algorithm for computing approximately optimal bid vectors given
a probabilistic price prediction. We have found that a simple local search approach
(§5.5.2) achieves a high fraction of optimality (§8), and that this translates into superior
performance in strategic simulations (§7).
The second ingredient is a method for computing self-confirming price predictions
for a given price-prediction bidding strategy and a specification of a simultaneousauction environment. We have exhibited a simple iterative estimation procedure, which
we have found to be effective at finding price distributions that are approximately
marginally self-confirming for a range of strategies and environments.
Our theoretical results say that if these ingredients can achieve their tasks perfectly,
we have a solution (i.e., a Bayes-Nash equilibrium) to the corresponding simultaneous
auction game. Approximations to the ideal in these ingredients yield approximate game
solutions.
Our computational experiments indicate that following this approach produces results that are as good or better than any other general method proposed for bidding in
simultaneous one-shot auctions. The evidence takes the form of a comprehensive empirical game-theoretic analysis, covering both complementary and substitutable valuation classes and a broad swath of heuristic strategies from the literature.
There is still considerable room for improvement in our proposed methods, as well
as opportunity to subject our conclusions to further empirical scrutiny. More sophisticated stochastic search techniques may improve upon our best bid optimization algorithms, and allow them to scale to larger environments with more complex valuations. Similarly, we do not consider our simple iterative method to be a last word
on finding self-confirming price distributions. In particular, we expect that substantial
improvement could be obtained by accounting for some joint dependencies in price
predictions. Finally, further testing against alternative proposals or in alternative environments would go some way to bolstering or refuting our positive conclusions.
27
A
Description of Strategies
Table 8 enumerates the strategies employed in our computational experiments, and
presents their defining features and parameters. For the LocalBid strategies, the column “BE samples” is interpreted as the number of iterations through the goods, and
“BE candidates” specifies the number of restarts.
Some additional features are inherent in the strategy name. Strategies ending in
“HB” employ the highest other-agent bid for purposes of deriving self-confirming predictions. Those without “HB” employ the actual good price. BidEval strategies starting
with “BidX”, as well as expected-value method ending in “MUa” or “MUStara”, employ exact evaluation of bids on a piecewise version of the price prediction, as opposed
to sampling.
Acknowledgments
This research was supported in part by Grant CCF-0905139 from the U.S. National
Science Foundation. Victor Naroditskiy and Jiacui Li provided useful feedback on
the theoretical results. Dong Young Yoon developed the simulator and initiated the
empirical game analysis. Ben Cassell designed and implemented the testbed facility
used to conduct our extensive simulations. Ka Man Lok assisted in managing the
simulation process.
References
John Birge and Francois Louveaux. Introduction to Stochastic Programming. Springer,
1997.
Justin Boyan and Amy Greenwald. Bid determination in simultaneous auctions: An
agent architecture. In Third ACM Conference on Electronic Commerce, pages 210–
212, Tampa, 2001.
Peter Cramton. Simultaneous ascending auctions. In Cramton et al. [2005].
Peter Cramton, Yoav Shoham, and Richard Steinberg, editors. Combinatorial Auctions.
MIT Press, 2005.
Richard Engelbrecht-Wiggans and Robert J. Weber. An example of a multi-object
auction game. Management Science, 25:1272–1277, 1979.
Amy Greenwald and Justin Boyan. Bidding under uncertainty: Theory and experiments. In Twentieth Conference on Uncertainty in Artificial Intelligence, pages
209–216, Banff, 2004.
Amy Greenwald, Seong Jae Lee, and Victor Naroditskiy. RoxyBot-06: Stochastic prediction and optimization in TAC travel. Journal of Artificial Intelligence Research,
36:513–546, 2009.
28
29
Table 8: Strategies and profiles simulated for the environments addressed in our EGTA study.
Strategy Name
Class
PP based on
MU samples BE cand gen BE samples
AverageMU64
AverageMU self
64
—
—
AverageMU64 HB
AverageMU self
64
—
—
AverageMU64Z HB
AverageMU self
64
—
—
AverageMU64Zi HB
AverageMU self
64
—
—
BidEvaluator HB
BidEval
TargetMU HB
8
TargetMU
32
BidEvaluatorAMU HB
BidEval
AverageMU64 HB
8
AverageMU
32
BidEvaluatorMix
BidEval
StraightMU
8
mix
32
BidEvaluatorMix E8S32K8 HB
BidEval
StraightMU HB
8
mix
32
BidEvaluatorMix HB
BidEval
StraightMU HB
8
mix
32
BidEvaluatorMix K16 HB
BidEval
StraightMU HB
8
mix
32
BidEvaluatorMixA
BidEval
StraightMUa
8
mix
32
BidEvaluatorMixA K16 HB
BidEval
StraightMUa HB
8
mix
32
BidEvaluatorSMU
BidEval
StraightMU
8
StraightMU
32
BidEvaluatorSMU E4S32K8
BidEval
StraightMU
4
StraightMU
32
BidEvaluatorSMU E8S32K8 HB BidEval
StraightMU HB
8
StraightMU
32
BidEvaluatorSMU HB
BidEval
StraightMU HB
8
StraightMU
32
BidEvaluatorStar
BidEval
TargetMUStar
8
TargetMU*
32
BidEvaluatorStar HB
BidEval
TargetMUStar HB
8
TargetMU*
32
BidXEvaluatorMix K16 HB
BidEval
StraightMUa HB
8
mix
—
BidXEvaluatorMixA
BidEval
StraightMUa
8
mix
—
BidXEvaluatorMixA K16 HB
BidEval
StraightMUa HB
8
mix
—
BidXEvaluatorMix3 K16 HB
BidEval
AverageMU64 HB
3
mix
—
LocalBidSearch K16 HB
LocalBid
AverageMU64 HB
—
—
—
SCBidEvaluatorFix HB
BidEval
self
8
TargetMU
32
SCBidEvaluatorMix HB
BidEval
self
8
mix
32
SCBidEvaluatorMixA K16 HB
BidEval
self
8
mix
32
SCBidXEvaluatorMixA K16
BidEval
self
8
mix
—
SCBidXEvaluatorMixA K16 HB
BidEval
self
8
mix
—
SCBidXEvaluatorMix3 K16 HB
BidEval
self
3
mix
—
SCLocalBidSearch K16 HB
LocalBid
self
—
—
16
SCLocalBidSearch K16Z HB
LocalBid
self
—
—
16
SCLocalBidSearchS5K6 HB
LocalBid
self
—
—
5
StraightMUa
StraightMU self
—
—
—
StraightMUa HB
StraightMU self
—
—
—
TargetMUa
TargetMU
self
—
—
—
TargetMUa HB
TargetMU
self
—
—
—
TargetMUStara
TargetMU*
self
—
—
—
TargetMUStara HB
TargetMU*
self
—
—
—
TargetMUStaraZ HB
TargetMU*
self
—
—
—
BaselineBidding
ad hoc
—
—
—
—
StraightMV
StraightMV self
—
—
—
TargetMV
TargetMV
self
—
—
—
TargetMVStar
TargetMV*
self
—
—
—
TargetMVStar HB
TargetMV*
self
—
—
—
BE candidates
—
—
—
—
4
4
4
8
4
16
4
16
4
8
8
4
4
4
16
4
16
16
16
4
4
16
16
16
16
1
1
6
—
—
—
—
—
—
—
—
—
—
—
—
Amy Greenwald, Victor Naroditskiy, and Seong Jae Lee. Bidding heuristics for simultaneous auctions: Lessons from TAC travel. In Wolfgang Ketter, Han La Poutré,
Norman Sadeh, Onn Shehory, and William Walsh, editors, Agent-Mediated Electronic Commerce and Trading Agent Design and Analysis, volume 44 of Lecture
Notes in Business Information Processing, pages 131–146. Springer-Verlag, 2010.
Patrick R. Jordan. Practical Strategic Reasoning with Applications in Market Games.
PhD thesis, University of Michigan, 2010.
Patrick R. Jordan, Christopher Kiekintveld, and Michael P. Wellman. Empirical gametheoretic analysis of the TAC supply chain game. In Sixth International Joint Conference on Autonomous Agents and Multi-Agent Systems, pages 1188–1195, Honolulu,
2007.
Paul Klemperer. Auctions: Theory and Practice. Princeton University Press, 2004.
Vijay Krishna. Auction Theory. Academic Press, second edition, 2010.
Vijay Krishna and Robert W. Rosenthal. Simultaneous auctions with synergies. Games
and Economic Behavior, 17:1–31, 1996.
Zinovi Rabinovich, Victor Naroditskiy, Enrico H. Gerding, and Nicholas R. Jennings.
Computing pure Bayesian Nash equilibria in games with finite actions and continuous types. Technical report, University of Southampton, 2011.
Daniel M. Reeves, Michael P. Wellman, Jeffrey K. MacKie-Mason, and Anna Osepayshvili. Exploring bidding strategies for market-based scheduling. Decision Support
Systems, 39:67–85, 2005.
Robert W. Rosenthal and Ruqu Wang. Simultaneous auctions with synergies and common values. Games and Economic Behavior, 17:32–55, 1996.
Peter Stone, Robert E. Schapire, Michael L. Littman, János A. Csirik, and David
McAllester. Decision-theoretic bidding based on learned density models in simultaneous, interacting auctions. Journal of Artificial Intelligence Research, 19:209–242,
2003.
Balázs Szentes and Robert W. Rosenthal. Three-object two-bidder simultaneous auctions: Chopsticks and tetrahedra. Games and Economic Behavior, 44:114–143,
2003.
Michael P. Wellman. Methods for empirical game-theoretic analysis (extended abstract). In Twenty-First National Conference on Artificial Intelligence, pages 1552–
1555, Boston, 2006.
Michael P. Wellman. Trading Agents. Morgan and Claypool, 2011.
Michael P. Wellman, Daniel M. Reeves, Kevin M. Lochner, and Yevgeniy Vorobeychik.
Price prediction in a trading agent competition. Journal of Artificial Intelligence
Research, 21:19–36, 2004.
30
Michael P. Wellman, Amy Greenwald, and Peter Stone. Autonomous Bidding Agents:
Strategies and Lessons from the Trading Agent Competition. MIT Press, 2007.
Michael P. Wellman, Anna Osepayshvili, Jeffrey K. MacKie-Mason, and Daniel M.
Reeves. Bidding strategies for simultaneous ascending auctions. B. E. Journal of
Theoretical Economics (Topics), 8(1), 2008.
Dong Young Yoon and Michael P. Wellman. Self-confirming price prediction for bidding in simultaneous second-price sealed-bid auctions. In IJCAI-11 Workshop on
Trading Agent Design and Analysis, Barcelona, 2011.
31

Download Report

Self-Confirming Price-Prediction Strategies for

Paperzz.com

Your Paperzz