Quantal Response Equilibria for Extensive Form Games

P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
Experimental Economics, 1:9–41 (1998)
c 1998 Economic Science Association
°
Quantal Response Equilibria for Extensive
Form Games
RICHARD D. MCKELVEY AND THOMAS R. PALFREY
Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA 91125, USA
email: [email protected], [email protected]
Abstract
This article investigates the use of standard econometric models for quantal choice to study equilibria of extensive
form games. Players make choices based on a quantal-choice model and assume other players do so as well.
We define an agent quantal response equilibrium (AQRE), which applies QRE to the agent normal form of an
extensive form game and imposes a statistical version of sequential rationality. We also define a parametric
specification, called logit-AQRE, in which quantal-choice probabilities are given by logit response functions.
AQRE makes predictions that contradict the invariance principle in systematic ways. We show that these predictions
match up with some experimental findings by Schotter et al. (1994) about the play of games that differ only with
respect to inessential transformations of the extensive form. The logit-AQRE also implies a unique selection
from the set of sequential equilibria in generic extensive form games. We examine data from signaling game
experiments by Banks et al. (1994) and Brandts and Holt (1993). We find that the logit-AQRE selection applied to
these games succeeds in predicting patterns of behavior observed in these experiments, even when our prediction
conflicts with more standard equilibrium refinements, such as the intuitive criterion. We also reexamine data from
the McKelvey and Palfrey (1992) centipede experiment and find that the AQRE model can account for behavior
that had previously been explained in terms of altruistic behavior.
Keywords: game theory, experimental economics, quantal response equilibrium
JEL Classification: C72, C70, C92
1.
Introduction
In an earlier article (McKelvey and Palfrey, 1995) we introduced the idea of quantal response
equilibrium (QRE) and applied this concept to noncooperative games in normal form.
Quantal responses are smoothed-out best responses, in the sense that players are more
likely to choose better strategies than worse strategies but do not play a best response with
a probability one. The idea has its origins in statistical limited dependent variable models
such as discrete choice (in economics and psychology) and stimulus/dosage response and
bioassay (in biology and medical research) and is an essential ingredient of the work on
bounded rationality in games by Rosenthal (1989) and control-cost models of equilibrium
selection by van Damme (1987). The added complication in applying the concept to game
theory, in contrast to individual choice, is that the choice probabilities of the players have an
important interactive component, since they are simultaneously determined in equilibrium.
QRE is an internally consistent equilibrium model, in the sense that the quantal response functions are based on the equilibrium probability distribution of the opponents’
P1: ABL
Experimental Economics
10
KL581-01-Mckelvey
April 25, 1998
9:48
MCKELVEY AND PALFREY
strategy choices rather than simply on arbitrary beliefs the players could have about those
probabilities.1
An interesting property of QRE is that systematic deviations from Nash equilibrium
may be predicted even without introducing systematic features to the error structure. The
systematic effects of the statistical disturbances are (indirect) equilibrium phenomena that
arise because of the strategic response of the players to the noisy environment. This leads
to what can be viewed as a statistical generalization of Nash equilibrium.
In the earlier article we show that for normal form games there is a correspondence
between QRE and Bayesian equilibria of what Harsanyi (1973) calls randomly disturbed
games. Specifically, we show that a particular parametrization of quantal response equilibrium, which we call the logit equilibrium, corresponds to the Bayesian equilibria of the
game with a vector of independent log Weibull-distributed disturbances added to the payoffs of the players, with each individual’s disturbances being private information to that
player. That paper then establishes some properties of QRE in finite n-person games,
and reexamines a diverse collection of experimental bimatrix games that have unique
logit (and Nash) equilibria. Statistical analysis of that data was carried out using standard maximum-likelihood techniques in a structural model that is directly implied by the
quantal response model. We estimate the variance of the Weibull disturbances for that
data and compare the predicted frequencies of strategy choices with the actual frequencies. We find that the theory tracks the data fairly well and can account for some of
the observed systematic deviations from the predictions of the exact Nash equilibrium
model.
This article reconsiders Nash equilibrium in extensive form games from the statistical
point of view and extends the concept of logit equilibrium to the extensive form representation. A number of interesting theoretical and empirical findings emerge. On the theoretical
side, the new formulation predicts systematic violations of what is known in the foundational game theory literature as invariance. It has been argued vigorously by some (e.g.,
Kohlberg and Mertens, 1986) that “the reduced normal form captures all the relevant information for decision purposes,” leading to the traditional game theory view that any good
theoretical equilibrium concept for noncooperative games must satisfy invariance. In our
statistical theory of decision-making in games the compelling arguments for invariance fall
apart. In fact, it is easy to construct examples where we predict systematic differences
in the predicted patterns of play depending on which “equivalent” version of a game is
played. Moreover, the intuition behind these differences is very sensible and supported by
experimental data.
Besides discriminating between different versions of a game that have equivalent strategicform representations, our model also makes different predictions depending on whether the
game is played in its “agent” normal form or its more traditional normal form and makes
different predictions between the expected patterns of play in the normal form of the game
and the reduced normal form of the some game. Loosely speaking, the intuitive reason
our model predicts representation dependence is that the error structure introduces private
information in particular ways that depend on the exact details of how the game is actually
played. Because of the specific statistical predictions that are implied by the dependence,
many aspects of the quantal response equilibrium theory can be tested directly by applying
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
11
standard maximum likelihood techniques to data generated in controlled laboratory experiments. For computational reasons, we focus on a version of QRE which is similar in spirit
to the agent model of how an extensive form game is played, and so we call this the agent
quantal response equilibrium (AQRE).
In addition to predicting systematic violations of the invariance principle, the logit-AQRE
model implies a unique selection from the set of sequential equilibrium components. This
selection is defined by the principal branch of the logit-AQRE correspondence, as explained
in McKelvey and Palfrey (1995). This selection thus generates a unique prediction of a
strictly positive probability distribution over play paths for every value of the logit response
parameter, for almost any finite extensive form game. As demonstrated in our earlier article,
this selection may be inconsistent with trembling-hand perfection as applied to the reduced
normal form of the game, as well as all other refinements of which we are aware.
In signaling games, there is a well-known multiple equilibrium problem to which a great
deal of effort has been directed by modern game theorists, generating a vast literature on
the general problem of deductively based refinements. Several experiments have been
conducted to test whether the refinements developed to distinguish these equilibria are
useful in helping to predict which equilibria (if any) are more likely to be played. This is
a natural setting in which to apply AQRE, since this equilibrium concept does not have the
problem of trying to define behavior off the equilibrium path. This equilibrium concept
also makes multiple equilibrium predictions, but the logit-AQRE has a natural refinement
that generically selects a unique equilibrium in a way that is much different from traditional
(deductive) refinement arguments. In addition to predicting the patterns of play “on the
Nash equilibrium path,” this equilibrium concept also predicts patterns of play off the Nash
equilibrium path as well. Some of the anomalies (concerning more standard theories)
uncovered in these experiments have to do with behavior off the equilibrium path, and our
model successfully accounts for these.
In past centipede game experiments conducted by us (McKelvey and Palfrey, 1992)
frequent violations of Nash predicted play were observed. This fact and the fact that the
observed outcomes were substantial Pareto improvements over Nash play, were rationalized
by a model in which some of the players have altruistic preferences. Using arguments based
on reputation building (Kreps and Wilson, 1982) it is shown that frequent violations of Nash
behavior (similar to what was observed) can be accounted for even if altruistic players are
“rare”. While this explanation of the data does account pretty well for most of the salient
features of that particular data set, it is clearly ad hoc. The explanation involves the invention,
or assumption, of a “deviant type” who systematically violates Nash behavior in exactly
the direction observed in the data. A preferable explanation would be able to account for
this data without resorting to such “adhocery”. The AQRE provides a framework for doing
exactly that and also has the desirable feature of being applicable to arbitrary games without
necessitating the invention of systematically deviant types, tailored to the peculiarities of
specific games.
Accordingly, we also reexamine the centipede data2 in the context of the logit-AQRE.
We find that a two-parameter specification of AQRE can account for the same qualitative
violations of Nash behavior as the five-parameter altruism model, although the exact fit to
the data is not as tight. In another article, we have presented some new data (contradictory
P1: ABL
Experimental Economics
KL581-01-Mckelvey
12
April 25, 1998
9:48
MCKELVEY AND PALFREY
to the altruism explanation) from much different centipede-type experiments where all
outcomes are Pareto optimal (Fey et al. 1996). The logit-AQRE model also fits that data
better than other competing models.
The remainder of the article is divided into seven sections. The next three sections set
out the formal structure of the AQRE model for extensive form games. We also introduce
a parametric version based on the logit model, which we later use for estimation. The fifth
section presents some examples, based on the chain-store paradox stage game, demonstrating one kind of violations of the invariance principle that are implied by our model, and
looks at some recent experimental data reported in Schotter et al. (1994) that are consistent
with our predictions. The sixth section explains the refinement for signaling games based on
logit-AQRE and examines data from experimental signaling games conducted by Brandts
and Holt (1993) and by Banks et al. (1994). The seventh section reexamines the data from
the centipede game, and discusses similar findings by Zauner (1993) and Fey et al. (1996).
We conclude with a few brief remarks about possible extensions of the QRE model.
2.
Definitions and notation
Let N be a finite set of n + 1 players, one of whom (player 0) is designated chance, let X be
a finite set of outcomes, and let A be a finite set of actions. Let Γ = (0, Q) be a topological
tree with 0 the set of nodes and Q ⊆ 0 × 0 a binary relation on 0 representing branches,
and let Q ∗ be the transitive closure of Q. If v Qv 0 we say v immediately follows v 0 (or v 0
immediately precedes v). If v Q ∗ v 0 we say v follows v 0 (or v 0 precedes v). Write Q(v) and
Q ∗ (v) for the set of immediate followers and the set of followers of v ∈ 0, respectively. We
assume that Q is an asymmetric, acyclic binary relation on 0 such that every node v ∈ 0
follows at most one node and such that there is exactly one node, v ∗ ∈ 0, which follows
no node. The element v ∗ is called the root node. A node v ∈ 0 is terminal if Q(v) = ∅;
otherwise it is nonterminal. Let 0 t and 0 0 represent the set of terminal and nonterminal
nodes of 0, respectively.
We define an extensive form game G(N , X, A, Γ) by defining an outcome function ψ :
0 t → X , a partition function P : 0 0 → N × I (here I is the integers), an index function
φ : 0 → A that is 1 − 1 on Q(v) for each v ∈ 0 0 and that satisfies φ(Q(v)) = φ(Q(v 0 ))
whenever P(v) = P(v 0 ), and a probability function λ : 0 0 → IR that is a probability
function on Q(v) for each v ∈ 0 0 with P1 (v) = 0.
For any v ∈ 0 0 , define A(v) = φ(Q(v)) to be the set of actions available at v. For each
v ∈ 0 0 define h(v) = {v 0 ∈ 0 0 : P(v) = P(v 0 )} to be the information set containing v.
Let H = {h : h = h(v) for some v ∈ 0 0 } be the set of all information sets. If h = h(v)
write P(h) = P(v) and A(h) = A(v). Write Hi = {h ∈ H : P1 (h) = i} to be the set of
j
information sets for player i. If P(v) = (i, j), write h i = h(v). An action a ∈ A(h) is said
to precede a node v ∈ 0 if there is an x ∈ h and y ∈ Q(x) with v ∈ Q ∗ (y) and φ(y) = a.
An extensive form game G is said to have perfect recall if for every player i ∈ N , all
information sets h, h 0 ∈ Hi , every action a ∈ A(h), and all nodes x, y ∈ h 0 , a precedes x
iff a precedes y. In a game of perfect recall, a player remembers what he previously knew
and what he previously did. We assume throughout that G has perfect recall.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
13
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
A behavior strategy for player i ∈ N is a function bi : Hi → M(A) satisfying bi (h) ∈
M(A(h)) for all h ∈ Hi . Here M(C) indicates the set of all probability measures over
j
the set C. Again, for h ∈ Hi we use the shorthand bi h = bi (h), and bi j = bi (h i ). For
each a ∈ A(h), bi ha = bi h (a) denotes the probability of
Q action a. Let Bi denote the
set of all behavior strategies for player i, and let B = i∈N Bi be the set of behavior
strategy n-tuples. If v ∈ h ∈ H0 is a chanceQnode, and v 0 ∈ Q(v), then we define b by
bh (v 0 ) = λv (v 0 ). We use the notation B−i = j6=i B j and write elements of B in the form
b = (bi , b−i ) ∈ Bi × B−i = B when we want to focus on a particular player i ∈ N . A
pure strategy is a behavior strategy that assigns a degenerate probability distribution to each
information set assigned to a player.
Let B ◦ denote the interior of B. Each behavior strategy n-tuple b ∈ B ◦ determines
a strictly positive realization probability ρ(v | b) for each node v ∈ 0, as follows. If
v = v ∗ is the root node, then define ρ(v | b) = 1. Otherwise, find v 0 with v ∈ Q(v 0 ).
Since v 0 ∈ 0 0 , v 0 ∈ h ∈ Hi forPsome i ∈ N , and define ρ(v | b) = ρ(v 0 )bh (v 0 ). For
∗
any h ∈ H , define ρ(h | b) =
v∈H ρ(v | b). Also, for any h ∈ H , define Q (h) =
∗
∪{Q (v) : v ∈ h}. Then we can define a conditional realization probability, ρ(v | h, b) on
Q ∗ (h) by ρ(v | h, b)ρ(h | b) = ρ(v | b). Note that for any b ∈ B ◦ , that ρ(v | b) defines a
strictly positive probability measure over the terminal nodes. Similarly, ρ(v | h, b) defines
a probability measure over 0 t ∩ Q ∗ (h).
For any i ∈ N and b ∈ B ◦ , define the payoff function u i : B ◦ → IR by
X
ρ(v | b)u i (v).
u i (b) =
v∈0 t
For any h ∈ H, i ∈ N and b ∈ B ◦ , define the conditional payoff function u i : B ◦ → IR
by
u i (b | h) =
X
ρ(v | h, b)u i (v).
v∈0 t ∩Q ∗ (h)
For any interior behavior strategy b ∈ B ◦ , any i and any information set h i ∈ Hi , we
j
j
j
denote u i (a, b | h i ) the conditional payoff to i of playing action a ∈ A(h i ) at h i with
probability one, and playing bi elsewhere.
Define a valuation function x : H → IR K , where for any h ∈ H, K = |A(h)| . We
j
write x ha = x(h)(a). Write X i j = IR | A(h i ) | to represent the space of possible expected
Q
j
payoffs for
set h i ∈ Hi , and X i = j X i j ,
Qactions that player i might adopt at information
and X = i∈N X . We define the function ū : B ◦ → X by
j
ū(b) = (ū 1 (b), . . . , ū n (b)),
where
j
ū i ja (b) = u i (a, b | h i ).
Thus ū(b) is a vector whose components are the conditional expected payoff to each
available action a of each information set j of each player i.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
14
3.
MCKELVEY AND PALFREY
Agent quantal response equilibrium
j
j
For each i, h i , a ∈ A(h i ), let εi ja be a random variable, which represents i’s payoff
j
disturbance from choosing a at information set h i . For any b ∈ B ◦ , let
û i ja = ū i ja (b) + εi ja .
We assume that ε is private information to the players. So player i observes εi ja , but the
other players (or the econometrician) do not.
j
Let εi j be the vector of i’s errors at h i , let εi be the entire error vector across i’s information
sets, and let ε = (ε1 , . . . , εn ) be the vector of errors for all players. We assume ε is
an absolutely continuous random vector (with respect to Lebesgue measure), distributed
according to a joint distribution with density function f (ε). We also assume that the εi j are
statistically independent and that the expected value of εi ja exists for all i, j, a ∈ A(h ij ).
Any probability density function, f , satisfying the above assumptions is called admissible.
We assume that at each information set h ij , player i selects an action that maximizes û i ja .
For any action a ∈ A(h ij ) and any ū ∈ X define
©
¡ ¢ª
Ri ja (ū) = ² : ū i ja + ²i ja ≥ ū i j â + ²i j â ∀â ∈ A h ij
and
Z
σi ja (ū) =
Ri ja (ū)
f (²) d²
Definition 1. For any extensive form game G and error structure f (ε), a behavioral strategy
b∗ ∈ B is an agent quantal response equilibrium (AQRE) if it is a fixed point in B of σ ◦ ū.
j
It is a vector b∗ ∈ B ◦ such that for all i ∈ N , 1 ≤ j ≤ Ji , a ∈ A(h i ), bi∗ja = σi ja (ū(b∗ )).
That is, b∗ is a quantal response equilibrium if, when all players are choosing their best
∗
response at every information set h ij , taking b−i
j as given, this generates a behavior strategy
∗
that is exactly the same as b .
This definition assumes the “agent model” of play in an extensive form game, where
different information sets of a given player are assumed to be played out by different
agents, all of whom share the same payoff function. In our model, each agent i j simply
chooses the maximum of û i ja at information set h ij and acts independently of the other
agents of the same player. For this reason we refer to the equilibrium defined above as the
agent quantal response equilibrium (AQRE).
Theorem 1. For any admissible f , an AQRE exists.
Proof: Define φ = σ ◦ ū : B 7→ B. Note that φ is continuous. Extend φ to have domain
B by
n
o
φi j (b) = co lim φi j (bt ) : {bt } ⊆ B ◦ and lim bt = b .
t→∞
t→∞
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
15
Then B is compact, convex, and φ : B 7→→ B is convex valued and upper hemicontinuous. Hence, by the Kakutani fixed point theorem, φ has a fixed point, b∗ ∈ B. It is easy
to show that there can be no fixed point on the boundary. Hence, there is a b∗ ∈ B ◦ with
2
σ ◦ ū(b∗ ) = b∗ .
The above theorem guarantees existence for AQRE, and the definition guarantees that
the equilibrium behavioral strategies place positive probability on every available action in
every information set of every player. The equilibrium correspondence has the same “nice”
topological properties as the Nash equilibrium correspondence.
The above model can be generalized by dropping the independence assumption or by
specifying an observation function O(h ij ), where for each i, h ij , O(h ij ) specifies a signal i
receives about ε at information set h ij , rather than just assuming i observes only ū i j at h i j .
While these changes would require some additional notation, the basic ideas and properties
of the quantal response model would not change. Some models of O(h ij ) other than the one
we explore here might be interesting in their own right. We leave these issues as a subject
of future work.
4.
The logistic AQRE
In the remainder of the article, we focus mainly on a specialized version of the model,
where each εi ja is independently and identically distributed according to the type I extreme
−λεi ja
. This
value, (or log Weibull) distribution with cumulative density F(εi ja ) = e−e
π2
distribution has mean of γλ , where γ is the Euler constant (0.577), and variance of 6λ
2.
So λ is proportional to the precision of F. This distribution of the disturbances leads to
choice probabilities following a multinomial logit distribution (see e.g., McFadden, 1974).
In particular, at h ij , bij (in the AQRE) is given by
eλū i ja (b)
.
λū i ja 0 (b)
a 0 ∈A(h ij ) e
bij (a) = P
A logit-AQRE is any solution to this set of k equations (one equation for each action in
each information set of each agent).
For each λ ∈ [0, ∞), define the logit-AQRE correspondence b∗ : R+ 7→→ B ◦ as the set
of logit-AQRE behavioral strategies when the precision of F equals λ:
(
∗
◦
b (λ) = b ∈ B :
bij (a)
)
eλū i ja (b)
.
=P
λū i ja 0 (b)
a 0 ∈A(h ij ) e
We establish several properties of the logit-AQRE correspondence as a function of λ. We
already established from Theorem 1 of the previous section that b∗ (λ) 6= ∅, ∀λ. That is,
at least one logit-AQRE exists for every value of λ. A second property is that limit points
of b∗ (·) as λ goes to infinity are behavioral strategy Nash equilibria of the game. Both of
these properties follow from the same arguments as in McKelvey and Palfrey (1995).
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
16
MCKELVEY AND PALFREY
A third property is that limit points of b∗ (λ) are not only Nash equilibrium behavior
strategy profiles, but are sequential equilibrium strategy profiles:
Theorem 2. For every finite extensive form game, every limit point of a sequence of
logit-AQREs with λ going to infinity corresponds to the strategy of a sequential equilibrium
assessment of the game.
∗
be a limit point of b∗ (λ). Then there exists consistent beliefs µ∗∞ (asProof: Let b∞
signments of a probability distribution over the nodes at each information set that satisfy
∗
specifies
the Kreps-Wilson, 1982, consistency condition) such that, under those beliefs, b∞
∗
optimal behavior for every continuation game. That is b∞ is sequentially rational given µ∗∞
∗
and µ∗∞ is consistent with b∞
. This is proved below.
∞
∗
Take any sequence {λk }k=1 and {bk∗ }∞
k=1 such that limk→∞ λk = ∞ and limk→∞ bk =
∗
∗
b∞ and bk is a logit equilibrium for λk for each k. First, note that for all k and for all
information sets h, b∗ (a) > 0 ∀ a ∈ A(h). Consequently, every node is reached with
positive probability. Therefore, by Bayes rule, for each k, bk∗ uniquely defines a set of
beliefs µ∗k over the nodes of every information set. Moreover, since µ varies continuously
with b, there is a unique limit µ∗∞ . Therefore, µ∗∞ are consistent beliefs with respect to
∗
∗
. What remains to be shown is that b∞
(h) is optimal given µ∗∞ (h) for all h. (If so,
b∞
∗
∗
∗
then (b∞ , µ∞ ) is a sequential equilibrium, so that b∞
is a sequential equilibrium strategy.)
j
Suppose not. Then there exists ε > 0 such that for some h c there is some pair of actions
j
∗
0
∗
∗
) > ². But then there must
{a, a } ∈ A(h c ) such that bcj∞ (a) > 0 but ū i ja 0 (b∞ ) − ū i ja (b∞
exist K > 0 such that
ū i ja 0 (bk∗ ) − ū i ja (bk∗ ) ≥
²
2
∀ k ≥ K.
Therefore, for all k ≥ K ,
bi∗jk (a)
bi∗jk (a 0 )
²
≤ e−λk 2
∀ k ≥ K.
So
²
bi∗j∞ (a) ≡ lim bi∗jk (a) ≤ lim bi∗jk (a 0 )e−λk 2 ,
k→∞
k→∞
= 0,
which contradicts bi∗j∞ (a) > 0.
2
The next result establishes uniqueness of the AQRE for games of perfect information.
Theorem 3. For every finite extensive form game of perfect information, and all 0 ≤ λ ≤
∞, b∗ (λ) is a singleton.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
17
Proof: This is proved by construction of b∗ (λ). Since the game has perfect information,
the predecessors of each terminal node all lie in different (singleton) information sets.
Consider one such information set, h, i(h) = i, A(h) = {a1 , . . . , ak } and for each a ∈
A(h), ∃ unique outcome νa ∈ 0 t such that ρ(νa | h, ba ) = 1, where ba is the behavior
strategy profile b, with bi (h) replaced by a, so u i (a | h) = u i (νa ). This uniquely determines
the action probabilities of logit-AQRE at any node immediately preceding a terminal node
by
eλu j (νa j )
bi∗ (a j h) = P K λu (ν ) .
i ai
k=1 e
ThenP
the expected utility to an agent who has reached h but has not chosen an action
at h is ak bi∗ (ak , h)u i (ν(ak , h)) = u i (b | h). Notice that u i (b | h) is therefore uniquely
determined. Next consider the immediate predecessors of the immediate predecessors
of 0 t . Repeating the above argument yields a unique b∗ (h 0 ) for all such h 0 such that
Q | (Q(V )) ⊆ 0 t ∀ v ∈ h 0 . Since the game is finite, we can iterate this argument until b∗
is defined (uniquely) at each information set.
2
A fifth property of the logit-AQRE correspondence is that for generic games there exists
a unique connected path in the graph of logit-AQRE correspondence that includes b0∗ (the
equilibrium when λ = 0). This unique connected path contains a solution b̄λ∗ for all values
of λ ≥ 0, and selects a unique sequential equilibrium component as λ → ∞. Any such
point is called a logit-AQRE solution of the game.
Theorem 4. For almost all finite extensive form games,
1. The logit AQRE correspondence, Q is a one-dimensional manifold.
2. For almost all λ there are an odd number of logit-AQRE.
3. There is a unique branch, B (the principal branch) of the logit AQRE selection connected
to the centroid of the game at λ = 0.
4. The principal branch of the logit AQRE correspondence selects a unique component
from the set of sequential equilibria of the game, in the sense that if {(λt , pt )}t and
{(λ0t , qt )}t are sequences in B with limt→∞ λt = limt→∞ λ0t = ∞, and if p∗ is a limit
point of { pt }t and q ∗ is a limit point of {qt }t , then p ∗ and q ∗ are in the same component
of the set of sequential equilibria of the game.
Proof: See the appendix.
2
Note that example 1 in McKelvey and Palfrey (1995) demonstrates that the converse of
Theorem 2 is not true. That is, there are sequential equilibria (and trembling-hand perfect
equilibria, as well) that are not approachable by a sequence of logit-AQREs. Thus, the
set of limit points logit-AQRE provides a refinement of sequential equilibrium, although
limit points are not necessarily trembling-hand perfect. Whenever there is a unique connected path selection of the logit-AQRE correspondence, this gives a refinement of the
AQRE-approachable sequential equilibria that is an even stronger refinement of sequential
P1: ABL
Experimental Economics
KL581-01-Mckelvey
18
April 25, 1998
9:48
MCKELVEY AND PALFREY
equilibrium. As we show below, it offers a quite different refinement from standard deductive refinements (such as the intuitive criterion),3 and predicts much better than the intuitive
refinement in signaling games that have been conducted as laboratory experiments.
Another nice feature of AQRE is that it makes predictions about the relative likelihood
of all different play paths. In contrast, other refinements tend to predict that only a subset
of the play paths can occur. Because all play paths can occur with positive probability in
AQRE, we don’t have to worry about the arbitrary specification of player beliefs “off the
equilibrium path”.
In the next section we demonstrate how this model of equilibrium behavior in extensive
form games contradicts the invariance principle that all strategically relevant information in
an extensive form game is captured in the normal form. We illustrate this in the context of a
particular game, the one-shot chain-store paradox. The way the game is actually played (as a
game where both players simultaneously choose strategies versus a two-stage game in which
the incumbent first observes whether an entrant has entered before choosing his strategy)
will systematically affect the predicted pattern of play using the logit-AQRE. Furthermore
the differences implied theoretically by logit-AQRE mirror experimental findings reported
recently by Schotter et al. (1994).
5.
Invariance
Consider the well-known chainstore paradox stage game,4 with the extensive form illustrated
in figure 1(a). The normal form representation of this game is given in Table 1.
The logit equilibrium (McKelvey and Palfrey, 1995) for the normal form of Table 1
is equivalent to the logit AQRE of the extensive form game of figure 1(b), which is a
strategically invariant transformation of the figure 1(a) game.
We wish to compare the quantal response equilibrium as it applies to the game in figure 1(a) to the quantal response equilibrium as it applies to the game in figure 1(b). To do
Figure 1. Chainstore paradox. Player 1 is the entrant, player 2 is the incumbent (E = enter, N = not enter,
F = fight, A = acquiesce).
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
Table 1.
19
Normal form of chainstore stage game.
1
2
F
A
N 2, 2 2, 2
E 0, 0 3, 1
this, we compare the logit-AQRE as a function of λ in the first case to the logit equilibrium
(McKelvey and Palfrey, 1995) in the second case.
The equilibrium conditions for the logit-AQRE of the extensive form game in figure 1(a)
are
1
,
1 + e(1−3q)λ
1
,
q=
1 + eλ
p=
(1)
(2)
where p is prob {1 moves N } and q is prob {2 moves F}.
The equilibrium conditions for the extensive form game in figure 1(b) are
e2λ
1
0 )λ =
0 )λ
2λ
3(1−q
(1−3q
e +e
1+e
0
e2 p λ
1
.
q 0 = (2 p0 +(1− p0 ))λ
0λ =
2
p
e
+e
1 + e(1− p0 )λ
p0 =
(3)
(4)
The qualitative features of the AQRE correspondence for ( p, q) and p 0 , q 0 as a function
of λ in (1) and (2) are shown in figures 2 and 3.5 The curves in these figures show how
the AQRE mixing probabilities vary with λ. The numbers and dots in these figures are
explained below.
Several points may be illustrated from this example. First, in the sequentially played
game there is a unique AQRE for every value of λ, since q does not depend on p. This is
generally the case in games of perfect information, as was established in Theorem 3.
This is not true in the game where player 2 chooses a strategy without having observed
player 1’s move. In this case there is an additional QRE component for large values of λ,
which converges to the subgame imperfect equilibrium p = 1, q = .5. Thus, in a sense, the
subgame imperfect entry deterrence equilibrium component (which is also trembling-hand
imperfect) is more plausible in the simultaneous play version of the chain-store game since
it is approachable by a sequence of AQRE’s.
There is another reason why the subgame imperfect outcome (no entry) is more likely to
be observed in the second version of the game. In the AQRE that converges to the perfect
equilibrium, p 0 and q 0 are both higher than p and q, respectively, for all values of λ. It is easy
to see why this is true directly from equations (1) to (4). For any value of p ∈ (0, 1) and for
any value of λ > 0, q < q 0 , since (1 − p 0 ) < 1. Since q < q 0 , it follows that p < p 0 . The
intuition is simple. In the strategic version of the game, the probability that F is suboptimal
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
20
9:48
MCKELVEY AND PALFREY
Figure 2.
SWW (1994) sequential version.
Figure 3.
SWW (1994) strategy version.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
Figure 4.
21
A 2 × 2 game of perfect information and its strategy version.
is less than 1, since it is only suboptimal if player 1 chooses E. This reduces the difference
in expected utility between F and A (relative to the sequential version), leading to a higher
probability that player 2 will choose F. This in turn leads to a higher probability player 1
will choose N . This gives the following proposition.
Proposition 1. For all λ > 0, q < q 0 and p < p 0 .
This can be generalized to the class of games illustrated in figure 4.
Suppose that D > 0. In the strategy version of the game, player 2’s (dominated) choice
of F is less costly than in the sequential version of the game. Therefore, q < q 0 < 12 when
D > 0. Similarly, q > q 0 > 12 if D < 0. In general, |q − 12 | > |q 0 − 12 | , which implies that
deviations from the subgame perfect Nash equilibrium prediction should always be greater
when the game is played in the strategy version. How this affects the moves by player 1
is more subtle and depends on whether C and D have the same or different signs. E is a
relatively more attractive option in the strategy version if and only if C > 0 and q 0 > q or
C < 0 and q 0 < q. This requires CD < 0. Similarly E is a relatively less attractive option
if and only if CD > 0. Therefore, p > p0 ⇔ CD < 0.
We turn next to an examination of data from experiments using this type of game. Schotter,
Weigelt and Wilson (1994) (SWW) conducted an experiment with A = 4, B = 3, C = 6,
and D = 2, where some subjects played the figure 4(a) game, other subjects played the
figure 4(b) game, and still other subjects played the game in normal form. Pooling the results
for the figure 4(b) games and the normal form games, they found that entry was deterred
more often with the strategy version of the game than with the sequential version of the game.
Furthermore, the second mover chose Fight more often in the strategy version of the game.6
To fit the AQRE to these data, we use a standard maximum-likelihood approach, as
described in detail in McKelvey and Palfrey (1995). Given an extensive form game, for
any parameter, λ, of the AQRE model, we obtain predicted frequencies of moves at each
information set. Given a data set from a particular experimental game, our estimate, λ̂, is
the value of λ that maximizes the likelihood of that data set.
In order to gauge the success of AQRE in fitting the data, we compare these estimates to
an alternative model, which we call the noisy Nash model (NNM). This alternative model
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
22
Table 2.
MCKELVEY AND PALFREY
SWW experimental results (36 subjects).
Sequential version
n
fi
QRE
NNM
N
7
0.0814
0.0674
0.0545
E
79
0.9186
0.9326
0.9455
F
2
0.0253
0.0455
0.0545
A
77
0.9747
0.9545
0.9455
λ|γ
1.5216
0.8910
λlo | γlo
1.2721
0.8080
λhi | γhi
1.8382
−L∗
Strategic version
34.159
0.9460
34.928
N
46
0.5750
0.4706
0.3875
E
34
0.4250
0.5294
0.6125
F
16
0.2000
0.3077
0.3875
A
64
0.8000
0.6923
0.6125
λ|γ
0.7658
0.2250
λlo | γlo
0.5515
0.0720
λhi | γhi
−L∗
0.8891
98.69
0.3710
108.819
(which includes Nash equilibrium and random play as two special cases) posits that at
any information set, players adopt the Nash equilibrium strategy with probability γ , and
with probability 1 − γ they randomize uniformly over all available actions. This differs
from the QRE by not imposing an equilibrium restriction. That is, NNM assumes players
may make errors relative to the Nash predictions, but they do not take into account that
other players also make errors. Thus, it is more in the spirit of the bounded rationality,
nonequilibrium approach to explaining experimental data, as opposed to the equilibrium
approach embodied in QRE. In NNM, the predictions for γ = 1 corresponds to λ = ∞,
and γ = 0 corresponds to λ = 0. The models generally differ (sometimes dramatically)
for intermediate values of γ and λ. We report maximum likelihood estimates for both γ
and λ and compare them.
The logit-AQRE correspondence for both the sequential and strategy versions of the
game7 are given in figures 2 and 3, respectively. Superimposed on those graphs are the
observed data points plotted at the maximum likelihood values of the logit response parameter. Table 2 gives the predicted and actual data aggregated across all of their experiments.
In this table (as well as subsequent tables), λ represents the maximum likelihood estimate
of λ, (λlo , λhi ) is a 95 percent confidence interval for λ, and f i denotes the empirical relative frequency of each strategy choice. The estimation here, as well as the estimations
presented later in this article are performed under the restriction that all subjects (both row
and column players) have the same λ. There are multiple points marked on each graph.
Each numbered point represents a different experience level of the subjects, with higher
numbers indicating greater experience. The black circle represents the pooled estimate,
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
23
over all experience levels. As is evident from the graphs, experience seems to have little
effect in the data.8 One can readily see that the logit-AQRE fits quite well in both versions
of the game, successfully predicting more entry deterrence in the sequential version of the
game. The fit is not perfect, as it underpredicts the frequency of (N , A) outcomes somewhat
in the strategy version of the game. This underprediction partly results from our having
included data from the matrix version of the game as well as the extensive form version.
The NNM estimates exhibit similar underprediction, and generate poorer fits in both games.
6.
Equilibrium selection
As a second application of AQRE, we look at simple sender-receiver signaling games.
Several experimental studies have been conducted using sender-receiver games, notably by
Banks et al. (1994), Brandts and Holt (1992, 1993), and Partow and Schotter (1993). This
is an ideal class of games to study with AQRE.
First, the games typically have multiple sequential equilibria, and logit-AQRE generically
selects a unique one as the limit of the unique connected path in the logit-AQRE graph.
The earlier figures showing the logit-AQRE correspondence for the SWW experiments
illustrates this selection. Notice that in the strategy version of their game there are multiple
logit-AQRE for sufficiently high values of λ. However, there is a unique connected path
starting at the unique equilibrium at λ = 0. The unique connected path selection is the
darker curve in that figure. This is also true for extensive form games as established in
Theorem 4. Moreover, in signaling games the unique connected path need not correspond
to traditional refinements of equilibrium. For reasons of space, we do not reexamine the
results of all the experimental signaling games that have been conducted to date. We focus
on a subset of the experimental games reported by Banks et al. (1994) (BCP) and Brandts
and Holt (1993) (BH).
Banks, Camerer, and Porter (1994) (BCP)
The BCP experiment consisted of a series of two-person signaling games, with two sender
types, three messages, and three responses. Each game was designed to have two Nash
equilibria, one of which was always further up the chain of deductive refinements than the
other. The experiments were intended to give data on the conditions under which subjects
play the more refined equilibrium. We analyze games 2, 3, and 4 of BCP. These games are
given in Table 3. Each of these games has two Nash equilibrium outcomes, with one being
a more refined equilibrium than the other.9
The logit-AQRE graphs are displayed in figures 5 to 7 for these three games. Each triangle
in the figures represents the simplex of mixing probabilities at some information set of the
game. (Recall that players have three possible actions at each information set in these
games.) The centroid of each triangle corresponds to λ = 0. As λ increases without bound,
the Logit solution probabilities converge to one of the vertices, corresponding to one of the
sequential equilibrium pure strategy profiles, as indicated by the solid lines. The aggregate
data from the experiment are represented by a black dot. Observe that convergence from
the centroid to the sequential equilibrium selected by the Logit solution is frequently
P1: ABL
Experimental Economics
Table 3.
m1
KL581-01-Mckelvey
April 25, 1998
9:48
BCP (1994) signaling games.
a1
a2
BCP # 2 (Nash versus
a3
m2
a1
a2
a3
m3
a1
a2
a3
sequential)∗
t1
1, 2
2, 2
0, 3
t1
1, 2
1, 1
2, 1
t1
3, 1
0, 0
2, 1
t2
2, 2
1, 4
3, 2
t2
2, 2
0, 4
3, 1
t2
2, 2
0, 0
2, 1
BCP # 3 (sequential versus
intuitive)†
t1
0, 3
2, 2
2, 1
t1
1, 2
2, 1
3, 0
t1
1, 6
4, 1
2, 0
t2
1, 0
3, 2
2, 1
t2
0, 1
3, 1
2, 6
t2
0, 0
4, 1
0, 6
BCP # 4 (intuitive versus divine)‡
t1
4, 0
0, 3
0, 4
t1
2, 0
0, 3
3, 2
t1
2, 3
1, 0
1, 2
t2
3, 4
3, 3
1, 0
t2
0, 3
0, 0
2, 2
t2
4, 3
0, 4
3, 0
∗ Nash:
(m 1 , a2 , a2 , a2 ) Seq: (m 3 , a2 , a2 , a1 ).
† Sequential: (m , a , a , a ) Intuitive: (m , a , a , a ).
2 1 3 1
1 2 1 1
‡ Intuitive: (m , a , a , a ) Divine: (m , a , a , a ).
2 3 3 2
3 2 2 1
Figure 5.
BCP (1994) Game 2. Sequential versus Nash.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
25
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
Table 4.
BCP experimental results.
BCP #2
t1
t2
m1
m2
m3
n
fi
m1
7
.184
m2
0
.000
m3
31
m1
11
m2
m3
QRE
BCP #3
NNM
n
fi
QRE
.103
.092
20
.500
.497
.021
.092
14
.350
.339
.816
.875
.817
6
.150
.324
.176
.092
45
.900
0
.000
.026
.092
4
23
.677
.798
.817
1
BCP #4
NNM
n
fi
QRE
NNM
.817
2
.067
.274
.217
.092
9
.300
.221
.217
.165
.092
19
.633
.504
.567
.927
.817
21
.700
.422
.217
.080
.057
.092
0
.000
.032
.217
.020
.016
.092
9
.300
.546
.567
a1
0
.000
.049
.092
4
.062
.153
.092
9
.391
.298
.217
a2
18
1.000
.837
.817
61
.939
.704
.817
13
.565
.593
.567
a3
0
.000
.113
.092
0
.000
.142
.092
1
.044
.109
.217
a1
0
.184
.092
14
.778
.686
.817
0
.000
.044
.217
a2
0
.797
.817
0
.000
.174
.092
4
.444
.647
.567
a3
0
.019
.092
4
.222
.140
.092
5
.556
.308
.217
a1
53
.841
.726
.817
7
1.000
.999
.817
23
.821
.704
.567
a2
0
.000
.026
.092
0
.000
.001
.092
5
.179
.235
.217
a3
10
.159
0
.000
0
.000
.248
.092
.000
.092
.062
.217
λ|γ
2.249
.725
1.598
.725
1.193
.350
λlo | γlo
1.901
.627
1.441
.634
.099
.216
λhi | γhi
2.715
.809
1.864
.802
−L∗
83.390
92.224
101.050
108.628
1.470
95.65
.479
118.151
nonmonotonic (for example, Type 1 senders and Message 2 receivers in Game 3). This is
in marked contrast to NNM, since the NNM correspondence is simply defined by a straight
line in each graph connecting the centroid to the appropriate vertex.
The estimation of λ and γ is reported in Table 4 for each game. We see that in each of
the three cases, the QRE selects the more refined equilibrium, which corresponds to what
was observed in the data.
Moreover, the QRE predictions pick up quite well which strategies are relatively more
overplayed (or underplayed) compared to the refined equilibrium predictions. In BCP
#2, the predicted equilibrium has both types sending message 3, with the receiver choosing
action 1 in response to message 3, and choosing action 2 in response to messages 1 and 2. For
both types of senders, AQRE predicts that nearly all message errors should involve sending
message 1. Indeed, in all eighteen deviations from this equilibrium, senders use message 1
and never use message 2.10 Response errors to the equilibrium message are predicted by
AQRE to be more than ten times as likely to be action 3 than to be action 2, and indeed
action 3 errors are observed in all ten of the responders’ deviations from equilibrium. Also,
QRE predicts that type 1 senders are less likely to deviate from the sequential equilibrium
than type 2 senders, which was also observed in the data.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
26
Figure 6.
9:48
MCKELVEY AND PALFREY
BCP (1994) Game 3. Intuitive versus sequential.
In the NNM, all deviations from the Nash equilibrium are equally likely, so that model
cannot account for the observation that certain errors are systematically more likely than
other errors. Consequently, NNM performs much more poorly than AQRE in these signaling games, where senders have three available messages and receivers have three available
actions.
In BCP #3 the predicted (that is, unique intuitive) equilibrium has both types sending
message 1, with the receiver choosing action 2 in response to message 1, and choosing
action 1 in response to both nonequilibrium messages. Of the thirty three deviations from
the intuitive equilibrium predicted strategies, most were deviations by type 1 senders, who
are predicted by the AQRE model to send the intuitive message approximately half the time.
(Indeed, type 1 senders were observed to send the intuitive message exactly half the time!)
In BCP #4, the predicted (that is, divine) equilibrium has both agents sending message 3,
with the receiver choosing action 1 in response to the equilibrium message 3, and action 2
otherwise. In the quantal response equilibrium, both sender types are predicted to send the
divine equilibrium message only about half the time. Aggregating both types of senders,
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
Figure 7.
27
BCP (1994) Game 4. Divine versus intuitive.
this is what we observe in the data, where message 3 is sent twenty eight times out of sixty
chances; message 1 is predicted to be sent slightly more than a third of the time, and we
observe it sent twenty three out of sixty times. The frequency of actions in response to each
message is predicted even more accurately (see Table 4).
Brandts and Holt (1993) (BH)
As a follow-up to the BCP experiment Brandts and Holt (1993) proposed two alternatives
to game 3 (which differentiated the intuitive equilibrium from a nonintuitive sequential
equilibrium), which they ran using similar11 procedures. The games, which differ slightly
from BCP because the sender has only two available messages, are given in Table 5. The
two games each have a sequential and an intuitive equilibrium at the same strategies. In each
of the BH games, four trials were conducted, each with twelve subjects, matched in pairs
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
28
MCKELVEY AND PALFREY
Table 5.
BH (1993) signaling games.
m=I
C
D
m=S
E
C
D
E
BH # 3 (sequential versus intuitive)
A
45, 30
B
30, 30
15, 0
0, 45
30, 15
A
30, 90
0, 15
45, 15
30, 15
B
45, 0
15, 30
30, 15
BH # 4 (sequential versus intuitive)
A
30, 30
0, 0
50, 35
A
45, 90
15, 15
100, 30
B
30, 30
30, 45
30, 0
B
45, 0
0, 30
0, 15
Note: sequential: (S, D, C), intuitive: (I, C, D).
Table 6.
Data and estimates for BH experiments, periods 4 to 6.
BH #3
t=A
t=B
m=I
m=S
n
fi
BH #4
QRE
NNM
n
fi
QRE
NNM
m=I
32
.970
.946
.838
0
.000
.026
.130
m=S
1
.030
.054
.162
38
1.000
.974
.871
m=I
18
.462
.601
.838
15
.441
.280
.130
m=S
21
.539
.399
.162
19
.559
.720
.871
a=C
48
.960
.686
.783
1
.067
.252
.086
a=D
2
.040
.178
.108
13
.867
.729
.827
a=E
0
.000
.136
.108
1
.067
.019
.086
a=C
4
.182
.109
.108
52
.912
.888
.827
a=D
18
.818
.719
.783
2
.035
.050
.086
a=E
0
.000
3
.053
.173
.108
.062
.086
λ(σ 2 )
.108
.675
.095
.741
λlo
.097
.559
.077
.633
λhi
.125
.773
.116
.830
69.529
78.295
56.267
68.034
−L∗
over a sequence of six periods. Each subject participated in two such sequences, once as a
row player, and once as a column player. For details see Brandts and Holt (1993: 286–287).
The point of the BH experiment was to attempt to construct a game in which play might
conceivably converge to the less refined equilibrium. Brandts and Holt proposed a descriptive story of how a particular dynamic of play in early rounds could potentially lead to the
unintuitive equilibrium, if the payoffs in BCP #3 were changed slightly. Their learning dynamic is related to the rationale for the logit AQRE selection, since the rough idea behind our
selection is that if subjects begin an experiment at a QRE with relatively high error rates (that
is, a low value of λ) and “follow” the equilibrium selection as λ increases with experience,
then they should converge to the equilibrium component implied by our selection.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
Figure 8.
29
BH (1993) Game 3. Sequential versus intuitive.
The data along with the QRE and NNM estimates are given in Table 6 and figures 8
and 9. Table 6 presents the results of the maximum likelihood estimation of the AQRE
and NNM models, and the observed frequencies for the later periods of their experiment
(periods 4 to 6). Notice that again AQRE provides a better fit than NNM. In figures 8
and 9, we present graphically the results of estimation with a breakdown by period. These
figures are similar to those of the BCP experiments for the receiver. However, the sender in
the BH experiments just has two messages. Therefore, we graph the sender data in a unit
square, with type A’s choice frequencies on the horizontal axis and type B’s on the vertical
axis. In those graphs higher numbers correspond to higher levels of experience (in the
same way as in BH (1993: 442, figure 5). The curves in the graphs represent the predicted
move frequencies for the logit-AQRE selection, as a function of λ. As in the BCP figures,
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
30
9:48
MCKELVEY AND PALFREY
Figure 9.
BH (1993) Game 4. Sequential versus intuitive.
the QRE correspondence starts at the centroid (the center of the square and the center of the
simplices) for low values of λ, and converges to the logit solution (the intuitive equilibrium
for game 3 and the nonintuitive sequential equilibrium for game 4) as λ goes to infinity.
While the estimated values of λ are not strictly increasing in experience levels, there is an
overall trend in the right direction, as is indicated by comparing the first three periods (white
circle) to the last three periods (black circle).
7.
Centipede game experiments
McKelvey and Palfrey (1992) studied experimentally two versions of the centipede game,
displayed in figures 10 and 11, respectively. In that article they proposed a theory of play
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
Figure 10.
The four-move centipede game (McKelvey-Palfrey, 1992).
Figure 11.
The six-move centipede game (McKelvey-Palfrey, 1992).
31
based on incomplete information, altruism, and heterogeneous prior beliefs, and fit a fiveparameter model to the data. The model captured the essential qualitative features of the
data and fit the exact move frequencies of the data fairly well.
In this section we reexamine the centipede data using the logit AQRE model. We describe
the details of the model as it applies to the extensive form-model versions of figure 10. The
computations for the six-move version are similar. Denote player 1’s behavior strategies as
( p1 , p2 ) and player 2’s behavior strategies as (q1 , q2 ), where
p1 = prob{1 chooses T at first move}
p2 = prob{1 chooses T at third move (if reached)}
q1 = prob{2 chooses T at second move (if reached)}
q2 = prob{2 chooses T at fourth move (if reached)}.
At the first move, player 1 estimates the payoff of T and P by
U11 (T ) = .4 + ε1T
U11 (P) = .2q1 + (1 − q1 )[1.6 p2 + (1 − p2 )[.8q2 + 6.4(1 − q2 )]] + ε1P ,
where ε1T and ε1P are independent random variables with a Weibull distribution with
parameter λ. The logit formula then implies
p1 =
1
.
1 + eλ[.2q1 +(1−q1 )[1.6 p2 +(1+ p2 )[.8q2 +6.4(1−q2 )]]−.4]
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
32
MCKELVEY AND PALFREY
Similarly, we obtain expressions for p2 , q1 and q2 , given below,
1
1 + eλ[.4 p2 +(1− p2 )[3.2q2 +1.6(1−q2 )]−.8]
1
p2 =
1 + eλ[.8q2 +6.4(1−q2 )−1.6]
1
q2 =
.
λ[1.6−3.2]
1+e
q1 =
This system of four equations and four unknowns is solved recursively (q2 , then p2 ,
then q1 , then p1 ) and has a unique solution. There are several interesting features of
the equilibrium correspondence. First, for most values of λ, p1 < q1 < p2 < q2 . This is
consistent with the experimental findings, where p1 is .07, q1 is .38, p2 is .65, and q2 is
.75. Second, for very low values of λ, p1 , q1 , and p2 are all decreasing in λ. That is, the
equilibrium does not converge monotonically toward the Nash equilibrium as a function of
λ. We estimate the best fitting value of λ and γ for this data set, using standard maximum
likelihood techniques.12 The results are reported in Tables 7 and 8. The NNM model fails
miserably, and in fact does no better than a completely random model.
Our estimates of λ findings mirror results reported in Zauner (1993), using a related
model. That article presents a structural model of the error process by adding independent
Normally distributed payoff disturbances and estimates the variance of the disturbances (see
next to last column of Tables 7 and 8). That article also adopts an “agent model” approach
in that a player does not observe all disturbances to payoffs at the start of the game, but
instead only the disturbance to his own current “take” payoff is observed. That model fits
better than ours, as measured by the likelihood score, although the qualitative findings are
very similar, as both models share two basic properties: they capture the feature in the data
that take probabilities increase as the end of the game is approached; and they overestimate
the take probabilities on the first and last moves. But as Zauner (1993) points out, neither
model includes a heterogeneity parameter, which produced a dramatic improvement in
Table 7.
Four-move centipede game.
n
fi
AQRE
NNM
Zauner
model
2 parameter
model
Take
T
281
.071
.246
.500
.214
.083
Probs
T/P
261
.383
.315
.500
.301
.320
T/PP
161
.644
.683
.500
.677
.718
T/PPP
57
.754
.938
.500
.939
.637
λ | (σ 2 )
1.70
0.00
—
1.655
λlo
1.54
0.00
—
—
λhi
1.87
0.01
—
—
q
—
—
—
−L∗
424.93
527.02
418.3
.96
402.5
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
33
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
Table 8.
Six-move centipede game.
n
fi
AQRE
NNM
Zauner
model
2 parameter
model
Take
T
281
.007
.236
.500
.186
.029
Probs
T/P
279
.065
.252
.500
.213
.056
.085
T/PP
261
.215
.240
.500
.215
T/PPP
205
.527
.478
.500
.436
.504
T/PPPP
97
.732
.842
.500
.814
.772
T/PPPPP
26
.846
.980
.500
.980
.409
λ|γ
.61
0.00
—
λlo | γlo
.55
0.00
—
—
.64
—
λhi | γhi
0.04
—
q
—
—
—
−L∗
534.03
797.02
506.4
.67
.97
454.3
fit in the models estimated by McKelvey and Palfrey (1992). One might hope that the
inclusion of a heterogeneity parameter would also significantly improve the fits of these two
models.
There are other ways these models might be improved on. As discussed above, these
models assume that a player cannot predict his own future play (the “agent” approach).
Alternatively, one might assume that each player observes all his payoff disturbances at
the beginning of the game. Unfortunately, this “player” approach is difficult to estimate
because the equilibrium can no longer be solved recursively from the last move of the game.
Another possible improvement would be to incorporate persistence in the error structure so
that a player who passed at the beginning is also likely to pass later on. This would mimic
the role of altruism.
We consider here a third possibility, which is to incorporate a second parameter, consistent
with our previous analysis of this data. Namely, we assume that there is some small
probability that players are “altruistic” (and hence choose P at every opportunity). This
gives rise to a game of incomplete information (as described formally in McKelvey and
Palfrey, 1992: 827) where with probability q a player is “rational” and with probability
(1 − q) they are “altruistic”. We then apply the AQRE directly to this modified game,
obtaining a two-parameter model in which we estimate both λ and q. The results of this
analysis appear in the last column of Tables 7 and 8. As one can see, there is a dramatic
improvement in fit over the one parameter model, at any reasonable significance level.
While the log likelihood is still several orders of magnitude worse than the five-parameter
model of McKelvey and Palfrey (1992), one must keep in mind that that model permitted
hetereogeneity and time trends, in addition to altruism.
8.
Conclusions
This article developed a model of equilibrium behavior in extensive form games when players are imperfect best responders. At each information set, players choose better actions
P1: ABL
Experimental Economics
KL581-01-Mckelvey
34
April 25, 1998
9:48
MCKELVEY AND PALFREY
with higher probabilities than worse actions but do not choose best responses with probability one. We developed a general theoretical structure, called agent quantal response
equilibrium, and a more specialized version, called logit AQRE, which we used to fit the
model to several experimental data sets. We summarize the theoretical findings and the
empirical findings separately.
Theoretical findings
1. Agent quantal response equilibria exist in all finite games.
2. Limit points of logit AQRE (as the precision parameter get arbitrarily large) define a
nonempty refinement of sequential equilibria of any finite extensive form game.
3. For generic extensive form games, there exists a unique connected component of the
logit AQRE correspondence, which includes equilibria for all values of the precision
parameter. A limit point of this component (as the precision parameter gets arbitrarily
large) provides a unique selection from the set of sequential equilibrium components of
the underlying game. We call this the logit solution of the game.
4. The logit solution of the game is not logically related to other refinement criteria, such
as the intuitive criterion or trembling hand perfection.
5. Agent quantal response equilibrium depends in systematic ways on inessential transformations of the extensive form.
Empirical findings
1. The AQRE model explains the finding by Schotter et al. (1994) that one-shot play in the
stage game of the chain-store paradox differs systematically depending on whether the
players make decisions sequentially or simultaneously. More entry deterrence occurs
when the decisions are made simultaneously. This is precisely what our model predicts.
No game-theoretic equilibrium concept satisfying invariance can explain this finding.
2. The AQRE model was used to fit data from a variety of signaling games with multiple
equilibria. In all cases, the observed data corresponded to the logit solution to the game,
even in the remarkable game of Brandts and Holt (1993) in which play converges to
an unintuitive sequential equilibrium. In these signaling games, the logit solution also
makes predictions about the pattern of behavior off the equilibrium path, as well as
making predictions about the relative probabilities of reaching different off-equilibrium
paths. The logit solution predictions are born out in the data from all five different
signaling games we examine.
3. The main features of the data from centipede experiments are consistent with the logit
equilibrium. In particular, the model predicts take probabilities to increase as a game
progresses. We fit data from two different variants of the centipede game and also fit
an adapted version of the logit equilibrium in which some fraction of the agents are
altruistic. That two-parameter model fits the data significantly better.
4. For all the experimental data we analyze, we also fit a competing model, called the noisy
Nash model. In that model, the maintained hypothesis is that at each information set the
player deviates from the Nash equilibrium behavior strategy with some fixed probability.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
35
We obtained maximum likelihood estimates of this error probability. The logit model
outperforms the NNM model without exception, usually by a wide margin.
Appendix
This appendix proves the properties of the quantal response correspondence stated in
Theorem 4.
Theorem 4. For almost all finite extensive form games,
1. The logit AQRE correspondence Q is a one-dimensional manifold.
2. For almost all λ there are an odd number of logit-AQRE.
3. There is a unique branch, B (the principal branch) of the logit AQRE selection connected
to the centroid of the game at λ = 0.
4. The principal branch of the logit AQRE correspondence selects a unique component
from the set of sequential equilibria of the game, in the sense that if {(λt , pt )}t and
{(λ0t , qt )}t are sequences in B with limt→∞ λt = limt→∞ λ0t = ∞, and if p∗ is a limit
point of { pt }t and q ∗ is a limit point of {qt }t , then p ∗ and q ∗ are in the same component
of the set of sequential equilibria of the game.
Proof: For this proof, we fix the extensive form tree 0, and let U represent the set of
possible payoff vectors that could be attached to the terminal nodes of the extensive form.
So elements of U are functions u : 0 t 7→ IRn from the terminal nodes to payoff vectors for
the players.
For each h ∈ H , select ah0 ∈ A(h), and write Ah = A(h) − {ah0 }. Define
(
)
X
ph (a) ≤ 1 .
Dh = ph : Ah 7→ IR | ph (a) ≥ 0 for all a ∈ Ah , and
a∈Ah
P
Write ph (ah0 ) = 1Q− a∈Ah ph (a), and for all a ∈PA(h) write pha = ph (a), and pho =
ph (ah0 ). Let D = h∈H Dh , m h = |Ah |, and m = h∈H m h .
Let X = D ◦ × (0, ∞). Define F : X × U 7→ IRm by, for h ∈ H and a ∈ Ah ,
µ
¶
1
pha
− vha ( p | u) + vh0 ( p | u),
Fha ( p, λ, u) = ln
λ
ph0
where vha is the expected payoff (to the player who chooses at h) of the continuation game
from taking action a, and vh0 is the corresponding payoff from taking action ah0 . For any
u ∈ U , write f u ( p, λ) = F( p, λ, u). For any given u ∈ U , the logit QRE correspondence
for u is given by Q = f u−1 (0), where 0 is the zero vector in IRm .
We will show that for any ( p, λ) ∈ X , that F is surjective. In particular, assume
F( p, λ, u) = x. Then for any ² ∈ IRm , construct u ² ∈ U as follows. For any h ∈ H , pick
ηh : A(h) 7→ IR such that
X
ph (a)ηh (a) = 0
a∈A(h)
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
36
MCKELVEY AND PALFREY
and such that ηh (aho ) − ηh (a) = ²ha . Then, for any terminal node v, set
X
u ²i (v) = u i (v) +
ηh (a),
h∈Hi ,a∈A(h),a<v
where a < v means that the action a precedes v. Then F( p, λ, u ² ) = x + ². Hence, for any
fixed ( p, λ) ∈ X, F is surjective. Hence, F( p, λ, u ² ) is transversal to any submanifold of
IRm . Hence, by the transversality theorem (Guillemin and Pollack, 1974: 68), for almost all
u ∈ U , f u ( p, λ) = F( p, λ, u) is transversal to 0. It follows (Guillemin and Pollack, 1974:
28) that f u−1 (0) is a manifold of codimension m, or dimension 1, for almost all u ∈ U . This
completes the proof of part (1) of the theorem.
Note that the above argument can be extended to the case when the domain of F is
bounded. For any c = (c, c̄) with 0 < c < c̄, define X c ⊆ X by X c = D 0 × [c, c̄]. Then
X c is a m + 1 dimensional manifold with boundary. It follows from a similar argument to
above that both F and ∂ F : ∂(X c × U) → <m are surjective for all ( p, λ) ∈ X c . Hence,
by the transversality theorem, for almost all u ∈ U , f u−1 (0) is a manifold with boundary of
dimension 1. We write Qc = f u−1 (0), to denote the Logit-AQRE correspondence restricted
to X c .
Now pick M > 0 so that for all p ∈ B 0 , suph∈H,a,c∈Ah |vha ( p | u) − vh0 ( p | u)| ≤ M.
Define aλ = e−λM , and bλ = eλM . Then it follows that, for any ( p, λ) ∈ Q,
µ
−λ · M ≤ log
pha
ph0
¶
≤ λ · M ⇒ aλ ph0 ≤ pha ≤ bλ ph0 .
But
pha ≤ bλ ph0 ⇒ 1 − ph0 =
X
pha ≤ bλ m h ph0
a∈Ah
⇒ ph0 ≥ 1/(bλ m h + 1)
and
pha ≥ aλ ph0 ⇒ pha ≥ aλ /(bλ m h + 1) = cλ .
Since aλ < 1, it follows that for all a ∈ A(h) (including a = ah0 ), the above inequality
holds. Define
W = {( p, λ) ∈ X : pha ≥ cλ for all h ∈ H, a ∈ A(h)}.
Thus, we have shown that Q ⊆ W ∩ X . Similarly, Qc ⊆ W ∩ X c . In other words, the QRE
graph can only “exit” X at the minimum and maximum values of λ.
Next, we establish that for sufficiently small λ, there is a unique solution. In other words,
there is only one possible exit for small λ.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
Lemma 1.
37
For sufficiently small λ, π ∗ (λ) is a singleton.
Proof: To see this, define the mapping φ : IRm → IRm to have components
,(
φha ( p) = exp[λvha ( p | u)]
X
)
exp[λvhc ( p | u)] .
c∈A(h)
Then, for any λ, ( p, λ) ∈ Q if and only if p is a fixed point φ. We will show that
for λ sufficiently small, φ is a contraction mapping. We use three facts to prove the
result, each of which follows from easy arguments. We use the notation vhac ( p | u) =
vha ( p | u) − vhc ( p | u).
Fact 1: For any p, q ∈ 10 (the interior of 1), maxl | pl − ql | ≤ maxkl | pl / pk − ql /qk |.
Fact 2: Since the derivative of e x at x = 0 is 1, then for all D > 1, there is a δ such that
whenever |x1 |, |x2 | < δ, |exp[x1 ] − exp[x2 ]| ≤ D · |x1 − x2 |.
Fact 3: For any p, q ∈ B there is an M > 0 such that maxhac |vhac ( p | u) − vhac (q | u)| ≤
M · maxha | pha − qha |.
Pick any D > 1, and let δ be defined as in Fact 2, and M be defined as in Fact 3. We
pick λ to satisfy
1. λvhac ( p | u) < δ for all h, a, c, with h ∈ H, a, c ∈ A(h) and any p ∈ B.
2. λ < 1/(D · M).
Write ρ = λ · D · M. Then pick any p, q ∈ B. Then, letting k · k represent the sup norm,
we have
kφ( p) − φ(q)k = max |φha ( p) − φha (q)| ≤ max |φha ( p)/φhc ( p) − φha (q)/φhc (q)|
ha
hac
= max | exp[λvhac ( p | u)] − exp[λvhac (q | u)]|
hac
≤ λ · D · max |vhac ( p | u) − vhac (q | u)|
hac
≤ λ · D · M · max | pha − qha | = ρ · k p − qk,
ha
where ρ < 1. The steps follow, respectively, by the definition of k · k, fact 1, equation (∗ ),
fact 2, fact 3, and the definition of k · k. It follows that for λ ≤ λ, that φ is a contraction
mapping. Hence, it has a unique fixed point.
2
Claim 3 of the theorem now follows directly from the above lemma. Also, from the
above argument, setting c = (c, c̄), we have shown that Qc is a compact, one-dimensional
manifold with boundary, which for small enough c, has a unique intersection with λ = c.
Any such manifold has a finite number of connected, compact components, each of which
must have an even number of boundary points. We have also shown that any boundary point
P1: ABL
Experimental Economics
38
KL581-01-Mckelvey
April 25, 1998
9:48
MCKELVEY AND PALFREY
must be at λ = c or λ = c̄. Since there is exactly one solution at λ = c, there must be an
odd number of solutions at λ = c̄. This establishes claim 2 of the theorem.
We now show the final claim of the theorem—that as λ → ∞, the branch B of the
manifold that passes through λ converges to a unique component of the set of sequential
equilibria.
Lemma 2. If {(λt , pt )}t and {(λ0t , qt )}t are sequences in B with limt→∞ λt = limt→∞ λ0t =
∞, and if p ∗ is a limit point of { pt }t and q ∗ is a limit point of {qt }t , then p ∗ and q ∗ are in
the same component of the set of sequential equilibria of the game.
Proof: It follows from the arguments above, that for almost all games, there exists an
increasing sequence {λi } with λ < λi for all i, such that if we define ci = (λ, λi ), X i = X ci ,
and Qi = Qci ⊆ Q, that
1. Q is a one-dimensional manifold with a unique point—say, ( p, λ)—for which λ = λ,
and a unique connected branch B that passes through ( p, λ).
2. Qi is a compact one-dimensional mainfold with boundary, which has a finite number of
connected components.
3. Letting Bi be the connected branch of Qi , which begins at ( p, λ), it follows that Bi is
a compact connected one-dimensional manifold with boundary for all i, which has a
unique intersection—say, ( pi , λi ) with λ = λi .
Now for any i, define Ai = {( p, λ) ∈ B : λ > λi }, and Ai to be the closure of the
projection of Ai onto D. Then {Ai } is a decreasing sequence of sets. By Theorem 2, ∩i Ai
is a subset of the set of sequential equilibria of the game. We show that for almost all
games, ∩i Ai must be a (nonempty) subset of a unique connected component of the set of
sequential equilibria. First of all, since D is compact, and each Ai is closed and nonempty,
∩i Ai cannot be empty. Suppose, by way of contradiction, that ∩i Ai contains points p ∗
and q ∗ which are in two distinct components of the set of sequential equilibria of the
game. But if p ∗ and q ∗ are both in ∩i Ai , then we can construct a sequence {( pi , λi )} ⊆ B
with p2i−1 → p ∗ , p2i → q ∗ , λi → ∞ and a homeomorphism, φ : < → B satisfying
φ(i) = ( pi∗ , λi ), and φ(2i) = (qi∗ , ηi ). In particular, start with any δ1 , and find ( p1∗ , λ1 ) ∈ B
with λ1 > 1/δ1 and k p1 − p ∗ k < δ1 . Since B is a connected one-dimensional manifold, it is
path connected (Guillemin and Pollack, 1974: 38, ex. 3), so one can construct φ[0, 1] → B
with φ(0) = ( p, λ) = φ(1) = ( p1∗ , λ1 ). Since φ[0, 1] is compact, it is bounded. Pick δ2 so
that 1/δ2 exceeds the bound, and find ( p2∗ , λ2 ) ∈ B with λ2 > 1/δ2 and k p2 − q ∗ k < δ2 .
By the same reasoning as above, one can construct φ[1, 2] → B with φ(1) = ( p1∗ , λ1 ), and
φ(2) = ( p2∗ , λ2 ). Proceeding in this fashion, one can construct the sequence {( pi , λi )} ⊆ B
and a homeomorphism, φ : < → B with the properties specified. Now for each i, φ[i, i +1]
is a compact one-dimensional manifold with boundary. Moreover, we can pick φ so that
φ(i − 1, i) ∩ φ(i, i + 1) = φ(i).
We have constructed an infinite sequence of compact manifolds with boundary, each of
whose projection on D connects a point near p∗ with a point near q ∗ . Further, for any λi ,
at most a finite number of these manifolds intersect with X i (since B ∩ X i is a compact
one-dimensional mainfold with boundary, which can consist of at most a finite number of
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
39
components.) Now, since p ∗ and q ∗ are in different connected components of the set
of sequential equilibria, we can find an open set E such that E contains the component
containing p ∗ and such that the closure of E does not contain any other component. Now
let ∂ E be the boundary of the closure of E. Then ∂ E is a closed compact subset of D.
Further, each of the infinite sequence of manifolds connecting a point near p∗ with a point
near q ∗ must have a nonempty intersection with ∂ E. Hence, ∂ E must have a nonempty
intersection with ∩i Ai . But this contradicts the fact that ∂ E is disjoint from the set of
sequential equilibria.
2
This lemma proves that the principal branch of the logit AQRE selects a unique component
of the set of sequential equilibria as λ goes to infinity. This establishes the final assertion
of the theorem.
2
Acknowledgment
This article has benefited from comments by two referees and by several participants at
the Conference on Learning in Games (Texas A&M University, February 1994) and the
1995 Summer Workshop on Game theory at Stonybrook, and the European Meetings of
the Econometric Society (Maastricht, September 1994). We thank the participants for their
comments and also thank Peter Coughlan, Mark Fey, Eugene Grayver, and Rob Weber for
their research assistance. We are grateful to Andy Schotter, David Porter, and Charles Holt
for sharing their data. The financial support of the National Science Foundation (Grant
#SBR-9223701) is gratefully acknowledged.
Notes
1. See also Chen et al. (1997), who study learning dynamics in the context of a similar quantal response
equilibrium model, which they call boundedly rational Nash equilibrium. Other applications can be found in
McKelvey and Palfrey (1996), Lopez (1995), Anderson et al. (1998), and elsewhere.
2. Zauner (1993) has independently conducted a reexamination of the centipede data using a similar model, but
with different distributional assumptions about the error structure. His findings are compared with ours in
Section 7.
3. That is, there exist games in which there is a unique intuitive equilibrium that is different from the unique
logit-AQRE selection.
4. This game can also be interpreted as a discrete (binary) version of an ultimatum game in which the offerer
may offer either a 50/50 split or a 75/25 split of the pie. The unfair split (75/25) may be either accepted or
rejected by the other player. The fair split is automatically accepted. Gale et al. (1995) call it the “ultimatum
minigame” and study the dynamic stability of its Nash equilibria under the replicator dynamic.
5. Figures 2 and 3 are the AQRE correspondences for the versions of these games that were run by Schotter et al.
(1994). Those games had different payoffs from the games of figure 1. However, the qualitative features of the
AQRE for the games of figure 1, and the SWW games are similar. So both sets of graphs are not reproduced
here.
6. The authors also found that these differences were magnified by presenting the strategy version to the subjects
in its matrix form. We are not proposing an explanation for this.
7. There are several other games examined by SWW, but none of these others were games of perfect information,
and all have multiple AQREs, even in the sequential versions. We do not analyze them here.
8. This may be partly due to the fact that subjects played the games so few times.
P1: ABL
Experimental Economics
40
KL581-01-Mckelvey
April 25, 1998
9:48
MCKELVEY AND PALFREY
9. In game 2, one equilibrium is sequential and the other is not. In game 3, both are sequential, but only one is
intuitive. In game 4, both are intuitive, but only one is divine.
10. Neither m 1 nor m 2 are dominated strategies.
11. There were some minor procedural differences. BH successfully replicated the results of BCP game 3.
12. A similar analysis is reported in Fey et al. (1996) using an experiment based on a constant sum version of the
centipede game. The details of the maximum likelihood procedures are explained in that article.
References
Anderson, S., Goeree, J., and Holt, C. (1998). “Rent Seeking with Bounded Rationality: An Analysis of the All
Pay Auction.” forthcoming in the Journal of Political Economy.
Banks, J., Camerer, C., and Porter, D. (1994). “Experimental Tests of Nash Refinements in Signaling Games.”
Games and Economic Behavior. 6, 1–31.
Brandts, J. and Holt, C.A. (1992). “An Experimental Test of Equilibrium Dominance in Signaling Games.”
American Economic Review. 82, 1350–1365.
Brandts, J. and Holt, C.A. (1993). “Adjustment Patterns and Equilibrium Selection in Experimental Signaling
Games.” International Journal of Game Theory. 22, 279–302.
Chen, H., Friedman, J.W., and Thisse, J. (1997). “Boundedly Rational Nash Equilibrium: A Probabilistic Choice
Approach.” Games and Economic Behavior. 18, 32–54.
Chen, K.-Y. (1994). “The Strategic Behavior of Rational Novices.” Ph.D. dissertation, California Institute of
Technology.
Cooper, R., DeJong, D., Forsythe, R., and Ross, T. (1989). “Communication in the Battle of the Sexes Game:
Some Experimental Results.” RAND Journal of Economics. 20, 568–587.
Cooper, R., DeJong, D., Forsythe, R., and Ross, T. (1990). “Selection Criteria in Coordination Games: Some
Experimental Results.” American Economic Review. 80, 218–233.
Fey, M., McKelvey, R.D., and Palfrey, T.R. (1996). “Experiments on the Constant-Sum Centipede Game.” International Journal of Game Theory. 25, 269–287.
Gale, J., Binmore, K., and Samuelson, L. (1995). “Learning to be Imperfect: The Ultimatum Game.” Games and
Economic Behavior. 8, 56–90.
Guillemin, V. and Pollack, A. (1974). Differential Topology. Prentice-Hall.
Harsanyi, J. (1973). “Games with Randomly Disturbed Payoffs.” International Journal of Game Theory. 2, 1–23.
Harsanyi, J. and Selten, R. (1988). A General Theory of Equilibrium Selection in Games. Cambridge, MA: MIT
Press.
Kohlberg, E. and Mertens, J.-F. (1986). “On the Strategic Stability of Equilibria.” Econometrica. 54, 1003–1037.
Kreps, D. and Wilson, R. (1982). “Reputation and Imperfect Information.” Journal of Economic Theory. 27,
253–279.
Lopez, G. (1995). “Quantal Response Equilibria for Models of Price Competition.” Ph.D. dissertation, University
of Virginia.
Luce, D. (1959). Individual Choice Behavior. New York: Wesley.
McFadden, D. (1974). “Conditional Logit Analysis of Qualitative Choice Behavior.” In P. Zarembka (ed.), Frontiers
in Econometrics. New York: Academic Press.
McKelvey, R.D. and Palfrey, T.R. (1992). “An Experimental Study of the Centipede Game.” Econometrica. 60,
803–836.
McKelvey, R.D. and Palfrey, T.R. (1995). “Quantal Response Equilibria in Normal Form Games.” Games and
Economic Behavior. 7, 6–38.
McKelvey, R.D. and Palfrey, T.R. (1996). “A Statistical Theory of Equilibrium in Games.” Japanese Economic
Review. 47, 186–209.
Partow, J. and Schotter, A. (1993). “Does Game Theory Predict Well for the Wrong Reasons? An Experimental
Investigation.” C.V. Starr Center for Applied Economics Working Paper 93-46, New York University.
Rosenthal, R. (1989). “A Bounded-Rationality Approach to the Study of Noncooperative Games.” International
Journal of Game Theory. 18, 273–292.
P1: ABL
Experimental Economics
KL581-01-Mckelvey
April 25, 1998
9:48
QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES
41
Schotter, A., Weigelt, K., and Wilson, C. (1994). “A Laboratory Investigation of Multiperson Rationality and
Presentation Effects.” Games and Economic Behavior. 6, 445–468.
Van Damme, E. (1987). Stability and Perfection of Nash Equilibria Berlin: Springer-Verlag.
Zauner, K. (1993). “Bubbles, Speculation, and a Reconsideration of the Centipede Game Experiment.” Mimeo,
University of California at San Diego. Revised as “A Reconsideration of the Centipede Game Experiments”.