P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 Experimental Economics, 1:9–41 (1998) c 1998 Economic Science Association ° Quantal Response Equilibria for Extensive Form Games RICHARD D. MCKELVEY AND THOMAS R. PALFREY Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA 91125, USA email: [email protected], [email protected] Abstract This article investigates the use of standard econometric models for quantal choice to study equilibria of extensive form games. Players make choices based on a quantal-choice model and assume other players do so as well. We define an agent quantal response equilibrium (AQRE), which applies QRE to the agent normal form of an extensive form game and imposes a statistical version of sequential rationality. We also define a parametric specification, called logit-AQRE, in which quantal-choice probabilities are given by logit response functions. AQRE makes predictions that contradict the invariance principle in systematic ways. We show that these predictions match up with some experimental findings by Schotter et al. (1994) about the play of games that differ only with respect to inessential transformations of the extensive form. The logit-AQRE also implies a unique selection from the set of sequential equilibria in generic extensive form games. We examine data from signaling game experiments by Banks et al. (1994) and Brandts and Holt (1993). We find that the logit-AQRE selection applied to these games succeeds in predicting patterns of behavior observed in these experiments, even when our prediction conflicts with more standard equilibrium refinements, such as the intuitive criterion. We also reexamine data from the McKelvey and Palfrey (1992) centipede experiment and find that the AQRE model can account for behavior that had previously been explained in terms of altruistic behavior. Keywords: game theory, experimental economics, quantal response equilibrium JEL Classification: C72, C70, C92 1. Introduction In an earlier article (McKelvey and Palfrey, 1995) we introduced the idea of quantal response equilibrium (QRE) and applied this concept to noncooperative games in normal form. Quantal responses are smoothed-out best responses, in the sense that players are more likely to choose better strategies than worse strategies but do not play a best response with a probability one. The idea has its origins in statistical limited dependent variable models such as discrete choice (in economics and psychology) and stimulus/dosage response and bioassay (in biology and medical research) and is an essential ingredient of the work on bounded rationality in games by Rosenthal (1989) and control-cost models of equilibrium selection by van Damme (1987). The added complication in applying the concept to game theory, in contrast to individual choice, is that the choice probabilities of the players have an important interactive component, since they are simultaneously determined in equilibrium. QRE is an internally consistent equilibrium model, in the sense that the quantal response functions are based on the equilibrium probability distribution of the opponents’ P1: ABL Experimental Economics 10 KL581-01-Mckelvey April 25, 1998 9:48 MCKELVEY AND PALFREY strategy choices rather than simply on arbitrary beliefs the players could have about those probabilities.1 An interesting property of QRE is that systematic deviations from Nash equilibrium may be predicted even without introducing systematic features to the error structure. The systematic effects of the statistical disturbances are (indirect) equilibrium phenomena that arise because of the strategic response of the players to the noisy environment. This leads to what can be viewed as a statistical generalization of Nash equilibrium. In the earlier article we show that for normal form games there is a correspondence between QRE and Bayesian equilibria of what Harsanyi (1973) calls randomly disturbed games. Specifically, we show that a particular parametrization of quantal response equilibrium, which we call the logit equilibrium, corresponds to the Bayesian equilibria of the game with a vector of independent log Weibull-distributed disturbances added to the payoffs of the players, with each individual’s disturbances being private information to that player. That paper then establishes some properties of QRE in finite n-person games, and reexamines a diverse collection of experimental bimatrix games that have unique logit (and Nash) equilibria. Statistical analysis of that data was carried out using standard maximum-likelihood techniques in a structural model that is directly implied by the quantal response model. We estimate the variance of the Weibull disturbances for that data and compare the predicted frequencies of strategy choices with the actual frequencies. We find that the theory tracks the data fairly well and can account for some of the observed systematic deviations from the predictions of the exact Nash equilibrium model. This article reconsiders Nash equilibrium in extensive form games from the statistical point of view and extends the concept of logit equilibrium to the extensive form representation. A number of interesting theoretical and empirical findings emerge. On the theoretical side, the new formulation predicts systematic violations of what is known in the foundational game theory literature as invariance. It has been argued vigorously by some (e.g., Kohlberg and Mertens, 1986) that “the reduced normal form captures all the relevant information for decision purposes,” leading to the traditional game theory view that any good theoretical equilibrium concept for noncooperative games must satisfy invariance. In our statistical theory of decision-making in games the compelling arguments for invariance fall apart. In fact, it is easy to construct examples where we predict systematic differences in the predicted patterns of play depending on which “equivalent” version of a game is played. Moreover, the intuition behind these differences is very sensible and supported by experimental data. Besides discriminating between different versions of a game that have equivalent strategicform representations, our model also makes different predictions depending on whether the game is played in its “agent” normal form or its more traditional normal form and makes different predictions between the expected patterns of play in the normal form of the game and the reduced normal form of the some game. Loosely speaking, the intuitive reason our model predicts representation dependence is that the error structure introduces private information in particular ways that depend on the exact details of how the game is actually played. Because of the specific statistical predictions that are implied by the dependence, many aspects of the quantal response equilibrium theory can be tested directly by applying P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES 11 standard maximum likelihood techniques to data generated in controlled laboratory experiments. For computational reasons, we focus on a version of QRE which is similar in spirit to the agent model of how an extensive form game is played, and so we call this the agent quantal response equilibrium (AQRE). In addition to predicting systematic violations of the invariance principle, the logit-AQRE model implies a unique selection from the set of sequential equilibrium components. This selection is defined by the principal branch of the logit-AQRE correspondence, as explained in McKelvey and Palfrey (1995). This selection thus generates a unique prediction of a strictly positive probability distribution over play paths for every value of the logit response parameter, for almost any finite extensive form game. As demonstrated in our earlier article, this selection may be inconsistent with trembling-hand perfection as applied to the reduced normal form of the game, as well as all other refinements of which we are aware. In signaling games, there is a well-known multiple equilibrium problem to which a great deal of effort has been directed by modern game theorists, generating a vast literature on the general problem of deductively based refinements. Several experiments have been conducted to test whether the refinements developed to distinguish these equilibria are useful in helping to predict which equilibria (if any) are more likely to be played. This is a natural setting in which to apply AQRE, since this equilibrium concept does not have the problem of trying to define behavior off the equilibrium path. This equilibrium concept also makes multiple equilibrium predictions, but the logit-AQRE has a natural refinement that generically selects a unique equilibrium in a way that is much different from traditional (deductive) refinement arguments. In addition to predicting the patterns of play “on the Nash equilibrium path,” this equilibrium concept also predicts patterns of play off the Nash equilibrium path as well. Some of the anomalies (concerning more standard theories) uncovered in these experiments have to do with behavior off the equilibrium path, and our model successfully accounts for these. In past centipede game experiments conducted by us (McKelvey and Palfrey, 1992) frequent violations of Nash predicted play were observed. This fact and the fact that the observed outcomes were substantial Pareto improvements over Nash play, were rationalized by a model in which some of the players have altruistic preferences. Using arguments based on reputation building (Kreps and Wilson, 1982) it is shown that frequent violations of Nash behavior (similar to what was observed) can be accounted for even if altruistic players are “rare”. While this explanation of the data does account pretty well for most of the salient features of that particular data set, it is clearly ad hoc. The explanation involves the invention, or assumption, of a “deviant type” who systematically violates Nash behavior in exactly the direction observed in the data. A preferable explanation would be able to account for this data without resorting to such “adhocery”. The AQRE provides a framework for doing exactly that and also has the desirable feature of being applicable to arbitrary games without necessitating the invention of systematically deviant types, tailored to the peculiarities of specific games. Accordingly, we also reexamine the centipede data2 in the context of the logit-AQRE. We find that a two-parameter specification of AQRE can account for the same qualitative violations of Nash behavior as the five-parameter altruism model, although the exact fit to the data is not as tight. In another article, we have presented some new data (contradictory P1: ABL Experimental Economics KL581-01-Mckelvey 12 April 25, 1998 9:48 MCKELVEY AND PALFREY to the altruism explanation) from much different centipede-type experiments where all outcomes are Pareto optimal (Fey et al. 1996). The logit-AQRE model also fits that data better than other competing models. The remainder of the article is divided into seven sections. The next three sections set out the formal structure of the AQRE model for extensive form games. We also introduce a parametric version based on the logit model, which we later use for estimation. The fifth section presents some examples, based on the chain-store paradox stage game, demonstrating one kind of violations of the invariance principle that are implied by our model, and looks at some recent experimental data reported in Schotter et al. (1994) that are consistent with our predictions. The sixth section explains the refinement for signaling games based on logit-AQRE and examines data from experimental signaling games conducted by Brandts and Holt (1993) and by Banks et al. (1994). The seventh section reexamines the data from the centipede game, and discusses similar findings by Zauner (1993) and Fey et al. (1996). We conclude with a few brief remarks about possible extensions of the QRE model. 2. Definitions and notation Let N be a finite set of n + 1 players, one of whom (player 0) is designated chance, let X be a finite set of outcomes, and let A be a finite set of actions. Let Γ = (0, Q) be a topological tree with 0 the set of nodes and Q ⊆ 0 × 0 a binary relation on 0 representing branches, and let Q ∗ be the transitive closure of Q. If v Qv 0 we say v immediately follows v 0 (or v 0 immediately precedes v). If v Q ∗ v 0 we say v follows v 0 (or v 0 precedes v). Write Q(v) and Q ∗ (v) for the set of immediate followers and the set of followers of v ∈ 0, respectively. We assume that Q is an asymmetric, acyclic binary relation on 0 such that every node v ∈ 0 follows at most one node and such that there is exactly one node, v ∗ ∈ 0, which follows no node. The element v ∗ is called the root node. A node v ∈ 0 is terminal if Q(v) = ∅; otherwise it is nonterminal. Let 0 t and 0 0 represent the set of terminal and nonterminal nodes of 0, respectively. We define an extensive form game G(N , X, A, Γ) by defining an outcome function ψ : 0 t → X , a partition function P : 0 0 → N × I (here I is the integers), an index function φ : 0 → A that is 1 − 1 on Q(v) for each v ∈ 0 0 and that satisfies φ(Q(v)) = φ(Q(v 0 )) whenever P(v) = P(v 0 ), and a probability function λ : 0 0 → IR that is a probability function on Q(v) for each v ∈ 0 0 with P1 (v) = 0. For any v ∈ 0 0 , define A(v) = φ(Q(v)) to be the set of actions available at v. For each v ∈ 0 0 define h(v) = {v 0 ∈ 0 0 : P(v) = P(v 0 )} to be the information set containing v. Let H = {h : h = h(v) for some v ∈ 0 0 } be the set of all information sets. If h = h(v) write P(h) = P(v) and A(h) = A(v). Write Hi = {h ∈ H : P1 (h) = i} to be the set of j information sets for player i. If P(v) = (i, j), write h i = h(v). An action a ∈ A(h) is said to precede a node v ∈ 0 if there is an x ∈ h and y ∈ Q(x) with v ∈ Q ∗ (y) and φ(y) = a. An extensive form game G is said to have perfect recall if for every player i ∈ N , all information sets h, h 0 ∈ Hi , every action a ∈ A(h), and all nodes x, y ∈ h 0 , a precedes x iff a precedes y. In a game of perfect recall, a player remembers what he previously knew and what he previously did. We assume throughout that G has perfect recall. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 13 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES A behavior strategy for player i ∈ N is a function bi : Hi → M(A) satisfying bi (h) ∈ M(A(h)) for all h ∈ Hi . Here M(C) indicates the set of all probability measures over j the set C. Again, for h ∈ Hi we use the shorthand bi h = bi (h), and bi j = bi (h i ). For each a ∈ A(h), bi ha = bi h (a) denotes the probability of Q action a. Let Bi denote the set of all behavior strategies for player i, and let B = i∈N Bi be the set of behavior strategy n-tuples. If v ∈ h ∈ H0 is a chanceQnode, and v 0 ∈ Q(v), then we define b by bh (v 0 ) = λv (v 0 ). We use the notation B−i = j6=i B j and write elements of B in the form b = (bi , b−i ) ∈ Bi × B−i = B when we want to focus on a particular player i ∈ N . A pure strategy is a behavior strategy that assigns a degenerate probability distribution to each information set assigned to a player. Let B ◦ denote the interior of B. Each behavior strategy n-tuple b ∈ B ◦ determines a strictly positive realization probability ρ(v | b) for each node v ∈ 0, as follows. If v = v ∗ is the root node, then define ρ(v | b) = 1. Otherwise, find v 0 with v ∈ Q(v 0 ). Since v 0 ∈ 0 0 , v 0 ∈ h ∈ Hi forPsome i ∈ N , and define ρ(v | b) = ρ(v 0 )bh (v 0 ). For ∗ any h ∈ H , define ρ(h | b) = v∈H ρ(v | b). Also, for any h ∈ H , define Q (h) = ∗ ∪{Q (v) : v ∈ h}. Then we can define a conditional realization probability, ρ(v | h, b) on Q ∗ (h) by ρ(v | h, b)ρ(h | b) = ρ(v | b). Note that for any b ∈ B ◦ , that ρ(v | b) defines a strictly positive probability measure over the terminal nodes. Similarly, ρ(v | h, b) defines a probability measure over 0 t ∩ Q ∗ (h). For any i ∈ N and b ∈ B ◦ , define the payoff function u i : B ◦ → IR by X ρ(v | b)u i (v). u i (b) = v∈0 t For any h ∈ H, i ∈ N and b ∈ B ◦ , define the conditional payoff function u i : B ◦ → IR by u i (b | h) = X ρ(v | h, b)u i (v). v∈0 t ∩Q ∗ (h) For any interior behavior strategy b ∈ B ◦ , any i and any information set h i ∈ Hi , we j j j denote u i (a, b | h i ) the conditional payoff to i of playing action a ∈ A(h i ) at h i with probability one, and playing bi elsewhere. Define a valuation function x : H → IR K , where for any h ∈ H, K = |A(h)| . We j write x ha = x(h)(a). Write X i j = IR | A(h i ) | to represent the space of possible expected Q j payoffs for set h i ∈ Hi , and X i = j X i j , Qactions that player i might adopt at information and X = i∈N X . We define the function ū : B ◦ → X by j ū(b) = (ū 1 (b), . . . , ū n (b)), where j ū i ja (b) = u i (a, b | h i ). Thus ū(b) is a vector whose components are the conditional expected payoff to each available action a of each information set j of each player i. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 14 3. MCKELVEY AND PALFREY Agent quantal response equilibrium j j For each i, h i , a ∈ A(h i ), let εi ja be a random variable, which represents i’s payoff j disturbance from choosing a at information set h i . For any b ∈ B ◦ , let û i ja = ū i ja (b) + εi ja . We assume that ε is private information to the players. So player i observes εi ja , but the other players (or the econometrician) do not. j Let εi j be the vector of i’s errors at h i , let εi be the entire error vector across i’s information sets, and let ε = (ε1 , . . . , εn ) be the vector of errors for all players. We assume ε is an absolutely continuous random vector (with respect to Lebesgue measure), distributed according to a joint distribution with density function f (ε). We also assume that the εi j are statistically independent and that the expected value of εi ja exists for all i, j, a ∈ A(h ij ). Any probability density function, f , satisfying the above assumptions is called admissible. We assume that at each information set h ij , player i selects an action that maximizes û i ja . For any action a ∈ A(h ij ) and any ū ∈ X define © ¡ ¢ª Ri ja (ū) = ² : ū i ja + ²i ja ≥ ū i j â + ²i j â ∀â ∈ A h ij and Z σi ja (ū) = Ri ja (ū) f (²) d² Definition 1. For any extensive form game G and error structure f (ε), a behavioral strategy b∗ ∈ B is an agent quantal response equilibrium (AQRE) if it is a fixed point in B of σ ◦ ū. j It is a vector b∗ ∈ B ◦ such that for all i ∈ N , 1 ≤ j ≤ Ji , a ∈ A(h i ), bi∗ja = σi ja (ū(b∗ )). That is, b∗ is a quantal response equilibrium if, when all players are choosing their best ∗ response at every information set h ij , taking b−i j as given, this generates a behavior strategy ∗ that is exactly the same as b . This definition assumes the “agent model” of play in an extensive form game, where different information sets of a given player are assumed to be played out by different agents, all of whom share the same payoff function. In our model, each agent i j simply chooses the maximum of û i ja at information set h ij and acts independently of the other agents of the same player. For this reason we refer to the equilibrium defined above as the agent quantal response equilibrium (AQRE). Theorem 1. For any admissible f , an AQRE exists. Proof: Define φ = σ ◦ ū : B 7→ B. Note that φ is continuous. Extend φ to have domain B by n o φi j (b) = co lim φi j (bt ) : {bt } ⊆ B ◦ and lim bt = b . t→∞ t→∞ P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES 15 Then B is compact, convex, and φ : B 7→→ B is convex valued and upper hemicontinuous. Hence, by the Kakutani fixed point theorem, φ has a fixed point, b∗ ∈ B. It is easy to show that there can be no fixed point on the boundary. Hence, there is a b∗ ∈ B ◦ with 2 σ ◦ ū(b∗ ) = b∗ . The above theorem guarantees existence for AQRE, and the definition guarantees that the equilibrium behavioral strategies place positive probability on every available action in every information set of every player. The equilibrium correspondence has the same “nice” topological properties as the Nash equilibrium correspondence. The above model can be generalized by dropping the independence assumption or by specifying an observation function O(h ij ), where for each i, h ij , O(h ij ) specifies a signal i receives about ε at information set h ij , rather than just assuming i observes only ū i j at h i j . While these changes would require some additional notation, the basic ideas and properties of the quantal response model would not change. Some models of O(h ij ) other than the one we explore here might be interesting in their own right. We leave these issues as a subject of future work. 4. The logistic AQRE In the remainder of the article, we focus mainly on a specialized version of the model, where each εi ja is independently and identically distributed according to the type I extreme −λεi ja . This value, (or log Weibull) distribution with cumulative density F(εi ja ) = e−e π2 distribution has mean of γλ , where γ is the Euler constant (0.577), and variance of 6λ 2. So λ is proportional to the precision of F. This distribution of the disturbances leads to choice probabilities following a multinomial logit distribution (see e.g., McFadden, 1974). In particular, at h ij , bij (in the AQRE) is given by eλū i ja (b) . λū i ja 0 (b) a 0 ∈A(h ij ) e bij (a) = P A logit-AQRE is any solution to this set of k equations (one equation for each action in each information set of each agent). For each λ ∈ [0, ∞), define the logit-AQRE correspondence b∗ : R+ 7→→ B ◦ as the set of logit-AQRE behavioral strategies when the precision of F equals λ: ( ∗ ◦ b (λ) = b ∈ B : bij (a) ) eλū i ja (b) . =P λū i ja 0 (b) a 0 ∈A(h ij ) e We establish several properties of the logit-AQRE correspondence as a function of λ. We already established from Theorem 1 of the previous section that b∗ (λ) 6= ∅, ∀λ. That is, at least one logit-AQRE exists for every value of λ. A second property is that limit points of b∗ (·) as λ goes to infinity are behavioral strategy Nash equilibria of the game. Both of these properties follow from the same arguments as in McKelvey and Palfrey (1995). P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 16 MCKELVEY AND PALFREY A third property is that limit points of b∗ (λ) are not only Nash equilibrium behavior strategy profiles, but are sequential equilibrium strategy profiles: Theorem 2. For every finite extensive form game, every limit point of a sequence of logit-AQREs with λ going to infinity corresponds to the strategy of a sequential equilibrium assessment of the game. ∗ be a limit point of b∗ (λ). Then there exists consistent beliefs µ∗∞ (asProof: Let b∞ signments of a probability distribution over the nodes at each information set that satisfy ∗ specifies the Kreps-Wilson, 1982, consistency condition) such that, under those beliefs, b∞ ∗ optimal behavior for every continuation game. That is b∞ is sequentially rational given µ∗∞ ∗ and µ∗∞ is consistent with b∞ . This is proved below. ∞ ∗ Take any sequence {λk }k=1 and {bk∗ }∞ k=1 such that limk→∞ λk = ∞ and limk→∞ bk = ∗ ∗ b∞ and bk is a logit equilibrium for λk for each k. First, note that for all k and for all information sets h, b∗ (a) > 0 ∀ a ∈ A(h). Consequently, every node is reached with positive probability. Therefore, by Bayes rule, for each k, bk∗ uniquely defines a set of beliefs µ∗k over the nodes of every information set. Moreover, since µ varies continuously with b, there is a unique limit µ∗∞ . Therefore, µ∗∞ are consistent beliefs with respect to ∗ ∗ . What remains to be shown is that b∞ (h) is optimal given µ∗∞ (h) for all h. (If so, b∞ ∗ ∗ ∗ then (b∞ , µ∞ ) is a sequential equilibrium, so that b∞ is a sequential equilibrium strategy.) j Suppose not. Then there exists ε > 0 such that for some h c there is some pair of actions j ∗ 0 ∗ ∗ ) > ². But then there must {a, a } ∈ A(h c ) such that bcj∞ (a) > 0 but ū i ja 0 (b∞ ) − ū i ja (b∞ exist K > 0 such that ū i ja 0 (bk∗ ) − ū i ja (bk∗ ) ≥ ² 2 ∀ k ≥ K. Therefore, for all k ≥ K , bi∗jk (a) bi∗jk (a 0 ) ² ≤ e−λk 2 ∀ k ≥ K. So ² bi∗j∞ (a) ≡ lim bi∗jk (a) ≤ lim bi∗jk (a 0 )e−λk 2 , k→∞ k→∞ = 0, which contradicts bi∗j∞ (a) > 0. 2 The next result establishes uniqueness of the AQRE for games of perfect information. Theorem 3. For every finite extensive form game of perfect information, and all 0 ≤ λ ≤ ∞, b∗ (λ) is a singleton. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES 17 Proof: This is proved by construction of b∗ (λ). Since the game has perfect information, the predecessors of each terminal node all lie in different (singleton) information sets. Consider one such information set, h, i(h) = i, A(h) = {a1 , . . . , ak } and for each a ∈ A(h), ∃ unique outcome νa ∈ 0 t such that ρ(νa | h, ba ) = 1, where ba is the behavior strategy profile b, with bi (h) replaced by a, so u i (a | h) = u i (νa ). This uniquely determines the action probabilities of logit-AQRE at any node immediately preceding a terminal node by eλu j (νa j ) bi∗ (a j h) = P K λu (ν ) . i ai k=1 e ThenP the expected utility to an agent who has reached h but has not chosen an action at h is ak bi∗ (ak , h)u i (ν(ak , h)) = u i (b | h). Notice that u i (b | h) is therefore uniquely determined. Next consider the immediate predecessors of the immediate predecessors of 0 t . Repeating the above argument yields a unique b∗ (h 0 ) for all such h 0 such that Q | (Q(V )) ⊆ 0 t ∀ v ∈ h 0 . Since the game is finite, we can iterate this argument until b∗ is defined (uniquely) at each information set. 2 A fifth property of the logit-AQRE correspondence is that for generic games there exists a unique connected path in the graph of logit-AQRE correspondence that includes b0∗ (the equilibrium when λ = 0). This unique connected path contains a solution b̄λ∗ for all values of λ ≥ 0, and selects a unique sequential equilibrium component as λ → ∞. Any such point is called a logit-AQRE solution of the game. Theorem 4. For almost all finite extensive form games, 1. The logit AQRE correspondence, Q is a one-dimensional manifold. 2. For almost all λ there are an odd number of logit-AQRE. 3. There is a unique branch, B (the principal branch) of the logit AQRE selection connected to the centroid of the game at λ = 0. 4. The principal branch of the logit AQRE correspondence selects a unique component from the set of sequential equilibria of the game, in the sense that if {(λt , pt )}t and {(λ0t , qt )}t are sequences in B with limt→∞ λt = limt→∞ λ0t = ∞, and if p∗ is a limit point of { pt }t and q ∗ is a limit point of {qt }t , then p ∗ and q ∗ are in the same component of the set of sequential equilibria of the game. Proof: See the appendix. 2 Note that example 1 in McKelvey and Palfrey (1995) demonstrates that the converse of Theorem 2 is not true. That is, there are sequential equilibria (and trembling-hand perfect equilibria, as well) that are not approachable by a sequence of logit-AQREs. Thus, the set of limit points logit-AQRE provides a refinement of sequential equilibrium, although limit points are not necessarily trembling-hand perfect. Whenever there is a unique connected path selection of the logit-AQRE correspondence, this gives a refinement of the AQRE-approachable sequential equilibria that is an even stronger refinement of sequential P1: ABL Experimental Economics KL581-01-Mckelvey 18 April 25, 1998 9:48 MCKELVEY AND PALFREY equilibrium. As we show below, it offers a quite different refinement from standard deductive refinements (such as the intuitive criterion),3 and predicts much better than the intuitive refinement in signaling games that have been conducted as laboratory experiments. Another nice feature of AQRE is that it makes predictions about the relative likelihood of all different play paths. In contrast, other refinements tend to predict that only a subset of the play paths can occur. Because all play paths can occur with positive probability in AQRE, we don’t have to worry about the arbitrary specification of player beliefs “off the equilibrium path”. In the next section we demonstrate how this model of equilibrium behavior in extensive form games contradicts the invariance principle that all strategically relevant information in an extensive form game is captured in the normal form. We illustrate this in the context of a particular game, the one-shot chain-store paradox. The way the game is actually played (as a game where both players simultaneously choose strategies versus a two-stage game in which the incumbent first observes whether an entrant has entered before choosing his strategy) will systematically affect the predicted pattern of play using the logit-AQRE. Furthermore the differences implied theoretically by logit-AQRE mirror experimental findings reported recently by Schotter et al. (1994). 5. Invariance Consider the well-known chainstore paradox stage game,4 with the extensive form illustrated in figure 1(a). The normal form representation of this game is given in Table 1. The logit equilibrium (McKelvey and Palfrey, 1995) for the normal form of Table 1 is equivalent to the logit AQRE of the extensive form game of figure 1(b), which is a strategically invariant transformation of the figure 1(a) game. We wish to compare the quantal response equilibrium as it applies to the game in figure 1(a) to the quantal response equilibrium as it applies to the game in figure 1(b). To do Figure 1. Chainstore paradox. Player 1 is the entrant, player 2 is the incumbent (E = enter, N = not enter, F = fight, A = acquiesce). P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES Table 1. 19 Normal form of chainstore stage game. 1 2 F A N 2, 2 2, 2 E 0, 0 3, 1 this, we compare the logit-AQRE as a function of λ in the first case to the logit equilibrium (McKelvey and Palfrey, 1995) in the second case. The equilibrium conditions for the logit-AQRE of the extensive form game in figure 1(a) are 1 , 1 + e(1−3q)λ 1 , q= 1 + eλ p= (1) (2) where p is prob {1 moves N } and q is prob {2 moves F}. The equilibrium conditions for the extensive form game in figure 1(b) are e2λ 1 0 )λ = 0 )λ 2λ 3(1−q (1−3q e +e 1+e 0 e2 p λ 1 . q 0 = (2 p0 +(1− p0 ))λ 0λ = 2 p e +e 1 + e(1− p0 )λ p0 = (3) (4) The qualitative features of the AQRE correspondence for ( p, q) and p 0 , q 0 as a function of λ in (1) and (2) are shown in figures 2 and 3.5 The curves in these figures show how the AQRE mixing probabilities vary with λ. The numbers and dots in these figures are explained below. Several points may be illustrated from this example. First, in the sequentially played game there is a unique AQRE for every value of λ, since q does not depend on p. This is generally the case in games of perfect information, as was established in Theorem 3. This is not true in the game where player 2 chooses a strategy without having observed player 1’s move. In this case there is an additional QRE component for large values of λ, which converges to the subgame imperfect equilibrium p = 1, q = .5. Thus, in a sense, the subgame imperfect entry deterrence equilibrium component (which is also trembling-hand imperfect) is more plausible in the simultaneous play version of the chain-store game since it is approachable by a sequence of AQRE’s. There is another reason why the subgame imperfect outcome (no entry) is more likely to be observed in the second version of the game. In the AQRE that converges to the perfect equilibrium, p 0 and q 0 are both higher than p and q, respectively, for all values of λ. It is easy to see why this is true directly from equations (1) to (4). For any value of p ∈ (0, 1) and for any value of λ > 0, q < q 0 , since (1 − p 0 ) < 1. Since q < q 0 , it follows that p < p 0 . The intuition is simple. In the strategic version of the game, the probability that F is suboptimal P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 20 9:48 MCKELVEY AND PALFREY Figure 2. SWW (1994) sequential version. Figure 3. SWW (1994) strategy version. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES Figure 4. 21 A 2 × 2 game of perfect information and its strategy version. is less than 1, since it is only suboptimal if player 1 chooses E. This reduces the difference in expected utility between F and A (relative to the sequential version), leading to a higher probability that player 2 will choose F. This in turn leads to a higher probability player 1 will choose N . This gives the following proposition. Proposition 1. For all λ > 0, q < q 0 and p < p 0 . This can be generalized to the class of games illustrated in figure 4. Suppose that D > 0. In the strategy version of the game, player 2’s (dominated) choice of F is less costly than in the sequential version of the game. Therefore, q < q 0 < 12 when D > 0. Similarly, q > q 0 > 12 if D < 0. In general, |q − 12 | > |q 0 − 12 | , which implies that deviations from the subgame perfect Nash equilibrium prediction should always be greater when the game is played in the strategy version. How this affects the moves by player 1 is more subtle and depends on whether C and D have the same or different signs. E is a relatively more attractive option in the strategy version if and only if C > 0 and q 0 > q or C < 0 and q 0 < q. This requires CD < 0. Similarly E is a relatively less attractive option if and only if CD > 0. Therefore, p > p0 ⇔ CD < 0. We turn next to an examination of data from experiments using this type of game. Schotter, Weigelt and Wilson (1994) (SWW) conducted an experiment with A = 4, B = 3, C = 6, and D = 2, where some subjects played the figure 4(a) game, other subjects played the figure 4(b) game, and still other subjects played the game in normal form. Pooling the results for the figure 4(b) games and the normal form games, they found that entry was deterred more often with the strategy version of the game than with the sequential version of the game. Furthermore, the second mover chose Fight more often in the strategy version of the game.6 To fit the AQRE to these data, we use a standard maximum-likelihood approach, as described in detail in McKelvey and Palfrey (1995). Given an extensive form game, for any parameter, λ, of the AQRE model, we obtain predicted frequencies of moves at each information set. Given a data set from a particular experimental game, our estimate, λ̂, is the value of λ that maximizes the likelihood of that data set. In order to gauge the success of AQRE in fitting the data, we compare these estimates to an alternative model, which we call the noisy Nash model (NNM). This alternative model P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 22 Table 2. MCKELVEY AND PALFREY SWW experimental results (36 subjects). Sequential version n fi QRE NNM N 7 0.0814 0.0674 0.0545 E 79 0.9186 0.9326 0.9455 F 2 0.0253 0.0455 0.0545 A 77 0.9747 0.9545 0.9455 λ|γ 1.5216 0.8910 λlo | γlo 1.2721 0.8080 λhi | γhi 1.8382 −L∗ Strategic version 34.159 0.9460 34.928 N 46 0.5750 0.4706 0.3875 E 34 0.4250 0.5294 0.6125 F 16 0.2000 0.3077 0.3875 A 64 0.8000 0.6923 0.6125 λ|γ 0.7658 0.2250 λlo | γlo 0.5515 0.0720 λhi | γhi −L∗ 0.8891 98.69 0.3710 108.819 (which includes Nash equilibrium and random play as two special cases) posits that at any information set, players adopt the Nash equilibrium strategy with probability γ , and with probability 1 − γ they randomize uniformly over all available actions. This differs from the QRE by not imposing an equilibrium restriction. That is, NNM assumes players may make errors relative to the Nash predictions, but they do not take into account that other players also make errors. Thus, it is more in the spirit of the bounded rationality, nonequilibrium approach to explaining experimental data, as opposed to the equilibrium approach embodied in QRE. In NNM, the predictions for γ = 1 corresponds to λ = ∞, and γ = 0 corresponds to λ = 0. The models generally differ (sometimes dramatically) for intermediate values of γ and λ. We report maximum likelihood estimates for both γ and λ and compare them. The logit-AQRE correspondence for both the sequential and strategy versions of the game7 are given in figures 2 and 3, respectively. Superimposed on those graphs are the observed data points plotted at the maximum likelihood values of the logit response parameter. Table 2 gives the predicted and actual data aggregated across all of their experiments. In this table (as well as subsequent tables), λ represents the maximum likelihood estimate of λ, (λlo , λhi ) is a 95 percent confidence interval for λ, and f i denotes the empirical relative frequency of each strategy choice. The estimation here, as well as the estimations presented later in this article are performed under the restriction that all subjects (both row and column players) have the same λ. There are multiple points marked on each graph. Each numbered point represents a different experience level of the subjects, with higher numbers indicating greater experience. The black circle represents the pooled estimate, P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES 23 over all experience levels. As is evident from the graphs, experience seems to have little effect in the data.8 One can readily see that the logit-AQRE fits quite well in both versions of the game, successfully predicting more entry deterrence in the sequential version of the game. The fit is not perfect, as it underpredicts the frequency of (N , A) outcomes somewhat in the strategy version of the game. This underprediction partly results from our having included data from the matrix version of the game as well as the extensive form version. The NNM estimates exhibit similar underprediction, and generate poorer fits in both games. 6. Equilibrium selection As a second application of AQRE, we look at simple sender-receiver signaling games. Several experimental studies have been conducted using sender-receiver games, notably by Banks et al. (1994), Brandts and Holt (1992, 1993), and Partow and Schotter (1993). This is an ideal class of games to study with AQRE. First, the games typically have multiple sequential equilibria, and logit-AQRE generically selects a unique one as the limit of the unique connected path in the logit-AQRE graph. The earlier figures showing the logit-AQRE correspondence for the SWW experiments illustrates this selection. Notice that in the strategy version of their game there are multiple logit-AQRE for sufficiently high values of λ. However, there is a unique connected path starting at the unique equilibrium at λ = 0. The unique connected path selection is the darker curve in that figure. This is also true for extensive form games as established in Theorem 4. Moreover, in signaling games the unique connected path need not correspond to traditional refinements of equilibrium. For reasons of space, we do not reexamine the results of all the experimental signaling games that have been conducted to date. We focus on a subset of the experimental games reported by Banks et al. (1994) (BCP) and Brandts and Holt (1993) (BH). Banks, Camerer, and Porter (1994) (BCP) The BCP experiment consisted of a series of two-person signaling games, with two sender types, three messages, and three responses. Each game was designed to have two Nash equilibria, one of which was always further up the chain of deductive refinements than the other. The experiments were intended to give data on the conditions under which subjects play the more refined equilibrium. We analyze games 2, 3, and 4 of BCP. These games are given in Table 3. Each of these games has two Nash equilibrium outcomes, with one being a more refined equilibrium than the other.9 The logit-AQRE graphs are displayed in figures 5 to 7 for these three games. Each triangle in the figures represents the simplex of mixing probabilities at some information set of the game. (Recall that players have three possible actions at each information set in these games.) The centroid of each triangle corresponds to λ = 0. As λ increases without bound, the Logit solution probabilities converge to one of the vertices, corresponding to one of the sequential equilibrium pure strategy profiles, as indicated by the solid lines. The aggregate data from the experiment are represented by a black dot. Observe that convergence from the centroid to the sequential equilibrium selected by the Logit solution is frequently P1: ABL Experimental Economics Table 3. m1 KL581-01-Mckelvey April 25, 1998 9:48 BCP (1994) signaling games. a1 a2 BCP # 2 (Nash versus a3 m2 a1 a2 a3 m3 a1 a2 a3 sequential)∗ t1 1, 2 2, 2 0, 3 t1 1, 2 1, 1 2, 1 t1 3, 1 0, 0 2, 1 t2 2, 2 1, 4 3, 2 t2 2, 2 0, 4 3, 1 t2 2, 2 0, 0 2, 1 BCP # 3 (sequential versus intuitive)† t1 0, 3 2, 2 2, 1 t1 1, 2 2, 1 3, 0 t1 1, 6 4, 1 2, 0 t2 1, 0 3, 2 2, 1 t2 0, 1 3, 1 2, 6 t2 0, 0 4, 1 0, 6 BCP # 4 (intuitive versus divine)‡ t1 4, 0 0, 3 0, 4 t1 2, 0 0, 3 3, 2 t1 2, 3 1, 0 1, 2 t2 3, 4 3, 3 1, 0 t2 0, 3 0, 0 2, 2 t2 4, 3 0, 4 3, 0 ∗ Nash: (m 1 , a2 , a2 , a2 ) Seq: (m 3 , a2 , a2 , a1 ). † Sequential: (m , a , a , a ) Intuitive: (m , a , a , a ). 2 1 3 1 1 2 1 1 ‡ Intuitive: (m , a , a , a ) Divine: (m , a , a , a ). 2 3 3 2 3 2 2 1 Figure 5. BCP (1994) Game 2. Sequential versus Nash. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 25 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES Table 4. BCP experimental results. BCP #2 t1 t2 m1 m2 m3 n fi m1 7 .184 m2 0 .000 m3 31 m1 11 m2 m3 QRE BCP #3 NNM n fi QRE .103 .092 20 .500 .497 .021 .092 14 .350 .339 .816 .875 .817 6 .150 .324 .176 .092 45 .900 0 .000 .026 .092 4 23 .677 .798 .817 1 BCP #4 NNM n fi QRE NNM .817 2 .067 .274 .217 .092 9 .300 .221 .217 .165 .092 19 .633 .504 .567 .927 .817 21 .700 .422 .217 .080 .057 .092 0 .000 .032 .217 .020 .016 .092 9 .300 .546 .567 a1 0 .000 .049 .092 4 .062 .153 .092 9 .391 .298 .217 a2 18 1.000 .837 .817 61 .939 .704 .817 13 .565 .593 .567 a3 0 .000 .113 .092 0 .000 .142 .092 1 .044 .109 .217 a1 0 .184 .092 14 .778 .686 .817 0 .000 .044 .217 a2 0 .797 .817 0 .000 .174 .092 4 .444 .647 .567 a3 0 .019 .092 4 .222 .140 .092 5 .556 .308 .217 a1 53 .841 .726 .817 7 1.000 .999 .817 23 .821 .704 .567 a2 0 .000 .026 .092 0 .000 .001 .092 5 .179 .235 .217 a3 10 .159 0 .000 0 .000 .248 .092 .000 .092 .062 .217 λ|γ 2.249 .725 1.598 .725 1.193 .350 λlo | γlo 1.901 .627 1.441 .634 .099 .216 λhi | γhi 2.715 .809 1.864 .802 −L∗ 83.390 92.224 101.050 108.628 1.470 95.65 .479 118.151 nonmonotonic (for example, Type 1 senders and Message 2 receivers in Game 3). This is in marked contrast to NNM, since the NNM correspondence is simply defined by a straight line in each graph connecting the centroid to the appropriate vertex. The estimation of λ and γ is reported in Table 4 for each game. We see that in each of the three cases, the QRE selects the more refined equilibrium, which corresponds to what was observed in the data. Moreover, the QRE predictions pick up quite well which strategies are relatively more overplayed (or underplayed) compared to the refined equilibrium predictions. In BCP #2, the predicted equilibrium has both types sending message 3, with the receiver choosing action 1 in response to message 3, and choosing action 2 in response to messages 1 and 2. For both types of senders, AQRE predicts that nearly all message errors should involve sending message 1. Indeed, in all eighteen deviations from this equilibrium, senders use message 1 and never use message 2.10 Response errors to the equilibrium message are predicted by AQRE to be more than ten times as likely to be action 3 than to be action 2, and indeed action 3 errors are observed in all ten of the responders’ deviations from equilibrium. Also, QRE predicts that type 1 senders are less likely to deviate from the sequential equilibrium than type 2 senders, which was also observed in the data. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 26 Figure 6. 9:48 MCKELVEY AND PALFREY BCP (1994) Game 3. Intuitive versus sequential. In the NNM, all deviations from the Nash equilibrium are equally likely, so that model cannot account for the observation that certain errors are systematically more likely than other errors. Consequently, NNM performs much more poorly than AQRE in these signaling games, where senders have three available messages and receivers have three available actions. In BCP #3 the predicted (that is, unique intuitive) equilibrium has both types sending message 1, with the receiver choosing action 2 in response to message 1, and choosing action 1 in response to both nonequilibrium messages. Of the thirty three deviations from the intuitive equilibrium predicted strategies, most were deviations by type 1 senders, who are predicted by the AQRE model to send the intuitive message approximately half the time. (Indeed, type 1 senders were observed to send the intuitive message exactly half the time!) In BCP #4, the predicted (that is, divine) equilibrium has both agents sending message 3, with the receiver choosing action 1 in response to the equilibrium message 3, and action 2 otherwise. In the quantal response equilibrium, both sender types are predicted to send the divine equilibrium message only about half the time. Aggregating both types of senders, P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES Figure 7. 27 BCP (1994) Game 4. Divine versus intuitive. this is what we observe in the data, where message 3 is sent twenty eight times out of sixty chances; message 1 is predicted to be sent slightly more than a third of the time, and we observe it sent twenty three out of sixty times. The frequency of actions in response to each message is predicted even more accurately (see Table 4). Brandts and Holt (1993) (BH) As a follow-up to the BCP experiment Brandts and Holt (1993) proposed two alternatives to game 3 (which differentiated the intuitive equilibrium from a nonintuitive sequential equilibrium), which they ran using similar11 procedures. The games, which differ slightly from BCP because the sender has only two available messages, are given in Table 5. The two games each have a sequential and an intuitive equilibrium at the same strategies. In each of the BH games, four trials were conducted, each with twelve subjects, matched in pairs P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 28 MCKELVEY AND PALFREY Table 5. BH (1993) signaling games. m=I C D m=S E C D E BH # 3 (sequential versus intuitive) A 45, 30 B 30, 30 15, 0 0, 45 30, 15 A 30, 90 0, 15 45, 15 30, 15 B 45, 0 15, 30 30, 15 BH # 4 (sequential versus intuitive) A 30, 30 0, 0 50, 35 A 45, 90 15, 15 100, 30 B 30, 30 30, 45 30, 0 B 45, 0 0, 30 0, 15 Note: sequential: (S, D, C), intuitive: (I, C, D). Table 6. Data and estimates for BH experiments, periods 4 to 6. BH #3 t=A t=B m=I m=S n fi BH #4 QRE NNM n fi QRE NNM m=I 32 .970 .946 .838 0 .000 .026 .130 m=S 1 .030 .054 .162 38 1.000 .974 .871 m=I 18 .462 .601 .838 15 .441 .280 .130 m=S 21 .539 .399 .162 19 .559 .720 .871 a=C 48 .960 .686 .783 1 .067 .252 .086 a=D 2 .040 .178 .108 13 .867 .729 .827 a=E 0 .000 .136 .108 1 .067 .019 .086 a=C 4 .182 .109 .108 52 .912 .888 .827 a=D 18 .818 .719 .783 2 .035 .050 .086 a=E 0 .000 3 .053 .173 .108 .062 .086 λ(σ 2 ) .108 .675 .095 .741 λlo .097 .559 .077 .633 λhi .125 .773 .116 .830 69.529 78.295 56.267 68.034 −L∗ over a sequence of six periods. Each subject participated in two such sequences, once as a row player, and once as a column player. For details see Brandts and Holt (1993: 286–287). The point of the BH experiment was to attempt to construct a game in which play might conceivably converge to the less refined equilibrium. Brandts and Holt proposed a descriptive story of how a particular dynamic of play in early rounds could potentially lead to the unintuitive equilibrium, if the payoffs in BCP #3 were changed slightly. Their learning dynamic is related to the rationale for the logit AQRE selection, since the rough idea behind our selection is that if subjects begin an experiment at a QRE with relatively high error rates (that is, a low value of λ) and “follow” the equilibrium selection as λ increases with experience, then they should converge to the equilibrium component implied by our selection. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES Figure 8. 29 BH (1993) Game 3. Sequential versus intuitive. The data along with the QRE and NNM estimates are given in Table 6 and figures 8 and 9. Table 6 presents the results of the maximum likelihood estimation of the AQRE and NNM models, and the observed frequencies for the later periods of their experiment (periods 4 to 6). Notice that again AQRE provides a better fit than NNM. In figures 8 and 9, we present graphically the results of estimation with a breakdown by period. These figures are similar to those of the BCP experiments for the receiver. However, the sender in the BH experiments just has two messages. Therefore, we graph the sender data in a unit square, with type A’s choice frequencies on the horizontal axis and type B’s on the vertical axis. In those graphs higher numbers correspond to higher levels of experience (in the same way as in BH (1993: 442, figure 5). The curves in the graphs represent the predicted move frequencies for the logit-AQRE selection, as a function of λ. As in the BCP figures, P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 30 9:48 MCKELVEY AND PALFREY Figure 9. BH (1993) Game 4. Sequential versus intuitive. the QRE correspondence starts at the centroid (the center of the square and the center of the simplices) for low values of λ, and converges to the logit solution (the intuitive equilibrium for game 3 and the nonintuitive sequential equilibrium for game 4) as λ goes to infinity. While the estimated values of λ are not strictly increasing in experience levels, there is an overall trend in the right direction, as is indicated by comparing the first three periods (white circle) to the last three periods (black circle). 7. Centipede game experiments McKelvey and Palfrey (1992) studied experimentally two versions of the centipede game, displayed in figures 10 and 11, respectively. In that article they proposed a theory of play P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES Figure 10. The four-move centipede game (McKelvey-Palfrey, 1992). Figure 11. The six-move centipede game (McKelvey-Palfrey, 1992). 31 based on incomplete information, altruism, and heterogeneous prior beliefs, and fit a fiveparameter model to the data. The model captured the essential qualitative features of the data and fit the exact move frequencies of the data fairly well. In this section we reexamine the centipede data using the logit AQRE model. We describe the details of the model as it applies to the extensive form-model versions of figure 10. The computations for the six-move version are similar. Denote player 1’s behavior strategies as ( p1 , p2 ) and player 2’s behavior strategies as (q1 , q2 ), where p1 = prob{1 chooses T at first move} p2 = prob{1 chooses T at third move (if reached)} q1 = prob{2 chooses T at second move (if reached)} q2 = prob{2 chooses T at fourth move (if reached)}. At the first move, player 1 estimates the payoff of T and P by U11 (T ) = .4 + ε1T U11 (P) = .2q1 + (1 − q1 )[1.6 p2 + (1 − p2 )[.8q2 + 6.4(1 − q2 )]] + ε1P , where ε1T and ε1P are independent random variables with a Weibull distribution with parameter λ. The logit formula then implies p1 = 1 . 1 + eλ[.2q1 +(1−q1 )[1.6 p2 +(1+ p2 )[.8q2 +6.4(1−q2 )]]−.4] P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 32 MCKELVEY AND PALFREY Similarly, we obtain expressions for p2 , q1 and q2 , given below, 1 1 + eλ[.4 p2 +(1− p2 )[3.2q2 +1.6(1−q2 )]−.8] 1 p2 = 1 + eλ[.8q2 +6.4(1−q2 )−1.6] 1 q2 = . λ[1.6−3.2] 1+e q1 = This system of four equations and four unknowns is solved recursively (q2 , then p2 , then q1 , then p1 ) and has a unique solution. There are several interesting features of the equilibrium correspondence. First, for most values of λ, p1 < q1 < p2 < q2 . This is consistent with the experimental findings, where p1 is .07, q1 is .38, p2 is .65, and q2 is .75. Second, for very low values of λ, p1 , q1 , and p2 are all decreasing in λ. That is, the equilibrium does not converge monotonically toward the Nash equilibrium as a function of λ. We estimate the best fitting value of λ and γ for this data set, using standard maximum likelihood techniques.12 The results are reported in Tables 7 and 8. The NNM model fails miserably, and in fact does no better than a completely random model. Our estimates of λ findings mirror results reported in Zauner (1993), using a related model. That article presents a structural model of the error process by adding independent Normally distributed payoff disturbances and estimates the variance of the disturbances (see next to last column of Tables 7 and 8). That article also adopts an “agent model” approach in that a player does not observe all disturbances to payoffs at the start of the game, but instead only the disturbance to his own current “take” payoff is observed. That model fits better than ours, as measured by the likelihood score, although the qualitative findings are very similar, as both models share two basic properties: they capture the feature in the data that take probabilities increase as the end of the game is approached; and they overestimate the take probabilities on the first and last moves. But as Zauner (1993) points out, neither model includes a heterogeneity parameter, which produced a dramatic improvement in Table 7. Four-move centipede game. n fi AQRE NNM Zauner model 2 parameter model Take T 281 .071 .246 .500 .214 .083 Probs T/P 261 .383 .315 .500 .301 .320 T/PP 161 .644 .683 .500 .677 .718 T/PPP 57 .754 .938 .500 .939 .637 λ | (σ 2 ) 1.70 0.00 — 1.655 λlo 1.54 0.00 — — λhi 1.87 0.01 — — q — — — −L∗ 424.93 527.02 418.3 .96 402.5 P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 33 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES Table 8. Six-move centipede game. n fi AQRE NNM Zauner model 2 parameter model Take T 281 .007 .236 .500 .186 .029 Probs T/P 279 .065 .252 .500 .213 .056 .085 T/PP 261 .215 .240 .500 .215 T/PPP 205 .527 .478 .500 .436 .504 T/PPPP 97 .732 .842 .500 .814 .772 T/PPPPP 26 .846 .980 .500 .980 .409 λ|γ .61 0.00 — λlo | γlo .55 0.00 — — .64 — λhi | γhi 0.04 — q — — — −L∗ 534.03 797.02 506.4 .67 .97 454.3 fit in the models estimated by McKelvey and Palfrey (1992). One might hope that the inclusion of a heterogeneity parameter would also significantly improve the fits of these two models. There are other ways these models might be improved on. As discussed above, these models assume that a player cannot predict his own future play (the “agent” approach). Alternatively, one might assume that each player observes all his payoff disturbances at the beginning of the game. Unfortunately, this “player” approach is difficult to estimate because the equilibrium can no longer be solved recursively from the last move of the game. Another possible improvement would be to incorporate persistence in the error structure so that a player who passed at the beginning is also likely to pass later on. This would mimic the role of altruism. We consider here a third possibility, which is to incorporate a second parameter, consistent with our previous analysis of this data. Namely, we assume that there is some small probability that players are “altruistic” (and hence choose P at every opportunity). This gives rise to a game of incomplete information (as described formally in McKelvey and Palfrey, 1992: 827) where with probability q a player is “rational” and with probability (1 − q) they are “altruistic”. We then apply the AQRE directly to this modified game, obtaining a two-parameter model in which we estimate both λ and q. The results of this analysis appear in the last column of Tables 7 and 8. As one can see, there is a dramatic improvement in fit over the one parameter model, at any reasonable significance level. While the log likelihood is still several orders of magnitude worse than the five-parameter model of McKelvey and Palfrey (1992), one must keep in mind that that model permitted hetereogeneity and time trends, in addition to altruism. 8. Conclusions This article developed a model of equilibrium behavior in extensive form games when players are imperfect best responders. At each information set, players choose better actions P1: ABL Experimental Economics KL581-01-Mckelvey 34 April 25, 1998 9:48 MCKELVEY AND PALFREY with higher probabilities than worse actions but do not choose best responses with probability one. We developed a general theoretical structure, called agent quantal response equilibrium, and a more specialized version, called logit AQRE, which we used to fit the model to several experimental data sets. We summarize the theoretical findings and the empirical findings separately. Theoretical findings 1. Agent quantal response equilibria exist in all finite games. 2. Limit points of logit AQRE (as the precision parameter get arbitrarily large) define a nonempty refinement of sequential equilibria of any finite extensive form game. 3. For generic extensive form games, there exists a unique connected component of the logit AQRE correspondence, which includes equilibria for all values of the precision parameter. A limit point of this component (as the precision parameter gets arbitrarily large) provides a unique selection from the set of sequential equilibrium components of the underlying game. We call this the logit solution of the game. 4. The logit solution of the game is not logically related to other refinement criteria, such as the intuitive criterion or trembling hand perfection. 5. Agent quantal response equilibrium depends in systematic ways on inessential transformations of the extensive form. Empirical findings 1. The AQRE model explains the finding by Schotter et al. (1994) that one-shot play in the stage game of the chain-store paradox differs systematically depending on whether the players make decisions sequentially or simultaneously. More entry deterrence occurs when the decisions are made simultaneously. This is precisely what our model predicts. No game-theoretic equilibrium concept satisfying invariance can explain this finding. 2. The AQRE model was used to fit data from a variety of signaling games with multiple equilibria. In all cases, the observed data corresponded to the logit solution to the game, even in the remarkable game of Brandts and Holt (1993) in which play converges to an unintuitive sequential equilibrium. In these signaling games, the logit solution also makes predictions about the pattern of behavior off the equilibrium path, as well as making predictions about the relative probabilities of reaching different off-equilibrium paths. The logit solution predictions are born out in the data from all five different signaling games we examine. 3. The main features of the data from centipede experiments are consistent with the logit equilibrium. In particular, the model predicts take probabilities to increase as a game progresses. We fit data from two different variants of the centipede game and also fit an adapted version of the logit equilibrium in which some fraction of the agents are altruistic. That two-parameter model fits the data significantly better. 4. For all the experimental data we analyze, we also fit a competing model, called the noisy Nash model. In that model, the maintained hypothesis is that at each information set the player deviates from the Nash equilibrium behavior strategy with some fixed probability. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES 35 We obtained maximum likelihood estimates of this error probability. The logit model outperforms the NNM model without exception, usually by a wide margin. Appendix This appendix proves the properties of the quantal response correspondence stated in Theorem 4. Theorem 4. For almost all finite extensive form games, 1. The logit AQRE correspondence Q is a one-dimensional manifold. 2. For almost all λ there are an odd number of logit-AQRE. 3. There is a unique branch, B (the principal branch) of the logit AQRE selection connected to the centroid of the game at λ = 0. 4. The principal branch of the logit AQRE correspondence selects a unique component from the set of sequential equilibria of the game, in the sense that if {(λt , pt )}t and {(λ0t , qt )}t are sequences in B with limt→∞ λt = limt→∞ λ0t = ∞, and if p∗ is a limit point of { pt }t and q ∗ is a limit point of {qt }t , then p ∗ and q ∗ are in the same component of the set of sequential equilibria of the game. Proof: For this proof, we fix the extensive form tree 0, and let U represent the set of possible payoff vectors that could be attached to the terminal nodes of the extensive form. So elements of U are functions u : 0 t 7→ IRn from the terminal nodes to payoff vectors for the players. For each h ∈ H , select ah0 ∈ A(h), and write Ah = A(h) − {ah0 }. Define ( ) X ph (a) ≤ 1 . Dh = ph : Ah 7→ IR | ph (a) ≥ 0 for all a ∈ Ah , and a∈Ah P Write ph (ah0 ) = 1Q− a∈Ah ph (a), and for all a ∈PA(h) write pha = ph (a), and pho = ph (ah0 ). Let D = h∈H Dh , m h = |Ah |, and m = h∈H m h . Let X = D ◦ × (0, ∞). Define F : X × U 7→ IRm by, for h ∈ H and a ∈ Ah , µ ¶ 1 pha − vha ( p | u) + vh0 ( p | u), Fha ( p, λ, u) = ln λ ph0 where vha is the expected payoff (to the player who chooses at h) of the continuation game from taking action a, and vh0 is the corresponding payoff from taking action ah0 . For any u ∈ U , write f u ( p, λ) = F( p, λ, u). For any given u ∈ U , the logit QRE correspondence for u is given by Q = f u−1 (0), where 0 is the zero vector in IRm . We will show that for any ( p, λ) ∈ X , that F is surjective. In particular, assume F( p, λ, u) = x. Then for any ² ∈ IRm , construct u ² ∈ U as follows. For any h ∈ H , pick ηh : A(h) 7→ IR such that X ph (a)ηh (a) = 0 a∈A(h) P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 36 MCKELVEY AND PALFREY and such that ηh (aho ) − ηh (a) = ²ha . Then, for any terminal node v, set X u ²i (v) = u i (v) + ηh (a), h∈Hi ,a∈A(h),a<v where a < v means that the action a precedes v. Then F( p, λ, u ² ) = x + ². Hence, for any fixed ( p, λ) ∈ X, F is surjective. Hence, F( p, λ, u ² ) is transversal to any submanifold of IRm . Hence, by the transversality theorem (Guillemin and Pollack, 1974: 68), for almost all u ∈ U , f u ( p, λ) = F( p, λ, u) is transversal to 0. It follows (Guillemin and Pollack, 1974: 28) that f u−1 (0) is a manifold of codimension m, or dimension 1, for almost all u ∈ U . This completes the proof of part (1) of the theorem. Note that the above argument can be extended to the case when the domain of F is bounded. For any c = (c, c̄) with 0 < c < c̄, define X c ⊆ X by X c = D 0 × [c, c̄]. Then X c is a m + 1 dimensional manifold with boundary. It follows from a similar argument to above that both F and ∂ F : ∂(X c × U) → <m are surjective for all ( p, λ) ∈ X c . Hence, by the transversality theorem, for almost all u ∈ U , f u−1 (0) is a manifold with boundary of dimension 1. We write Qc = f u−1 (0), to denote the Logit-AQRE correspondence restricted to X c . Now pick M > 0 so that for all p ∈ B 0 , suph∈H,a,c∈Ah |vha ( p | u) − vh0 ( p | u)| ≤ M. Define aλ = e−λM , and bλ = eλM . Then it follows that, for any ( p, λ) ∈ Q, µ −λ · M ≤ log pha ph0 ¶ ≤ λ · M ⇒ aλ ph0 ≤ pha ≤ bλ ph0 . But pha ≤ bλ ph0 ⇒ 1 − ph0 = X pha ≤ bλ m h ph0 a∈Ah ⇒ ph0 ≥ 1/(bλ m h + 1) and pha ≥ aλ ph0 ⇒ pha ≥ aλ /(bλ m h + 1) = cλ . Since aλ < 1, it follows that for all a ∈ A(h) (including a = ah0 ), the above inequality holds. Define W = {( p, λ) ∈ X : pha ≥ cλ for all h ∈ H, a ∈ A(h)}. Thus, we have shown that Q ⊆ W ∩ X . Similarly, Qc ⊆ W ∩ X c . In other words, the QRE graph can only “exit” X at the minimum and maximum values of λ. Next, we establish that for sufficiently small λ, there is a unique solution. In other words, there is only one possible exit for small λ. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES Lemma 1. 37 For sufficiently small λ, π ∗ (λ) is a singleton. Proof: To see this, define the mapping φ : IRm → IRm to have components ,( φha ( p) = exp[λvha ( p | u)] X ) exp[λvhc ( p | u)] . c∈A(h) Then, for any λ, ( p, λ) ∈ Q if and only if p is a fixed point φ. We will show that for λ sufficiently small, φ is a contraction mapping. We use three facts to prove the result, each of which follows from easy arguments. We use the notation vhac ( p | u) = vha ( p | u) − vhc ( p | u). Fact 1: For any p, q ∈ 10 (the interior of 1), maxl | pl − ql | ≤ maxkl | pl / pk − ql /qk |. Fact 2: Since the derivative of e x at x = 0 is 1, then for all D > 1, there is a δ such that whenever |x1 |, |x2 | < δ, |exp[x1 ] − exp[x2 ]| ≤ D · |x1 − x2 |. Fact 3: For any p, q ∈ B there is an M > 0 such that maxhac |vhac ( p | u) − vhac (q | u)| ≤ M · maxha | pha − qha |. Pick any D > 1, and let δ be defined as in Fact 2, and M be defined as in Fact 3. We pick λ to satisfy 1. λvhac ( p | u) < δ for all h, a, c, with h ∈ H, a, c ∈ A(h) and any p ∈ B. 2. λ < 1/(D · M). Write ρ = λ · D · M. Then pick any p, q ∈ B. Then, letting k · k represent the sup norm, we have kφ( p) − φ(q)k = max |φha ( p) − φha (q)| ≤ max |φha ( p)/φhc ( p) − φha (q)/φhc (q)| ha hac = max | exp[λvhac ( p | u)] − exp[λvhac (q | u)]| hac ≤ λ · D · max |vhac ( p | u) − vhac (q | u)| hac ≤ λ · D · M · max | pha − qha | = ρ · k p − qk, ha where ρ < 1. The steps follow, respectively, by the definition of k · k, fact 1, equation (∗ ), fact 2, fact 3, and the definition of k · k. It follows that for λ ≤ λ, that φ is a contraction mapping. Hence, it has a unique fixed point. 2 Claim 3 of the theorem now follows directly from the above lemma. Also, from the above argument, setting c = (c, c̄), we have shown that Qc is a compact, one-dimensional manifold with boundary, which for small enough c, has a unique intersection with λ = c. Any such manifold has a finite number of connected, compact components, each of which must have an even number of boundary points. We have also shown that any boundary point P1: ABL Experimental Economics 38 KL581-01-Mckelvey April 25, 1998 9:48 MCKELVEY AND PALFREY must be at λ = c or λ = c̄. Since there is exactly one solution at λ = c, there must be an odd number of solutions at λ = c̄. This establishes claim 2 of the theorem. We now show the final claim of the theorem—that as λ → ∞, the branch B of the manifold that passes through λ converges to a unique component of the set of sequential equilibria. Lemma 2. If {(λt , pt )}t and {(λ0t , qt )}t are sequences in B with limt→∞ λt = limt→∞ λ0t = ∞, and if p ∗ is a limit point of { pt }t and q ∗ is a limit point of {qt }t , then p ∗ and q ∗ are in the same component of the set of sequential equilibria of the game. Proof: It follows from the arguments above, that for almost all games, there exists an increasing sequence {λi } with λ < λi for all i, such that if we define ci = (λ, λi ), X i = X ci , and Qi = Qci ⊆ Q, that 1. Q is a one-dimensional manifold with a unique point—say, ( p, λ)—for which λ = λ, and a unique connected branch B that passes through ( p, λ). 2. Qi is a compact one-dimensional mainfold with boundary, which has a finite number of connected components. 3. Letting Bi be the connected branch of Qi , which begins at ( p, λ), it follows that Bi is a compact connected one-dimensional manifold with boundary for all i, which has a unique intersection—say, ( pi , λi ) with λ = λi . Now for any i, define Ai = {( p, λ) ∈ B : λ > λi }, and Ai to be the closure of the projection of Ai onto D. Then {Ai } is a decreasing sequence of sets. By Theorem 2, ∩i Ai is a subset of the set of sequential equilibria of the game. We show that for almost all games, ∩i Ai must be a (nonempty) subset of a unique connected component of the set of sequential equilibria. First of all, since D is compact, and each Ai is closed and nonempty, ∩i Ai cannot be empty. Suppose, by way of contradiction, that ∩i Ai contains points p ∗ and q ∗ which are in two distinct components of the set of sequential equilibria of the game. But if p ∗ and q ∗ are both in ∩i Ai , then we can construct a sequence {( pi , λi )} ⊆ B with p2i−1 → p ∗ , p2i → q ∗ , λi → ∞ and a homeomorphism, φ : < → B satisfying φ(i) = ( pi∗ , λi ), and φ(2i) = (qi∗ , ηi ). In particular, start with any δ1 , and find ( p1∗ , λ1 ) ∈ B with λ1 > 1/δ1 and k p1 − p ∗ k < δ1 . Since B is a connected one-dimensional manifold, it is path connected (Guillemin and Pollack, 1974: 38, ex. 3), so one can construct φ[0, 1] → B with φ(0) = ( p, λ) = φ(1) = ( p1∗ , λ1 ). Since φ[0, 1] is compact, it is bounded. Pick δ2 so that 1/δ2 exceeds the bound, and find ( p2∗ , λ2 ) ∈ B with λ2 > 1/δ2 and k p2 − q ∗ k < δ2 . By the same reasoning as above, one can construct φ[1, 2] → B with φ(1) = ( p1∗ , λ1 ), and φ(2) = ( p2∗ , λ2 ). Proceeding in this fashion, one can construct the sequence {( pi , λi )} ⊆ B and a homeomorphism, φ : < → B with the properties specified. Now for each i, φ[i, i +1] is a compact one-dimensional manifold with boundary. Moreover, we can pick φ so that φ(i − 1, i) ∩ φ(i, i + 1) = φ(i). We have constructed an infinite sequence of compact manifolds with boundary, each of whose projection on D connects a point near p∗ with a point near q ∗ . Further, for any λi , at most a finite number of these manifolds intersect with X i (since B ∩ X i is a compact one-dimensional mainfold with boundary, which can consist of at most a finite number of P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES 39 components.) Now, since p ∗ and q ∗ are in different connected components of the set of sequential equilibria, we can find an open set E such that E contains the component containing p ∗ and such that the closure of E does not contain any other component. Now let ∂ E be the boundary of the closure of E. Then ∂ E is a closed compact subset of D. Further, each of the infinite sequence of manifolds connecting a point near p∗ with a point near q ∗ must have a nonempty intersection with ∂ E. Hence, ∂ E must have a nonempty intersection with ∩i Ai . But this contradicts the fact that ∂ E is disjoint from the set of sequential equilibria. 2 This lemma proves that the principal branch of the logit AQRE selects a unique component of the set of sequential equilibria as λ goes to infinity. This establishes the final assertion of the theorem. 2 Acknowledgment This article has benefited from comments by two referees and by several participants at the Conference on Learning in Games (Texas A&M University, February 1994) and the 1995 Summer Workshop on Game theory at Stonybrook, and the European Meetings of the Econometric Society (Maastricht, September 1994). We thank the participants for their comments and also thank Peter Coughlan, Mark Fey, Eugene Grayver, and Rob Weber for their research assistance. We are grateful to Andy Schotter, David Porter, and Charles Holt for sharing their data. The financial support of the National Science Foundation (Grant #SBR-9223701) is gratefully acknowledged. Notes 1. See also Chen et al. (1997), who study learning dynamics in the context of a similar quantal response equilibrium model, which they call boundedly rational Nash equilibrium. Other applications can be found in McKelvey and Palfrey (1996), Lopez (1995), Anderson et al. (1998), and elsewhere. 2. Zauner (1993) has independently conducted a reexamination of the centipede data using a similar model, but with different distributional assumptions about the error structure. His findings are compared with ours in Section 7. 3. That is, there exist games in which there is a unique intuitive equilibrium that is different from the unique logit-AQRE selection. 4. This game can also be interpreted as a discrete (binary) version of an ultimatum game in which the offerer may offer either a 50/50 split or a 75/25 split of the pie. The unfair split (75/25) may be either accepted or rejected by the other player. The fair split is automatically accepted. Gale et al. (1995) call it the “ultimatum minigame” and study the dynamic stability of its Nash equilibria under the replicator dynamic. 5. Figures 2 and 3 are the AQRE correspondences for the versions of these games that were run by Schotter et al. (1994). Those games had different payoffs from the games of figure 1. However, the qualitative features of the AQRE for the games of figure 1, and the SWW games are similar. So both sets of graphs are not reproduced here. 6. The authors also found that these differences were magnified by presenting the strategy version to the subjects in its matrix form. We are not proposing an explanation for this. 7. There are several other games examined by SWW, but none of these others were games of perfect information, and all have multiple AQREs, even in the sequential versions. We do not analyze them here. 8. This may be partly due to the fact that subjects played the games so few times. P1: ABL Experimental Economics 40 KL581-01-Mckelvey April 25, 1998 9:48 MCKELVEY AND PALFREY 9. In game 2, one equilibrium is sequential and the other is not. In game 3, both are sequential, but only one is intuitive. In game 4, both are intuitive, but only one is divine. 10. Neither m 1 nor m 2 are dominated strategies. 11. There were some minor procedural differences. BH successfully replicated the results of BCP game 3. 12. A similar analysis is reported in Fey et al. (1996) using an experiment based on a constant sum version of the centipede game. The details of the maximum likelihood procedures are explained in that article. References Anderson, S., Goeree, J., and Holt, C. (1998). “Rent Seeking with Bounded Rationality: An Analysis of the All Pay Auction.” forthcoming in the Journal of Political Economy. Banks, J., Camerer, C., and Porter, D. (1994). “Experimental Tests of Nash Refinements in Signaling Games.” Games and Economic Behavior. 6, 1–31. Brandts, J. and Holt, C.A. (1992). “An Experimental Test of Equilibrium Dominance in Signaling Games.” American Economic Review. 82, 1350–1365. Brandts, J. and Holt, C.A. (1993). “Adjustment Patterns and Equilibrium Selection in Experimental Signaling Games.” International Journal of Game Theory. 22, 279–302. Chen, H., Friedman, J.W., and Thisse, J. (1997). “Boundedly Rational Nash Equilibrium: A Probabilistic Choice Approach.” Games and Economic Behavior. 18, 32–54. Chen, K.-Y. (1994). “The Strategic Behavior of Rational Novices.” Ph.D. dissertation, California Institute of Technology. Cooper, R., DeJong, D., Forsythe, R., and Ross, T. (1989). “Communication in the Battle of the Sexes Game: Some Experimental Results.” RAND Journal of Economics. 20, 568–587. Cooper, R., DeJong, D., Forsythe, R., and Ross, T. (1990). “Selection Criteria in Coordination Games: Some Experimental Results.” American Economic Review. 80, 218–233. Fey, M., McKelvey, R.D., and Palfrey, T.R. (1996). “Experiments on the Constant-Sum Centipede Game.” International Journal of Game Theory. 25, 269–287. Gale, J., Binmore, K., and Samuelson, L. (1995). “Learning to be Imperfect: The Ultimatum Game.” Games and Economic Behavior. 8, 56–90. Guillemin, V. and Pollack, A. (1974). Differential Topology. Prentice-Hall. Harsanyi, J. (1973). “Games with Randomly Disturbed Payoffs.” International Journal of Game Theory. 2, 1–23. Harsanyi, J. and Selten, R. (1988). A General Theory of Equilibrium Selection in Games. Cambridge, MA: MIT Press. Kohlberg, E. and Mertens, J.-F. (1986). “On the Strategic Stability of Equilibria.” Econometrica. 54, 1003–1037. Kreps, D. and Wilson, R. (1982). “Reputation and Imperfect Information.” Journal of Economic Theory. 27, 253–279. Lopez, G. (1995). “Quantal Response Equilibria for Models of Price Competition.” Ph.D. dissertation, University of Virginia. Luce, D. (1959). Individual Choice Behavior. New York: Wesley. McFadden, D. (1974). “Conditional Logit Analysis of Qualitative Choice Behavior.” In P. Zarembka (ed.), Frontiers in Econometrics. New York: Academic Press. McKelvey, R.D. and Palfrey, T.R. (1992). “An Experimental Study of the Centipede Game.” Econometrica. 60, 803–836. McKelvey, R.D. and Palfrey, T.R. (1995). “Quantal Response Equilibria in Normal Form Games.” Games and Economic Behavior. 7, 6–38. McKelvey, R.D. and Palfrey, T.R. (1996). “A Statistical Theory of Equilibrium in Games.” Japanese Economic Review. 47, 186–209. Partow, J. and Schotter, A. (1993). “Does Game Theory Predict Well for the Wrong Reasons? An Experimental Investigation.” C.V. Starr Center for Applied Economics Working Paper 93-46, New York University. Rosenthal, R. (1989). “A Bounded-Rationality Approach to the Study of Noncooperative Games.” International Journal of Game Theory. 18, 273–292. P1: ABL Experimental Economics KL581-01-Mckelvey April 25, 1998 9:48 QUANTAL RESPONSE EQUILIBRIA FOR EXTENSIVE FORM GAMES 41 Schotter, A., Weigelt, K., and Wilson, C. (1994). “A Laboratory Investigation of Multiperson Rationality and Presentation Effects.” Games and Economic Behavior. 6, 445–468. Van Damme, E. (1987). Stability and Perfection of Nash Equilibria Berlin: Springer-Verlag. Zauner, K. (1993). “Bubbles, Speculation, and a Reconsideration of the Centipede Game Experiment.” Mimeo, University of California at San Diego. Revised as “A Reconsideration of the Centipede Game Experiments”.
© Copyright 2025 Paperzz