Learning From Experience in Beauty Contests

Learning from experience in beauty contests
Jeffrey A. Livingston*
Bentley University
Michael K. Price
Georgia State University
and NBER
Susan Skeath
Wellesley College
Abstract: In the "p-beauty" contest, contestants choose a number between zero and 100 and the
winner is the player who selects the number that is closest to some fraction p of the average
chosen by the group. While the Nash equilibrium is for all players to choose 0, subjects
frequently display bounded rationality by choosing numbers that are substantially higher.
However, subjects adjust their choices over repeated plays, typically converging to close to
equilibrium. Theoretical models of this learning process come in two flavors. The first, beliefbased models, assume that players learn via a sophisticated process where they form beliefs
about the strategies followed by their opponents, and best respond to those beliefs. The second,
choice reinforcement models, assume that learning is less sophisticated, where subjects react to
payoffs associated with past plays and attempt to adopt those strategies. While past experimental
studies have examined learning in this context, none have employed a design that can distinguish
between these two models. We develop such a design by placing subjects in a circumstance
where the pattern of results they saw in past plays likely does not match the winning strategy in
their next play, and find that players still best respond to the pattern they saw. We interpret this
as evidence in favor of choice reinforcement models.
Keywords: bounded rationality, beauty contests, learning, experience
JEL codes: D01, C7
*Corresponding author. Department of Economics, Bentley University, 175 Forest Street,
Waltham, MA 02138. Phone 781-891-2538, Fax 781-891-2896, Email [email protected]
I. Introduction
In a seminal study of boundedly rational behavior, Nagel (1995) presents an experiment
where subjects play the p-beauty contest game (PBCG), where contestants choose a number
between zero and 100 and win by being closest to some fraction, p, of the average of all of the
numbers chosen. While the unique equilibrium of the game is for all players to select zero, across
many experimental studies, players typically select numbers above 0. For example, when
p = 2/3, the average number chosen by a group of subjects is typically between 20 and 35.
A number of theoretical models have been proposed to explain this violation of Nash
Equilibrium behavior.1 Nagel (1995) and Stahl and Wilson (1995) initially offered versions of a
model of what has come to be known as level-k thinking. In such a model, a level-0 thinker does
not think strategically and effectively selects a number at random, while a level-k thinker selects
the best response to a belief that all other players are following a level-(k-1) strategy. For
example, when p = 2/3, a level-1 player selects a number near 33 in an effort to be close to 2/3 of
the anticipated average play of 50, while a level-2 player selects a number around 22 in an effort
to be close to 2/3 of the anticipated average play of 33.
One frequent response to findings of seemingly irrational behavior is that it is likely to
dissipate as actors learn from experience.3 Indeed, when the PBCG game is repeated over a
number of rounds, players who have never played before show a convergence toward the Nash
1
Crawford et al. (2010) provide a review of these models of what they term “strategic
sophistication,” including the level-k model detailed below, the related cognitive hierarchy
model of Camerer et al. (2004) in which “Step k” thinkers accurately predict the relative
frequencies of players doing fewer steps of thinking from levels 0 to k-1, equilibrium plus noise,
finitely iterated strict dominance and k-rationalizability (Bernheim (1984) and Pearce (1984)),
quantal response equilibrium (McKelvey and Palfrey (1995)), and noisy introspection (Goeree and
Holt (2004)).
3
See, for example, List (2003) who offers powerful evidence from a field experiment that market
experience attenuates the endowment effect.
1
equilibrium choice of zero as the rounds proceed.4 Further, Sbriglia (2008) shows that learning
can lead to more advanced levels of thinking. Over the course of six rounds, the winning player
leaves the game and the remaining players are given information about the thought process used
by the winner. Compared to when players are not given this information, the game converges to
the predicted equilibrium more quickly.
Several theoretical models have been advanced to explain how players learn in the PCBG
and related strategic situations. The models fall into two types. Camerer and Ho (1999) propose a
general model that captures each of these possibilities as special cases in an “experienceweighted attraction” model of learning by game players (see also Camerer, Ho and Chong,
2003). In the first type, players form beliefs about the strategies of their opponents and best
respond to those beliefs. They describe this type of model as follows:
“One approach, belief-based models, starts with the premise that players keep track of the
history of previous play by other players and form some belief about what others will do
in the future based on past observation. Then they tend to choose a best-response, a
strategy that maximizes their expected payoffs given the beliefs they formed.”
In the second type, players are less sophisticated, merely reacting to patterns that they witness in
previous outcomes:
“A different approach, choice reinforcement, assumes that strategies are ‘reinforced’ by
their previous payoffs, and the propensity to choose a strategy depends in some way on
its stock of reinforcement. Players who learn by reinforcement do not generally have
beliefs about what other players will do. They care only about the payoffs strategies
yielded in the past, not about the history of play that created those payoffs.”
4
Nagel (1995), Alba-Fernandez et. al. (2006), Guth et. al. (2002), and Ho, Camerer and Weigelt
(1998) are among the papers that have presented evidence on this issue.
2
Most recent theories of learning in games where level-k behavior is exhibited assume a
sophisticated process in the style of a belief-based model. For example, Ho and Su (2013) study
a level-k model of behavior in centipede games. Their model assumes that players carefully
attempt to intuit the level rule played by their opponents, so that they can play the best response
to the behavior that this level rule entails. Mohlin (2012) explores a more general setting where,
in a level-k framework, the lowest type merely best responds to the average of past play, and
higher types develop their beliefs about how others learn using increasingly complex models.
These models of sophisticated learning enjoy some empirical support from Slonim
(2005).7 He runs an experiment where players play a supergame of nine rounds of a PBCG,
separated into three games of three rounds each, against a new set of two opponents in each
game. Players thus gain experience with the PBCG as they move from round to round and game
to game. He examines two treatments, SAME and MIX. In the SAME treatment, all three players
start the game with the same amount of experience. In the MIX treatment, one player is
experienced and the other two have not seen the game before. He finds that experienced players
choose lower numbers in round one when facing experienced opponents than when facing
inexperienced opponents, often correctly predicting the guesses of their opponents. As a result,
the experienced player is far more likely to win, particularly in the first two rounds. For example,
in the first round of games where one player is experienced and the others are inexperienced, the
7
The extant literature does include several studies of the level of thinking employed by subjects
who play the PBCG. Sbriglia (2008) shows that learning can lead to more advanced levels of
thinking. Over the course of six rounds, each round’s winning player leaves the game but gives
information about her thought process to the remaining players. The game converges more
quickly to the predicted equilibrium, and the levels of thinking advance more quickly, when
players are provided this information than when they are not. Costa-Gomes and Crawford
(2006) define several types of strategies that players might follow when playing a PBCG that are
based on level-k thinking, several of which are rational best responses to non-equilibrium PBCG
strategies. They utilize a series of 16 two-person PBCGs to decipher each player's type, and find
that many players can be neatly classified into their defined types.
3
experienced player wins 85 percent of the time. Slonim argues that these results are consistent
with “the ‘sophisticated’ learning studied in Cooper and Kagel (2002), Camerer and Ho (1998),
Camerer et al. (2002), Stahl (2000) and others. For example, Cooper and Kagel find that some
players learn about opponent’s reasoning in signaling games and Camerer et al. find that some
players learn that other players are learning.”
However, in his setting, there is a close correlation between experience and the number
the player chooses. For example, in Slonim’s sample, when playing against other inexperienced
players, the average guess in the first round of new players is 33.5 with a standard deviation of
only 2.3. Thus, in his experiment, experience type is an excellent predictor of the number a
player can be expected to guess, resulting in a pattern of target numbers that are relatively easy to
predict without thinking carefully about the strategy that might have led to the choice. The data
are thus consistent with either sophisticated belief-based models or less sophisticated
reinforcement models.
We conduct an experiment that is designed to identify which of these two types of
models is more consistent with the learning observed in PBCGs. Subjects gain experience in a
session of six rounds of the PBCG and then play in another session against a known mix of
experienced players who had played the PBCG once before and inexperienced players who had
never before played the game. As in past studies, experienced players tend to play lower
numbers than inexperienced players. The average guess of an experienced player in round one in
our sample is 24.40, while the average guess of an inexperienced player is 41.52. Experimental
treatments vary both the proportion of types in the game in which a subject gains experience and
the proportion of types – experienced and inexperienced agents – in the second game played.
4
We use such variation as a means to randomly shock the level of play first observed by a
subject. Importantly, this allows us to explore how subjects behave when the target number they
witness in their initial play is different from the target number in their second play. For example,
we observe subjects who gained experience while playing the game with five inexperienced
opponents, resulting in an average target number of 27.34 in round one. Some of these subjects
play a second time against another set of five inexperienced opponents, likely resulting in a
similar target to what they witness initially, but others play a second time with four experienced
players and one inexperienced player, resulting in an average target number of 18.29 in round
one. If “learning” reflects pattern recognition whereby subjects simply best respond to outcomes
observed in prior repetitions of the game, we would expect subjects that initially observe a lower
target to guess lower numbers at the start of the second game than counterparts who initially
observed a higher target, regardless of the mix of experience types in the new game. Conversely,
if learning reflects an increased understanding of the PBCG and the iterated dominance
reasoning required to “solve” such games, then we would expect play at the start of the second
game to be independent of the level of rationality initially observed by an experienced subject,
and instead vary with the mix of experience types. In this case, for example, players would see
that their new game involves a larger proportion of opponents who also have experience in the
game, anticipate that such players will also choose lower numbers due to their increased
understanding of the game, and best respond accordingly by selecting an even lower number.
The results offer strong support for choice reinforcement-style models over belief-based
models. Four results support this conclusion. First, relative to situations where the experience
mix is the same in both games, experienced players undershoot the target when there are fewer
experienced players than when they played initially, resulting in a new target number that tends
5
to be higher than they witnessed. Similarly, they overshoot the target when there are more
experienced players than when they played initially, resulting in a new target number that tends
to be lower than they witnessed. Second, experienced players are less likely to choose a number
near the target as the difference between the new target number and the target number they
initially witnessed increases. Third, when players from different initial experience mixes play in
a new game of the same type, they tend to choose different numbers. Finally, when players from
the same initial experience mix play in new game of different types, they tend to choose the
similar numbers, failing to account for the experience mix differences.
The remainder of our analysis proceeds as follows. Section II presents the experimental
design. Section III describes the data obtained from the experiments, presents our strategy for
analyzing the data, and describes the results of our analysis. Section IV concludes by reviewing
our main results and outlining possible extensions of this line of research.
II. Experimental Design
Subjects were recruited from amongst the undergraduate student population at the
University of Tennessee, Knoxville, where the experiments were conducted. At the time of
recruitment, subjects were informed that they would be participating in an experiment that would
take up to 75 minutes to complete. The experiment was conducted in the UT Experimental
Economics Laboratory, which holds 24 networked computer workstations in separate cubicles.
Groups of six subjects play a game, which we define as six consecutive rounds of a PBCG. Up to
four games were played at a time; we define a set of concurrently played games as a session.
Within each round of play, subjects guessed a number between 0 and 100, inclusive. The player
whose guess was closest to 2/3 of the mean guess of the group won the round.
6
The winner of each round was paid $3. In addition to the prizes awarded to the winner of
each round, all subjects were awarded a $10 participation fee. The games were run by computer
using Z-Tree and average earnings for the experiment, including the participation fee, were
$13.44 per subject.
As students arrived for the experiment, they checked in at a table located one floor above
the Experimental Economics Laboratory. Each subject was provided a notecard with an ID
number that corresponded to the computer terminal at which they were ultimately seated. Given
that our design required that groups of six include a specific mix of experienced and
inexperienced subjects, we relied upon this procedure as a way to ensure that we observed the
desired mix of types.9 Once all subjects were seated at a computer and logged into z-Tree, they
were provided a hard copy of the experimental instructions and asked to follow along as the
instructions were read aloud by an experimental monitor (see Appendix A). Once the
instructions were read, subjects were asked if they had any questions. All questions were
answered in private and the game began.
Our basic experimental design requires that we create a series of linked sessions (or
families) containing differing proportions of inexperienced subjects who have not yet played the
PBCG and experienced subjects who had played the PBCG once before. Each family starts with
an initial (or progenitor) session whereby all participants were inexperienced agents playing the
PBCG for the first time. Our experiment includes 26 such progenitor games. Each of these
progenitor sessions are subsequently linked with two to three additional sessions in which each
9
The z-Tree code was set-up to form groups of six using computers located at pre-determined
cubicles. At the end of a session, we thus excused individuals from set locations and seated new
participants at these vacated terminals to ensure that each group had the desired mix of
participant types.
7
group of six contains either one, three, five, and six experienced agents who had participated in
the prior session.
Our linked families were created as follows. Every 30 minutes, up to four groups of six
concurrently played the PBCG. At the end of each session, subjects were called individually to
the front of the computer lab and informed about their earnings for that portion of the
experiment. Subjects who were pre-selected to participate in a second session were informed to
return to their computer and that they would be participating in a second session of the
experiment where they would be competing in a new PBCG.10 The returning players were
informed about the new number of experienced and inexperienced players that they would now
be competing against. Subjects who had completed a second session of play and those who were
pre-selected to participate in only a single session were transported to a second room where they
completed a post-experiment survey. Once they completed the post-experiment survey subjects
were paid their earnings in cash.
After all subjects had been either reseated at their computer terminal or transported to the
secondary room, a new set of inexperienced agents entered the laboratory and were seated at a
pre-determined computer monitor. To ensure that all groups had the desired mix of experienced
and inexperienced agents, the entering subjects were seated at the station whose number matched
the ID number on the card received when checking in for the experiment. Once all participants
were seated, the monitor distributed and read aloud the experimental instructions – which
included information on the mix of experienced and inexperienced agents in each group of six.
This basic process was replicated two to three times within each linked family.
10
66 subjects played only one game, while 276 subjects played two games.
8
In total, our experiment includes 103 games organized into 32 sessions, played by 342
unique subjects who made a total of 3708 guesses. Figure 1 provides a summary of basic
experimental design broken down by generation of play. As noted in the table, we observe 26
progenitor games that include all inexperienced agents. Second generation sessions include three
different mixes of agent types; (i) one experienced and five inexperienced agents, (ii) three
experienced and three inexperienced agents, and (iii) five experienced and one inexperienced
agent. Third generation sessions include two different mixes of agent types; (i) one experienced
and five inexperienced agents and (ii) three experienced and three inexperienced agents. Fourth
generation sessions include sessions with all six agents experienced.
By design, we thus observe significant variation in the mix of agents in the session from
which any agent gains his/her initial experience. All progenitors gain experience in sessions
where they are matched with five other inexperienced agents. Subjects who initially participate
in a second generation session gain experience from groups where they are matched with either
(i) one, (ii) three or (iii) five experienced agents. And subjects who initially participate in a third
generation session gain experience from groups where they are matched with either one on three
experienced agents. As the target number in the first round of play and the subsequent evolution
of this target across the remaining rounds of play is later shown to depend on the mix of subject
types, our design thus provides random variation in the history observed by any inexperienced
agent.
We observe similar variation in the mix of agent types that an agent who initially gained
experienced in a given session type faces when repeating the PBCG. For example, agents who
initially participated in a progenitor session are subsequently matched in one of three possible
mixes in the second generation. Similarly, an agent who initially gained experience in a session
9
with three experienced and two other inexperienced agents could subsequently be matched with
(i) only inexperienced agents, (ii) two other experienced agents, or (iii) all experienced agents.
As such, we observe experienced agents who are randomly matched in their second round of
play with either a higher or lower fraction of experienced types than what was present in the
session in which they gained experience.
As the mix of types correlates with the observed history of play, this allows us to
disentangle the form of learning. While inexperienced agents tend to choose numbers consonant
with level-1 reasoning, experienced counterparts tend to select numbers closer to level-2
reasoning. On average, those who initially gain experience in a session with five experienced
agents thus observe a higher level of rationality than those who gain experience in sessions with
a lower fraction of experienced agents. As such, we are thus able to compare the choices of
experienced agents who initially observed level-1 play with those who initially observed level-2
play when competing against five inexperienced players. Given the tendency for inexperienced
agents to select a number consistent with level-1 reasoning, if experience teaches agents to think
deeper about the game and the underlying solution concept we would expect both types to
recognize this tendency and select a number consistent with level-2 reasoning. In contrast, if
experience simply triggers a form of pattern recognition – e.g., best respond by playing the target
number in the prior round - one would expect to observe those who gained experience in a
session with five experienced agents to best respond to what they observed in round 1 (level-2
play) and select a number that is lower, on average, than that chosen by those who initially
gained experience matched against one or three experienced agents and observed level-1 play in
their first round of play.
III. Data and Results
10
Table 1 presents summary statistics of player guesses by round of play, player
experience, and session experience mix. The raw data illustrate our empirical strategy. As shown
in the first two rows, players who have experience with the game tend to play lower numbers.
Thus, as shown in rows three through seven, as the number of experienced players in the session
increases, average guesses decline, particularly in the early rounds. For example, in sessions
played with six inexperienced players, the average round one guess is 41.04, roughly consistent
with level-one play, but in sessions with five experienced players and one inexperienced player,
the average round one guess is 27.43, roughly consistent with level-two play.11 Thus, when a
player gains experience in a session with mostly inexperienced players, but is placed in a new
session with mostly experienced players, the winning strategy is usually inconsistent with the
pattern the player saw in her initial session. Similarly if a player gains experience in a session
with mostly experienced players but is placed in a new session with mostly inexperienced
players, the pattern they witnessed offers incorrect guidance if followed. These players will only
hold an advantage due to their experience if they have learned to carefully anticipate the
strategies that other players will follow.
An initial look at how players learn shows the advantage of experience. Table 2 presents
regression estimates of the following equation:
(1)
PERCig = α + β1ROUND3ig + β2ROUND4ig + β3ROUND5ig + β5ROUND6ig + εig
11
As frequently occurs in PBCG experiments where the target is the function of the mean guess,
as the rounds proceeded, a handful of players decided to play extremely high numbers in order to
bring up the group average and disrupt the results. This occurred in each session type. The
maximum play in each round of each session type was 100, leading to higher averages than
expected in late rounds and in some cases, the average in a round is higher than the average in
previous rounds. Accordingly, in all of the analysis that follows, plays that were clearly not
serious, defined as choosing a number that is more than twice the previous round’s target
number, are dropped from the analysis. Doing so has no effect on the qualitative results.
11
where PERCig is player i’s guess in game g as a percentage of the previous round’s target value,
ROUND3ig through ROUND6ig are dummy variables indicating whether the guess took place in
rounds three through six, respectively, using round two as the omitted category, and the error
terms are clustered by game. Columns 1 and 2 show the results for inexperienced and
experienced players, respectively, and column 3 pools the data to test whether differences among
the experience types are statistically significant, adding a dummy variable indicating whether the
player is experienced and interactions between this experience indicator and the round indicators.
The results are consistent with the findings of Slonim (2005) and Livingston and Skeath (2014),
who show that on average, experienced players’ guesses appear one level of thinking deeper than
inexperienced players. Inexperienced players in round two choose guesses whose average is very
close to the previous round’s target number (94 percent), while experienced players in round two
play a best response to this move; their average guess is two-thirds of the round one target. This
advantage diminishes as the game proceeds. Inexperienced players learn to lower their guesses,
playing close to two-thirds of the previous round’s target by round five. Experienced players also
adjust their strategy, lowering their play to 56 percent of the previous round’s target by round
four. Overall, the inexperienced players do partially catch up to the experienced players. As the
results in column 3 indicate, the adjustments by inexperienced players in rounds three through
six are statistically significantly larger than the adjustments made by experienced players.
The advantage held by experienced players in the early rounds gives them a greater
chance of winning, but this advantage dissipates in later rounds. Table 3 displays the proportion
of winners who are experienced in each game type, and tests the null hypothesis that this
proportion is equal to the expected proportion if each player has an equal chance of winning. The
proportion of winners who are experienced is statistically significantly higher than expected in
12
the first four rounds of both the games where one experienced player plays against five
inexperienced players, and the games where three experienced players play against three
inexperienced players, using one-tailed tests of proportions. In rounds five and six, the advantage
held by experienced players is not statistically significant, a function of the fact that
inexperienced players have learned by those rounds to play smaller percentages of the previous
round’s target.
These patterns in average play are consistent with the sophisticated strategies assumed by
Ho and Su (2013) and Mohlin (2012). If players learn in this way, we would expect players who
have experience with the game to learn to anticipate that inexperienced opponents are likely to
begin by playing level one-type strategies, and play close to the previous round’s target in early
rounds. Similarly, we would expect inexperienced players to learn from the strategies of more
successful players, and for their strategies to converge over time. Using a similar experiment to
what we employ, Slonim (2005) finds a similar pattern, and interprets this evidence as supportive
of models of sophisticated learning in this type of game.
Livingston and Skeath (2014), however, find that if one looks beyond average play, the
behavior of players when they gain experience is not in line with what one would expect from
sophisticated learners. While average play is consistent with experienced players employing a
level of thinking that is one level deeper than inexperienced players, there is large variance in
their play, leading to frequent mistakes, and when they make mistakes, they do not adjust their
behavior any more effectively than inexperienced players. Further, the results presented above
are also consistent with players simply recognizing patterns of behavior that increase the chance
of winning without thinking carefully about the strategies that others are following.
13
Inexperienced players see that the winning number is well below what they chose, and may
simply switch to the pattern they saw their opponents have success with in previous rounds.
Thus, to distinguish between belief-based models of sophisticated learning and choice
reinforcement models which assume learning to be less sophisticated, we explore five lines of
analysis. First, we consider how inexperienced players evolve their strategies over the course of
the game. Because average guesses are lower in each round when more experienced players are
involved, one might expect inexperienced players to learn more quickly when paired with a
larger proportion of experienced opponents. The estimates presented in Table 4 examine whether
this is the case. Each column presents the estimates of the following equation separately for each
round:
(2)
PERCig = α + β1TYPE51ig + β2TYPE33ig + β3TYPE15ig + εig
where the dependent variable is again the player i’s guess in game g as a percentage of the
previous round’s target, and the control variables TYPE51ig, TYPE33ig and TYPE15ig are dummy
variables indicating whether the game included one, three or five experienced players,
respectively, using zero experienced players as the omitted category. Standard errors are again
clustered by game.
The estimates lead to our first result:
Result 1: Inexperienced players do not react to the previous round’s target
differently depending on the mix of experience.
Curiously, the point estimates suggest that inexperienced players choose guesses that are a higher
percentage of the previous round’s target in games with one experienced player than in games
with all inexperienced opponents. Still, for the most part, the differences in guesses as a
percentage of the previous round’s target between games with different numbers of experienced
14
opponents are largely statistically insignificant. Only in round four is there evidence that the
experience mix matters, but still, on the most important margin, the difference between games
with all inexperienced opponents and in games with all experienced opponents is not significant.
Thus, the evidence suggests that inexperienced players may learn to lower their guesses, but they
do not learn faster in the face of stronger evidence that doing so would be beneficial.
More importantly, however, our design focuses on examining how experienced players
behave when the pattern of results they witnessed does not correspond to the likely winning
strategy. Thus, secondly, we examine how much experienced players miss the target by in round
one as a function of whether their new session has the same, more, or fewer experienced players
than the session where they gained their experience, leading to our second result:
Result 2: Experienced players make larger mistakes in round one when facing a
different experience mix than what they originally encountered.
Figure 2 displays how the amount by which the experienced players miss the target in round one
varies with the number of experienced opponents relative to the game where they gained their
experience. The figure shows that when there are fewer experienced opponents than what the
player originally faced, so that the new round one target is likely to be higher than what the
player witnessed, players undershoot the target by 3.28 on average. But when there are more
experienced opponents than what the player originally faced, so that the new round one target is
likely to be lower than what the player witnessed, players overshoot the target by 6.95 on
average. Table 5 presents a regression which shows that the differences relative to experienced
players who face the same number of experienced opponents in both games are statistically
significant. These results are consistent with choice reinforcement models. The players follow
the pattern they witnessed in their original game, failing to anticipate that experienced players
15
are likely to play smaller numbers and inexperienced players are likely to play larger numbers.
Thus, the evidence suggest that the experienced players do not anticipate the strategies of their
opponents, as belief-based models assume.
Thirdly, we examine how the likelihood that an experienced player is to play a guess
close to the target is impacted by the degree to which the pattern they witnessed matches the new
game. Nagel (1995) calculates “neighborhood intervals” around the choices that are consistent
with each level of thinking in order to see whether choices are concentrated around those
numbers.12 We follow a similar approach by calculating the neighborhood interval around the
target number for each round and then estimating how often the players select a guess in that
interval. Probits of the following form are estimated:
(3)
Pr(NIig = 1) = 1TARGDIFF1ig + is,
where NIig is a dummy variable that equals one if player i in game g guesses a number within the
neighborhood interval of the game g target number for a particular round, and TARGDIFF1ig is
the absolute value of the difference between the round one target in game g and the target that
player i saw in round one of her previous session. Standard errors are again clustered by game.13
The estimates lead to our third result:
Result 3: Experienced players are less likely to choose a number inside the
neighborhood interval of the target as the gap between the target they initially
witnessed and the new target increases.
12
Each interval has the boundaries 50(2/3)i+1/4 and 50(2/3)i-1/4, where i represents the level of
thinking, rounded to the nearest integer.
13
Including game fixed effects would force the dropping of a large number of observations
because there are many games in each round in which no player’s guess was in the neighborhood
interval of the target number.
16
Panel A of Table 6 reports the estimated marginal effect of the difference between the targets and
Panel B reports summary statistics on each variable. The results show that as the difference
between the current target and observed target grows, experienced players are less likely to
choose a number in the neighborhood interval of the target. A one point increase in the absolute
value of the difference between the round one target value in the current session and the round
one target value the player observed originally results in a decrease in the probability that the
player’s guess is in the neighborhood interval of the target of 1.6 percentage points in rounds one
and two, and one 0.9 percentage points in round three. Thus, witnessing a pattern that does not
correspond to the player’s new circumstances decreases the likelihood that the player’s guess is
close to the target, and this effect persists for three rounds. This evidence is again consistent with
players following patterns without carefully considering the likely strategies of their opponents.
Fourthly, we examine whether players coming from different histories play similar when
they go into the same game type. If players learn to think carefully about the strategies followed
by others, they should realize that other players with experience are also likely to play smaller
numbers that are roughly consistent with level-two type thinking, and inexperienced players are
likely to play higher numbers that are roughly consistent with level-one type thinking. Thus,
when facing the same mix of experienced and inexperienced opponents, in the first round,
players should choose similar numbers regardless of the experience mix they faced in their initial
game. To investigate whether this is the case, we hold the experience mix in the new game
constant, and estimate the effect that the round one target number the player saw when gaining
experience (which largely varies due to the different experience mixes in the players’ original
games) has on the player’s round one guess in the new game. The following equation is
estimated:
17
(4)
GUESSig = α + β1TARGSEENig + εig
where GUESSig is player i's guess in round one of game g, TARGSEENig is the round one target
that player i in game g witnessed in the game where the player gained experience, and standard
errors are again clustered by game. The estimates lead to our fourth result:
Result 4: Players who witnessed different target numbers when gaining experience
choose different numbers when playing in new games with the same experience mix.
The results are presented in Table 7. Columns 1 through 4 examine play in final games of each
experience profile, and Column 5 examines play in all game types together, but adds as controls
dummy variables indicating the game type, with games with all experienced players used as the
omitted category. In all game types but those with all experienced players, and considering all
games together and controlling for game type, players guess larger numbers in round one when
the round one target they saw when gaining experience was higher. For example, in games with
three inexperienced players and three experienced players, a one point increase in the target
number witnessed by the subject originally is correlated with a 1.16 increase in the number the
target plays in the new game. These results suggests that players are following the pattern from
their original play, and are not using information about the new mix of experience profiles to
anticipate the strategies their new opponents are likely to follow.
Finally, we examine whether players coming from the same mix of experience play
differently from each other when they go into different game types. Again, if players learn to
think carefully about the strategies followed by others, they should realize that experienced
players generally choose level-two type numbers, and inexperienced players generally choose
level-one type numbers. Thus, when facing different mixes of experienced and inexperienced
opponents, players should choose different numbers depending on the proportion of players who
18
are experienced in their current game, regardless of the experience mix they faced in the game in
which they played initially. To investigate whether this is the case, the following equation is
estimated:
(5)
GUESSig = α + β1TYPE51ig + β2TYPE33ig + β3TYPE15ig + β4TARGSEENig + εig
where GUESSig and TARGSEENig are as previously defined; and TYPE51ig, TYPE33ig and
TYPE15ig are again dummy variables indicating whether the game included one, three or five
experienced players, respectively. The estimates suggest our final result:
Result 5: Players who gained experience with the same mix of experienced and
inexperienced players choose similar numbers when playing in new games with
different experience mixes.
The estimates are presented in Table 8. Column 1 examines play in final games where all six
players are experienced. Column 2 examines play in final games where three players are
experienced and three are inexperienced. Column 3 examines play in final games where only one
player is experienced. The results are consistent. Players with a given experience type do not
play statistically different numbers based on the experience mix in their new game, thus failing
to account for the strategies their new opponents are likely to follow.
V. Conclusion
While modern theories of learning in strategic games typically come in the flavor of
belief-based models, where players form beliefs about the strategies their opponents are expected
to follow and best respond to those beliefs, the validity central assumption has yet to be
sufficiently tested. Past studies employ designs that cannot distinguish between belief-based
models and choice reinforcement models where players do not attempt to deduce the strategies of
19
the other players, but merely follow signals about what moves have led to higher payoffs in the
past.
Our design permits just such a test. We place agents in situations where the signals they
observe about the payoffs associated with various moves do not offer good advice about the
strategies that are likely to be successful. Thus, if players simply follow the signals, choice
reinforcement models better describe the way that they learn. If, rather, they learn and think
about the game more carefully, and anticipate the play of others more effectively, then beliefbased models are likely the superior choice. We find evidence that players respond to signals
about payoffs received from past play regardless of how the mix of players changes, offering
strong support in favor of choice reinforcement models in the context of beauty contests.
Certainly, our research is not the final word on the matter. Future research should consider
carefully the type of learning that takes place in various strategic contexts, so that models of the
learning process in these contexts can be based on the correct set of assumptions.
20
Figure 1. Experimental Design
Progenitor games:
6 inexperienced
1 exp.
5 inexp.
3 exp.
Generation 1
5 exp.
3 inexp.
1 exp.
1 exp.
1 exp.
5 inexp.
5 inexp.
5 inexp.
3 exp.
3 exp.
3 exp.
3 inexp.
3 inexp.
3 inexp.
1 inexp.
Generation 3
6 exp.
6 exp.
6 exp.
0 inexp.
0 inexp.
0 inexp.
21
Generation 2
Generation 4
Table 1. Summary statistics, by round and experience mix
Round 1
Round 2
Round 3
Round 4
Round 5
Round 6
Player type:
Inexperienced
N = 342
41.52
(18.23)
27.45
(15.54)
17.89
(14.78)
12.83
(17.87)
9.62
(17.18)
8.66
(17.72)
Experienced
N = 276
24.40
(12.95)
12.80
(7.67)
6.84
(5.94)
5.90
(14.29)
6.56
(16.21)
7.11
(18.38)
Session type:
6 inexperienced,
0 experienced
N = 156
41.04
(18.91)
28.00
(15.27)
17.59
(11.72)
12.59
(17.51)
11.43
(20.51)
10.56
(19.12)
5 inexperienced,
1 experienced
N = 144
39.21
(18.59)
26.57
(15.53)
18.51
(15.16)
12.90
(14.75)
8.41
(12.85)
6.06
(12.47)
3 inexperienced,
3 experienced
N = 102
33.94
(16.61)
20.01
(12.03)
12.77
(14.92)
10.15
(19.26)
6.25
(8.23)
7.88
(20.08)
1 inexperienced,
5 experienced
N = 90
27.43
(12.07)
14.11
(9.06)
7.26
(6.49)
6.37
(15.45)
7.20
(17.78)
6.24
(15.25)
0 inexperienced,
23.45
11.24
6 experienced
(15.21)
(9.10)
N = 126
Standard deviations in parentheses.
5.10
(5.66)
4.65
(14.97)
6.52
(19.61)
8.24
(21.59)
22
Table 2. Learning over time by experienced and inexperienced players
Inexperienced players Experienced players
All players
(1)
(2)
(3)
Constant
0.94***
(0.020)
0.67***
(0.019)
= 1 if experienced
0.94***
(0.019)
-0.28***
(0.028)
round 3
-0.08***
(0.028)
-0.06**
(0.027)
-0.08***
(0.026)
round 4
-0.21***
(0.028)
-0.11***
(0.027)
-0.21***
(0.026)
round 5
-0.28***
(0.028)
-0.10***
(0.027)
-0.28***
(0.027)
round 6
-0.28***
(0.028)
-0.10***
(0.027)
-0.28***
(0.027)
experienced*round 3
0.02
(0.039)
experienced*round 4
0.10**
(0.039)
experienced*round 5
0.18***
(0.040)
experienced*round 6
0.18***
(0.040)
Observations
1,577
1,306
2,883
R2
0.090
0.016
0.121
Dependent variable is the player’s guess as a percentage of the previous round’s target.
Standard errors in parentheses.
*** p<0.01, ** p<0.05, * p<0.1
23
Table 3. How often is the winner an experienced player?
Round:
session type:
1
2
3
4
5
5 inexp., 1 exp.
0.41***
0.33**
0.32**
0.30*
0.24
(expected = 0.17)
(0.10)
(0.10)
(0.10)
(0.10)
(0.09)
N = 24
3 inexp., 3 exp.
(expected = 0.50)
N = 17
0.82***
(0.09)
0.71**
(0.11)
0.71**
(0.11)
0.71**
(0.11)
0.59
(0.12)
6
0.26
(0.09)
0.59
(0.12)
1 inexp., 5 exp.
0.93
0.80
0.87
0.93
0.73
0.87
(expected = 0.83)
(0.06)
(0.10)
(0.09)
(0.06)
(0.11)
(0.09)
N = 15
Proportion of winners who are experienced players. Standard errors in parentheses.
Significance tests test the null hypothesis that the proportion of winners who are experienced is
equal to the expected proportion if each player has an equal chance of winning against the
alternative hypothesis that the proportion of winners who are experienced is higher than the
expected proportion if each player has an equal chance of winning.
*** the proportion of experienced winners is significantly different from the expected
proportion at the 1% level;
** the proportion of experienced winners is significantly different from the expected proportion
at the 5% level;
* the proportion of experienced winners is significantly different from the expected proportion at
the 10% level.
24
Table 4. Inexperienced player guesses as a percentage of the previous round’s target value
Round 2
(1)
0.94***
(0.03)
Round 3
(2)
0.86***
(0.02)
Round 4
(3)
0.71***
(0.03)
Round 5
(4)
0.65***
(0.04)
Round 6
(5)
0.70***
(0.05)
0.03
(0.04)
0.04
(0.04)
0.09*
(0.05)
0.06
(0.06)
-0.07
(0.07)
3 experienced players
0.004
(0.08)
-0.05
(0.07)
-0.04
(0.07)
-0.06
(0.07)
-0.02
(0.07)
5 experienced players
-0.13
(0.12)
-0.09
(0.13)
-0.05
(0.07)
-0.06
(0.07)
-0.03
(0.13)
Constant
Paired with:
1 experienced players
Observations
320
322
319
305
311
2
R
0.006
0.010
0.025
0.019
0.008
Inexperienced players who are paired with no experienced players and five other
inexperienced players is the omitted category.
Robust standard errors clustered by session in parentheses.
*** p<0.01, ** p<0.05, * p<0.1
25
Figure 2. Experienced players’ mistakes in the first round
difference between guess and target
8
6
4
2
0
fewer
same
-2
-4
Number of experienced opponents,
relative to when gained experience
26
more
Table 5: Does the difference between a player’s guess and the target value vary depending
on whether their experience profile matches the current game?
difference
(1)
More experienced players in current
game than in historical game
3.33***
(0.598)
Fewer experienced in current game
than in historical game
-6.90***
(1.445)
Constant
3.62***
(0.325)
Observations
276
R-squared
0.060
Dependent variable is the difference between the player’s round one guess
and the round one target value.
The omitted category is players who had the same number of experienced
players in the current session and in the session from which they gained
experience.
Robust standard errors clustered by game in parentheses.
*** p<0.01, ** p<0.05, * p<0.1
27
Table 6. Do experienced players whose experience does not match the pattern adapt?
Panel A. Dependent variable: Pr(guess in neighborhood interval of target value)
Difference
between targets
Observations
round one
(1)
-0.016***
(0.004)
round two
(2)
-0.016***
(0.005)
round three
(3)
-0.009**
(0.004)
round four
(4)
-0.001
(0.003)
round five
(5)
-0.001
(0.004)
round six
(6)
-0.001
(0.003)
276
276
276
276
274
272
round two
(2)
0.174
(0.380)
round three
(3)
0.134
(0.341)
round four
(4)
0.101
(0.302)
round five
(5)
0.105
(0.307)
round six
(6)
0.094
(0.293)
Panel B. Summary statistics
= 1 if guess in
NI of target
round one
(1)
0.174
(0.380)
Difference
8.75
between targets
(5.21)
Independent variable in each regression is the difference between the round one target the player
saw when gaining experience and the round one target value in the current play.
Robust standard errors clustered by game in parentheses in panel A.
Standard deviations in parentheses in panel B.
*** p<0.01, ** p<0.05, * p<0.1
28
Table 7. Do players coming from different histories play the same when they go into the
same game type?
0-6 games
1-5 games
3-3 games
5-1 games
All games
(1)
(2)
(3)
(4)
(5)
target seen in
0.28
0.67***
1.16***
0.90***
0.64***
round 1 originally
(0.29)
(0.13)
(0.22)
(0.20)
(0.13)
1-5 game
1.36
(1.63)
3-3 game
1.82
(1.72)
5-1 game
2.20
(2.21)
Constant
16.15*
(8.31)
7.56*
(3.63)
-4.57
(5.65)
4.01
(4.64)
Observations
126
75
51
24
2
R
0.01
0.11
0.31
0.19
0-6 games include zero inexperienced players and six experienced players.
1-5 games include one inexperienced player and five experienced players.
3-3 games include three inexperienced players and three experienced players.
5-1 games include five inexperienced players and one experienced player.
6-0 games include six inexperienced players and zero experienced players.
Robust standard errors clustered by game in parentheses.
*** p<0.01, ** p<0.05, * p<0.1
29
7.05*
(4.02)
276
0.07
Table 8. Do people coming from the same history play differently when they go into different game types?
Experience from 1-5 game
Experience from 3-3 game
Experience from 6-0 game
(1)
(2)
(3)
3-3 game type omitted
1-5 game type omitted
3-3 game type omitted
5-1 game
-2.16
(2.57)
-7.89
(7.45)
3-3 game
1-5 game
target seen in
round 1 originally
Constant
--
-7.57
(8.40)
--
-0.61
(1.91)
2.02***
(0.50)
0.53
(0.64)
0.87***
(0.13)
-12.36
(9.15)
18.19
(13.75)
2.65
(3.35)
Observations
15
33
R-squared
0.47
0.04
0-6 games include zero inexperienced players and six experienced players.
1-5 games include one inexperienced player and five experienced players.
3-3 games include three inexperienced players and three experienced players.
5-1 games include five inexperienced players and one experienced player.
6-0 games include six inexperienced players and no experienced players.
Robust standard errors clustered by game in parentheses.
*** p<0.01, ** p<0.05, * p<0.1
30
114
0.18
Appendix A. Instructions
Experiment Instructions
This is an experiment in economic decision-making. The experiment consists of a series
of six (6) rounds. You will play against a group of 5 other people in each round. The decisions
that you and the 5 other people make will determine the dollar winnings for each of you. Each
player will be paid $5 for participating.
At the start of each round, you will be asked to choose a number between 0 and 100,
inclusive. 0 and 100 are possible choices. Your number can include up to two decimal places,
such as 12.34 or 56.78. At the same time, each of the other 5 people will also choose a number
between 0 and 100. None of you will be able to see anyone else’s number until after your
decision is submitted.
The numbers selected by all 6 people in your group will be averaged, and then the
number that is two-thirds (0.67) of that average will be calculated and announced at the end of
the round.
The person whose number is closest to two-thirds of the average will win $3 for that
round. The 5 other people will earn $0.
If more than one person ties for having a number closest to two-thirds of the average,
then the payment of $3 will be divided equally among those who tied and the others will earn $0.
The website will keep track of the choices of each player in each round. It will also
calculate the target number (two-thirds of the average of the numbers chosen by the 6
participants), identify the winner or winners of each round, and keep track of each player’s
winnings over the six (6) rounds of play.
After the end of the final round, you will be required to complete a short online survey.
Upon completing the survey and logging out of the website, you will present your code card at
the table upstairs near the main door of Smith and collect your winnings. At that time, you will
also need to sign a receipt confirming the amount of the payment that is made to you.
31
References
Alba-Fernández, V., P. Brañas-Garza, F. Jiménez-Jiménez, and J. Rodero-Cosano. 2006.
“Teaching Nash Equilibrium and Dominance: A Classroom Experiment on the Beauty
Contest.” Journal of Economic Education 37(3), pp. 305-322.
Berheim, D. 1984. “Rationalizable Strategic Behavior.” Econometrica 52(4), pp. 1007-1028.
Binmore, K. 1999. “Why Experiment in Economics?” Economic Journal 109(453), pp. F16-F24.
Burnham, T., Cesarini, D., Johanesson, M., Lichtenstein, P. and Wallace., B. 2009. “Higher
Cognitive Ability is Associated With Lower Entries in a P-beauty Contest.” Journal of
Economic Behavior and Organization 72(1), pp. 171-175.
Camerer, C. 1997. “Taxi Drivers and Beauty Contests.” Engineering and Science 1, pp. 10-19.
Camerer, C. and T. Ho. 1999. “Experience Weighted Attraction Learning in Normal-Form
Games.” Econometrica 67(4), pp. 827-874.
Camerer, C., T. Ho, and K. Chong. 2003. “Models of Thinking, Learning, and Teaching in
Games.” American Economic Review 93(2), pp. 192-195.
Camerer, C., T. Ho, and K. Chong. 2004. “A Cognitive Hierarchy Model of Games.” Quarterly
Journal of Economics 119(3), pp. 861-898.
Crawford, V., M. Costa-Gomes, and N. Iriberri. 2010. “Strategic Thinking.” Mimeo, University
of California at San Diego.
Costa-Gomes, M.A. and V. Crawford. 2006. “Cognition and Behavior in Two-Person Guessing
Games: An Experimental Study.” American Economic Review 96(5), 1737-1768.
Dufwenberg, M., T. Lindqvist, and E. Moore. 2005. “Bubbles and Experience: An Experiment.”
American Economic Review 95(5), pp. 1731-1737.
Goeree, J. and C. Holt. 2004. “A Model of Noisy Introspection.” Games and Economic Behavior
46(2), pp. 365-382.
Guth, W., M. Kocher, and M. Sutter. 2002. “Experimental ‘Beauty Contests’ with Homogeneous
and Heterogeneous Players and with Interior and Boundary Equilibria.” Economics Letters 74,
pp. 219-228.
Ho, T., C. Camerer, and K. Weigelt. 1998. “Iterated Dominance and Iterated Best Response in
Experimental ‘p-Beauty’ Contests.” American Economic Review 88(4), pp. 947-969.
Holt, D. 1999. “An Empirical Model of Strategic Choice with an Application to Coordination
Games.” Games and Economic Behavior 27, pp. 86-105.
32
Johnson, E., C. Camerer, S. Sen, and T. Rymon. 2002. “Detecting Failures of Backward
Induction: Monitoring Information Search in Sequential Bargaining.” Journal of Economic
Theory 104, pp. 16-47.
Keser, C. and R. Gardner. 1999. “Strategic Behavior of Experienced Subjects in a Common Pool
Resource Game.” International Journal of Game Theory 28, pp. 241-252.
Kocher, M. and M. Sutter. 2006. “Time is Money: Time Pressure, Incentives and the Quality of
Decision-Making.” Journal of Economic Behavior and Organization 61, pp. 375-392.
Kocher, M., M. Sutter, and F. Wakolbinger. 2007. “The Impact of Naïve Advice and
Observational Learning in Beauty-Contest Games.” Tinbergen Institute Discussion Paper
TI2007-01. January.
Levitt, S. and List, J. 2007. “What Do Laboratory Experiments Measuring Social Preferences
Reveal About the Real World?” Journal of Economic Perspectives 21(2), pp. 153-174.
List, J. 2003. “Does Market Experience Eliminate Market Anomalies?” Quarterly Journal of
Economics 118(1), pp. 41-71.
Livingston, J.A. and Skeath, S. 2014. “A Step Ahead? Experienced Play in the P-Beauty
Contest.” Working paper.
McKelvey, R. and T. Palfrey. 1995. “Quantal Response Equilibria for Normal-Form Games.”
Games and Economic Behavior 10(1), pp. 6-38.
Moulin, H. 1986. Game Theory for the Social Sciences. (2nd ed.) New York: New York
University Press.
Nagel, R. 1995. “Unraveling in Guessing Games: An Experimental Study.” American Economic
Review 85(5), pp. 1313-1326.
Pearce, D. 1984. “Rationalizable Strategic Behavior and the Problem of Perfection.”
Econometrica 52(4), pp. 1029-1050.
Sbriglia, P. 2008. “Revealing the Depth of Reasoning in P-Beauty Contest Games.”
Experimental Economics 11, 107-121.
Slonim, R. 2005. “Competing Against Experienced and Inexperienced Players.” Experimental
Economics 8, pp. 55-75.
Sonnemans, J. and J. Tuinstra. 2008. “Positive Expectations Feedback Experiments and Number
Guessing Games as Models of Financial Markets.” Tinbergen Institute Discussion Paper
TI2008-076. August.
33
Stahl, D. and Wilson, P. 1995. “On Players Models of Other Players: Theory and Experimental
Evidence.” Games and Economic Behavior 10(1), 218-254.
Sutter, M. 2005. “Are Four Heads Better Than Two? An Experimental Beauty-Contest Game
with Teams of Different Sizes.” Economics Letters 88, pp. 41-46.
Thaler, R. 1997. “Giving Markets a Human Dimension.” Financial Times: Survey – Mastering
Finance 6, p. 2. June 16.
34