Learning from experience in beauty contests Jeffrey A. Livingston* Bentley University Michael K. Price Georgia State University and NBER Susan Skeath Wellesley College Abstract: In the "p-beauty" contest, contestants choose a number between zero and 100 and the winner is the player who selects the number that is closest to some fraction p of the average chosen by the group. While the Nash equilibrium is for all players to choose 0, subjects frequently display bounded rationality by choosing numbers that are substantially higher. However, subjects adjust their choices over repeated plays, typically converging to close to equilibrium. Theoretical models of this learning process come in two flavors. The first, beliefbased models, assume that players learn via a sophisticated process where they form beliefs about the strategies followed by their opponents, and best respond to those beliefs. The second, choice reinforcement models, assume that learning is less sophisticated, where subjects react to payoffs associated with past plays and attempt to adopt those strategies. While past experimental studies have examined learning in this context, none have employed a design that can distinguish between these two models. We develop such a design by placing subjects in a circumstance where the pattern of results they saw in past plays likely does not match the winning strategy in their next play, and find that players still best respond to the pattern they saw. We interpret this as evidence in favor of choice reinforcement models. Keywords: bounded rationality, beauty contests, learning, experience JEL codes: D01, C7 *Corresponding author. Department of Economics, Bentley University, 175 Forest Street, Waltham, MA 02138. Phone 781-891-2538, Fax 781-891-2896, Email [email protected] I. Introduction In a seminal study of boundedly rational behavior, Nagel (1995) presents an experiment where subjects play the p-beauty contest game (PBCG), where contestants choose a number between zero and 100 and win by being closest to some fraction, p, of the average of all of the numbers chosen. While the unique equilibrium of the game is for all players to select zero, across many experimental studies, players typically select numbers above 0. For example, when p = 2/3, the average number chosen by a group of subjects is typically between 20 and 35. A number of theoretical models have been proposed to explain this violation of Nash Equilibrium behavior.1 Nagel (1995) and Stahl and Wilson (1995) initially offered versions of a model of what has come to be known as level-k thinking. In such a model, a level-0 thinker does not think strategically and effectively selects a number at random, while a level-k thinker selects the best response to a belief that all other players are following a level-(k-1) strategy. For example, when p = 2/3, a level-1 player selects a number near 33 in an effort to be close to 2/3 of the anticipated average play of 50, while a level-2 player selects a number around 22 in an effort to be close to 2/3 of the anticipated average play of 33. One frequent response to findings of seemingly irrational behavior is that it is likely to dissipate as actors learn from experience.3 Indeed, when the PBCG game is repeated over a number of rounds, players who have never played before show a convergence toward the Nash 1 Crawford et al. (2010) provide a review of these models of what they term “strategic sophistication,” including the level-k model detailed below, the related cognitive hierarchy model of Camerer et al. (2004) in which “Step k” thinkers accurately predict the relative frequencies of players doing fewer steps of thinking from levels 0 to k-1, equilibrium plus noise, finitely iterated strict dominance and k-rationalizability (Bernheim (1984) and Pearce (1984)), quantal response equilibrium (McKelvey and Palfrey (1995)), and noisy introspection (Goeree and Holt (2004)). 3 See, for example, List (2003) who offers powerful evidence from a field experiment that market experience attenuates the endowment effect. 1 equilibrium choice of zero as the rounds proceed.4 Further, Sbriglia (2008) shows that learning can lead to more advanced levels of thinking. Over the course of six rounds, the winning player leaves the game and the remaining players are given information about the thought process used by the winner. Compared to when players are not given this information, the game converges to the predicted equilibrium more quickly. Several theoretical models have been advanced to explain how players learn in the PCBG and related strategic situations. The models fall into two types. Camerer and Ho (1999) propose a general model that captures each of these possibilities as special cases in an “experienceweighted attraction” model of learning by game players (see also Camerer, Ho and Chong, 2003). In the first type, players form beliefs about the strategies of their opponents and best respond to those beliefs. They describe this type of model as follows: “One approach, belief-based models, starts with the premise that players keep track of the history of previous play by other players and form some belief about what others will do in the future based on past observation. Then they tend to choose a best-response, a strategy that maximizes their expected payoffs given the beliefs they formed.” In the second type, players are less sophisticated, merely reacting to patterns that they witness in previous outcomes: “A different approach, choice reinforcement, assumes that strategies are ‘reinforced’ by their previous payoffs, and the propensity to choose a strategy depends in some way on its stock of reinforcement. Players who learn by reinforcement do not generally have beliefs about what other players will do. They care only about the payoffs strategies yielded in the past, not about the history of play that created those payoffs.” 4 Nagel (1995), Alba-Fernandez et. al. (2006), Guth et. al. (2002), and Ho, Camerer and Weigelt (1998) are among the papers that have presented evidence on this issue. 2 Most recent theories of learning in games where level-k behavior is exhibited assume a sophisticated process in the style of a belief-based model. For example, Ho and Su (2013) study a level-k model of behavior in centipede games. Their model assumes that players carefully attempt to intuit the level rule played by their opponents, so that they can play the best response to the behavior that this level rule entails. Mohlin (2012) explores a more general setting where, in a level-k framework, the lowest type merely best responds to the average of past play, and higher types develop their beliefs about how others learn using increasingly complex models. These models of sophisticated learning enjoy some empirical support from Slonim (2005).7 He runs an experiment where players play a supergame of nine rounds of a PBCG, separated into three games of three rounds each, against a new set of two opponents in each game. Players thus gain experience with the PBCG as they move from round to round and game to game. He examines two treatments, SAME and MIX. In the SAME treatment, all three players start the game with the same amount of experience. In the MIX treatment, one player is experienced and the other two have not seen the game before. He finds that experienced players choose lower numbers in round one when facing experienced opponents than when facing inexperienced opponents, often correctly predicting the guesses of their opponents. As a result, the experienced player is far more likely to win, particularly in the first two rounds. For example, in the first round of games where one player is experienced and the others are inexperienced, the 7 The extant literature does include several studies of the level of thinking employed by subjects who play the PBCG. Sbriglia (2008) shows that learning can lead to more advanced levels of thinking. Over the course of six rounds, each round’s winning player leaves the game but gives information about her thought process to the remaining players. The game converges more quickly to the predicted equilibrium, and the levels of thinking advance more quickly, when players are provided this information than when they are not. Costa-Gomes and Crawford (2006) define several types of strategies that players might follow when playing a PBCG that are based on level-k thinking, several of which are rational best responses to non-equilibrium PBCG strategies. They utilize a series of 16 two-person PBCGs to decipher each player's type, and find that many players can be neatly classified into their defined types. 3 experienced player wins 85 percent of the time. Slonim argues that these results are consistent with “the ‘sophisticated’ learning studied in Cooper and Kagel (2002), Camerer and Ho (1998), Camerer et al. (2002), Stahl (2000) and others. For example, Cooper and Kagel find that some players learn about opponent’s reasoning in signaling games and Camerer et al. find that some players learn that other players are learning.” However, in his setting, there is a close correlation between experience and the number the player chooses. For example, in Slonim’s sample, when playing against other inexperienced players, the average guess in the first round of new players is 33.5 with a standard deviation of only 2.3. Thus, in his experiment, experience type is an excellent predictor of the number a player can be expected to guess, resulting in a pattern of target numbers that are relatively easy to predict without thinking carefully about the strategy that might have led to the choice. The data are thus consistent with either sophisticated belief-based models or less sophisticated reinforcement models. We conduct an experiment that is designed to identify which of these two types of models is more consistent with the learning observed in PBCGs. Subjects gain experience in a session of six rounds of the PBCG and then play in another session against a known mix of experienced players who had played the PBCG once before and inexperienced players who had never before played the game. As in past studies, experienced players tend to play lower numbers than inexperienced players. The average guess of an experienced player in round one in our sample is 24.40, while the average guess of an inexperienced player is 41.52. Experimental treatments vary both the proportion of types in the game in which a subject gains experience and the proportion of types – experienced and inexperienced agents – in the second game played. 4 We use such variation as a means to randomly shock the level of play first observed by a subject. Importantly, this allows us to explore how subjects behave when the target number they witness in their initial play is different from the target number in their second play. For example, we observe subjects who gained experience while playing the game with five inexperienced opponents, resulting in an average target number of 27.34 in round one. Some of these subjects play a second time against another set of five inexperienced opponents, likely resulting in a similar target to what they witness initially, but others play a second time with four experienced players and one inexperienced player, resulting in an average target number of 18.29 in round one. If “learning” reflects pattern recognition whereby subjects simply best respond to outcomes observed in prior repetitions of the game, we would expect subjects that initially observe a lower target to guess lower numbers at the start of the second game than counterparts who initially observed a higher target, regardless of the mix of experience types in the new game. Conversely, if learning reflects an increased understanding of the PBCG and the iterated dominance reasoning required to “solve” such games, then we would expect play at the start of the second game to be independent of the level of rationality initially observed by an experienced subject, and instead vary with the mix of experience types. In this case, for example, players would see that their new game involves a larger proportion of opponents who also have experience in the game, anticipate that such players will also choose lower numbers due to their increased understanding of the game, and best respond accordingly by selecting an even lower number. The results offer strong support for choice reinforcement-style models over belief-based models. Four results support this conclusion. First, relative to situations where the experience mix is the same in both games, experienced players undershoot the target when there are fewer experienced players than when they played initially, resulting in a new target number that tends 5 to be higher than they witnessed. Similarly, they overshoot the target when there are more experienced players than when they played initially, resulting in a new target number that tends to be lower than they witnessed. Second, experienced players are less likely to choose a number near the target as the difference between the new target number and the target number they initially witnessed increases. Third, when players from different initial experience mixes play in a new game of the same type, they tend to choose different numbers. Finally, when players from the same initial experience mix play in new game of different types, they tend to choose the similar numbers, failing to account for the experience mix differences. The remainder of our analysis proceeds as follows. Section II presents the experimental design. Section III describes the data obtained from the experiments, presents our strategy for analyzing the data, and describes the results of our analysis. Section IV concludes by reviewing our main results and outlining possible extensions of this line of research. II. Experimental Design Subjects were recruited from amongst the undergraduate student population at the University of Tennessee, Knoxville, where the experiments were conducted. At the time of recruitment, subjects were informed that they would be participating in an experiment that would take up to 75 minutes to complete. The experiment was conducted in the UT Experimental Economics Laboratory, which holds 24 networked computer workstations in separate cubicles. Groups of six subjects play a game, which we define as six consecutive rounds of a PBCG. Up to four games were played at a time; we define a set of concurrently played games as a session. Within each round of play, subjects guessed a number between 0 and 100, inclusive. The player whose guess was closest to 2/3 of the mean guess of the group won the round. 6 The winner of each round was paid $3. In addition to the prizes awarded to the winner of each round, all subjects were awarded a $10 participation fee. The games were run by computer using Z-Tree and average earnings for the experiment, including the participation fee, were $13.44 per subject. As students arrived for the experiment, they checked in at a table located one floor above the Experimental Economics Laboratory. Each subject was provided a notecard with an ID number that corresponded to the computer terminal at which they were ultimately seated. Given that our design required that groups of six include a specific mix of experienced and inexperienced subjects, we relied upon this procedure as a way to ensure that we observed the desired mix of types.9 Once all subjects were seated at a computer and logged into z-Tree, they were provided a hard copy of the experimental instructions and asked to follow along as the instructions were read aloud by an experimental monitor (see Appendix A). Once the instructions were read, subjects were asked if they had any questions. All questions were answered in private and the game began. Our basic experimental design requires that we create a series of linked sessions (or families) containing differing proportions of inexperienced subjects who have not yet played the PBCG and experienced subjects who had played the PBCG once before. Each family starts with an initial (or progenitor) session whereby all participants were inexperienced agents playing the PBCG for the first time. Our experiment includes 26 such progenitor games. Each of these progenitor sessions are subsequently linked with two to three additional sessions in which each 9 The z-Tree code was set-up to form groups of six using computers located at pre-determined cubicles. At the end of a session, we thus excused individuals from set locations and seated new participants at these vacated terminals to ensure that each group had the desired mix of participant types. 7 group of six contains either one, three, five, and six experienced agents who had participated in the prior session. Our linked families were created as follows. Every 30 minutes, up to four groups of six concurrently played the PBCG. At the end of each session, subjects were called individually to the front of the computer lab and informed about their earnings for that portion of the experiment. Subjects who were pre-selected to participate in a second session were informed to return to their computer and that they would be participating in a second session of the experiment where they would be competing in a new PBCG.10 The returning players were informed about the new number of experienced and inexperienced players that they would now be competing against. Subjects who had completed a second session of play and those who were pre-selected to participate in only a single session were transported to a second room where they completed a post-experiment survey. Once they completed the post-experiment survey subjects were paid their earnings in cash. After all subjects had been either reseated at their computer terminal or transported to the secondary room, a new set of inexperienced agents entered the laboratory and were seated at a pre-determined computer monitor. To ensure that all groups had the desired mix of experienced and inexperienced agents, the entering subjects were seated at the station whose number matched the ID number on the card received when checking in for the experiment. Once all participants were seated, the monitor distributed and read aloud the experimental instructions – which included information on the mix of experienced and inexperienced agents in each group of six. This basic process was replicated two to three times within each linked family. 10 66 subjects played only one game, while 276 subjects played two games. 8 In total, our experiment includes 103 games organized into 32 sessions, played by 342 unique subjects who made a total of 3708 guesses. Figure 1 provides a summary of basic experimental design broken down by generation of play. As noted in the table, we observe 26 progenitor games that include all inexperienced agents. Second generation sessions include three different mixes of agent types; (i) one experienced and five inexperienced agents, (ii) three experienced and three inexperienced agents, and (iii) five experienced and one inexperienced agent. Third generation sessions include two different mixes of agent types; (i) one experienced and five inexperienced agents and (ii) three experienced and three inexperienced agents. Fourth generation sessions include sessions with all six agents experienced. By design, we thus observe significant variation in the mix of agents in the session from which any agent gains his/her initial experience. All progenitors gain experience in sessions where they are matched with five other inexperienced agents. Subjects who initially participate in a second generation session gain experience from groups where they are matched with either (i) one, (ii) three or (iii) five experienced agents. And subjects who initially participate in a third generation session gain experience from groups where they are matched with either one on three experienced agents. As the target number in the first round of play and the subsequent evolution of this target across the remaining rounds of play is later shown to depend on the mix of subject types, our design thus provides random variation in the history observed by any inexperienced agent. We observe similar variation in the mix of agent types that an agent who initially gained experienced in a given session type faces when repeating the PBCG. For example, agents who initially participated in a progenitor session are subsequently matched in one of three possible mixes in the second generation. Similarly, an agent who initially gained experience in a session 9 with three experienced and two other inexperienced agents could subsequently be matched with (i) only inexperienced agents, (ii) two other experienced agents, or (iii) all experienced agents. As such, we observe experienced agents who are randomly matched in their second round of play with either a higher or lower fraction of experienced types than what was present in the session in which they gained experience. As the mix of types correlates with the observed history of play, this allows us to disentangle the form of learning. While inexperienced agents tend to choose numbers consonant with level-1 reasoning, experienced counterparts tend to select numbers closer to level-2 reasoning. On average, those who initially gain experience in a session with five experienced agents thus observe a higher level of rationality than those who gain experience in sessions with a lower fraction of experienced agents. As such, we are thus able to compare the choices of experienced agents who initially observed level-1 play with those who initially observed level-2 play when competing against five inexperienced players. Given the tendency for inexperienced agents to select a number consistent with level-1 reasoning, if experience teaches agents to think deeper about the game and the underlying solution concept we would expect both types to recognize this tendency and select a number consistent with level-2 reasoning. In contrast, if experience simply triggers a form of pattern recognition – e.g., best respond by playing the target number in the prior round - one would expect to observe those who gained experience in a session with five experienced agents to best respond to what they observed in round 1 (level-2 play) and select a number that is lower, on average, than that chosen by those who initially gained experience matched against one or three experienced agents and observed level-1 play in their first round of play. III. Data and Results 10 Table 1 presents summary statistics of player guesses by round of play, player experience, and session experience mix. The raw data illustrate our empirical strategy. As shown in the first two rows, players who have experience with the game tend to play lower numbers. Thus, as shown in rows three through seven, as the number of experienced players in the session increases, average guesses decline, particularly in the early rounds. For example, in sessions played with six inexperienced players, the average round one guess is 41.04, roughly consistent with level-one play, but in sessions with five experienced players and one inexperienced player, the average round one guess is 27.43, roughly consistent with level-two play.11 Thus, when a player gains experience in a session with mostly inexperienced players, but is placed in a new session with mostly experienced players, the winning strategy is usually inconsistent with the pattern the player saw in her initial session. Similarly if a player gains experience in a session with mostly experienced players but is placed in a new session with mostly inexperienced players, the pattern they witnessed offers incorrect guidance if followed. These players will only hold an advantage due to their experience if they have learned to carefully anticipate the strategies that other players will follow. An initial look at how players learn shows the advantage of experience. Table 2 presents regression estimates of the following equation: (1) PERCig = α + β1ROUND3ig + β2ROUND4ig + β3ROUND5ig + β5ROUND6ig + εig 11 As frequently occurs in PBCG experiments where the target is the function of the mean guess, as the rounds proceeded, a handful of players decided to play extremely high numbers in order to bring up the group average and disrupt the results. This occurred in each session type. The maximum play in each round of each session type was 100, leading to higher averages than expected in late rounds and in some cases, the average in a round is higher than the average in previous rounds. Accordingly, in all of the analysis that follows, plays that were clearly not serious, defined as choosing a number that is more than twice the previous round’s target number, are dropped from the analysis. Doing so has no effect on the qualitative results. 11 where PERCig is player i’s guess in game g as a percentage of the previous round’s target value, ROUND3ig through ROUND6ig are dummy variables indicating whether the guess took place in rounds three through six, respectively, using round two as the omitted category, and the error terms are clustered by game. Columns 1 and 2 show the results for inexperienced and experienced players, respectively, and column 3 pools the data to test whether differences among the experience types are statistically significant, adding a dummy variable indicating whether the player is experienced and interactions between this experience indicator and the round indicators. The results are consistent with the findings of Slonim (2005) and Livingston and Skeath (2014), who show that on average, experienced players’ guesses appear one level of thinking deeper than inexperienced players. Inexperienced players in round two choose guesses whose average is very close to the previous round’s target number (94 percent), while experienced players in round two play a best response to this move; their average guess is two-thirds of the round one target. This advantage diminishes as the game proceeds. Inexperienced players learn to lower their guesses, playing close to two-thirds of the previous round’s target by round five. Experienced players also adjust their strategy, lowering their play to 56 percent of the previous round’s target by round four. Overall, the inexperienced players do partially catch up to the experienced players. As the results in column 3 indicate, the adjustments by inexperienced players in rounds three through six are statistically significantly larger than the adjustments made by experienced players. The advantage held by experienced players in the early rounds gives them a greater chance of winning, but this advantage dissipates in later rounds. Table 3 displays the proportion of winners who are experienced in each game type, and tests the null hypothesis that this proportion is equal to the expected proportion if each player has an equal chance of winning. The proportion of winners who are experienced is statistically significantly higher than expected in 12 the first four rounds of both the games where one experienced player plays against five inexperienced players, and the games where three experienced players play against three inexperienced players, using one-tailed tests of proportions. In rounds five and six, the advantage held by experienced players is not statistically significant, a function of the fact that inexperienced players have learned by those rounds to play smaller percentages of the previous round’s target. These patterns in average play are consistent with the sophisticated strategies assumed by Ho and Su (2013) and Mohlin (2012). If players learn in this way, we would expect players who have experience with the game to learn to anticipate that inexperienced opponents are likely to begin by playing level one-type strategies, and play close to the previous round’s target in early rounds. Similarly, we would expect inexperienced players to learn from the strategies of more successful players, and for their strategies to converge over time. Using a similar experiment to what we employ, Slonim (2005) finds a similar pattern, and interprets this evidence as supportive of models of sophisticated learning in this type of game. Livingston and Skeath (2014), however, find that if one looks beyond average play, the behavior of players when they gain experience is not in line with what one would expect from sophisticated learners. While average play is consistent with experienced players employing a level of thinking that is one level deeper than inexperienced players, there is large variance in their play, leading to frequent mistakes, and when they make mistakes, they do not adjust their behavior any more effectively than inexperienced players. Further, the results presented above are also consistent with players simply recognizing patterns of behavior that increase the chance of winning without thinking carefully about the strategies that others are following. 13 Inexperienced players see that the winning number is well below what they chose, and may simply switch to the pattern they saw their opponents have success with in previous rounds. Thus, to distinguish between belief-based models of sophisticated learning and choice reinforcement models which assume learning to be less sophisticated, we explore five lines of analysis. First, we consider how inexperienced players evolve their strategies over the course of the game. Because average guesses are lower in each round when more experienced players are involved, one might expect inexperienced players to learn more quickly when paired with a larger proportion of experienced opponents. The estimates presented in Table 4 examine whether this is the case. Each column presents the estimates of the following equation separately for each round: (2) PERCig = α + β1TYPE51ig + β2TYPE33ig + β3TYPE15ig + εig where the dependent variable is again the player i’s guess in game g as a percentage of the previous round’s target, and the control variables TYPE51ig, TYPE33ig and TYPE15ig are dummy variables indicating whether the game included one, three or five experienced players, respectively, using zero experienced players as the omitted category. Standard errors are again clustered by game. The estimates lead to our first result: Result 1: Inexperienced players do not react to the previous round’s target differently depending on the mix of experience. Curiously, the point estimates suggest that inexperienced players choose guesses that are a higher percentage of the previous round’s target in games with one experienced player than in games with all inexperienced opponents. Still, for the most part, the differences in guesses as a percentage of the previous round’s target between games with different numbers of experienced 14 opponents are largely statistically insignificant. Only in round four is there evidence that the experience mix matters, but still, on the most important margin, the difference between games with all inexperienced opponents and in games with all experienced opponents is not significant. Thus, the evidence suggests that inexperienced players may learn to lower their guesses, but they do not learn faster in the face of stronger evidence that doing so would be beneficial. More importantly, however, our design focuses on examining how experienced players behave when the pattern of results they witnessed does not correspond to the likely winning strategy. Thus, secondly, we examine how much experienced players miss the target by in round one as a function of whether their new session has the same, more, or fewer experienced players than the session where they gained their experience, leading to our second result: Result 2: Experienced players make larger mistakes in round one when facing a different experience mix than what they originally encountered. Figure 2 displays how the amount by which the experienced players miss the target in round one varies with the number of experienced opponents relative to the game where they gained their experience. The figure shows that when there are fewer experienced opponents than what the player originally faced, so that the new round one target is likely to be higher than what the player witnessed, players undershoot the target by 3.28 on average. But when there are more experienced opponents than what the player originally faced, so that the new round one target is likely to be lower than what the player witnessed, players overshoot the target by 6.95 on average. Table 5 presents a regression which shows that the differences relative to experienced players who face the same number of experienced opponents in both games are statistically significant. These results are consistent with choice reinforcement models. The players follow the pattern they witnessed in their original game, failing to anticipate that experienced players 15 are likely to play smaller numbers and inexperienced players are likely to play larger numbers. Thus, the evidence suggest that the experienced players do not anticipate the strategies of their opponents, as belief-based models assume. Thirdly, we examine how the likelihood that an experienced player is to play a guess close to the target is impacted by the degree to which the pattern they witnessed matches the new game. Nagel (1995) calculates “neighborhood intervals” around the choices that are consistent with each level of thinking in order to see whether choices are concentrated around those numbers.12 We follow a similar approach by calculating the neighborhood interval around the target number for each round and then estimating how often the players select a guess in that interval. Probits of the following form are estimated: (3) Pr(NIig = 1) = 1TARGDIFF1ig + is, where NIig is a dummy variable that equals one if player i in game g guesses a number within the neighborhood interval of the game g target number for a particular round, and TARGDIFF1ig is the absolute value of the difference between the round one target in game g and the target that player i saw in round one of her previous session. Standard errors are again clustered by game.13 The estimates lead to our third result: Result 3: Experienced players are less likely to choose a number inside the neighborhood interval of the target as the gap between the target they initially witnessed and the new target increases. 12 Each interval has the boundaries 50(2/3)i+1/4 and 50(2/3)i-1/4, where i represents the level of thinking, rounded to the nearest integer. 13 Including game fixed effects would force the dropping of a large number of observations because there are many games in each round in which no player’s guess was in the neighborhood interval of the target number. 16 Panel A of Table 6 reports the estimated marginal effect of the difference between the targets and Panel B reports summary statistics on each variable. The results show that as the difference between the current target and observed target grows, experienced players are less likely to choose a number in the neighborhood interval of the target. A one point increase in the absolute value of the difference between the round one target value in the current session and the round one target value the player observed originally results in a decrease in the probability that the player’s guess is in the neighborhood interval of the target of 1.6 percentage points in rounds one and two, and one 0.9 percentage points in round three. Thus, witnessing a pattern that does not correspond to the player’s new circumstances decreases the likelihood that the player’s guess is close to the target, and this effect persists for three rounds. This evidence is again consistent with players following patterns without carefully considering the likely strategies of their opponents. Fourthly, we examine whether players coming from different histories play similar when they go into the same game type. If players learn to think carefully about the strategies followed by others, they should realize that other players with experience are also likely to play smaller numbers that are roughly consistent with level-two type thinking, and inexperienced players are likely to play higher numbers that are roughly consistent with level-one type thinking. Thus, when facing the same mix of experienced and inexperienced opponents, in the first round, players should choose similar numbers regardless of the experience mix they faced in their initial game. To investigate whether this is the case, we hold the experience mix in the new game constant, and estimate the effect that the round one target number the player saw when gaining experience (which largely varies due to the different experience mixes in the players’ original games) has on the player’s round one guess in the new game. The following equation is estimated: 17 (4) GUESSig = α + β1TARGSEENig + εig where GUESSig is player i's guess in round one of game g, TARGSEENig is the round one target that player i in game g witnessed in the game where the player gained experience, and standard errors are again clustered by game. The estimates lead to our fourth result: Result 4: Players who witnessed different target numbers when gaining experience choose different numbers when playing in new games with the same experience mix. The results are presented in Table 7. Columns 1 through 4 examine play in final games of each experience profile, and Column 5 examines play in all game types together, but adds as controls dummy variables indicating the game type, with games with all experienced players used as the omitted category. In all game types but those with all experienced players, and considering all games together and controlling for game type, players guess larger numbers in round one when the round one target they saw when gaining experience was higher. For example, in games with three inexperienced players and three experienced players, a one point increase in the target number witnessed by the subject originally is correlated with a 1.16 increase in the number the target plays in the new game. These results suggests that players are following the pattern from their original play, and are not using information about the new mix of experience profiles to anticipate the strategies their new opponents are likely to follow. Finally, we examine whether players coming from the same mix of experience play differently from each other when they go into different game types. Again, if players learn to think carefully about the strategies followed by others, they should realize that experienced players generally choose level-two type numbers, and inexperienced players generally choose level-one type numbers. Thus, when facing different mixes of experienced and inexperienced opponents, players should choose different numbers depending on the proportion of players who 18 are experienced in their current game, regardless of the experience mix they faced in the game in which they played initially. To investigate whether this is the case, the following equation is estimated: (5) GUESSig = α + β1TYPE51ig + β2TYPE33ig + β3TYPE15ig + β4TARGSEENig + εig where GUESSig and TARGSEENig are as previously defined; and TYPE51ig, TYPE33ig and TYPE15ig are again dummy variables indicating whether the game included one, three or five experienced players, respectively. The estimates suggest our final result: Result 5: Players who gained experience with the same mix of experienced and inexperienced players choose similar numbers when playing in new games with different experience mixes. The estimates are presented in Table 8. Column 1 examines play in final games where all six players are experienced. Column 2 examines play in final games where three players are experienced and three are inexperienced. Column 3 examines play in final games where only one player is experienced. The results are consistent. Players with a given experience type do not play statistically different numbers based on the experience mix in their new game, thus failing to account for the strategies their new opponents are likely to follow. V. Conclusion While modern theories of learning in strategic games typically come in the flavor of belief-based models, where players form beliefs about the strategies their opponents are expected to follow and best respond to those beliefs, the validity central assumption has yet to be sufficiently tested. Past studies employ designs that cannot distinguish between belief-based models and choice reinforcement models where players do not attempt to deduce the strategies of 19 the other players, but merely follow signals about what moves have led to higher payoffs in the past. Our design permits just such a test. We place agents in situations where the signals they observe about the payoffs associated with various moves do not offer good advice about the strategies that are likely to be successful. Thus, if players simply follow the signals, choice reinforcement models better describe the way that they learn. If, rather, they learn and think about the game more carefully, and anticipate the play of others more effectively, then beliefbased models are likely the superior choice. We find evidence that players respond to signals about payoffs received from past play regardless of how the mix of players changes, offering strong support in favor of choice reinforcement models in the context of beauty contests. Certainly, our research is not the final word on the matter. Future research should consider carefully the type of learning that takes place in various strategic contexts, so that models of the learning process in these contexts can be based on the correct set of assumptions. 20 Figure 1. Experimental Design Progenitor games: 6 inexperienced 1 exp. 5 inexp. 3 exp. Generation 1 5 exp. 3 inexp. 1 exp. 1 exp. 1 exp. 5 inexp. 5 inexp. 5 inexp. 3 exp. 3 exp. 3 exp. 3 inexp. 3 inexp. 3 inexp. 1 inexp. Generation 3 6 exp. 6 exp. 6 exp. 0 inexp. 0 inexp. 0 inexp. 21 Generation 2 Generation 4 Table 1. Summary statistics, by round and experience mix Round 1 Round 2 Round 3 Round 4 Round 5 Round 6 Player type: Inexperienced N = 342 41.52 (18.23) 27.45 (15.54) 17.89 (14.78) 12.83 (17.87) 9.62 (17.18) 8.66 (17.72) Experienced N = 276 24.40 (12.95) 12.80 (7.67) 6.84 (5.94) 5.90 (14.29) 6.56 (16.21) 7.11 (18.38) Session type: 6 inexperienced, 0 experienced N = 156 41.04 (18.91) 28.00 (15.27) 17.59 (11.72) 12.59 (17.51) 11.43 (20.51) 10.56 (19.12) 5 inexperienced, 1 experienced N = 144 39.21 (18.59) 26.57 (15.53) 18.51 (15.16) 12.90 (14.75) 8.41 (12.85) 6.06 (12.47) 3 inexperienced, 3 experienced N = 102 33.94 (16.61) 20.01 (12.03) 12.77 (14.92) 10.15 (19.26) 6.25 (8.23) 7.88 (20.08) 1 inexperienced, 5 experienced N = 90 27.43 (12.07) 14.11 (9.06) 7.26 (6.49) 6.37 (15.45) 7.20 (17.78) 6.24 (15.25) 0 inexperienced, 23.45 11.24 6 experienced (15.21) (9.10) N = 126 Standard deviations in parentheses. 5.10 (5.66) 4.65 (14.97) 6.52 (19.61) 8.24 (21.59) 22 Table 2. Learning over time by experienced and inexperienced players Inexperienced players Experienced players All players (1) (2) (3) Constant 0.94*** (0.020) 0.67*** (0.019) = 1 if experienced 0.94*** (0.019) -0.28*** (0.028) round 3 -0.08*** (0.028) -0.06** (0.027) -0.08*** (0.026) round 4 -0.21*** (0.028) -0.11*** (0.027) -0.21*** (0.026) round 5 -0.28*** (0.028) -0.10*** (0.027) -0.28*** (0.027) round 6 -0.28*** (0.028) -0.10*** (0.027) -0.28*** (0.027) experienced*round 3 0.02 (0.039) experienced*round 4 0.10** (0.039) experienced*round 5 0.18*** (0.040) experienced*round 6 0.18*** (0.040) Observations 1,577 1,306 2,883 R2 0.090 0.016 0.121 Dependent variable is the player’s guess as a percentage of the previous round’s target. Standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 23 Table 3. How often is the winner an experienced player? Round: session type: 1 2 3 4 5 5 inexp., 1 exp. 0.41*** 0.33** 0.32** 0.30* 0.24 (expected = 0.17) (0.10) (0.10) (0.10) (0.10) (0.09) N = 24 3 inexp., 3 exp. (expected = 0.50) N = 17 0.82*** (0.09) 0.71** (0.11) 0.71** (0.11) 0.71** (0.11) 0.59 (0.12) 6 0.26 (0.09) 0.59 (0.12) 1 inexp., 5 exp. 0.93 0.80 0.87 0.93 0.73 0.87 (expected = 0.83) (0.06) (0.10) (0.09) (0.06) (0.11) (0.09) N = 15 Proportion of winners who are experienced players. Standard errors in parentheses. Significance tests test the null hypothesis that the proportion of winners who are experienced is equal to the expected proportion if each player has an equal chance of winning against the alternative hypothesis that the proportion of winners who are experienced is higher than the expected proportion if each player has an equal chance of winning. *** the proportion of experienced winners is significantly different from the expected proportion at the 1% level; ** the proportion of experienced winners is significantly different from the expected proportion at the 5% level; * the proportion of experienced winners is significantly different from the expected proportion at the 10% level. 24 Table 4. Inexperienced player guesses as a percentage of the previous round’s target value Round 2 (1) 0.94*** (0.03) Round 3 (2) 0.86*** (0.02) Round 4 (3) 0.71*** (0.03) Round 5 (4) 0.65*** (0.04) Round 6 (5) 0.70*** (0.05) 0.03 (0.04) 0.04 (0.04) 0.09* (0.05) 0.06 (0.06) -0.07 (0.07) 3 experienced players 0.004 (0.08) -0.05 (0.07) -0.04 (0.07) -0.06 (0.07) -0.02 (0.07) 5 experienced players -0.13 (0.12) -0.09 (0.13) -0.05 (0.07) -0.06 (0.07) -0.03 (0.13) Constant Paired with: 1 experienced players Observations 320 322 319 305 311 2 R 0.006 0.010 0.025 0.019 0.008 Inexperienced players who are paired with no experienced players and five other inexperienced players is the omitted category. Robust standard errors clustered by session in parentheses. *** p<0.01, ** p<0.05, * p<0.1 25 Figure 2. Experienced players’ mistakes in the first round difference between guess and target 8 6 4 2 0 fewer same -2 -4 Number of experienced opponents, relative to when gained experience 26 more Table 5: Does the difference between a player’s guess and the target value vary depending on whether their experience profile matches the current game? difference (1) More experienced players in current game than in historical game 3.33*** (0.598) Fewer experienced in current game than in historical game -6.90*** (1.445) Constant 3.62*** (0.325) Observations 276 R-squared 0.060 Dependent variable is the difference between the player’s round one guess and the round one target value. The omitted category is players who had the same number of experienced players in the current session and in the session from which they gained experience. Robust standard errors clustered by game in parentheses. *** p<0.01, ** p<0.05, * p<0.1 27 Table 6. Do experienced players whose experience does not match the pattern adapt? Panel A. Dependent variable: Pr(guess in neighborhood interval of target value) Difference between targets Observations round one (1) -0.016*** (0.004) round two (2) -0.016*** (0.005) round three (3) -0.009** (0.004) round four (4) -0.001 (0.003) round five (5) -0.001 (0.004) round six (6) -0.001 (0.003) 276 276 276 276 274 272 round two (2) 0.174 (0.380) round three (3) 0.134 (0.341) round four (4) 0.101 (0.302) round five (5) 0.105 (0.307) round six (6) 0.094 (0.293) Panel B. Summary statistics = 1 if guess in NI of target round one (1) 0.174 (0.380) Difference 8.75 between targets (5.21) Independent variable in each regression is the difference between the round one target the player saw when gaining experience and the round one target value in the current play. Robust standard errors clustered by game in parentheses in panel A. Standard deviations in parentheses in panel B. *** p<0.01, ** p<0.05, * p<0.1 28 Table 7. Do players coming from different histories play the same when they go into the same game type? 0-6 games 1-5 games 3-3 games 5-1 games All games (1) (2) (3) (4) (5) target seen in 0.28 0.67*** 1.16*** 0.90*** 0.64*** round 1 originally (0.29) (0.13) (0.22) (0.20) (0.13) 1-5 game 1.36 (1.63) 3-3 game 1.82 (1.72) 5-1 game 2.20 (2.21) Constant 16.15* (8.31) 7.56* (3.63) -4.57 (5.65) 4.01 (4.64) Observations 126 75 51 24 2 R 0.01 0.11 0.31 0.19 0-6 games include zero inexperienced players and six experienced players. 1-5 games include one inexperienced player and five experienced players. 3-3 games include three inexperienced players and three experienced players. 5-1 games include five inexperienced players and one experienced player. 6-0 games include six inexperienced players and zero experienced players. Robust standard errors clustered by game in parentheses. *** p<0.01, ** p<0.05, * p<0.1 29 7.05* (4.02) 276 0.07 Table 8. Do people coming from the same history play differently when they go into different game types? Experience from 1-5 game Experience from 3-3 game Experience from 6-0 game (1) (2) (3) 3-3 game type omitted 1-5 game type omitted 3-3 game type omitted 5-1 game -2.16 (2.57) -7.89 (7.45) 3-3 game 1-5 game target seen in round 1 originally Constant -- -7.57 (8.40) -- -0.61 (1.91) 2.02*** (0.50) 0.53 (0.64) 0.87*** (0.13) -12.36 (9.15) 18.19 (13.75) 2.65 (3.35) Observations 15 33 R-squared 0.47 0.04 0-6 games include zero inexperienced players and six experienced players. 1-5 games include one inexperienced player and five experienced players. 3-3 games include three inexperienced players and three experienced players. 5-1 games include five inexperienced players and one experienced player. 6-0 games include six inexperienced players and no experienced players. Robust standard errors clustered by game in parentheses. *** p<0.01, ** p<0.05, * p<0.1 30 114 0.18 Appendix A. Instructions Experiment Instructions This is an experiment in economic decision-making. The experiment consists of a series of six (6) rounds. You will play against a group of 5 other people in each round. The decisions that you and the 5 other people make will determine the dollar winnings for each of you. Each player will be paid $5 for participating. At the start of each round, you will be asked to choose a number between 0 and 100, inclusive. 0 and 100 are possible choices. Your number can include up to two decimal places, such as 12.34 or 56.78. At the same time, each of the other 5 people will also choose a number between 0 and 100. None of you will be able to see anyone else’s number until after your decision is submitted. The numbers selected by all 6 people in your group will be averaged, and then the number that is two-thirds (0.67) of that average will be calculated and announced at the end of the round. The person whose number is closest to two-thirds of the average will win $3 for that round. The 5 other people will earn $0. If more than one person ties for having a number closest to two-thirds of the average, then the payment of $3 will be divided equally among those who tied and the others will earn $0. The website will keep track of the choices of each player in each round. It will also calculate the target number (two-thirds of the average of the numbers chosen by the 6 participants), identify the winner or winners of each round, and keep track of each player’s winnings over the six (6) rounds of play. After the end of the final round, you will be required to complete a short online survey. Upon completing the survey and logging out of the website, you will present your code card at the table upstairs near the main door of Smith and collect your winnings. At that time, you will also need to sign a receipt confirming the amount of the payment that is made to you. 31 References Alba-Fernández, V., P. Brañas-Garza, F. Jiménez-Jiménez, and J. Rodero-Cosano. 2006. “Teaching Nash Equilibrium and Dominance: A Classroom Experiment on the Beauty Contest.” Journal of Economic Education 37(3), pp. 305-322. Berheim, D. 1984. “Rationalizable Strategic Behavior.” Econometrica 52(4), pp. 1007-1028. Binmore, K. 1999. “Why Experiment in Economics?” Economic Journal 109(453), pp. F16-F24. Burnham, T., Cesarini, D., Johanesson, M., Lichtenstein, P. and Wallace., B. 2009. “Higher Cognitive Ability is Associated With Lower Entries in a P-beauty Contest.” Journal of Economic Behavior and Organization 72(1), pp. 171-175. Camerer, C. 1997. “Taxi Drivers and Beauty Contests.” Engineering and Science 1, pp. 10-19. Camerer, C. and T. Ho. 1999. “Experience Weighted Attraction Learning in Normal-Form Games.” Econometrica 67(4), pp. 827-874. Camerer, C., T. Ho, and K. Chong. 2003. “Models of Thinking, Learning, and Teaching in Games.” American Economic Review 93(2), pp. 192-195. Camerer, C., T. Ho, and K. Chong. 2004. “A Cognitive Hierarchy Model of Games.” Quarterly Journal of Economics 119(3), pp. 861-898. Crawford, V., M. Costa-Gomes, and N. Iriberri. 2010. “Strategic Thinking.” Mimeo, University of California at San Diego. Costa-Gomes, M.A. and V. Crawford. 2006. “Cognition and Behavior in Two-Person Guessing Games: An Experimental Study.” American Economic Review 96(5), 1737-1768. Dufwenberg, M., T. Lindqvist, and E. Moore. 2005. “Bubbles and Experience: An Experiment.” American Economic Review 95(5), pp. 1731-1737. Goeree, J. and C. Holt. 2004. “A Model of Noisy Introspection.” Games and Economic Behavior 46(2), pp. 365-382. Guth, W., M. Kocher, and M. Sutter. 2002. “Experimental ‘Beauty Contests’ with Homogeneous and Heterogeneous Players and with Interior and Boundary Equilibria.” Economics Letters 74, pp. 219-228. Ho, T., C. Camerer, and K. Weigelt. 1998. “Iterated Dominance and Iterated Best Response in Experimental ‘p-Beauty’ Contests.” American Economic Review 88(4), pp. 947-969. Holt, D. 1999. “An Empirical Model of Strategic Choice with an Application to Coordination Games.” Games and Economic Behavior 27, pp. 86-105. 32 Johnson, E., C. Camerer, S. Sen, and T. Rymon. 2002. “Detecting Failures of Backward Induction: Monitoring Information Search in Sequential Bargaining.” Journal of Economic Theory 104, pp. 16-47. Keser, C. and R. Gardner. 1999. “Strategic Behavior of Experienced Subjects in a Common Pool Resource Game.” International Journal of Game Theory 28, pp. 241-252. Kocher, M. and M. Sutter. 2006. “Time is Money: Time Pressure, Incentives and the Quality of Decision-Making.” Journal of Economic Behavior and Organization 61, pp. 375-392. Kocher, M., M. Sutter, and F. Wakolbinger. 2007. “The Impact of Naïve Advice and Observational Learning in Beauty-Contest Games.” Tinbergen Institute Discussion Paper TI2007-01. January. Levitt, S. and List, J. 2007. “What Do Laboratory Experiments Measuring Social Preferences Reveal About the Real World?” Journal of Economic Perspectives 21(2), pp. 153-174. List, J. 2003. “Does Market Experience Eliminate Market Anomalies?” Quarterly Journal of Economics 118(1), pp. 41-71. Livingston, J.A. and Skeath, S. 2014. “A Step Ahead? Experienced Play in the P-Beauty Contest.” Working paper. McKelvey, R. and T. Palfrey. 1995. “Quantal Response Equilibria for Normal-Form Games.” Games and Economic Behavior 10(1), pp. 6-38. Moulin, H. 1986. Game Theory for the Social Sciences. (2nd ed.) New York: New York University Press. Nagel, R. 1995. “Unraveling in Guessing Games: An Experimental Study.” American Economic Review 85(5), pp. 1313-1326. Pearce, D. 1984. “Rationalizable Strategic Behavior and the Problem of Perfection.” Econometrica 52(4), pp. 1029-1050. Sbriglia, P. 2008. “Revealing the Depth of Reasoning in P-Beauty Contest Games.” Experimental Economics 11, 107-121. Slonim, R. 2005. “Competing Against Experienced and Inexperienced Players.” Experimental Economics 8, pp. 55-75. Sonnemans, J. and J. Tuinstra. 2008. “Positive Expectations Feedback Experiments and Number Guessing Games as Models of Financial Markets.” Tinbergen Institute Discussion Paper TI2008-076. August. 33 Stahl, D. and Wilson, P. 1995. “On Players Models of Other Players: Theory and Experimental Evidence.” Games and Economic Behavior 10(1), 218-254. Sutter, M. 2005. “Are Four Heads Better Than Two? An Experimental Beauty-Contest Game with Teams of Different Sizes.” Economics Letters 88, pp. 41-46. Thaler, R. 1997. “Giving Markets a Human Dimension.” Financial Times: Survey – Mastering Finance 6, p. 2. June 16. 34
© Copyright 2026 Paperzz