Predicting Bargaining Behaviors: Out - of- Sample Estimates from a Social Utility Model with Quantal Respon se Arnaud De Bruyn ♣ The Pennsylvania State University Gary E. Bolton The Pennsylvania State University Draft version 3.9 June 2003 ♣ 701 Business Administration Building, Smeal College of Business Administration, The Pennsylvania State University, University Park, PA 16802. Tel.: (+1) (814) 865 - 4091. Fax: (+1) (814) 865 - 3015. Email: [email protected] . 1 PREDICTING BARGAINING BEHAVIORS: OUT-OF- SAMPLE ESTIMATES FROM A SOCIAL UTILITY MODEL WITH QUANTAL RESPONSE Substa n tial evidence from experi me n t al game theory suggest s that people are not only motivated by their selfish monet ary gains when they play games, but also take into account non - monet ary consider ation s such as equity or reciprocity. In this paper, we lay the grou nd wo r k to quan tify these motives. We fit a mo del to 1- round sequent ial (ultim at u m) bargaini ng game data, and use the model to obtain out - of- sample estimat es of play in multiple rou nd sequen tial bargaining games. The model embed s a social utility function in a quantal respon se framework, and has 3 fitted para me t er s, 1 to captu r e utility trade - offs, 1 to represen t heterog en eity and ran do m n e s s of behavior s, and 1 to captur e experience effects. The data comes from 6 previously report ed studies, enco m p a s si n g 19 distinct param et e ri za tion s of the sequent ial bargaining game. The model is remark a bly accurate with respect to directional findings and to out - of- sam ple estima tes of average first offers, accounti n g for 90% of the variability in the data. Out - of- sampl e estimat es of rejection behavior account for nearly half the variability. Furt her m or e, the para me t er estimat es achieve reaso nable stability when fitted to different dataset s. The resul ts suggest that the influence of social utility on bargai ner decision can be reliably quantified for forecasti ng, and that the mod el can be fairly generalize d to predict actual behaviors in very differen t games setting s. 2 TABLE OF CONTENTS TABLE OF CONTENTS..............................................................................................................................3 INTRODUCTION.........................................................................................................................................5 A UTILITY- DECISION FRAMEWORK MODEL FOR THE ULTIMATUM GAME......................9 OPEROTIANILIZATION OF THE ERC FUNCTION ....................................................................................................9 T HE ULTIMATUM GAME..................................................................................................................................11 T HE SELLER..................................................................................................................................................11 T HE BUYER...................................................................................................................................................14 A CLOSER LOOK AT THE DECISION PARAMETERS................................................................................................15 ESTIMATING THE PARAMETERS OF THE MODEL......................................................................17 O PTIMIZATION A PROCEDURE AND RESULTS.........................................................................................................17 NOTE ON SCALING.....................................................................................................................................23 SUMMARY.....................................................................................................................................................24 REPLICATIONS OF OTHER GAMES...................................................................................................24 BINMORE, SHAKED AND N EELIN, SONNENSCHEIN SUTTON (1985).......................................................................................................25 AND SPIEGEL (1988) ................................................................................................27 GÜTH AND T IETZ (1988)............................................................................................................................30 O CHS AND ROTH (1989)............................................................................................................................32 BOLTON (1991) ..........................................................................................................................................42 Two- round bargaining game .....................................................................................................42 Truncation game ............................................................................................................................44 GÜTH AND VAN DAMME (1998) ..................................................................................................................46 3 SUMMARY.....................................................................................................................................................50 STABILITY OF THE PARAMETER ESTIMATES.............................................................................56 ULTIMATUM GAME SIMULATIONS WITH ALTERNATIVE ASSUMPTIONS ABOUT PLAYERS’ CHARACTERISTICS...........................................................................................59 DISCUSSION AND CONCLUSIONS.....................................................................................................63 COMPARISONS TO REINFORCEMENT LEARNING MODELS........................................................................................63 LIMITATIONS AND DIRECTIONS FOR FUTURE RESEARCH.......................................................................................66 CONTRIBUTIONS AND CONCLUSION..................................................................................................................67 REFERENCES.............................................................................................................................................70 4 INTRODUCTION When agent s’ behaviors are hypot hesi ze d to be solely driven by their self- interes te d material gains, classical equilibrium game theo ry fails to explain num ero us experi m ent al findings. People do reject positive offers in ultimat u m games; they do give money to other players in the dictator games; and they make decisions that are by all means inconsi ste nt with what classic game theory would have predict ed. It has now been largely acknowledged that people do not seem to be only motivated by their selfish, monet ar y interes t s when they make decisions, and the urge to better under s t an d such pheno m e na has triggered an entire new strea m of research in busi nes s econo mics and game theory. In the last decade, variou s models have been propos ed to include a variety of non - mon et ary motivatio n s in people's behavior s, such as fairness (or aversion to unfairn ess), equity, reciprocity, or altruis m (Bolton 1991; Bolton and Ockenfels 2000; Camerer 1990; Fehr et al. 1993; Fehr and Schmidt 1997; Ochs and Roth 1989; Rabin 1993). Others have introd uce d reinforce m e nt - learni ng models to explain how players adap t their behaviors (someti m es not optim ally) to a competi tive game environ m e nt (Camerer and Ho 1999; Cheung and Friedma n 1997; Cheung and Friedma n 1995; Cooper and Feltovich 1996; Cox et al. 1995; Erev et al. 1999; Erev and Roth 1995; Erev and Roth 1998; Feltovich 2000; Grossko pf 5 1999; Hopkin s 1999; Rapopor t and Erev 1998; Swartho ut and Walker 1999). In this paper, we investigate one particular model of social utility that may explain nu mer o u s so- called "incon sist ent" behavior s in games, namely, the Equity - Reciprocity - Competi tion theory, hereafter ERC (Bolton 1991; Bolton and Ockenfels 2000). This theory suggest s that people gauge the outco me of a game both in term of absolute and relative money: when people form an opinion about how well they perfor m ed in a game, they not only take into account their actual payoffs, but also com pa re their payoffs relative to others'. Bolton and Ockenfels (2000) pro pos e a specific ERC utility functional form where the outco m e of a game is described by a scaling factor c (i.e., the size of the pie) and the prop or ti on σ of the pie the player actually gets. Players'utility is modeled as a weighted sum of both his absolute (the more the better) and relative gains (the more equitable, that is, the closer to a 50:50 split –if only two players are involved– the better). Two amplitud e param et e r s, a and b , measur e the relative import a nce of these two conflicting motivations. If a is equal to 0, the player's only motivatio n is fairness and equity; if b is equal to 0, the utility function is equivalen t to the sole self- intereste d materi al gain model. 6 b 1 U (σ ) = aσc − σ − 2 2 2 Equation 1 – ERC social utility function (Bolton and Ockenfels 2000) . Utility is a function of the size of the pie c, the proportion σ of the pie the player gets, and two parameters a and b measuring the relative importance of both absolu te and relative gains. a is usually set equal to 1. This model has several interesti ng prop erties. First, the pure - pecuniary model can be expres sed as a special case of this more general utility function (i.e., b=0). Second, Equation 1 spans a large variety of mixed self- interes te d and altr uistic motivation s, motivations that can be scaled by simply modifying the amplitu de of either a or b . Finally, the aut hor s have demon st r a t e d that this particular functio nal form could explain several surpri sing findings in various versions of the ultimat u m game (including the Güth - van Dam me 3- person bargaini ng version) as well as in the dictato r game (Bolton and Ockenfels 2000; Bolton and Ockenfels 1998), at least theoretically. But is the ERC utility functio n only a norm a tive or descri ptive model, or can it be used to predict actual players'behavior s? In other words, is it possible to find the actual value of the param e t er b (a being fixed to a consta nt) in one particular game, and if yes, is this param et e r stable enough to predict how players might behave in other games? The aim of this paper is to show that this is actu ally the case, and to demons t r a t e the good predictive accuracy of the ERC theory in a large variety of games. 7 This paper is organi zed in five section s. First, we develop a utilitydecision framewo rk, based on the underlying ERC theory, to describe how people make their decision s when they play the sim ple version of the ultim at u m game. In other words, we build a model that links the ERC utility function to the actual decisions made by players through a decision rule that incorpora t es players'heterogenei ty, uncert ai nty and choice rando m ne s s. Then, in the second section, we fit the para me t er s of this model to describe the observations made by Roth et al. in their multi - country, bargaini ng experi men t s (Roth et al. 1991), and describe the model’s fit and face validity proper ties. In the third section, we show that the pro pose d fram ewor k (i.e., ERC utility function + Gibbs decision rule) can predict to a very large extent how people actually played a variety of games (even games played at different points in time with differen t popul ation s), nam ely (i) the 2- round, (ii) the 3- roun d and (iii) the 5- rou nd versions of the ultim at u m game, (iv) the truncation game, and (v) the Güth - van Damm e 3 - person bargaining game. Out - of - sam ple prediction s are com par ed to observations repor te d in 6 distinct papers, totaling 19 differen t experi m ent al conditions. All major experi m ent al findings are direction ally replicated, often with a sur pri si ng accuracy. In the fourt h section, we re- estimat e the para me t er s of the model on two differen t, large dataset s, and show an adeq uat e stability of the param et e r estimat es. We conclude this paper by a discus sion of our results and by suggesti ng several directions for fut ure research. 8 A UTILITY -DECISION FRAMEWORK MODEL FOR THE ULTIMATUM GAME Operotianilization of the ERC function In this paper, we will amend the ERC utility function (Equation 1 ) in three ways. First, we will set a equal to 1, without loss of generalization. Second, one can note that, in the original model, only the absolute term of the utility function was multiplied by c. It did not affect the theoretical resul ts, though (c was set equal to 1 for the sake of the argum en t thro ugho u t Bolton and Ockenfels’ paper), but we argue that the model would be more realistic by multi plying both absolute and relative term s of the utility function by the size of the pie. Without this modification, (i) players would be upset by an unfair split of a pie of size 0 (that is, players would be upset by having 0% of not hi ng, but not by having 50% of nothin g); and (ii) players would be propor tion ally more driven by their selfish interest s when pie size grows, an assum p ti on that has also been partly contra dict ed in the literat u r e. To multiply the relative element of the utility function by the size of the pie takes care of these two relative inconsist encies witho ut affecting the theoretical discussion originally prop o se d by the auth or s. Finally, it has been argued that the ERC function was not symm et ric, and that the import a nce of positive reciprocity (disutility generate d by unfair splits in disfavor of others) and negative reciprocity (disutility generat ed 9 by unfair splits in one’s disfavor) might differ. In his first model, Bolton (1991) argued that unfair splits generat ed disutility if and only if inequity were in disfavo r of the player (i.e., no positive reciprocity, b=0 if σ>½ ). Later, Bolton and Ockenfels (2000) released this assu m p t io n, and introd uce d a perfectly sym met ric reciprocity effect in the ERC function. Others have argued that the trut h might lie in between; players’ behaviors might be driven by both positive and negative reciprocities, but negative recip rocity’s amplitude is likely to be more import a nt. In any cases, the games we intend to fit and replicate in this paper are likely to involve negative recip rocity only, and mainly trigger com petitive behaviors amo ng players. Althoug h we do believe players might be driven by positive reciprocity consider ation s, too, the dataset s we will analy ze will not give us the chance to estimat e such considera tions, and we therefore assu m e perfect asym m et r y in the utility function. The ERC utility function we will use thro ug hou t this paper is therefore as follows: 2 1 b c σ − σ − if U (σ ) = 2 2 cσ fi σ< 1 σ≥ 1 2 2 Equation 2 – Utility function (ERC functional form) used throughout this paper. c is the size of the pie, σ the proportion of the pie the player gets, and b measures the relative importance of relative gains (without positive reciprocity). 10 It only contain s one para m et er to be esti mate d, namely b , while both the size of the pie and the propo rti on the player gets are exogenou s to the utility function. The ultimatum game In the sim plest version of the ultimat u m game (one round per game), say α are buyers, and β are sellers. α makes an offer to β and propose s him to keep a proport ion σ of a pie of size c. If β accept s, the pie is divided accor di ngly, β gets σc and α gets (1- σ)c. If β refuses, bot h α and β get nothin g. For the convenience of the argum en t, we consider that σ can only take a finite number of values, and varies between 0 and 1 with an increm en t of 0.1. All possible values for σ are noted σi, with 0 ≤ i ≤ 10 , and σ0 =0, σ1 =0.1,… σ10 =1. The seller Say Pβ ( σ i ) is the aggregat e prob ability for β, the sellers, to accept an offer of σi. For inst ance, Pβ ( 0.5) = .95 means that, on average, 95% of players β accept an offer of σ=0.5 (i.e., an equal split of the pie). By definition, Pβ ( σ i ) ∈ [ 0,1]∀i . If the player accepts the offer, the pie is split accordingly, and his utility is a functio n of both "absolut e" and "relative" money as given by the ERC utility function (see Equation 2). If he refuses the offer (noted ∅), no 11 player receives any payoff. Since the size of the pie shri nks to not hi ng, U ( ∅ ) , the utility associated with a refusal, can be obtained by replacing c=0 in the ERC equation. Consequ e nt ly, U ( ∅ ) = 0, ∀σ i . How can we model the probability that sellers will accept a particular offer? One of the basic assu m p t i on s of most math e m a t ical learni ng theories propose d in psychology is that choice behavior is probabilistic (Bush and Mosteller 1955; Estes 1950; Luce 1959; Suppes and Atkinson 1960). At an aggregate level, we therefor e hypot hesi ze that Pβ ( σ i ) , the probability of accepti ng a particular offer σi, can be expres sed by a Gibbs distribution: Pβ (σ i ) = τ β .U ( σ i ) e τ β .U ( ∅ ) e τ β .U ( σ i ) +e Where U ( ∅ ) and U (σ i ) are the utilities of rejecting or accepting the offer, respectively. Since U ( ∅ ) = 0 , it follows that: Pβ (σ i ) = τ β .U ( σ i ) e 1+e τ β .U ( σ i ) Equation 3 – Sellers' probability to accept an offer of σi. 12 In Equation 3 , τ β is a positive coefficient of certitu de 1 . We will also use the term “decision param e t er” intercha ngeably. If τ β ∞, Pβ ( σ i ) → 1 ⇔ U ( σ i ) > U ( ∅ ) , and Pβ ( σ i ) → 0 otherwise; the larger τ β , the higher the probability for the seller to follow the strat egy that produces the highest utility. At the other extrem e, if τ β=0, the seller has a 50:50 chance to accept any offer, indepen d e n tly of the actual value of σi. In this model, the signification of τ β is twofold. First, it is an indicat or of individuals'choice consist ency. It has been observed in variou s experi me nt s that some players are inconsist ent over time, accepting and offer of .4 at one game, and refusi ng a better offer of .5 at the very next game. The probabilistic nat ur e of the decision rule, introd uced when τ β takes a relatively small value, takes care of that pheno m en o n and introd uces some uncert ai nt y in the strat egy the same player will follow over time. In other word s, a small value for τ β underli nes the fact that players might very well not be certain of their own preferences, or might show some inconsi ste ncies in their choices. Also, since Equatio n 3 is an aggregated decisio n rule, to introd uce some uncer tai nty takes care of the heterogeneity of different player s'strat egies to either accept or refuse the same offers. Conseq ue n tly, the coefficient of amplitude τ β is an elegant way to aggregat e both individu al s’ uncer tai nty and choice inconsi stencies as well as players’ heterogeneity. This term is similar to the "alpha- rule" used in some Logit models to link products' preferences to their actual market shares. 1 13 The buyer Say Pα ( σ i ) is the aggregate probability for α, the buyers, to make an offer of σi to the sellers. For instance, Pα ( .5) = .2 indicates that buyers propose on average to split the pie equally 22% of the time. By definition, I ∑ P (σ ) = 1. i= 0 α i We hypot h esi ze that buyers’ decisio n to offer σi (and to propose to keep 1 - σi for them selves) follows a Gibbs distribu tion, too: Pα ( σ i ) = e τ α .Ε ( U ( 1−σ i ) ) I ∑e ( ( τ α . Ε U 1−σ j )) j =1 Equation 4 – Buyers’ decision rule: probability for the buyers to make an offer of σi . Where Ε(U (1 − σ i ) ) is the expected utility of offering σi to the seller. If buyers have a perfect knowledge of the true prob abilities for the sellers to accept any particular offer and if they are risk - indifferent (two assu m p t io n s we will make for the time being), the expected utility of an offer σi is equal to Pβ (σ i ) ⋅U (1 − σ i ) , where U (1 − σ i ) is given by the ERC utility function (with the sam e param et e r s as the buyers’), and thus: 14 Pα ( σ i ) = e τα . Pβ ( σ i ).U ( 1−σ i ) I ( j ).U (1−σ j ) ∑e τ α . Pβ σ j =1 Equation 5 – Buyers’ decision rule: probability for the buyers to make an offer of σi, re- expressed as a function of sellers’ probability to accept such offer. I This expres sio n guarant ees that ∑ P ( σ ) = 1 , and i= 0 α i that the offers with the highest expected utilities are likely to be chose n more often. Again, if τ α ∞, buyers system atically make the offer σi that procures the highest expected utility. If τ α=0, Pα ( σ i ) = Pα (σ j ),∀i, j . A closer look at the decisi on parameters So far, we have hypot hesi ze d that buyers and sellers had two different decision param et e r s, τ α and τ β, and that these para m et er s were indepen de n t of players’ experience and consta nt over time. We now release these hypot hes es. Indepen de nt ly of the psychological aspect s of hum an decision - making they captur e, decision param et e r s are also influenced by the num ber of alternatives players have to choose from. For instance, if the num ber of irrelevant alternatives were to be artificially increased while keeping the decision param et e r const an t, the probability to choose the action with the highest utility would decrease. To reconcile players’ decision para me t er s while at the same time taking into accou nt such “dilution” 15 effects, we hypot hesi z e that α and β players’ decision para m et er s are equal to a comm o n “root”, multiplied by the natur al logarith m of the nu mb er of alternatives players have to choose from (that is, either 11 for the first player or 2 for the seco nd). Furt her m o r e, the model as is does not take into account learning effects that are likely to occur, and does not differentiat e decisions made during the first games from decisions made later duri ng the experi m ent, for which players had more experience. We present here what are in our opinion the two most likely learning pheno m e n a. First, buyers probably begin the game with a large variety of expectations about sellers'likelihood to accept various offers, and these expectation s converge toward more reliable estimat es after a few games. That creates a greater hom ogen eity of sellers'beliefs about the most likely outcom es of a given offer, and thus a greater ho mog enei ty of sellers’ behavior s (i.e., offers). Second, more experienced sellers shoul d learn what a "fair offer" is. Players shoul d eventu ally becom e more certain whet her they should accept or refuse any given offer or, in other words, they shoul d learn their own preferen ces and make more consisten t decisions after a few games. Given the interpre ta ti on of the decisio n param et er (i.e., indicator of players’ heteroge nei ty and choice rand om n e s s), these two learning phen o m en a shoul d translate into decision param et e r s that increase as a 16 function of players’ experience. Therefore, we re- express the decision para me t er s as a linear function of the num ber of games played, where τ 0 is the intercep t (i.e., value of the decision param et e r at the very first game), and τ 1 is the slope represen ti ng the increase in the decision para me t er thanks to players’ experience. Thus, we have: τ α = ( τ 0 + τ 1 g ) ⋅ ln ( 1 ) τ β = (τ 0 + τ 1 g ) ⋅ ln ( 2) Equation 6 – Decision parameters of the players re- expressed as a function of a common parameter τ 0 and a learning trend τ 1, scaled to the number of alternatives (g is the number of games already played). ESTIMATING THE PARAMETERS OF THE MODEL Optimization procedure and results The above decision - utility framewor k atte m pt s to nor m atively model how people play the simple version of the ultim at u m game. It has three para me t er s to be estimat ed: b , the uniqu e para m et er of the ERC utility function, and τ 0 and τ 1 , the two para m et er s that drive players’ coefficients of certit ud e. Note that the param et e r c (the size of the pie) is a function of the game design and is given a priori . We fit the model to the multi - count ry bargai ning experim ent conduct ed by Roth and his colleagues (Roth and Erev 1995; Roth et al. 1991). In this well- known experi me n t, 270 participant s from 4 different countries 17 played 10 games each of the simple ultimat u m game, either as α or as β. The dataset contains 1,350 observations, each with an offer (σi) and an outco m e (i.e., the seller either accept s or rejects the offer). The size of the pie c was $10, divided into 1,000 token s of 1¢ each, but for estima tion proced u re s we sim plify the dataset to the case of 10 tokens of $1 each. We find the opti mal values of the three param et e r s of the model using maxim u m likelihood estim ation. As not hing is assum e d about the underlying process that generate d the data, standar d deviations are estimat ed using nonpar a m e t ri c boots t ra p variance estima tion (for a review of the advant ages of this met hod, see the books of Davison and Hinkley 1997; Efron and Tibshirani 1993; Manly 1997; Shao and Tu 1996), and are shown within parent he se s. The param et e r estim ates we obtain ed are b=10.742 (.995), τ 0 =.3478 (.0189) and τ 1 =.015 9 (.0038). All param e t er s are significant at p<.01. As expected, b is positive (players are not only greedy but also seem to evaluat e their gains in ter m of “relative” payoffs, too), τ 0 is positive but relatively small, and τ 1 is positive and significant (there is a learni ng trend). The correlatio n s between observations and predictions are high, with Rα=.907 and Rβ =.973. Specifically, the mo del predict s that the average offer will be 4.06, with an average rejection rate of 30.9%. These num ber s are actually 4.07 and 26.4% in the original datas et. Successful studen t t- 18 tests at p<0.05 on these two meas ur es confirm a good statistical fit of the model. i 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 P ( i ) P ( i ) Observations Model Observations Model .006 .020 .107 .076 .416 .338 .025 .003 .002 .000 .007 .006 .012 .053 .223 .359 .253 .083 .010 .000 .000 .000 .000 .333 .424 .534 .714 .928 .855 1.000 1.000 1.000 1.000 .026 .113 .312 .561 .734 .811 .847 .873 .893 .909 .924 Table 1 – Probability for the buyer to make an offer of σi , and probability for the seller to accept such offer: observations vs. model (source: Roth et al., 1991 ). 19 Probabilit y t o m ak e an offer of... ( seller) 0.45 Obser vat ions 0.40 Model 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 i Figure 1 – Probability for the buyer to make an offer of σi : observations vs. model (source: Roth et al., 1991 ). Probabilit y t o accept an offer of... ( seller) 1.0 Observ at ions 0.9 Model 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 i Figure 2 – Probability for the seller to accept an offer of σi : observations vs. model (source: Roth et al., 1991 ). 20 The following figure shows the shape of the estima ted ERC utility function. 10 b= 0 b= 10.7 42 5 0 Ut ility 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 -5 - 10 - 15 Figure 3 – Seller's utility function for different values of σi (and a pie- size of $10). The buyer's utility function is symmetric (i.e., the main argument is [1- σi ] instead of σi ). The game we used to fit the model did not seem appropri at e to estim ate possible positive reciprocity considera ti o ns, and since no estimation of the right part of the curve could be perfor m e d given the data at hand, we assu m ed perfect linearity of the social utility function when σ>.5. We now check the statistical validity of this assum p t io n. The log - likelihoo d of the asym m et r ic model was - 5,266.5, with a stan d ar d deviation of 80.1 (estimat ed using boots t ra p variance estimation). The log- likeliho od of the model with perfectly sym m et ric recip rocity consider atio n s (b≠0, ∀σ) was - 5,254.2, with a standar d 21 deviation of 76.2. Since we cannot reject the null hypot he sis that there is no positive recip rocity consider ation s involved in this game, and that no statistical difference could be found between the two model s, the original, asy m m et ric model was retai ned. In term of learni ng, the model predict s that buyers will learn to offer more to sellers over time, which will in turn increase the likelihood of accepta nce. Figure 4 shows how learning affects the mean opening offer (model versu s observatio n s). . 45 Round 1 . 40 Round 10 . 35 . 30 . 25 . 20 . 15 . 10 . 05 . 00 0 .0 0. 1 0.2 0. 8 0 .9 1. 0 0 .3 0. 4 0.5 0 .6 0. 7 0 .8 0 .9 1. 0 . 50 Round 1- 2 . 45 Round 9- 10 . 40 . 35 . 30 . 25 . 20 . 15 . 10 . 05 . 00 0 .0 0. 1 0 .2 0 .3 0 .4 0 .5 0 .6 0.7 Figure 4 – Probability for the buyer to make an offer of σi during the first and last periods, model (left chart) versus observations (right chart). Players learn that small offers are likely to be rejected, and learn to make offers that tend to converge around 40% of the pie. 22 A note on scaling What if the size of the pie were not $10 but $1, or $100? The shape of the ERC utility function (see Equation 2) would be unaltere d, but its amplitud e would. This effect is intuitively easy to suppo r t: having half of a $100 pie shoul d procur e more satisfaction than having half of a $10 pie. Altern atively, a player sho uld be more upset by an unfair split of a large pie than by a similarly unfair split of a smaller one. But what about the Gibbs decision rules employed to transfor m expected levels of utility into probabilities of actions? Obviously, scale does mat ter in Gibbs distribu tio ns. If the amplit u de of the utility function were to be increased, the distributi on of action s’ likelihood would be much more concent r at e d aroun d the ones with the highest expected rewar ds. At the very extrem e, if the size of the pie were infinite, the player would choose the action with the highest expected utility with probability of 1. In fact, increasi ng the size of the pie has the same effect in this model than increasi ng the decision para me t er. Conseq ue nt ly, the model struct u re conveys that players select actions with the highest expected rewar ds more frequ en tly when the size of the pie is large. In other words, players’ decisions are expected to be more consist ent, less erratic, when game’s stake is import a n t. 23 Summary We have presen t ed shreds of evidence that the ERC utility function, linked to actual players’ decisions by a Gibbs distribution that incorpo rat e s both players'heterogen ei ty and individuals’ choice rand o m ne s s, offers an appro priat e fit for the simple version of the ultima t u m game. We have also shown that strategies followed by both buyers and sellers tended to becom e more hom ogeneou s and less rand o m over time. They learn what the most likely outcom es of the game are, and make more clear - cut and consisten t decisions when experienced, which translat es into statistically significant learni ng trends in the decision param e t er s. REPLICATIONS OF OTHER GAMES The model presen t ed herein is based on the solid theoretical framework of the ERC utility function, and offers an appropri at e represe nt a tion of the specific dataset we used to compu t e the param et e r s of the model. But can it accurat ely predict outcom es of other games? In other words, to which extent the specific param et e r s we com put e d to fit this particul ar dataset can be generalized to other games, other popul ati on s, other experi me n t s? To assess the generalization proper ties of our model (and of its estimat ed param e t er s), we compu t e the out - of- sam ple predictions of the 24 model for 5 other games (the 2- roun d, 3- round and 5- round versions of the ultim at u m game, the truncatio n game, and the Güth - van Dam me 3perso n bargai ning game), and compare the games’ outcom e s we obtain to well- known experi me nt s in the literat ur e. Expected utilities are com p u t e d using perfect backwar d inductio n, and unless specified otherwise, simulatio n s of repeate d games are averaged over an equal nu mb er of games (with learning trend) than those in the experi me nt used as a benchm ar k. Altho ugh com pari son s between prediction s made by the model and the actu al outcome s observed in these games are perfor m e d only ex post , that is, we apply our model without any additional inform a tion specific to the games played except for the struct u r e (i.e., rules) of the games them s elves, we show that our prediction s are often very similar to observation s. Binmore, Shaked and Sutton (1985) The first experi m ent of the so- called ultimat u m game was run by Güth and his colleag ues in 1982. The results are consist ent with the ones replicated later by Roth et al. in their multi - country bargaini ng experi me n t, and are therefore not reprod uce d here (Güth et al. 1982). The main finding, similar to what has been shown in the previous section, was that the pecuniary model of game theory had little predictive power and could not explain subject s’ behaviors in the simple ultim at u m 25 game. This work triggered many respons es, the first of which was made by Binmore, Shaked and Sutton (1985). In Binmore et al.’s experi m ent, 164 subject s played a 2- round version of the ultim at u m game (Binmore et al. 1985), and bargained over a pie size of 100 pences (which was appr oxi m at ely equivalent to $1.15 at the time of the experi me nt, i.e., c=1.15). In the 2 - roun d version of the ultim at u m game, also known as the 2- round sequent ial - bargaining game, the game does not stop if the second player refuse s the offer. Instead, a second roun d is played where the roles are inverted: the second player has the oppo rt u ni t y to make a counte roffer to the first player, and the latter can either accept or refuse it. If both first and second player’s offers are rejected, players get no payoff. The game is mad e more com plex by incorpor ati ng discount factors in the secon d rou nd. If players do not agree in the first round, the actual portion of the pie both players receive (if they eventually agree) is multiplied by δ α and δ β (∈ [ 0,1] ), respectively. The closer to 0 the discount factors are, the more the pie shrinks, a strong incentive for both players to find an agreeme nt during the first round. Note that a combin ati o n of discount factors (0, 0) corres pon d s to the simple version of the ultima t u m game. The disco unt factor was ¼ (i.e., the pie size was 25 pences in the second round), and was similar for both players α and β. Game theory suggest s 26 that players β shoul d accept any offer greater or equal to 25% of the pie, allowing the first players to keep 75% for them selves in the first round. Binmore and his colleagues repor t a mean opening offer of 41.6 pences in the first game; our model predict s a mean opening offer of 41.4 pences, perfectly replicati ng author s’ findings. For com plete ne s s, we shoul d also mention that the aut hor s asked the β players what offer they would hypot h etically make if they had to play a secon d game, this time as propos er s. The average answer was 33.2 pences. Neelin, Sonnensch ein and Spiegel (1988) Neelin, Sonnen sc h ei n and Spiegel challenged Binmore, Shaked and Sutto n’s finding s by extending the bargaining sequences to more than 2 roun d s of negotiatio ns (Neelin et al. 1988), and played 2- , 3- and 5roun d bargai ning games. The 3- round version of the ultimat u m game is an extension of the 2roun d version, where players α can refu se the counter offer, and go to a thir d rou nd of negotiations by making a counter - count er offer that players β can either accept or reject. The 5- round version is yet a longer version of the game that relies on the same sequenti al bargaining mech ani s m s. Discoun t factor s are cum ul atively applied at each round. 27 This experi men t is one of the rare that expand s the ultima t u m game up to five rou n ds of negotiations (Neelin et al. 1988), and is therefore wort h mentio ni ng. In this experi m ent, 80 participant s played 3 multi - round ultimat u m games, either as player α or player β (they did not switch roles). The pie size was $5 for all 3 games (c=5). The first game was a 2- round version with sym m et ric discount factors of ¼ (i.e., δ α= δ β =0.25); the second a 3roun d version with a discoun t factor of ½; and the last was a 5- round ultima t u m game with a disco un t factor of 1 / 3 . The discount rates were chosen so that seller’s Rubins tein mean openi ng offer was $1.25 in every game (Rubinst ein 1982). The observed mean opening offer was $1.37 (27.4%) in the 2- round ultima t u m game, replicating the findings of Binmore and his colleagues that the pecu niary model of game theory had some predictive power. With a pie size of $5, our model predict s $1.73 (34.6%). As a remi nder, Binmore’s results were 41.9% in the first game, and 33.2% in their hypot hetical seco nd game. This finding did not hold for the 3- round and 5 - roun d versio ns, though: R1: In the 3- round and 5- roun d versions of the ultimat u m game, neither “game me n s hi p” nor “fairmens hi p” theories predict the actual outcome s. 28 Neelin et al. foun d that mean opening offers in the 3- roun d and 5- round version s of the ultim at u m game were inconsi st e nt with both the pecuniary model of game theory and the equal split predictions. The mean openi ng offers were $2.36 in the second game (3 rounds), and $1.71 in the third (5 round s). Our model predicts $1.80 and $1.77, respectively. Interestingly, Neelin and his colleagues concluded that players were acting myopically, and wrote that “[sellers] always act as if they were in the 2 - roun d game.” They seem to suggest that the increasing number of roun d s (and players’ inability to find the perfect subga m e equilibriu m as the game beco me s more com plex) was the principal factor to explain discrepa ncies between observations and Rubinstein’s predictions. In our opinion, this explanation has some merits; one might find difficult to believe that players are able to men tally simulate up to five round s of negotiations using backwar d induction and correctly find the subgam e perfect equilibriu m. Another explanation of Neelin et al.’s findings could be that the key driver of the result s was the discount factor s, not the num ber of rounds. This alternative explanation could be for mulat ed as follows: R2: Mean opening offers increase as discount factors increase. 29 In other words, indepen de n tl y of the num be r of round s played, the mean openi ng offer gets closer to the 50:50 split as the discount factor s increase and favor multiple rounds of negotiations. Although our model fails to replicate the amplit ude of this trend, it predict s a similar phen o m en o n: 0. 25 0 .33 0. 50 # of rounds 2 5 3 Observ at ions 1.37 1.71 2.36 Sim ulat ion s 1.73 1.77 1.80 Table 2 – Mean opening offers in the 2- , 3- and 5- round versions of the ultimatum game, with different discount factors, observations versus model (source: Neelin, Sonnenschein and Spiegel, 1986 ). Güth and Tietz (1988) Güth and Tietz had a similar intuitio n, and ran a 2- round version of the ultima t u m game, only with much more extrem e values for the discount factors (Güth and Tietz 1988). 42 players participat ed in the experi me nt, and played the 2- round ultimat u m game under different combi nations of pie size (DM 5, 15 and 35) and discoun t factors (discount factor s were similar for both players, and equal to either 0.1 or 0.9). Note that the pecuniary model of game theory suggest s that in the δ =0.1 condition, player α shoul d propo se 90% of the pie to player β and keep 10% for himself. 30 R3: The mean dem and e d shares are always greater than 0.5. On average and across all conditions, players α dema nd ed 64.6% of the pie and propo s ed 35.4% to the second player. Our model predict s 62.1% and 37.9%, respectively. R4: When the time costs of bargaining are rather low, subjects tend to bargain longer (p.10). The author s note a dram atic difference in the first - round rejection rate across con dition s. In the δ =0.1 con ditio n, rejection rate is 19.0%, but increases to 61.9% in the δ =0.9 condition. Our model predicts an increase from 38.1% to 55.1%. Note that the data were not sufficient to highlight a clear influence of the size of the pie on either the rejection rates or the mean openi ng offers. Subjects having played the game twice, the auth ors noticed that experience seem ed to induce a tenden cy to play fair, that is, to make offers closer to the equal money division split. This trend was previously mentio ne d (and replicated) in the simple version of the ultimat u m game. However, in Güth and Tietz’s experi men t, subject s played in com pletely differen t conditio ns during the second game: the swapped their place and played the role of the other player, and were invited to bargain over 31 a pie of different size, with a different discount factor. It is quite challenging to post ul at e (and to replicate in com put er simulations) as to how these changes affected learni ng. Fortunat ely, Ochs and Roth’s system a tic analysis of multi - round ultimat u m games will shed some light on these aspect s of the model. Ochs and Roth (1989) Ochs and Roth teste d four combinations of discount factors (δ α , δ β ) for both 2- round and 3- round ultimat u m games, nam ely (.4, .4), (.4, .6), (.6, .4) and (.6, .6). Participant s bargained over a pie of $30 (i.e., c=30) divided into 100 tokens. For instance, with a disco unt factor δ of .6, the face value of a token was 30¢ in the first rou nd, 18¢ (=30 × .6) in the second, and 11¢ (≈30 × .6²) in the third roun d, if any. There were a total of 8 conditions (2- round or 3- roun d versions, 4 combinati ons of discoun t factors per version), referred to by the author s as cell 1 to cell 8. Each game was played 10 times. There were between 8 and 10 α- and β- bargainers per cell (Ochs and Roth 1989). Since the aut hor s find many empirical similarities between the 2 - and 3- roun d bargaini ng games, we will study them conjointly. We find the equilibriu m of the game by backwar d induction, in the form of a decision tree where each node represen t s a decision made by one of the players based on the expected utilities of the strategies he can follow. We then com par e our predictions to several regularities found by Ochs 32 and Roth (Ochs and Roth 1989), and later com me nt e d by Bolton and Ockenfels (Bolton and Ockenfels 2000). R5: There is a consisten t first - mover advant age: α bargainers receive more than β bargainer s, regar dles s of the value of δ β. Altho ugh the aut hor s do not report a particular meas ur e of first - mover advant age per se, we com put ed the ratio of α bargainer s’ payoffs over β bargainer s’ payoffs (includi ng the cases where no payoffs are received, or when they are discoun t ed). In the 2- rou nd version of the ultimat u m game, the model suggest s that first - mover s’ payoffs are on average 18.9% higher than second mover s’ payoffs, ranging from 5.7% to 29.9% given the particular combination of discount factor s. That confir m s Ochs and Roth’s observatio n that α bargainer s receive more than β bargainer s, regardless of the value of δ β . More specifically, the model predict s that the first - mover advan tage is maximi zed when the second player’s discou nt factor is very import an t (22% to 33% when δ β=.4, versu s 6% to 16% first - mover advant age when δ β =.6). In other words, the first - player advan tage is maximi zed when the second player has a great deal to loose by going to the second roun d of negotiation, and thus has less negotiatio n power. A similar patter n of finding is found with the 3roun d version. 33 Ochs and Roth also note that for each of the four combinations of discou nt factors they tested, the mean opening offer favored the first player to the detri m ent of the second, that is, the first player system a tically prop ose d to keep more than half of the pie. This finding is linked with the second regularity they foun d and that we replicate, too: R6: Observed mean opening offers deviate from the pecuniary equilibri um in the direction of the equal money division. We com par e the mean opening (offer mad e by the first player) to observation s averaged over all conditions (8 cells, 10 games per cell, 8 to 10 observation s per game). The model predicts that the first player’s average offer will range between $12.7 and $13.9, with a mean of $13.1. In the original experi men t, the range was between $12.4 and $14.6, with a mean of $13.6. Both observation s and model’s simulations are closer to the equal - split division than what the pure pecuniary model would have suggested. Two-round ultimatum game Three-round ultimatum game cell 1 cell 2 cell 3 cell 4 cell 5 cell 6 cell 7 cell 8 ( .4, .4) ( .6, .4) ( .6, .6) ( .4, .6) ( .4, .4) ( .6, .4) ( .6, .6) ( .4, .6) Observ at ions 12.4 14.6 14.2 13.7 13.0 13.4 13.6 14.0 Model 12.8 13.0 13.9 13.4 12.7 12.7 13.2 13.1 ( , ) Figure 5 – Mean opening offer in the 2- round and 3- round ultimatum game, observations versus model (source: Ochs and Roth, 1989 ), averaged over 10 34 games. The first player offers less than half of the pie to the second player, whatever the discount factors (pie size=30). The mean opening offers are closer to the equal- split division than suggested by the pecuniary model. Note that the model suggest s that the first player is likely to offer the smallest share of the pie to the second player in the (.4, .4) condition, that is, when both players have the most to loose by going to furt her roun d s of negotiation, both for the 2- rou nd and 3- round versions. This particul ar pheno m e no n is also observed in the original experi m ent. R7: There are learning tren ds. Indepen de n t ly of the mean opening offers that vary from one condition to another, the authors have observed import an t learni ng trends over time. Players modify their behavior s thanks to recently gained experience. Our model replicates many of these trend s. For instance, in the (.4,.4) condition of the 2- round ultim at u m game, first players decrease their opening offers over time, from 13.2 in the first game to 12.0 in the last. Our predictions are 13.0 and 12.6 respectively. At the other extrem e, in the (.6, .6) condition of the 2- round ultimat u m game, first players learn to increase the offer they make to the second player, from 13.9 in the first game to 14.7 in the tent h game, on average. Our mod el predicts this upward trend, too, from 13.6 to 14.2 after 10 games. 35 16 15 14 13 12 11 10 1 2 3 4 5 6 7 8 9 10 16 15 14 13 12 11 10 1 2 3 4 5 6 7 8 9 10 (δ α , δ β ) = ( .4,. 4) (δ α , δ β ) = ( .6,. 4 ) 16 15 14 13 12 11 10 1 2 3 4 5 6 7 8 9 10 16 15 14 13 12 11 10 1 2 3 4 5 6 7 8 9 10 36 (δ α , δ β ) = ( .6,. 6 ) (δ α , δ β ) = ( .4,. 6 ) Figure 6 – Mean opening offers in the 2- round version of the ultimatum game, for the first ten games, with different discount factors for players α and β, observations versus model (source: Ochs and Roth, 1989 ). In our simulation s of the 2 - round ultimat u m games, though, we fail to replicate the learning involved in the (.4, .6) condition during the first few games (altho ugh both simulation s and observatio ns converge to the same equilibriu m). Furt her m o r e, the model actu ally predicts the direction of the learni ng tren d in the (.6, .4) condition, but under esti m a t es the mean openi ng by an average of $1.60. 37 16 15 14 13 12 11 10 1 2 3 4 5 6 7 8 9 10 16 15 14 13 12 11 10 1 2 3 4 5 6 7 8 9 10 (δ α , δ β ) = ( .4,. 4) (δ α , δ β ) = ( .6,. 4 ) 16 15 14 13 12 11 10 1 2 3 4 5 6 7 8 9 10 16 15 14 13 12 11 10 1 2 3 4 5 6 7 8 9 10 38 (δ α , δ β ) = ( .6,. 6 ) (δ α , δ β ) = ( .4,. 6 ) Figure 7 – Mean opening offers in the 3- round version of the ultimatum game, for the first ten games, with different discount factors for players α and β, observations versus model (source: Ochs and Roth, 1989 ). The replication s of the learning trend s observed duri ng 3- round ultima t u m games are quite satisfacto rily for all conditions except for the (.4, .6) condition where both amplitude and direction of learni ng are not correctly replicate d. 39 R8: A subst a n t ial prop or tion of first - period offers are rejected. The autho r s repor t an average 15.8% first - period rejection rate across all condition s in the 2 - round ultimat u m game. The model predict s a subst an ti ally similar rejectio n rate of 16.5%. Besides, the predictions are directionally consist en t with observations: first - period rejection rate is mini m u m in the (.4, .4) condition and maxim u m in the (.6, .6) condition, in both model’s predictio ns and observations. Two-round ultimatum game Three-round ultimatum game cell 1 cell 2 cell 3 cell 4 cell 5 cell 6 cell 7 cell 8 ( .4, .4) ( .6, .4) ( .6, .6) ( .4, .6) ( .4, .4) ( .6, .4) ( .6, .6) ( .4, .6) Observ at ions .100 .150 .188 .200 .120 .140 .144 .289 Model .130 .167 .202 .170 .137 .164 .226 .180 ( , ) Table 3 – Average rejection rate of first- period offers in the 2- round and 3round versions of the ultimatum game, with different discount factors for players α and β, observations versus model (source: Ochs and Roth, 1989 ). Besides, rejection rates in the 3- round version of the ultimat u m game are slightly higher than in the 2- round version, both in observations (17.1% vs. 15.8%) and in our predictions (17.5% vs. 16.5%). Finally, as noted by the aut hor s, first - offer rejection rates are higher when second player’s discou n t factor is high. In other words, second players are more likely to reject an offer when they have less to loose by 40 going to the secon d round. This finding is replicated by the model, too: wheth er it is in the 2 - round or 3- roun d versions, rejections rates are higher when δ β=.6 then when δ β=.4. R9: A subst a n t ial propor tion of rejected first - period offers are followed by disadvan t ageous counter offers. In Ochs and Roth’s 2- round ultimat u m game (1989), 101 of the 125 first roun d rejections (observed across all conditions) are followed by disadva n t ageou s count eroffers (81.0%). Players end up refusi ng offers to make event u ally coun teroffer s that, in fine , give them less money (due to discou nt factors). This ratio is compar able to findings in other experi me n t s (Binmo re et al. 1985; Neelin et al. 1988). The model predicts 96.8% of disadva nt ageou s cou nte roffer s. Despite the overesti m a tion, this is not inconsis ten t with similarly high ratios of disadvant ageous countero ffer s found in other experim ent s. For instance, Bolton (1991) repo rt s in his 2- round bargaining experi m en t that 24 of the 25 observed countero ffer s (93.3%) were disadvant ageous (p.1103). R10: The value of δ α influences the outcom e. By the pecuniary equilibrium, the proport ion al allocation shoul d depe nd exclusively on the value of δ β . Still, actual payoffs and observed rejection 41 rates are influenced by δ α, as shown in both observations and model’s prediction s. Bolton (1991) Two - round bargaining game Bolton also ran a 2- round version of the ultimat u m game (Bolton 1991), althoug h players bargained over a pie of $12, and 2 different combin ati o n s of discoun t factors were tested, namely (2 / 3 , 1 / 3 ) and (1 / 3 , 2 / 3 ). The game was played 8 and 7 times, respectively. Findings similar to the ones foun d by Ochs and Roth were replicat ed by Bolton (1991). We rapidly report the results of our simulation s, and com pa re them to the observations made by the author. The mean openi ng offers were 4.80 (40.0%) in the (2 / 3 , 1 / 3 ) condition, and 5.78 (48.2%) in the (1 / 3 , 2 / 3 ) condition on average across all games. Simulation s predict 4.97 (41.4%) and 5.21 (43.4%), respectively. The rejection rates in the original experi men t were similar across both condition s, at 18.8% and 18.4% respectively. The model does not predict any variation across condition s in the rejection rates either, but largely overesti m a te these figures at 32.8% and 33.5% respectively. The pro por tion of disadvant ageou s coun teroffer s was 85% and 20% respectively in the observatio ns, for 95.5% and 69.0% predicted by the model. Predictions are consisten t with observations (disadvant ageou s countero ffer s are not rare, and occur more often in the (2 / 3 , 1 / 3 ) condition 42 than in the (1 / 3 , 2 / 3 ) one) but overesti m at e d. One has to remem be r, however, that the original ratios were comp u te d on about only 10 observation s each. 10 9 8 7 6 5 4 3 2 1 2 3 4 5 6 7 8 10 9 8 7 6 5 4 3 2 1 2 3 4 5 6 7 (δ α , δ β ) = ( 23 , 13 ) (δ α , δ β ) = ( 13 , 23 ) Figure 8 – Mean opening offers in the 2- round version of the ultimatum game, for the first eight or seven games, with different discount factors for players α and β, observations versus model (source: Bolton, 1991 ). 43 Truncation ga me The truncation game is very similar to the 2- round version of the ultima t u m game, only that, if the secon d player refuses the first - round offer and decides to make a counter - offer, the first player has no choice but to accept (i.e., in the second round, the second player becom es a dictato r). Similarly to the 2- roun d ultimat u m game, discoun t factors apply in the second round. In Bolton (1991), players bargained over a pie of $12 divided into 100 tokens (c=12), and two combination s of discount factors were tested, namely (2 / 3 , 1 / 3 ) and (1 / 3 , 2 / 3 ). Each game was played 8 times. R11: In the (2 / 3 , 1 / 3 ) condition, observed mean opening offers deviate from the pecuniary equilibriu m in the direction of the equal money divisio n. The difference widens with experience. The pecuniary equilibriu m of the (2 / 3 , 1 / 3 ) condition is for the second player to accept any offer above 1/3 of the pie, and therefore for the first player to make an offer of $4.08 (in Bolton (1991), each of the 100 tokens had a face value of 12¢, and therefore 34 tokens with a total dollar value of $4.08 was the smallest offer above $4 a player could possibly make). On average, however, opening offers were equal to $4.62 in the first 44 game, and increased up to $5.21 in the last game. Our model predict s $5.59 and $5.85 respectively. The observed rejection rate was 39.1%. Our model predicts 37.0%. R12: In the (1 / 3 , 2 / 3 ) condition, observed mean opening offers deviate from the pecuniary equilibriu m in the direction of the equal money divisio n. The difference narrows with experience. The pecuniary equilibriu m of the (2 / 3 , 1 / 3 ) condition is for the second player to accept any offer above $8, and therefore for the first player to make an offer of $8.04 (67 tokens). In the observations, the mean openi ng offer was $7.64 in the first game, and got exactly equal to the equilibriu m at $8.04 after 7 games. Our mod el predicts $6.77 in the first and $7.22 in the last games, respectively. Since predictions under est i m a te the mean opening offer, the model also overesti m at es the rejection rate (26.6% in observation s versus 53.5% in simulations). Altho ugh the model overesti m a te s the bias towar d the equal division split, both pattern s of findings and directio ns of learni ng are replicat ed. 45 10 9 8 7 6 5 4 3 2 1 2 3 4 5 6 7 8 10 9 8 7 6 5 4 3 2 1 2 3 4 5 6 7 8 (δ α , δ β ) = ( 2 3 , 13 ) (δ α , δ β ) = ( 13 , 2 3 ) Figure 9 – Mean opening offers in the truncation game, for the first eight games, with different discount factors for players α and β, observations versus model (source: Bolton, 1991 ). Güth and van Damme (1998) The Güth - van Dam me game three - person bargaini ng game is similar to the simple, one- round version of the ultim at u m game, except that there is a third player with whom the pie has to be shared. The first player prop o ses to the secon d player a divisio n of the pie among all three players, and the latter either accepts or reject s the offer. If the 46 prop o si tio n is rejected, players do not receive any payoffs, otherwi se the pie is split accordin gly. In any case, the third player has nothing to say. This game challenges many conventional theories about fairness and equity, and is therefore worth stu dyi ng (Bolton and Ockenfels 1998; Güth et al. 2002; Güth and van Dam me 1998). We com par e our predictions to the simplest version (i.e., essential infor m ati o n condition, consta nt mode) of the original experi m ent conduc ted by Güth and van Damm e (Güth and van Damm e 1998). Players had to share a pie of 24 Dutch Guilder s (divided into 120 tokens), which represen t e d by the time approxi m at ely $13.6 (c=13.6). To apply the model we developed so far to a 3- person game requires a small modification of the utility function, though. Since 3 players are involved, the part of each player's payoff is expected to be one - third instead of one - half of the pie, and the deviation in term of "relative" money has to be modified accor di ngly, that is, the term (σ − 1 2 ) in Equation 2 is replaced by (σ − 1 3 ) . No other modification is made, and the para me t er b of the equation remains uncha nged. R13: The amoun t the dum m y receives is very small. Proposers generally offer much less than a third of the pie to the third player (the dum m y). On average duri ng the first six games, the dum my’s 47 share was 7.8 out of 120 tokens in the observations (6.5%). Our model predict s 8.4 (7.0%), as shown below. Observ at ions Model Pr oposer ( x ) 79.1 76.3 Respon der ( y ) 33.1 35.3 Dum m y ( z) 7.8 8.4 Table 4 – Average amounts (pie size=120 tokens) allocated to the three players by the proposer in the essential information condition of the Güth- van Damme game, observations versus model (source: Güth and van Damme, 1998 ). R14: Rejectio n rates are lower in the 3- person Güth - van Damm e game than in the 2- person ulti mat u m game. The average rejection rate in the original simple ultimat u m game dataset we used to fit our model (Roth et al. 1991) was .264 (.309 predicte d), a ratio that is consisten t although higher than the typical 15- 20 percent rejection rate observed in 2 - person ultimat u m games (Roth 1995). It is . 079 in the Güth - van Damm e original dataset (p.241). Our model does not captu r e this finding, and predict a very high rejection rate of .281. R15: There is a learning trend. 48 We have shown that a learning tren d could be described by expres si ng the decision param et e r of the decision rule as a linear function of players’ experience. We apply the same schem e in this model to replicate players’ learning by predicting the game’s outcom e at the first, sixth, twentiet h and fiftieth games. As shown below, this does not affect y ' s payoff much, but increases the prop oser's payoff to the detrim e nt of the du m m y's. In other words, the proposer learns that he can keep the du m m y's share of the pie without affecting the respond er 's likelihood of accepti ng his proposals. The exact sam e patter n has been found duri ng Güth and van Dam m e's actual experi m ent (p.239). Model Round 1 Round 6 Round 20 Round 50 Prop oser ( x ) 75.9 76.3 78.7 82.8 Responder ( y ) 35.0 35.3 35.2 32.0 Dum m y ( z) 9.2 8.4 6.1 5.2 Table 5 – Division of the pie when players gain experience. The model replicates observations: dummy's payoff decreases and proposer's payoff increases with learning. Note that the model correctly esti mat es both the nat ure and the direction of players’ learning, but underes ti m a t e s the pace at which it will occur. Actually, the predictions made by the model for the fiftieth game are very close to the observation s already mad e duri ng the sixth game of Güth and van Dam m e’s experi m ent, namely x=8 0.8, y=3 3.3 and z=5.8. 49 Furt her m or e, the model predict s that the rejection rate will dram atically decrease to .076 after 50 games, compar able to the actual rejection rate observed during the experi men t. It seem s that the large overesti m ati on of the rejectio n rate (.281 predict ed versus .079 observed) is mainly the consequ ence of the model’s inability to predict the pace at which learning will occur, rather than the direction, nat ur e or effects of such learni ng. Summary In this section, we have simulated 5 games (the 2- round, 3- round and 5roun d version s of the ultimat u m game, the truncation game, and the Güth - van Damm e 3- person bargaining game) and compare d our out - ofsam ple predictio n s to existing experim ent s in the literat ur e. Most of 15 major findings are replicated, and prediction s are not only directionally consist en t with observations, but also often accurate. These result s seem to un derlin e the great generali za tion proper ties and predictive power of the ERC utility - decision framework we have developed. Observ at ions Mod el GAM E U SED T O FI T T H E M ODEL ON E- ROU N D , U LT I M AT UM GAM E ( Rot h et al., 19 91) Mean op ening off er 4 0. 6% 40. 7% Av er ag e r ej ect ion r at e 2 6. 4% 30. 9% Table 6 – Summary of findings. Roth et al. simple version of the ultimatum game has been used to fit the parameters of the model. 50 Obser v at ions Model OUT- OF- SAM PLE PREDI CT I ON S OF OTH ER GAM ES T W O- ROU N D ULT I M ATUM GAM E ( Binm or e, Sh aked and Su t t on, 1985 ) Mean opening of fer 4 1.6% 4 1.4% T W O- , T H REE- AN D FI V E- ROU N D ULT I M AT UM GAM ES ( Neelin et al. , 2 7.4% Mean opening of fer ( 2 R, = 1 / 4 ) 1 3 4.2% Mean opening of fer ( 5 R, = / 3 ) 1 4 7.2% Mean opening of fer ( 3 R, = / 2 ) Mean opening of fer incr eases wit h discount fact or 1 988) 3 4.6% 3 5.4% 3 6.0% T W O- ROU N D ULT I M ATUM GAM E ( Güt h and Tiet z, 19 88) Mean opening of fer 3 5.4% Dem anded shar es alw ay s g reat er t h an 50 % Rej ect ion rat e in ( = .1) condit ion Rej ect ion rat e in ( = .9) condit ion 3 7.9% 1 9.0% 3 8.1% 6 1.9% 5 5.1% Rej ect ion rat es incr ease w it h discoun t fact or s T W O- AN D T H REE- ROU N D ULT I M ATU M GAM ES ( Ochs and Rot h, 19 89) Consist en t fir st - m ov er adv ant age Mean opening of fer ( 2 R) 4 5.8% 4 4.3% Mean opening of fer ( 3 R) 4 5.0% 4 4.0% Mean opening m inim um in ( . 4, .4 ) con dit ion Th ere are lear ning t r end s Av er age r ej ect ion r at e ( 2R) 1 5.8% 1 6.5% Av er age r ej ect ion r at e ( 3R) 1 7.1% 1 7.5% Rej ect ion rat e m inim um in ( .4 , .4) condit ion Rej ect ion rat e m ax im um in ( .6 , . 6) condit ion Disadv ant ag eou s count erof fer s Value of inf luences t he out com e 8 1.0% 9 6.8% Table 6 (cont’d) – Summary of findings. Major experimental findings are replicated, and predictions are not only directionally consistent with observations, but often very accurate. 51 Obser v at ions Model OUT- OF- SAM PLE PREDI CT I ON S OF OTH ER GAM ES ( co n t 'd) T W O- ROU N D ULT I M ATUM GAM E ( Bolt on, 1 991) Mean opening of fer in ( 2 / 3 , 1 / 3 ) condit ion 1 2 Mean opening of fer in ( / 3 , / 3 ) condit ion 4 0.0% 4 1.4% 4 8.2% 4 3.4% Mean opening m inim um in ( 2 / 3 , 1 / 3 ) cond it ion Av er age r ej ect ion r at e 1 8.6% 3 3.2% Disadv . count er of fer s in ( / 3 , / 3 ) cond it ion 8 5.0% 9 5.5% Disadv . count er of fer s in ( 1 / 3 , 2 / 3 ) cond it ion 2 0.0% 6 9.0% 2 1 2 1 Dis. count . m ax im um in ( / 3 , / 3 ) cond it ion T RUN CATI ON GAM E ( Bolt on, 1 991) Mean opening in ( 2 / 3 , 1 / 3 ) condit ion 1 2 Mean opening in ( / 3 , / 3 ) condit ion 4 0.7% 4 7.8% 6 5.3% 5 8.3% Mean opening dev iat es fr om equilibr ium Widens ov er t im e in ( 2 / 3 , 1 / 3 ) condit ion 1 2 Nar r ows ov er t im e in ( / 3 , / 3 ) condit ion GÜ TH - V AN DAM M E ULTI M AT UM GAM E ( Güt h an d v an Dam m e, 19 98) Pr oposer's sh ar e 6 5.9% 6 3.6% Resp onder 's shar e 2 7.6% 2 9.4% 6 .5% 7 .0% 7 .9% 2 8.1% Du m m y 's sh ar e Av er age r ej ect ion r at e ( 1) Rej ect ion rat e lower t han ult im at um gam e no Th ere is a lear ning t r end ( 1) Not e: Unless ind icat ed ot her wise, all figur es ar e relat ed t o off ers ( in cluding t he ones ev ent ually rej ect ed ) m ade dur ing t he fir st r ound ( if m ore t h an one) . ( 1) The m odel p redict s t he av er age r ej ect ion r at e will ev ent ually decr ease t o 7.6% aft er 50 gam es. Table 6 (cont’d) – Summary of findings. Major experimental findings are replicated, and predictions are not only directionally consistent with observations, but often very accurate. 52 By plotti ng observation s versus mod el predictions across all the games, for both mean openi ng offers and rejection rates, one can see that mean openi ng offers are often very accurat ely predict ed across games. Rejection rates, however, are often overesti m at e d. One of the reason might be that the game we used to fit the model (i.e., Roth et al. 1991) has already a high rejection rate (26.4%) com pa re d to the 15- 20% usually foun d in one - round ultima t u m games. 53 70% 60% Pr edictions 50% 40% 30% 20% 10% 0% 0% 10% 20 % 30% 40% 5 0% 60% 70% Obser vat ions 70 % 60 % Pr edict ions 50 % 40 % 30 % 20 % 10 % 0% 0% 10% 20% 3 0% 40 % 50% 60% 70 % Obser vat ions 54 Figure 10 – Plots of mean opening offers (left chart) and first- round rejection rates (right chart), observations versus predictions. Squared- correlation statistics are .895 and .474, respectively. Mean opening offers are predicted very satisfactorily across games and conditions, but rejection rates are usually overestimated by the model. Note that rejection rates are not always reported in all experiments (e.g., Binmore et al., 1985). 55 STABILITY OF THE PARAMETER ESTIMATES We esti mat ed the para me t er s of our model using Roth et al.’s multi coun try bargaining dataset (Roth et al. 1991), and then used these para me t er estimat es to make out - of- sam pl e predictions for several other games. The reason s why we used this particular datas et to fit our model are twofold. First, the one- round ultimat u m game is the keysto ne of all bargaining games; it is therefo re natur al to fit the model on the simplest setting possible, and then test the out - of- sam ple validity of the predictions on more elaborate d versions of the game. Another, more practical reaso n to use Roth et al.’s dataset is that it is one of the rare very large dataset s available. Since maxim u m likelihood estimat es can be heavily biased for small sam ples, the latter reaso n is not trivial. One might wonder, however, whether the param et e r estim at es we obtain ed would have been different if we had fitted the model to anot her dataset. To suit our needs, such dataset should have the two following desirable charact eri stics: First, since one of the para me t er of the model is a learning param et e r, the game shoul d have been played repeat edly (i.e., several periods) with the same players to allow the model to captur e learni ng tren ds. Second, since the desirable proper ties of maxim u m likelihoo d estimat es are only achieved asy mp t o tically (Eliason 1993), the dataset should have 56 a large numb er of observations per period . For inst ance, Eliason insists that “in the typical ML estimation proced ur e, one would want to have a large sam pl e size because the desirable proper ties of the MLE (…) are justified only in large sam ple situation s.” Fortu n at ely, despite the lack of potenti al candidat es, Ochs and Roth’s dataset s meet all the above criteria (Ochs and Roth 1989). We therefore re- estim ate d the param et e r s of the model to the observations made by Ochs and Roth in their 2- round and 3- roun d ultim at u m games separat ely. To maximize the size of each dataset, we estimat ed one set of para me t er s to fit simult aneo u s ly all 4 conditions of each game (i.e., 4 combin ati o n s of discoun t factors), thu s leading to two datas et s of 380 observation s each. To avoid artificial biases, observations were weighted within each dataset so that each condition would equally contribute to their respective log- likelihood function. Table 7 repor ts the resul ts of the param et e r estim ate s fitted on the 3 dataset s (the original ultimat u m game from Roth et al. used in the first section s of this paper and the 2- round and 3- round versions of the ultima t u m game from Ochs and Roth), using maxim u m likelihood estimation. Altho ugh most difference s are statistically significant (though, b estim ate is not statistically different between the first and the second dataset s, so is τ 1 between the first and the thir d dataset s), param e t er estimat es seem to be reason ab ly stable across games. 57 b ON E- ROUN D ULTI M ATUM GAM E ( Rot h et al., 1991) TW O- ROUN D ULT I M AT UM GAM E ( Ochs & Rot h, 1989) TH REE- ROUN D ULT I M AT UM GAM E ( Ochs & Rot h, 1989) 0 1 10.742 0.3478 0.0159 ( .995) ( .0189) ( .0038) 10.566 0.2704 0.0016 ( 2.258) ( .0267) ( .0038) 12.579 0.2206 0.0180 ( 1.485) ( .0190) ( .0050) Table 7 – Parameter estimates of the model, using maximum likelihood estimation and 3 different datasets. Estimates seem to be reasonably stable across games. 58 ULTIMATUM GAME SIMULATIONS WITH ALTERNATIVE ASSUMPTIONS ABOUT PLAYERS’ CHARACTERISTICS In the first sections of this paper, we have assu m e d that the struct u r al form of players’ utility function followed an ERC, asym m et ric shape (i.e., presence of negative recip rocity, but no positive reciprocity), and that choices were the outcom e of a rando m process. The statistical fit obtain ed one the simple version of the ultimat u m game seem ed to confir m our hypo t hese s. First, introd uci ng positive reciprocity did not improve the overall fit of the model, and the null hypot h esis that there was no positive reciprocity considerat io ns in players’ behavio rs could not be rejected; then, the decision para m et er estimat e was sufficiently small to allow a great deal of variability in the decisions’ outco m es. Taking the simple ultimat u m game for illust ration purpos e, we could modify the param et e r values of the model for one or more players, and see how these modificatio n s affect the outcom e of the game. The following modificatio ns can be applied to the model: 1. The decision para me t er can be set to infinity (τ→∞), tran sfo r m i ng the probabilistic decision rule into a deter mi ni stic one; the strat egy with the highest expected utility would then be chosen with a probability of 1. 59 2. The param et e r b of the utility function can be set equal to 0, transf or m i ng the nonlinear ERC utility function into a standar d linear function; players’ motivation s becom e self - interest ed, mo net ary gains only, witho ut equity consider ation s, as suggested by the pecu ni ary model of classic game theory (“greedy” utility function). 3. On the other han d, the ERC function can be made symm e tric, and b can be set positive even for σ> ½, introd ucing some altruistic considerat io ns in players’ motivations (“sym m et ric ERC” utility functio n). These changes can be readily applied to the proposer, the respon de r, or both. Table 8 shows the mean openi ng offer in the ultimate game as predict ed by our simulatio n s for all possible combi nations. Cells in bold are not statistically different from the observation s made by Roth et al. (1991), used to fit the model in the first place. Responder Greedy Asymmetric ERC (b>0 ⇔ σ<½) (b=0) Proposer τ→∞ τ→∞ Greedy τ→∞ .001 .039 Asymmetric ERC τ→∞ Symmetric ERC τ→∞ .410 (e) .440 (e) .410 .407 (e) .435 (e) .410 .280 .400 (d) .280 .400 (d) .191 .319 .419 (d) .319 .415 (d) .001 .140 .280 .400 (d) .280 .400 (d) .039 .189 .319 .408 (c) .319 .408 (d) (e) .450 (d) .410 (e) .450 (d) (e) .455 (d) .410 (e) .454 (d) (a) .140 Symmetric ERC (b>0, ) τ→∞ (b) 60 Table 8 – Mean opening offer (in percent) in the simple ultimatum game (c=$10), as given by simulations, with different hypotheses about the parameters. The cells in bold are not statistically different from the results obtained by Roth et al. (1991). For illustra tion pur po se, cell (a) corresp o nd s to the perfect subgam e equilibriu m: the propo ser and the resp on d er are both only motivated by their pecuniary gains (b=0), and they system at ically choose the strategy with the highest (expected) utility. Therefore, the first player offers the smallest share of the pie possible to the second player, and the latter accept s with a probability of 1. This illustr at es the fact that the perfect subga m e equilibriu m can be viewed as a special case of our model, where b =0 and the decision para m et er is set to an extrem ely large value. It has been suggested that, under some conditions, even “greedy” and “ratio nale” players migh t choose to offer more than the minim u m suggested by game theory, in order to secure a rationale respons e from the second player. For instance, Binmore wrote that “the first player might be dissu a ded from making an opening dema nd at, or close to, the ‘optim u m’ level, becaus e his opponen t would then incur a negligible cost in making an ‘irrational’ rejection” (Binmore et al. 1985, p.1180). Cell (b) replicates this reasoni ng: altho ugh both players are greedy, the rand o m ne s s associated with the second player’s decision to reject small offers has to be com pen s at e d (by increasi ng the share offered to the secon d player), so that the cost of an irratio nal decision of rejection would increase. Our simulations suggest that this sole effect could 61 explain an increase in the mean opening offer up to .080, but is far from sufficient to explain the .408 observed in the original experi m ent. Cell (c) is the stan d ar d model we used throu ghou t this paper: players share the same utility function (an ERC, asym m e t ric curve), and both players’ decision s are probabilistic. Interestin gly, as shown in cells (d), only two conditions suffice to replicate the observations mad e by Roth et al. (1991). First, respon de r s mus t have an aversion to games outcom e s that are in his or her disfavor (b β>0 if σ< ½), and second, respo nd er s’ choices must be probabilistic. Other than that, nothi ng matter s. For instance, charact eri stics associat ed with the first player are irrelevant to explain the results. If the two above condition s were met, even a “greedy” (b=0) and “rationale” (τ→∞) robot that would maximize its expected gains by system a tically following the strat egy with the highest expected utility would still have to select a mean openi ng offer not statistically different from what Roth et al. have observed in their multi - count ry bargaining experi m ent. Also, as shown additionally in cells (e), propose r’s altruistic motives could also explain by them s elves a large mean opening offer, although such hypot hesi s could be refut ed by observations made in other games (see for instance the Güth - van Damm e game, where the propos er does not seem to care about the du m my.) 62 DISCUSSION AND CONCLUSIONS In this paper, we have develope d a utility- decision framework to explain players’ behavior s in various games, based on the und erlying ERC utility functio n (to take into account players’ consider atio n for fairness and equity) linked to actual decisions by a probabilistic Gibbs distrib utio n (to incorp orat e choice ran do m n e s s and players’ heterogenei ty). After fitting the model to the ultimat u m game and obtaini ng a good fit, we have used this model to predict players’ behaviors in 5 other games, totaling 19 differen t experi m ent al con ditions, and replicated 15 majo r experi m ent al findings, often with a sur p ri si ng accu racy. This model has many elements in com m on with other mod els propo sed in the literat ure to explain players’ behaviors, and yet present s many original and remar kab le properties that we will highlight by compari ng it to two well- known reinforce m e nt learning model s: the Roth and Erev’s simple reinforcem e nt learning mod el (Erev and Roth 1998; Roth and Erev 1995), hereaft er RE; and the Sarin and Vahid’s dynamic model of choice (Sarin and Vahid 1999; Sarin and Vahid 2001), hereafter SV. Comparison s to reinforcem e nt learning models RE (i.e., Roth and Erev’s reinforcem e n t learning mod el) models players’ propensities q t ( i ) to choo se action i at time t: at the beginni ng of the game, each player has an initial propensi ty to play each possible strategy 63 (i.e., q 0 ( i ) are exogenou s to the model). The chosen strategy is then deter mi ned by a linear probabilistic decision rule, where the prob ability p t ( i ) to choose action i at time t is defined by: pt (i ) = qt (i ) N ∑ q ( j) j =1 t Equation 7 – Linear probabilistic decision rule in the RE model (Roth and Erev 1995) . Probability to take each action i at time t is a linear function of its propensity q t (i) . Then, the actu al payoff x t ( i ) that results from having chosen the ith action at time t is observed, and the propensi ties are updat ed by the following form ul a: q t +1 ( i ) = q t ( i ) + x t ( i ) − x min Equation 8 – Updating procedure in the RE model (Roth and Erev 1995) . Propensity q(i) are linearly augmented with the actual, observed payoffs for taking this action. Where x min is the minim u m payoff that can be experience d, so that choosing an action that leads to the smallest possible rewar d (usually zero for most games) is not reinfo rced. The SV model, althou gh based on a similar set of equations, differs in many ways from the RE’s model. First, SV does not mod el players’ 64 propensities to choose different actions, but rat her expected utilities of these actions (Sarin and Vahid 2001), where expected utilities “represent the subjective assess m e n t of the player regardi ng the payoff she would obtain from the choice of any strat egy at any time” (p.105). Second, players’ choices are not hypot hesi ze d to be probabilistic: they choose the action with the highest expected utility with a probability of 1. In that sense, the SV model is myopic 2 . Finally, the updati ng procedur e also differs in the sense that, after choosing the action i at time t and observing the actual payoff x t ( i ) of this choice, expected utilities are update d by compari ng them to actual payoffs as follows: u t +1 ( i ) = (1 − λ ).u t ( i ) + λ .x t ( i ) Equation 9 – Updating procedure in the SV model (Sarin and Vahid 2001) . Expected utilities u(i) are iteratively shifted by a constant λ toward the latest observed payoff of taking this action. Where λ is a small positive const a nt smaller than 1 (usually equal to 0.01). At the very extre me, if λ=0, there are no up dat es in expected utilities and u t ( i ) = u 0 ( i ) , ∀t . Our model diverges noticeably from these two models in three ways. First, althou g h the utility- decision framewor k model we have prese nte d in this paper has some learni ng com pone n t s embedded in the decision rule (i.e., the coefficients of certit u de is a linear function of players’ 2 In learning theory, it is also said that players follow a “greedy” decision rule. 65 experience), it is not a learning model per se. Players’ behavior s are not reinforce d by trial - and - error (as in the ER model), and changes in strat egy is not explained by continuou sly refined estim ate s of expected utilities than k s to experience (as in the SV model). In a sense, our model incorpo rat e s a learning compone nt that is indepen de n t of the actual choices made by players, which is a possible limitation of our model. Second, our model explicitly incorporat e s a nonlinear social utility function, while ER and SV implicitly use a linear utility function: both ER (resp. SV) upd ate s the propensi ties of choices (resp. expected utilities) thro ug h a linear transfor m a ti on of game payoffs, without taking into account decreasing marginal utilities, expectation s of fairness or sense of equity. Third, our mod el does not need prior knowledge of any sort to be able to replicate and simulate new games. While ER3 needs initial prope nsi ties q 0 , and SV needs initial expected utilities u 0 , our model does not have these prerequi sites, and therefore is more general and less inform a tion sensi tive. Limitations and direction s for future research The model we have developed, despi te its good stati stical fit and predictive power, has several limitations we addr es s here. There has been a recent attempt by Roth and Erev to circumvent this model limitation, although the presence of initial propensities still seems to be critical to predict the outcome of a large family of games. 3 66 First , the learning compon en t of the ERC utility- decision framewor k is indepen de n t of players’ past actions, which seems unrealistic. The learni ng trend embed d e d in the model is expected to arise from a decrease in players’ heteroge nei ty and in choice rando m ne s s, but the way these phen om en a are linked to experience and trials and error s (i.e., reinforce m e n t learning) has still to be investigated. A secon d limitation of this model is the way variability is accoun t ed for. One can argue that players’ heterogen eity and individual s’ choice rand o m n e s s (the two main compone nt s that explain why games outco m e s are probabilistic) are not very well capt ure d by our model, the reason being that they are obviously confound ed in the param e t er of the decision rules, and therefore cannot be analyzed and estim ate d separat ely. A third limitation is that it is unclear so far whet her or not the para me t er s of the model are actually stable, and whet her they sho uld be modified (if ever) to better account for certai n popul ati on s (e.g., stud en t s vs. other types of participan t s) or certain game configuration s (e.g., varying nu m ber of players). Contributions and conclusion The first contribu t ion of our model is that it does not rely upon any kind of prior knowledge abou t initial players’ expect atio ns or utilities or prope nsities to play certain strategies. Most of the criticism s that have 67 been made against reinforce me n t learni ng models are indeed about the critical role of these initial values in the model s: where they come from, and the amo u nt of infor m ati on they contain (that is, whethe r or not the models propo se d to explain players’ behaviors only “magnify” infor m ati o n already contained into these initial values exter nal to the model, and therefo re do not explain much). Erev and Roth (1998) say their model is “agnostic about where the initial propensi ties come from”. Similarly, Sarin and Vahid (2001) write that “initial assess m e nt s u 0 may have been formed by hears ay, strategy labels, or by similarity of the decision situation to other decision problem s that the individual may have faced in the past […]”. We offer an alternative to these ad hoc approaches by linking behavior s to an underlying utility function that is stable over time, consta nt whatever the game played, and shared by all players whatever their specific roles in the game. Conseque n tly, at the difference of existing models, our mod el can be hypot hetically used to predict ex ante the outcom e s of never - before - played games. The second contribution of this paper is to addres s one of the major criticis m s made to the social utility theory, and specifically its lack of quan tification. By quantifying the unique param e t er of the utility functio n we employ, our mod el gives a sense of magnit ude about the players’ motivatio ns. Although most of the findings present ed here have already been explained by an exploration of the theoretical properties of the ERC utility function (Bolton 1991; Bolton and Ockenfels 2000), these 68 prope rties could not be quantified without comput i ng the most likely values of the para m et er s. A third contrib ution is achieved by the remar kable generalization prope rties of our model when applied across different datas et s, collected at different point of time by different aut hor s and for different games. This finding seem s to und erli ne the relatively good stability of the model’s param et e rs. Finally , this paper also lays the groun dwor k for more sophis ticat ed models that shoul d investigate a different way to model variability by disent a ng ling heterogeneity of players and rando m n e s s of individual s’ choices, and investigat e furt her the possible asym m et r y between positive and negative reciprocities, possibly by stu dying games that involve both types of behaviors. 69 REFERENCES Binmore, Kennet h, Avner Shaked, and J. Sutton (1985), "Testing Noncoopera tive Bargaining Theory: A Prelimin ary Study," The American Economic Review, 75 (Decem ber), 1178- 80. Bolton, Gary E. (1991), "A Compa rative Model of Bargaining: Theory and Evidence," The American Economic Review, 81 (5), 1096- 136. Bolton, Gary E. and Axel Ockenfels (2000), "ERC: A Theory of Equity, Reciprocity, and Comp etition," The American Economic Review, 90 (1), 166 - 93. - - - - (1998), "Strategy and Equity: An ERC- Analysis of the Güth - van Dam me Game," Journal of Mathem atical Psychology, 42 (2/3), 215- 26. Bush, Robert and Frederick Mosteller (1955), Stochastic Models for Learning. New York: Wiley. Camerer, Colin (1990), "Behavioral Game Theory," in Insights in Decision Making: A Tribute to Hillel J. Einhom, Robin Hogarth, Ed. Chicago: University of Chicago Press. 70 Camerer, Colin and Teck Hua Ho (1999), "Experience - weighted Attraction Learning in Normal Form Games," Econom et rica, 67 (4), 827 - 74. Cheu ng, Yin - Wong and Daniel Friedm an (1997), "Individual Learning in Norm al Form Games: Some Laborat o ry Results," Games and Economic Behavior (19), 46- 76. - - - - (1995), "Individual Learning in Normal Form Games: Some Laboratory Results,". Santa Cru z: Mimeo, University of California. Cooper, David and Nick Feltovich (1996), "Reinforce m en t - Based Learning vs. Bayesian Learning: A Compari so n,": Mimeo, University of Pittsburg h. Cox, James, Jason Shacht, and Mark Walker (1995), "An Experime nt to Evaluate Bayesian Learning of Nash Equilibriu m,": Mimeo, University of Arizona. Davison, A.C. and D.V. Hinkley (1997), Bootst rap Methods and Their Application. Cambridge Series in Statistical and Probabilistic Mathema tics: Camb ri dge University Press. 71 Efron, B. and R. J. Tibshirani (1993), An Intro ducti on to the Bootst rap. Monograp h s on Statistics and Applied Probability. London: Chapm a n & Hall /CRC. Eliason, Scott R. (1993), Maximum Likelihood Estimation: Logic and Practice. Sage University Paper on Quantitative Applications in the Social Sciences. Thousa n d Oaks, CA: Sage. Erev, Ido, Yoella Bereby- Meyer, and Alvin E. Roth (1999), "The Effect of Adding a Const ant to All Payoffs: Experiment al Investigation, and a Reinforcem en t Learning Model with Self- Adjusting Speed of Learning," Jour nal of Econo mic Behavior and Organi zation, 39 (1), 111- 28. Erev, Ido and Alvin E. Roth (1995), "On the need for low rationality, cognitive game theory: Reinforcem e nt learning in experi men t al games with unique, mixed strategy equilibria," in MIMEO. University of Pittsb u rg h. - - - - (1998), "Predicting how people play games: Reinforcem e nt learning in experi m ent al games with unique, mixed strat egy equilibria," American Economic Review, 88 (4), 848 - 81. 72 Estes, William K. (1950), "Toward a Statistical Theory of Learni ng," Psychological Review, 57 (2), 94- 107. Fehr, Ernst, Georg Kirchsteiger, and Arno Riedl (1993), "Does Fairness Prevent Market Clearing? An Experim ent al Investigation," The Quaterly Jour nal of Econo mics, 108 (2), 437 - 59. Fehr, Ernst and Klaus Schmi dt (1997), "How to Account for Fair and Unfair Outco me s - A Model of Biased Inequality Aversion," in Symposi u m on Econo mic Theory. Gerzensee, Switzerlan d. Feltovich, Nick (2000), "Reinforce m en t - based vs. belief - based learni ng models in experi me nt al asym m et ric - inform a tion games," Econom et rica, 68 (3), 605 - 41. Grossk op f, Brit (1999), "Competition, Aspiration and Learning in the Ultimat u m Game: An Experimen t al Investigation," in 1999 Europea n Economics Associatio n Meetings. Universitat Pompeu Fabra. Güth, Werner , Carst en Schmi dt, and Matthias Sutter (2002), "Bargaining Outside The Lab – A Newspaper Experiment Of A Three Person Ultimat u m Game,". 73 Güth, Werner, R. Schmittberger, and B. Schwar ze (1982), "An Experime nt al Analysis of Ultimat u m Bargaining," Journal of Economic Behavior and Organizati on, 3, 367 - 88. Güth, Werner and Reinhard Tietz (1988), "Ultimat u m Bargaining for a Shrinki ng Cake, An Experimen t al Analysis," in Working paper. Güth, Werner and Eric van Damm e (1998), "Infor ma ti on, Strategic Behavior and Fairness in Ultimat u m Bargaining, An Experime nt al Study," Jour nal of Mathem at ical Psychology, 42 (2/3), 227- 47. Hopkins, Ed (1999), "Learning, Matching, and Aggregation," Games and Economic Behavior, 26, 79- 110. Luce, Duncan R. (1959), Individual choice behaviour. New York: Wesley. Manly, Bryan F. J. (1997), Randomi z at ion, Bootstra p and Monte Carlo Metho ds in Biology, Second Edition: CRC Press. Neelin, Janet, Hugo Sonnenschei n, and Matthew Spiegel (1988), "A Further Test of Noncoo per ative Bargaining Theory," The American Economic Review, 78 (Septemb er), 824- 36. 74 Ochs, Jack and Alvin E. Roth (1989), "An Experiment al Study of Sequenti al Bargaining," The American Economic Review, 79 (June), 35584. Rabin, Matthew (1993), "Incorpor ati ng Fairnes s into Game Theory and Economics," American Economic Review, 83 (5), 1281 - 302. Rapopo r t, Amno n and Ido Erev (1998), "Coordination, "magic", and reinforce m e n t learning in a market entry game," Games and Economic Behavior, 23 (2), 146- 75. Roth, Alvin E. (1995), "Bargaining Experim en t s," in Handbook of Experime nt al Econo mics, J. Kagel and A. E. Roth, Ed. Princeton: Princeton University Press. Roth, Alvin E. and Ido Erev (1995), "Learning in Extensive - Form Games: Experime nt al Data and Simple Dynamic Models in the Inter m ediat e Term," Games and Economic Behavior, 8 (Special Issue: Nobel Symposi u m), 164 - 212. Roth, Alvin E., Vesna Prasnikar, Masahiro Okuno - Fujiwara, and Shmuel Zamir (1991), "Bargaining and Market Behavior in Jerusalem, Ljubljana, Pittsb u rg h, and Tokyo," American Economic Review, 81 (5), 1068- 95. 75 Rubinst ein, A. (1982), "Perfect Equilibriu m in a Bargaining Model," Econom et rica, January. Sarin, Rajiv and Farshi d Vahid (1999), "Payoff Assess m e n t s Without Probabilities: A Simple Dynamic Model of Choice," Games and Economic Behavior, 28, 294 - 309. - - - - (2001), "Predicting How People Play Games: A Simple Dynamic Model of Choice," Games and Economic Behavior, 34 (1), 104 - 22. Shao, Jun and Dongsheng Tu (1996), The Jackknife and Bootstr ap. Springer Series in Statistics: Springer Verlag. Suppes, P. and R. C. Atkinson (1960), "Markov Learning Models for Multiperso n Inter - Actions," Review of Metaphysics, 15, 196. Swarth ou t, Todd and Mark Walker (1999), "Reinforcem en t, Belief Learning, and Inform ati on Processing," in Summ er 1999 ESA Meeting. 76
© Copyright 2026 Paperzz