DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND Biases and Strategic Behavior in Performance Evaluation: The Case of the FIFA’s Best Soccer Player Award Tom Coupe Olivier Gergaud Abdul Noury WORKING PAPER No. 24/2016 Department of Economics and Finance College of Business and Economics University of Canterbury Private Bag 4800, Christchurch New Zealand WORKING PAPER No. 24/2016 Biases and Strategic Behavior in Performance Evaluation: The Case of the FIFA’s Best Soccer Player Award Tom Coupe1† Olivier Gergaud2 Abdul Noury3 27 October 2016 Abstract: In this paper, we study biases in performance evaluation by analyzing votes for the FIFA Ballon d’Or award for best soccer player, the most prestigious award in the sport. We find that ‘similarity’ biases are substantial, with jury members disproportionately voting for candidates from their own country, own national team, own continent, and own league team. Further, we show that the impact of these biases on the total number of votes a candidate receives is fairly limited and hence is likely to affect the outcome of this competition only on rare occasions where the difference in quality between the leading candidates is small. Finally, analyzing the incidence of ‘strategic voting’, we find jury members who vote for one leading candidate are more, rather than less, likely to also give points to his main competitor, as compared with neutral jury members. We discuss the implications of our findings for the design of awards, elections and performance evaluation systems in general, and for the FIFA Ballon d’Or award in particular. Keywords: Award; Bias; Voting; Soccer JEL Classifications: D72, Z2 1 Kyiv School of Economics, UKRAINE; and the Department of Economics and Finance, University of Canterbury, Christchurch, NEW ZEALAND 2 KEDGE Business School & LIEPP, Sciences Po, FRANCE 3 Department of Politics, New York University-Abu Dhabi, UAE † Corresponding author is Tom Coupé. Email: [email protected] 1. Introduction For a long time, economists have been studying incentive mechanisms ranging from monetary compensation to fringe benefits, and, more recently, awards. For these incentive mechanisms to be effective in making people work hard, they must reward people for performance rather than luck or other non-performance-related outputs (see Prendergast, 1999). In practice, however, organizations often fail to reward for performance only. For example, Bertrand and Mullainathan (2001) show that how much chief executive officers (CEOs) are paid depends on luck, such as fluctuations in oil prices or changes in industry performance. Similarly, Blanchard et al. (1994) find that firms use part of their cash windfalls to increase executive compensation. One of the reasons for such ‘pay without performance’ (Bebchuk and Fried, 2004) is that the people in charge of evaluating performance are often biased and take into account factors other than past (or expected) performance when judging how good performance has been (or is likely to be), or who performed best (or is likely to perform best). In this paper, we investigate the size and direction of biases in performance evaluation, by analyzing the voting process for the most prestigious best player award in soccer, known as the Ballon d’Or (Golden Ball) 1, awarded annually by the International Federation of Association Football (FIFA). Many major sports associations use awards to reward and promote performance. FIFA and the National Basketball Association (NBA), for example, use elections to choose the best athlete of the year, while the Association of Tennis Players (ATP) and the International Cycling Union (UCI) use a mathematical formula that gives more weight to victories in more prestigious tournaments than those gained in lesser competitions. 2 Awards also play an important role for many international organizations, and in some cases giving an award is a key element of their functioning (see Frey and Gallus, 2015). Awards are a popular incentive mechanism in various fields of business, particularly where information asymmetry is important, ranging from advertising and finance to medicine, public affairs, technology, and transportation. According to Awards, Honors & Prizes (Gale, 2015), a primary source of information on awards first published 35 years ago, the number of awards has been growing steadily over time, with the 2015 edition containing references to some 20,000 awards and prizes. Despite this increase in the importance of awards as incentive mechanisms, economists, and other social scientists have only recently begun to study the various dimensions of awards. In a workplace setting, awards have been shown to reduce absenteeism (Markham et al., 2002), improve call center workers’ performance (Neckermann et al., 2014) and positively affect health worker trainees’ exam scores (Ashraf et al., 2014). Awards can improve the welfare of both the giver and the receiver. For the awarding bodies, they can provide a way to promote their values, while awardees can use them to reveal or signal their talent. Other economic agents who are engaged in the activities covered by the awards can also benefit as they can bask in the reflected glory and adoration of the receiver (Frey and Gallus, 2015). Biased decision-making serves to impair these positive impacts, resulting in a series of undesirable effects, not only reducing incentives for current or future applicants but also damaging the reputation of the institution organizing the award. 1 According to the FIFA’s ‘rules of allocation’ (2014), this award is ‘bestowed according to on-field performance and overall behavior on and off the pitch’. 2 In soccer, there is also a best African player award, a best South American player award and a best Asian player award. L'Equipe, the leading French sports newspaper, and the Guardian, a major UK newspaper, also both publish their own ranking of the top 100 best soccer players of the year. 1 In this paper, we focus on two kinds of biases in performance evaluation, which have been identified by the academic literature. First, we evaluate how ‘similarity’ between voters and candidates affects evaluations. 3 A large body of literature in the social sciences documents that subjects tend to be attracted to people who are similar to them (see, for example, Montoya et al., 2008). This also extends to how people evaluate others’ performance. For example, Giuliano et al. (2011) document the presence of an own-race bias in the behavior of managers of a large retail firm, while Parsons et al. (2011) show such bias in the behavior of umpires in baseball. Zitzewitz (2006) documents an ownnationality bias in the behavior of judges of various winter sports, while Ginsburgh and Noury (2008) show that linguistic and cultural proximities between singers and jury members are important predictors of success in the Eurovision Song Contest. Similarity biases have been shown to exist, too, in settings where evaluators need to predict candidates’ future performance, such as in hiring decisions or elections. Combes, Linnemer, and Visser (2008) find that links between jury members and candidates, such as working at the same university or the jury member being the Ph.D. advisor of the candidate, matter for the hiring of economics professors, while Stoll et al. (2004) observe own-race bias in the behavior of hiring agents in a large sample of US firms. Using experiments, Bailenson et al. (2008) show that voters prefer candidates with faces similar to their own, especially for unfamiliar candidates. Using survey data, Caprara et al. (2007) find that voters prefer candidates whose traits they rate most similar to their own, while Webster and Pierce (2015) show that voters are more likely to vote for candidates of a similar age to themselves. Similarly, Cutler (2002) shows Canadian voters are more likely to support candidates with whom they share a common gender, location or language. One possible explanation of why similarity might matter is that similarity can validate the evaluator’s own characteristics (Kaptein et al. 2014): in other words, if a candidate with your characteristics wins an award or an election, your characteristics will be deemed more valuable. Second, we evaluate the importance of ‘strategic’ behavior of evaluators. Strategic voting occurs when the ranking of candidates by an evaluator does not correspond to the evaluator’s true preferences. Several papers provide evidence for the importance of strategic voting. Fujiwara (2011), for instance, uses Brazilian data to show that lower placed candidates get higher vote shares in the first round of a two-round election (dual ballot) than when there is only a single-round election, suggesting that voters vote for the top candidates rather than their preferred candidates in single-round elections. Similarly, Alvarez et al. (2006) find that in the UK, voters are less likely to support a political party that is perceived as being unlikely to win. In rank-order elections like the ones used for the FIFA Ballon d’Or award, voters have an incentive to behave strategically. Indeed, a major criticism of ranked order voting methods is that they are vulnerable to strategic behavior. Rational voters can indeed increase the winning chance of their preferred candidate by ranking their direct competitors as low as possible. In addition to contributing to the academic literature on performance evaluation, voting behavior and awards, this paper further adds to the popular debate on biased evaluations in the Ballon d’Or, an award whose reputation has been marred by scandals and allegations of biased voting. 4 As information is made public on how each jury member voted, every year articles appear in the popular 3 Other studies have focused on biases not related to similarity. For example, for musical competitions, Van Ours and Ginsburgh (2003) find that jury members are influenced by the order of appearance of candidates, while Tsay (2013) documents that judges are influenced more by what they see than what they hear. Similarly, for academic awards, Hamermesh and Schmidt (2003) find that descriptive characteristics of candidates (such as affiliation or subspecialty) affect the voters, while in the context of political elections, Berggren et al. (2006) show that a candidate’s perceived physical beauty matters. 4 There have been allegations of votes being changed ex-post, jury members being influenced by politicians, being biased against specific countries, being biased in favor of some clubs and being biased against defenders. 2 press that list the unexpected votes of high-profile jury members, accompanied by allegations of both strategic voting and similarity voting. For example, a Guardian blog article commented: ‘The two frontrunners were not the only men guilty of tactical voting. The Portugal coach did not rate [Argentine player] Messi in his top three; the Argentina coach voted for three Argentinians; the Brazil coach picked [Brazilian player] Neymar; and the Germany manager selected three Germans. The levels of bias and favouritism in the voting make the Eurovision Song Contest look positively objective’ (Campbell, 2015). Similarly, a South African soccer news website observed: ‘In fact, it was only the captains, whose compatriots and teammates weren’t amongst the nominees, that seemed to make remotely objective votes’ (Smith, 2015). In this paper, we contribute to this popular discussion by presenting estimates of the impact of different biases, both on an individual candidate’s chance of being selected as best player by a given voter, and on the overall outcome of these elections. We find clear evidence that jury members are much more likely to vote for ‘similar’ candidates, defined here as candidates with whom they share a national team, nationality, or continent, than for candidates with whom they share no such similarity. We also show that the size of the bias is larger for closer ties. We document the existence of such bias in each of the four best player elections we have analyzed, and among all types of voters: coaches, players, and media representatives. Hence, the recent decision to limit the electorate to media representatives only (Lacombe, 2016) is unlikely to result in an electoral process free of bias. Most importantly, however, we find that while some of the biases we detect are sizeable, the overall impact on election outcomes is limited. This supports FIFA’s decision in 2011 to simplify the electoral process by allowing jury members to vote for candidates from their own national team. As far as strategic behavior is concerned, we find some evidence that is consistent with voters engaging in strategic voting. For example, a sizeable fraction of Messi fans avoid giving points to his biggest rival, Ronaldo (and vice versa). However, compared with ‘neutral’ voters (who vote for neither Messi nor Ronaldo), jury members placing Messi first are more, rather than less, likely to put Ronaldo second. Still, replacing the current ranked order voting system with an alternative system such as approval voting, which is arguably less vulnerable to strategic behavior, could reduce worries about strategic biases in FIFA’s Ballon d’Or award. The rest of this paper is organized as follows. Section 2 presents some background information on FIFA’s Ballon d’Or award, its rules and participants, and provides data on our similarity variables. Section 3 analyzes the degree of similarity bias in the Ballon d’Or voting. Section 4 examines the extent to which this bias affects the outcome of the Ballon d’Or competition. Section 5 analyzes the degree of strategic voting, and Section 6 provides conclusions. 2. Background and data Since 2010, FIFA, together with France Football, a popular French soccer magazine, has organized a best player of the year award, known as the Ballon d’Or (or Golden Ball), with a jury of captains and coaches of national teams, and media representatives. 5 Each jury member has three votes, worth respectively one, three and five points for third, second and first place, and they can choose to vote from 23 pre-selected candidates. 6 The candidate with the highest overall score receives the award. 5 Prior to 2010, France Football organized its own player of the year awards, with voting conducted by journalists, while FIFA’s equivalent award was voted for by national team coaches (and later also captains). In 2010, these two awards were merged. We have data for 2010 but, as FIFA rules in 2010 did not allow jury members to vote for candidates from their own national team, we excluded 2010 from our analysis. 6 We focus on the voting for the 23 candidates. It is not clear how FIFA selects these 23 candidates. We cannot exclude the possibility that there are already biases at the initial selection stage. 3 From FIFA’s website we collected individual-level voting data for all jury members, for each of the elections from 2011 to 2014. In addition, we gathered data on various characteristics of each jury member and candidate. For each individual, we know which country they represent from the FIFA voting files. For coaches, using Google searches, we also collected data on nationality. While players for the national team are by definition nationals of the country they represent, more than 40% of the coaches manage a team of a country that is different from their country of nationality. For captains, trainers and candidates, we used Google to search for data on their year of birth and their position on the field: forward, midfielder, defender, or goalkeeper. For each of the captains and each of the candidates, we further used Google to collect data on the team for which they play as well as the country where they play, for the year relevant to the vote. Finally, for each candidate, we collected data on their popularity using Google Trends (GT). GT is an online search tool that enabled us to measure how often a specific player had been searched for over a specific period of time. We used GT scores for the month preceding the election, i.e. October of the election year as the voting happens in November. We also collected the ‘player rating’, a widely used performance indicator calculated by WhoScored.com (WS), which states that its ratings are ‘the most accurate, respected and well-known performance indicators in the world of football [soccer]’. Rating is a variable that ranges from 6 to 10 and is calculated based on a large number of raw statistics and weighted according to their influence within the game. Although the GT scores and Rating are correlated, they capture different dimensions of candidate quality. Whereas the GT score is a measure of popularity, Rating measures performance regardless of how popular a player is. Based on the data we collected, we created variables that reflect similarity 7 with respect to: • • • • • • • National team (given a value of 1 if jury member and candidate are affiliated with the same national team and 0 if otherwise. In about 0.6% of candidate-jury member pairs, the jury member and candidate share the same country) Nationality (1 if jury member and candidate share nationality, 0 otherwise. In about 1.3% of candidate-jury member pairs, the jury member and candidate share the same nationality) Continent (1 if jury member and candidate share continent, 8 0 otherwise. In about 29% of candidate-jury member pairs, the jury member and candidate share the same continent) Competition (1 if jury member and candidate play in the same soccer competition, 0 otherwise. In about 6% of candidate-jury member pairs, the jury member and candidate play in the same competition) League team (1 if jury member and candidate play for the same league team, 0 otherwise. In about 0.1% of candidate-jury member pairs, the jury member and candidate play in the same competition) Position on the field (1 if jury member and candidate share position, 0 otherwise. In about 27% of candidate-jury member pairs, the jury member and candidate play the same position) Younger (1 if the jury member is younger than the candidate, 0 otherwise. In about 19% of candidate-jury member pairs, the jury member is younger than the candidate) 7 We focus on interactions of given characteristics of jury members and candidates—for example, do older jury members select older candidates? We do not consider interactions between different characteristics, such as whether or not older jury members choose goalkeepers more often than other players. In the regressions in this paper we control for candidate characteristics. That is, jury members could vote for better players. We do not include voter characteristics, however, as such effects would be candidate-specific, for example, do older voters prefer Messi. 8 We use nationality to determine continent. 4 3. Similarity biases in performance evaluation The data we use consist of choices made by various jury members in the FIFA Ballon d’Or competition. Each jury member has a choice of 23 candidates and has to select one candidate as ‘best’ player. 9 This means that for each jury member, we have 23 observations, together representing a single vote. Given that there are approximately 500 jury members and we use data from four Ballon d’Or elections, we have around 47,000 observations in total. In this section, we focus on the top choice made by voters. Each observation of our dependent variable consists of a value for a binary dependent variable: that is, 1 if the candidate is selected as the best player, and 0 otherwise. Our set of explanatory variables describes how the jury member and the candidate are related. Having a binary dependent variable calls for the use of non-linear regression methods such as logit or probit to make sure that the predicted probabilities lie within the zero–one interval. A drawback of these non-linear methods is that the marginal effects of the explanatory variables on the probability of being selected as best player depend on the values of all explanatory variables. As a consequence, the marginal effect of a similarity link will be different for Messi than for Ronaldo, for instance. It is important to realize that the 23 observations of each vote are not independent from one another. When a jury member chooses one candidate as best player, by definition none of the other 22 candidates can then be chosen as best player. A model that captures this feature is McFadden’s Discrete Choice model (1974) which can be estimated by the (alternative specific) conditional logit model. Note that this further complicates the interpretation of the marginal effects: anything that increases the chance of one candidate being selected will simultaneously decrease the chance of other candidates being selected. For example, assigning Messi an extra (positive) similarity link with a jury member will increase his chance of winning, but will decrease Ronaldo’s and other candidates’ winning chance. The discussion above suggests that using the correct statistical model comes at the cost of ease of interpretation: The (alternative specific) conditional logit model provides us with odds ratios that have the right properties but are not easy to grasp, while models that are easy to interpret are likely to be less accurate. For example, given the binary nature of the dependent variable, we could estimate a linear probability model (LPM) with robust standard errors. The advantage of doing so is that LPM is very easy to interpret, with coefficients reflecting how the probability of being selected as best player changes as one changes the value of a single explanatory variable. However, we know that several of the assumptions behind LPM will be violated: First, LPM assumes observations are independent, which in our case does not hold, as explained above. Second, LPM does not restrict predicted values to be in the zero–one interval. As a solution to the above problem, we provide in this paper the results of a conditional logit analysis, present the odds ratios and specific examples to illustrate the economic significance of these odds ratios, and in the appendix provide the results of the LPM model. Overall, both models lead to the same conclusions, but we point out if, where, and when the results differ. In our basic specification, we use three similarity variables: National Team, Nationality, and Continent. As control variables, we further include alternative specific variables including age of the candidate, his Google Trends score, his WhoScored.com (WS) rating, and his nationality (captured by a country dummy). Ideally, we would include alternative specific dummies to capture all characteristics of the candidates. However, this sometimes leads to non-convergence of the maximum likelihood estimation procedure. Therefore, for comparability across specifications, we show the 9 They also choose one candidate as second best, and another one as third best. Section 5 analyzes the vote for the second and third place. 5 results of regressions with a fixed set of candidate characteristics. The bias introduced by this is likely to be small: for the cases where including candidate-specific dummies did lead to convergence, we get results very similar to those when we use the set of candidate characteristics rather than the candidate-fixed effects. Column 1 of Table I presents the results of a regression using the votes from all jury members regardless of their types. Column 2 presents the results of a regression using votes from the media representatives only. Since these jury members are not affiliated with the national team, they can only share the same nationality and the same continent with candidates. Column 3 presents the results of a regression using votes from coaches. As coaches can be of a different nationality from the national team they train, we can separate the effect of belonging to the same national team from the effect of sharing nationality. Column 5 presents the results of a regression using the votes of captains only. Since captains have the nationality of the national team, one cannot distinguish between the impact of being players of the same national team and sharing nationality, and hence the coefficient of the National Team in this regression also incorporates the effect of Nationality. Columns 4, 6, and 7 show the results of including additional similarity variables as explanatory variables. For coaches and captains, we have dummies that indicate whether or not jury members and candidates share the same position on the field, and whether or not the jury member is younger than the candidate (columns 4 and 6). In addition, for captains, we have dummies for sharing the same league team with the candidate and playing in the same competition (league) as the candidate (column 7). [Table I here] Column 1 shows that the odds to be voted best player by a given jury member are about 9.3 times higher if a player plays for the same national team as the jury member, compared with a case where the player is not playing for the same national team as the jury member. This effect is statistically significant but to gauge how sizeable it is we compute, for all candidates, the expected chance to be selected as best player by a jury member under two opposing scenarios. For both scenarios, we assume that all other candidates keep their actual sample values. For the first scenario, we assume the candidate under consideration has no similarity link with the jury member. For the second scenario, we assume the candidate under consideration has a similarity link to the jury member. Comparing these two scenarios for a given candidate gives us, for that candidate, the average (across observations for that candidate) marginal effect of moving from not being similar to being similar on a given dimension with a given jury member. 10 In the text, we discuss these average marginal effects for three players: Messi, Ronaldo, and Xabi Alonso. 11 We also discuss the overall average marginal effect, which is the average of these candidate-specific ‘average’ marginal effects across all candidates. 12 10 Note that while for each individual vote this marginal effect will correspond to the odds ratio, the average of these marginal effects (which we use here) does have an odds ratio that is similar in magnitude but not exactly the same, because the average of a ratio is not the ratio of the averages. Alternatively, one could compute the marginal effects at the ‘average’ values of the dependent variables but, as this average player would be a player with several different nationalities, it makes little sense to do this. 11 Messi and Ronaldo were the main candidates in the four elections we have studied, while Xabi Alonso is an example of a candidate with a relatively low likelihood of winning. 12 Note we use an unweighted average and do not weigh a candidate’s (average) marginal effect by the number of observations this candidate represents in the sample. 6 Table AI and AII in the appendix give the values for these statistics, for the different regressions we present below. Starting with the impact of the jury member and the candidate being affiliated with the same national team, we find that under the first scenario, Messi would get 40% of the votes as best player, while under the second scenario, he would get 84% of the votes. Hence, the marginal effect of representing the same national team is 44 percentage points for Messi. Ronaldo would get 28% under the first scenario, and 74% under the second scenario, giving a marginal effect of 46 percentage points. Xabi Alonso would get 2% under the first scenario, and 16% under the second scenario: a marginal effect of 14 percentage points. These examples show that the effect of being affiliated with the same national team as a jury member is sizeable. Not surprisingly, the percentage points difference between the two scenarios tends to be bigger for the candidates who, even without being similar to the jury member, have a higher chance of getting the best player vote. The overall average marginal effect, which is the average marginal effect of candidates, averaged across all candidates, is 9.8 percentage points. Further, column 1 of Table I shows that belonging to the same national team is not the only variable that matters, but also that candidates benefit substantially from sharing nationality with a jury member. The odds to be voted best player by a given jury member are about five times higher if a candidate shares his nationality with the jury member, than if the candidate and jury member are of different nationalities. Using the scenarios described above, for Messi, not sharing nationality with a jury member would get him 40% of the votes, compared with 75% in the case of sharing nationality. For Ronaldo, these percentages would be 28% and 65% respectively, while for Xabi Alonso, these percentages would be 2% and 9%. The overall average marginal effect of sharing nationality is about 6 percentage points, which is smaller than the average marginal effects across candidates who play for the same national team (which we found to be about 9.8 percentage points), but is still a sizeable effect. Finally, the impact of candidate sharing continent with a jury member is also positive and significant, though the odds ratio is fairly small. The odds to be voted best player by a given jury member are about 29% higher if a candidate shares his continent with the jury member, compared to the case where the candidate is not sharing a continent with the jury member. Using the scenarios described above, for Messi, not sharing his continent with a jury member would get him 40% of the votes, compared with 45% in the case of sharing a continent with the jury member. For Ronaldo, these percentages would be 27% and 32% respectively, while for Xabi Alonso, these percentages would be 2% and 2.5%. While in relative terms these are still sizeable impacts, for candidates with a low chance of being selected as the best player, in the case of absence of similarity with the candidate the absolute change in probability is small. The overall average marginal effect across candidates is about 0.6 percentage points. So far, we have assumed all jury members behave in a similar way. Next, we allow for differences between media representatives (column 2), coaches (column 3), and captains (column 5). All types of jury members show biases, though the magnitude of specific biases somewhat varies across types of voters. For coaches we can distinguish between the three types of similarity and confirm the significant and sizeable impact of sharing nationality (odds ratio approximately 5) and national team (odds ratio approximately 16), while the impact of sharing continent is again positive (odds ratio approximately 1.1) but insignificant at the 10% significance level. For captains and media representatives, we find significant effects of sharing continent (odds ratios of approximately 1.5), and significant effects of sharing nationality (odds ratio of approximately 9 for media representatives) and of sharing national team/nationality (which cannot be separated for captains and has an odds ratio of approximately 23). Note the need to be careful when comparing odds ratios across equations as the denominators, the probabilities under the first scenario, can be quite different across equations. 7 Tables AIa and AIb in the appendix provide the percentage of votes for the three above-mentioned players under the two scenarios for the different columns of Table I. It shows that in our case the denominators (the first scenario) are fairly similar across equations, and hence that rough comparisons of odds ratios across specifications can be informative. The estimate of the average marginal effect of being part of the same national team varies (across columns of table AI) between 9.8 and 19.3 percentage points, of having nationality in common between 3.2 and 8.5 percentage points, and of sharing a continent between 0.2 and 1.1 percentage points. This suggests that the effect of similarity is bigger if this similarity is ‘geographically’ closer, which intuitively makes sense. Columns 4 and 6 add two more similarity indicators: one indicator that is 1 if the jury member is younger than the candidate, and 0 otherwise, and one indicator that is 1 if the candidate and the jury member play on the same position on the field, and 0 otherwise. We find both variables have a negative effect on the chance of a candidate being selected as the best player, but only significantly so when we use the votes of captains. When voted for by jury members who are captains, the odds of being selected as best player when the jury member is younger than the candidate is approximately 75% of the odds when the jury member is older than the candidate. Similarly, the odds of being selected as best player when the candidate plays the same position on the field as the jury member is approximately 60% of the odds when the candidate is playing a different position than the jury member. Hence, we find that captain jury members have a tendency both to avoid voting for candidates who are older than they are themselves and to avoid voting for players who are playing in the same position as they did. Table I in the Appendix illustrates these effects by showing that Messi would get 42% of votes of jury members who are not forwards while only 31% of votes of jury members who are forwards (based on column 6). For Ronaldo, who is also a forward, these percentages are 32% versus 23% respectively. Xabi Alonso, a midfielder, in contrast, would get about 3% of the votes of jury members who are not midfielders, and 2% of votes from jury members who play as midfielders. The marginal effects for the younger dummy are similar in nature but somewhat smaller. Finally, column 7 adds two more similarity variables to the model that analyzes the voting behavior of the captains. We add dummies for jury members and candidates who play in the same league team, and for those playing in the same league. Adding these variables reduces the odds ratio of sharing the national team/nationality and shows a sizeable advantage of playing for the same league team (odds ratio of approximately 4.6) and playing in the same league (odds ratio of approximately 1.7). For Messi, this implies that he will get 72% of the votes of Barcelona jury members, compared with 38.5% of non-Barcelona jury members. Similarly, while approximately 51% of jury members in the Spanish League would vote for Messi, only approximately 38% of jury members active in other leagues would do so. Table AI in the Appendix further shows that 11% of Real Madrid jury members would select Xabi Alonso, compared with approximately 3% of non-Real Madrid jury members. Similarly, while approximately 5% of jury members in the Spanish League would vote for Xabi Alonso, only approximately 3% of jury members active in other leagues would do so. [Table II here] In the above regressions, we pooled the data from four best player elections. Next, we run our basic regression using one election at a time. Table II below repeats column 1 of Table I but then splits the sample by year in columns 2 to 5. The direction of the effects in each year is the same as in the regression that pooled data from all years, though some coefficients are not significant in 2011. This could be because of the fact that in 2010 votes for national teammates had been forbidden and that 8 the votes of jury members who had voted for their national team members anyway had been made invalid (Volkskrant, 2011), which could have led to uncertainty about whether or not voting for teammates was allowed in 2011. Over time, there is indeed an increase in the share of best player votes going to candidates linked by national team or nationality. Moreover, 2011 had very concentrated voting, with 78% of all first-place votes going to Messi. The other years, 2012, 2013, 2014, all give significant effects, but magnitudes of the odds ratios do vary across years. In 2011 and 2012 the odds ratio of shared nationality is bigger than the odds ratio of shared national team, while in 2013 and 2014 the opposite is true. Table IIa gives the corresponding estimates of the average marginal effects. Tables AIII and AIV provide the results of OLS regressions with jury member fixed effects, clustered and robust standard errors, and the same set(s) of explanatory variables as used in Tables I and II. Overall, qualitative results are similar with the relative ordering of the impact of various similarities typically being maintained, though significance differs somewhat for the similarities in terms of age and position and the ordering of impact does not change across year-specific regressions. The biggest difference is in the impact of playing in the same competition, which is significant and positive based on the conditional logit analysis, but insignificant based on the OLS analysis. Given that many of the OLS model assumptions are violated by the data, the conditional logit results are more convincing, however. 4. The overall impact of similarity biases While the impact of the biases on the chance a given candidate gets selected as best player by a given jury member is sizeable, this does not necessarily mean the biases have a sizeable impact on the overall election outcome. Indeed, as Table III shows, the variation in the number of jury members with which a candidate shares a given feature is fairly limited for some dimensions and is inversely related to the extent of bias. [Table III here] For example, while the impact of having a shared national team on the probability a given candidate receives a vote from a given jury member is sizeable, all candidates typically have only two potential jury members from their national team: the captain and the coach of the national team. Given there are several hundred jury members, this means that the overall impact on the election outcome must be small. There is more variation across candidates in the number of jury members linked through the continent, but as column 3 shows, the effect of being linked through continent is small. To illustrate the overall impact of the similarities, we compare the actual percentage of votes received by a top-three player of a given year with the predicted percentage of best player votes in a scenario where jury members and candidates would not be linked by nationality, national team or continent. That is, we simultaneously set all linkages for all candidates to zero in the year-specific models (the various columns of Table II). 13 [insert Table IV here] 13 This is different from the player-specific marginal effects we computed earlier by varying a single similarity dimension for a single candidate from one to zero, while keeping the values for all other candidates as they were in the sample. 9 As Table IV shows, the differences between actual outcomes and predicted outcomes based on the zero-links scenario are small, showing that similarities, while affecting individual jury members, do not affect in any meaningful way who gets most votes as best player. This suggests that only if there is very little consensus on who is the best player, could similarities affect the election outcome. As an example, in 2014 Ronaldo was a clear winner, with 37.66% of the overall vote count, compared with Messi who took 15.76% of the vote count. German player Manuel Neuer was third, with 15.72% of the vote count. 14 While the difference between the number one and two was clearly too big to be caused by biased voting, biases could potentially be sufficiently large to affect who won second place, though in this case, as Table V suggests, Messi’s lead would have been bigger if there had not been any nationality, national team or continent similarity links. 5. Strategic voting Besides biased voting, where jury members use criteria other than candidate quality to define their preferred candidate, jury members can also behave strategically when voting. For example, jury members can make choices that benefit their preferred candidate by not voting for that candidate’s direct competitors. For example, supporters of Ronaldo will maximize their support if they do not only rank him first (thus allocating him 5 points) but also do not put Lionel Messi, Ronaldo’s arch rival, in second or third place (which receive respectively 3 points and 1 point), even if they did think Messi was the second or third best player. In fact, of those who are ranking Ronaldo first, 45.93% put Messi second and another 18.81% put Messi third—hence almost 65% of Ronaldo fans give points to Messi. Similarly, about 57.89% of Messi fans put Ronaldo second, and another 17.11% put Ronaldo third—hence, almost 75% of Messi fans award points to Ronaldo. Our conclusion about strategic voting thus depends on how strictly one defines strategic voting: We find that about 65% of Ronaldo fans do give points to Messi, so we could argue that 35% of Ronaldovoting jury members vote strategically by not giving points to Messi. Similarly, one could argue that 25% of Messi jury members vote strategically by not giving points to Ronaldo. However, if it is assumed that one can rationally have different evaluations of soccer quality, then one should compare the behavior of Messi and Ronaldo voters with the voting behavior of the quarter of jury members who do not put either of these players in the top place—jury members whom we will call ‘neutral’ jury members. Of these neutral jury members only 53.85% put Messi in the top three and only 51.44% put Ronaldo in the top three—suggesting, if anything, that voting for Messi goes together with voting for Ronaldo and vice versa. That is, there are many jury members who are choosing quality above all and thus give both contenders points. At the same time, in relative terms, Ronaldo fans are less generous to Messi (65% voting for Messi, relative to 53.8% of ‘neutral’ jury members) than Messi fans are to Ronaldo (75% vote for Ronaldo, versus only 51.44% of ‘neutral’ members). However, jury members who are fans of both star players are more likely to vote for the other star player than are ‘neutral’ jury members. Table V checks this comparison more formally, using regression analysis. We first analyze the choice for second place made by jury members who did not vote for Messi first and compare the choice of those jury members who put Ronaldo first with the choice of those who put neither Ronaldo nor Messi first. We then repeat a similar regression, analyzing the choice for second place of jury members who did not vote for Ronaldo first, and compare the choice of those who put Messi first with that of those who put neither Ronaldo nor Messi first. 14 The vote count combines first, second, and third places while Table V focuses only on the first place vote share. 10 [Table V about here] As control variables, we use dummies for jury members affiliated with Messi’s (Ronaldo’s) national team, for jury members who are Argentinian (Portuguese), and for jury members who are South American (European). We find that Ronaldo fans are 12% points more likely than neutral fans to put Messi second, while Messi fans are 30% points more likely than neutral fans to put Ronaldo second. As far as control variables are concerned, we find, for the sample that excludes jury members who put Messi first, that members of the Portuguese national team are less likely to put Messi second, while Argentinian nationals are more likely to put Messi second. 6. Conclusion This paper analyzes in detail the determinants of the vote for the most popular soccer award, the FIFA Ballon d’Or, thereby adding to the expanding literature on the economics of awards (see Frey and Gallus, 2015). This paper also adds to the literature on the provision of incentives by analyzing biases in performance evaluation, and to the literature on voting behavior by analyzing biased and strategic voting. In addition, this paper contributes to the popular debate about FIFA’s Ballon d’Or by providing empirical estimates of the extent and origins of biased voting in the elections for this award. We show that voting for the FIFA Ballon d’Or is indeed subject to sizeable biases, and thus provide support to the academic literature on the importance of ‘similarity’ as a determinant of biased performance evaluations or biased voting behavior. Our results suggest that closer ties between jury members and candidates lead to bigger biases. Our basic specification, for example, suggests that a candidate who is affiliated with the same national team as the jury member will, on average, be about 10 percentage points more likely to be chosen as best player by that jury member than a candidate not affiliated with the same national team as that jury member. It also suggests that a candidate who has the same nationality as the jury member will, on average, be about 6 percentage points more likely to be chosen as best player by that jury member than a candidate of a different nationality from the jury member. Sharing a continent with a jury member, in contrast, only leads to a 0.6 percentagepoint difference. Further, our results suggest that all types of voters (players, coaches and media representatives) are affected by these biases and that such biases were present in all four elections years we analyzed. In September 2016, France Football, the French magazine that between 2010 and 2015 co-organized the Ballon d’Or elections with FIFA, announced it would stop cooperating with FIFA and revert to its pre-2010 independent Ballon d’Or award and the pre-2010 award procedure, which only allows media representatives to vote. This decision was motivated by ‘the hope that the award will gain impartiality as journalists do not have fellow team members to defend nor a dressing room to keep happy, while certain team captains or coaches might show their friendship or their desire to keep social peace’ (Lacombe, 2016). 15 Our results suggest, however, that media representatives, too, have reasons to vote for candidates of the same nationality or continent as themselves. We do not find, however, that similarity biases are likely to affect the overall outcome of the FIFA best player elections, as for those similarity dimensions for which the bias is sizeable, candidates tend to have similarity links with only a small number of jury members. For those similarity dimensions on which candidates tend to be linked with many jury members, the size of the bias is small or the variation in the number of links across candidates is small. The fact that in 2011 FIFA changed the 15 Translated from the original French. 11 voting rules by allowing voters to vote for candidates affiliated with their own national team is thus unlikely to have affected the election outcome in any meaningful way. The experience of the FIFA Ballon d’Or award thus provides an interesting lesson for designers of awards, elections or performance evaluation systems (and for reporters covering such systems). Having a rule against biased (national team) voting, as existed in 2010, led to some votes being invalidated, as voters had not been aware of the rule. At the same time, the impact of such national team bias on who won the award would have been negligible. Hence, the designers of bias-proof electoral systems should be aware that there is a cost–benefit trade-off: Complicating electoral rules to avoid biases only makes sense if the advantages in terms of reducing biases outweigh the costs in terms of confusing voters. Our results suggest that even while biases can be sizeable, voting procedures can be fairly robust against such biases. If the number of voters is sufficiently large and the degree to which candidates benefit from biases is small, or the variation across candidates in the degree to which they benefit from such biases is limited, the cost of trying to avoid biases might be bigger than the benefits. In this paper, we also investigate whether jury members behave strategically by not giving points to the direct competitors of their most preferred candidate. While we find that a sizeable number of jury members vote for Messi but not Ronaldo (and vice versa), compared with ‘neutral’ voters, jury members who vote for Messi (Ronaldo) are more, rather than less, likely to vote for Ronaldo (Messi). To reduce strategic voting, or at least the suspicion that it affects the voting outcome, FIFA (and France Football since both will offer their on award from 2016 onwards) might want to consider changing the current rank-order vote to an electoral process that is based on approval voting. The theoretical literature indeed suggests that approval voting, whereby voters can approve any number of candidates but do not rank them, is less sensitive to strategic voting (see Brams and Fishburns, 1978). It would be interesting to investigate the extent to which our findings are influenced by the fact that the votes are made public. One could speculate that because jury members know their votes will be made public, they are pushed to vote in a less biased way. This argument was indeed one of the motivations for the Professional Basketball Writers Association to make votes public in 2014. In 2013, player LeBron James missed an MVP sweep because one voter voted for an outsider, something the Association wanted to prevent from happening again (Draper, 2014). However, when fellow player Stephen Curry did realize such a sweep in 2016, one commentator on Reddit (Redmond24, 2016) argued exactly the opposite, that voters no longer dare to deviate and ‘risk the internet’s wrath upon themselves’. There is an emerging literature in political science and political economy that compares theoretically and empirically the results of secret and public ballots. Morton and Ou (2015), for example, using experimental data, compare the effects on voters’ electoral choices. They find that when voting is public, individuals are significantly more likely to make ethical rather than selfish choices. This suggests that, if Ballon d’Or votes had not been made public, biases in voting for the award would have been bigger than the ones we document here. Finally, FIFA and France Football could reduce controversy over the Ballon d’Or award in the future by following the example of the International Cycling Union or the International Tennis Federation. These sports organizations created a ranking system that is based not on elections but rather on a mathematical equation that aggregates various performance statistics. Similarly, FIFA and France Football could use statistics of player performance in international competitions to rank players. While such ranking methodology would be controversial at the development stage, eventually it could become widely accepted and debates at the time of the annual award might diminish. After all, discussing statistical methodology is much less fun than discussing how strangely this or that famous person voted. 12 References Alvarez R.M., Boehmke F.J. and Nagler J. (2006). “Strategic Voting In British Elections.” Electoral Studies, 25(1), pp. 1–19. Ashraf N., Bandiera O. and Lee S.S. (2014). “Awards unbundled: evidence from a natural field experiment.” Journal of Economic Behavior & Organization, 100. pp. 44–63. Bailenson J.N., Iyengar S., Yee N. and Collins N.A., (2008) “Facial Similarity between Voters and Candidates Causes Influence.” Public Opinion Quarterly, 72(5), pp. 935–961. Bebchuk L.A. and Fried J.M. (2004). “Pay without Performance, The Unfulfilled Promise of Executive Compensation, Part II: Power and Pay.” Retrieved at http://www.law.harvard.edu/faculty/bebchuk/pdfs/Performance-Part2.pdf Berggren N., Jordahl H. and Poutvaara P. (2006) “The Looks of a Winner: Beauty, Gender and Electoral Success.” IZA Discussion Paper No. 2311. Available at SSRN: http://ssrn.com/abstract=933639 Bertrand M. and Mullainathan S. (2001). “Are CEOs Rewarded For Luck? The Ones Without Principals Are.” The Quarterly Journal of Economics, pp. 901–932. Blanchard O.J., Lopez de Silanes F. and Shleifer A. (1994), “What do firms do with cash windfalls?” Journal of Financial Economics, vol. 36, pp. 337–360. Brams S. and Fishburn P. (1978). “Approval Voting.” The American Political Science Review, 72(3), pp. 831-847. Campbell, P. (2015), “The strange world of Ballon d'Or voting: starring Ronaldo, Messi and Mascherano.” The Guardian, January 13 2015. Retrieved on 04/09/2016 at http://www.theguardian.com/football/blog/2015/jan/13/strange-ballon-dor-voting-cristiano-ronaldolionel-messi-javier-mascherano Caprara1 G.V., Vecchione M., Barbaranelli C. and Fraley C.R. (2007) “When Likeness Goes with Liking: The Case of Political Preference.” Political Psychology, 28(5), pp. 609–632. Cutler F. (2002) “The Simplest Shortcut of all: Sociodemographic Characteristics and Electoral Choice.” Journal of Politics, 64(2), pp. 466-90. Combes P.P., Linnemer L. and Visser M. (2008). “Publish or peer-rich? The role of skills and networks in hiring economics professors.” Labour Economics, 15(3), pp. 423–441. Draper K. (2014). “Increased Transparency Has Revealed that Awards Voting is More Broken Than We Thought.” The Diss, April 23, 2014. Lacombe R. (2016). “Ballon d’Or: Retour a la Maison.” France Foot, September 20, 2016. Fujiwara T. (2011). “A Regression Discontinuity Test of Strategic Voting and Duverger’s Law.” Quarterly Journal of Political Science, 2011, 6: 197–233. FIFA (2014). “Rules of Allocation.” Retrieved on 03/09/2016 at http://resources.fifa.com/mm/document/ballon-dor/playeroftheyear-men/02/46/27/12/rulesofallocation2014en_neutral.pdf 13 Frank, R. and Cook P.J. (1995). The Winner-Take-All Society, New York: Martin Kessler Books at The Free Press, 1995. Frey B. and Gallus J. (2015). “Towards an Economics of Awards.” Journal of Economic Surveys, doi:10.1111/joes.12127. Gale (2015). ‘Awards, Honors & Prizes.’ 36th Edition. Ginsburgh V. and Noury A.G. (2008). “The Eurovision song contest. Is voting political or cultural?” European Journal of Political Economy, 24 (1), pp. 41–52. Giuliano L., Levine D.I. and Leonard J. (2011). “Racial Bias in the Manager-Employee Relationship: An Analysis of Quits, Dismissals, and Promotions at a Large Retail Firm.” Journal of Human Resources, 46 (1), pp. 26–52. Hamermesh D.S. and Schmidt P. (2003). “The Determinants of Econometric Society Fellows Elections.” Econometrica, 71(1), pp. 399–407. Kaptein M., Castaneda D., Fernandez N., and Nass C. (2014). “Extending the Similarity-Attraction Effect: The Effects of When-Similarity in Computer-Mediated Communication.” Journal of Computer-Mediated Communication, 19(3), pp. 342–357. Markham S.E., Dow Scott K., and McKee G.H. (2002). “Recognizing Good Attendance: A Longitudinal, Quasi-Experimental Field Study.” Personnel Psychology, 55 (3), September 2002, pp. 639–660. McFadden D.L. (1974). “Conditional logit analysis of qualitative choice behavior”. In Frontiers in Econometrics, ed. P. Zarembka, pp. 105–142. New York: Academic Press. Montoya R.M., Horton R.S., Kirchner J. (2008). “Is actual similarity necessary for attraction? A metaanalysis of actual and perceived similarity.” Journal of Social and Personal Relationships, 25(6), pp. 889–922. Morton R.B. and Ou K. (2015). “The Secret Ballot and Prosocial Behavior.” Working paper. Neckermann S., Cueni R. and Frey B.S. (2014). “Awards at work.” Labour Economics, 31, pp. 205– 217. Obstfeld M. and Rogoff K. (2000). “The Six Major Puzzles in International Macroeconomics: Is There a Common Cause?” NBER Working Paper, No. 7777. Parsons C.A., Sulaeman J., Yates M.C. and Hamermesh D.S. (2011). “Strike Three: Discrimination, Incentives, and Evaluation.” American Economic Review, 101(4), pp. 1410–35. Prendergast C. (1999). “The Provision of Incentives in Firms.” Journal of Economic Literature, 37 (1), pp. 7–63. Redmond24 (2016). “A Question about Stephen Curry's unanimous MVP award by a non NBAfollower”. Retrieved on 03/09/2016 https://m.reddit.com/r/nba/comments/4j71gt/a_question_about_stephen_currys_unanimous_mvp/ Smith K. (2015). “How Biased Is The Ballon d’Or Voting? Bias In Ballon d’Or Voting Revealed”, Soccerladuma, January 14, 2015. Retrieved on 04/09/2016 at http://webcache.googleusercontent.com/search?q=cache:GrgAULpDYYJ:www.soccerladuma.co.za/news/articles/categories/generic/bias-in-ballon-d-or-votingrevealed/197939+&cd=1&hl=en&ct=clnk&gl=ua 14 Stoll M.A., Raphael S., and Holzer H.J. (2004). “Black Job Applicants And The Hiring Officer’s Race”, Industrial And Labor Relations Review, 57 (2), pp. 267–287. Tsay C.-J. (2013). “Sight over sound in the judgment of music performance.” Proceeding of the National Academy of Sciences. www.pnas.org/cgi/doi/10.1073/pnas.1221454110. Van Ours J. and Ginsburgh V. (2003). “Expert opinion and compensation: evidence from a musical competition.” American Economic Review, 93 (1), pp. 289–296. Volkskrant (2011). “Per ongeluk ongeldig gestemd”. January 12, 2011. Retrieved on 03/09/2016 http://www.volkskrant.nl/archief/per-ongeluk-ongeldig-gestemd~a1823539/ Webster S.W. and Pierce, A.W. (2015). “Older, Younger, or More Similar? The Use of Age as a Voting Heuristic.” Working paper. Zitzewitz E. (2006). “Nationalism in Winter Sports Judging and Its Lessons for Organizational Decision Making.” Journal of Economics & Management Strategy, 15 (1), pp. 67–99. 15 Table I: Conditional Logit Regression of being selected as best player on similarity links and candidate characteristics All Jury members (I) National Team Nationality Continent Media (II) Coaches (III) Coaches (IV) Captains (V) Captains (VI) Captains (VII) 9.290*** (3.12) 5.027*** (1.07) 1.293*** (0.1) 24.352*** (12.12) 3.033*** (0.9) 1.126 (0.18) 0.779 (0.13) 0 0 23.730*** (9.24) 25.209*** (9.73) 15.370*** (6.49) 9.016*** (4.19) 1.479*** (0.22) 15.842*** (6.99) 5.034*** (1.23) 1.108 (0.14) 1.559*** (0.23) 1.600*** (0.24) 0.591*** (0.08) 0.757* (0.12) 1.563*** (0.23) 0.594*** (0.09) 0.760* (0.13) 4.632*** (2.17) 1.723* (0.54) YES YES YES YES YES YES YES 0.441 46,141 0.482 15,160 0.437 15,545 0.453 10,043 0.425 15,436 0.43 15,052 0.434 14,852 Position Younger League Team Competition Candidate Characteristics R Adj sq. N Standard errors within parentheses. * p<0.10, ** p<0.05, *** p<0.01. We excluded cases where no information on some of the player characteristics were available. Votes of coaches, captains and press representatives are used. Note these regressions are framed as the probability a given candidate gets a vote (not a given jury member votes) as the unit of observation is the candidate. Numbers in the table are odds ratios based on a conditional logit regression. 16 Table II: Conditional Logit Regression of being selected as best player on similarity links and candidate characteristics – by year All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 all years 2011 2012 2013 2014 National Team Nationality Continent Candidate Characteristics R Adj sq. N 9.290*** (3.12) 5.027*** (1.07) 1.293*** (0.1) 1.92 (1.86) 7.372*** (4.53) 1.07 (0.21) 5.908** (4.8) 26.892*** (13.78) 1.381** (0.22) 20.250*** (12.49) 3.726*** (1.51) 1.860*** (0.33) 19.506*** (13.5) 6.422*** (2.89) 2.246*** (0.54) YES YES YES YES YES 0.441 46141 0.677 10098 0.532 11088 0.454 12443 0.476 12512 Standard errors within parentheses. * p<0.10, ** p<0.05, *** p<0.01. We excluded cases where no information on some of the player characteristics were available. Votes of coaches, captains and press representatives are used. Note these regressions are framed as the probability a given candidate gets a vote (not a given jury member votes) as the unit of observation is the candidate. Numbers in the table are odds ratios based on a conditional logit regression. 17 Table III: Variation in the number of similar jury members National Team Nationality Continent Position Younger League team Competition (1) Max links (2) Min links (3) Avg Marg Eff 2 14 190 116 159 4 27 0 1 33 21 2 0 1 0.098 0.057 0.006 -0.01 -0.006 0.054 0.014 The number of links reflects the number of jury members being similar to a candidate on a given criterion. The average marginal effects come from table AI (column I for the top 3 similarity variables and column VII for the bottom 4 – the maximum and minimum number of links is based on the corresponding specification) Table IV – Actual versus predicted outcomes in case similarities are set to zero (%) 2011 Predicted Predicted w/o links 2012 Predicted w/o links Predicted w/o links Messi Lionel Ronaldo Cristiano Hernández Xavi 78 7.2 4.3 79.2 6.7 4.1 Messi Lionel Ronaldo Cristiano Iniesta Andres 58.7 17.4 4.1 62.9 16 3.7 2013 Actual Predicted w/o links 2014 Predicted w/o links Predicted w/o links Ronaldo Cristiano Messi Lionel Ribery Franck 30.9 22 30.1 30.6 24.7 29.4 Ronaldo Cristiano Messi Lionel Neuer Manuel 55.7 8.9 10.2 56.3 9.8 10 The numbers in the table are the percentage of jury members who chose a given player as best player. 18 Table V – How are the first and second place choices related – a regression analysis Messi Second Ronaldo First Argentina National Team Portuguese National Team Argentina Nationality Portuguese Nationality European Jury Member South American Jury member Sample Pseudo R² # of observations Ronaldo Second 0.12*** (0.03) . . -0.49** (0.24) 0.47* (0.27) 0.23 (0.17) -0.05 (0.03) -0.05 (0.06) Excluding those voting Messi First 0.02 1210 Messi First Argentina National Team Portuguese National Team Argentina Nationality Portuguese Nationality European Jury Member South American Jury member 0.30*** (0.02) -0.22 (0.23) -2.97 (116.09) -0.04 (0.18) 2.92 (116.09) 0.02 (0.03) 0.04 (0.05) Excluding those voting Ronaldo First 0.08 1460 Standard errors within parentheses. * p<0.10, ** p<0.05, *** p<0.01. We excluded cases where characteristics perfectly predict outcomes. Votes of coaches, captains and press representatives are used. Note these regressions are framed as the probability Messi/Ronaldo gets a vote. Numbers in the table are marginal effects based on a logit regression. 19 Table AIa: Estimates of marginal effects under different scenarios National Team Nationality Continent Column (I) Scenario I Scenario II Messi 0.400 0.841 Ronaldo 0.284 0.745 Xabi 0.021 0.162 Overall Average Marginal Effect 0.098 Messi 0.398 0.747 Ronaldo 0.281 0.625 Xabi 0.020 0.094 Overall Average Marginal Effect 0.057 Messi 0.397 0.455 Ronaldo 0.268 0.316 Xabi 0.019 0.025 Overall Average Marginal Effect 0.006 Column (II) Scenario I Scenario II . . . . . . 0.386 0.297 0.017 0.085 0.382 0.279 0.015 0.009 . . . Estimates are based on the conditional logit regressions of table I 20 0.826 . 0.743 . 0.133 . 0.469 . 0.353 . 0.022 . Column (III) Scenario I Scenario II 0.420 0.900 0.254 0.802 0.017 0.215 0.150 0.416 0.758 0.249 0.586 0.017 0.079 0.056 0.420 0.443 0.248 0.265 0.017 0.019 0.002 . . . . . . . . . Table AIb: Estimates of marginal effects under different scenarios Column (IV) Scenario I Scenario II National Team Nationality Continent Same Position Younger Same League team Messi Ronaldo Xabi Overall Average Marginal Effect Messi Ronaldo Xabi Overall Average Marginal Effect Messi Ronaldo Xabi Overall Average Marginal Effect 0.430 0.243 0.016 Messi Ronaldo Xabi Overall Average Marginal Effect Messi Ronaldo Xabi Overall Average Marginal Effect 0.445 0.257 0.019 0.931 0.842 0.273 Column (V) Scenario I Scenario II 0.388 0.295 0.027 0.193 0.429 0.241 0.016 0.926 0.886 0.393 Column (VI) Scenario I Scenario II 0.388 0.296 0.027 0.206 0.929 0.891 0.397 Column (VII) Scenario I Scenario II 0.385 0.299 0.027 0.210 0.887 0.836 0.292 0.148 0.672 0.459 0.046 0.032 0.432 0.236 0.016 0.458 0.256 0.018 0.002 0.386 0.275 0.024 0.488 0.362 0.037 0.385 0.273 0.024 0.010 0.391 0.217 0.015 0.382 0.277 0.025 0.011 0.418 0.323 0.031 -0.005 0.434 0.251 0.017 0.493 0.365 0.037 0.010 0.305 0.228 0.019 0.413 0.324 0.032 -0.011 0.000 0.000 0.000 0.402 0.317 0.033 -0.029 0.342 0.265 0.025 0.398 0.319 0.034 0.339 0.268 0.026 -0.006 0.385 0.299 0.027 21 0.303 0.231 0.020 -0.010 -0.006 Messi Ronaldo Xabi 0.483 0.364 0.038 0.720 0.629 0.114 Column (IV) Scenario I Scenario II Same League Column (V) Scenario I Scenario II Overall Average Marginal Effect Messi Ronaldo Xabi Overall Average Marginal Effect Column (VI) Scenario I Scenario II Column (VII) Scenario I Scenario II 0.054 0.384 0.298 0.028 0.507 0.409 0.047 0.014 Estimates are based on the conditional logit regressions of table I. 22 Table AII: estimates of marginal effects under different scenarios – year by year National Team Messi Ronaldo Xabi Overall Average Marginal Effect Nationality Messi Ronaldo Xabi Overall Average Marginal Effect Continent Messi Ronaldo Xabi Overall Average Marginal Effect 2011 2012 2013 2014 Scenario I Scenario II Scenario I Scenario II Scenario I Scenario II Scenario I Scenario II 0.779 0.870 0.587 0.882 0.219 0.830 0.086 0.609 0.071 0.126 0.170 0.535 0.308 0.890 0.557 0.953 0.020 0.038 0.063 0.279 . . . . 0.013 0.778 0.067 0.019 0.072 0.962 0.344 0.128 0.584 0.166 0.060 0.057 0.779 0.070 0.020 0.176 0.966 0.830 0.623 0.218 0.306 . 0.197 0.790 0.074 0.021 0.583 0.155 0.057 0.001 Estimates are based on the conditional logit regressions of table II. 23 0.496 0.614 . 0.084 0.554 . 0.056 0.656 0.200 0.077 0.009 0.190 0.212 0.269 . 0.085 0.328 0.402 . 0.022 0.354 0.878 . 0.081 0.492 . 0.162 0.674 . 0.025 Table AIII: OLS Regression of being selected as best player on similarity links and candidate characteristics All Jury members (I) National Team Nationality Continent Media (II) Coaches (III) Coaches (IV) Captains (V) Captains (VI) Captains (VII) 0.177*** (0.04) 0.081*** (0.02) 0.013*** (0.00) 0.222*** (0.06) 0.056*** (0.02) 0.008 (0.01) -0.016*** (0.00) -0.076*** (0.01) 0.205*** (0.05) 0.206*** (0.05) 0.187*** (0.06) 0.108*** (0.03) 0.016*** (0.01) 0.233*** (0.06) 0.079*** (0.02) 0.008 (0.01) 0.021*** (0.01) 0.022*** (0.01) -0.016*** (0.00) -0.009* (0.01) YES YES 0.201 46,141 YES YES 0.212 15,160 YES YES 0.2 15,545 YES YES 0.207 10,043 YES YES 0.196 15,436 YES YES 0.197 15,052 0.021*** (0.01) -0.015*** (0.00) -0.009 (0.01) 0.089*** (0.03) -0.004 (0.01) YES YES 0.198 14,852 Position Younger League Team Competition Candidate Characteristics Jury member Fixed Effects R Adj sq. N Standard errors within parentheses under the coefficient estimates. * p<0.10, ** p<0.05, *** p<0.01. Votes of coaches, captains and press representatives are used. Numbers in the table are coefficients of OLS regression, with robust standard errors clustered at the jury member. 24 Table AIV: OLS Regression of being selected as best player on similarity links and candidate characteristics – by year All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 all years 2011 2012 2013 2014 National Team Nationality Continent Candidate Characteristics Jury member Fixed Effects R Adj sq. N 0.177*** (0.04) 0.081*** (0.02) 0.013*** (0.00) YES YES 0.201 46,141 0.038 (0.07) 0.044 (0.03) 0.004 (0.01) YES YES 0.46 10,098 0.150** (0.07) 0.118*** (0.03) 0.019** (0.01) YES YES 0.276 11,088 0.289*** (0.09) 0.078** (0.03) 0.021*** (0.01) YES YES 0.215 12,443 0.212*** (0.08) 0.093*** (0.03) 0.017*** (0.00) YES YES 0.308 12,512 Standard errors within parentheses. * p<0.10, ** p<0.05, *** p<0.01. Votes of coaches, captains and press representatives are used. Numbers in the table are coefficients of OLS regression, with robust standard errors clustered at the jury membe. 25
© Copyright 2025 Paperzz