The Case of the FIFA`s Best Soccer Player Award

DEPARTMENT OF ECONOMICS AND FINANCE
COLLEGE OF BUSINESS AND ECONOMICS
UNIVERSITY OF CANTERBURY
CHRISTCHURCH, NEW ZEALAND
Biases and Strategic Behavior in Performance Evaluation:
The Case of the FIFA’s Best Soccer Player Award
Tom Coupe
Olivier Gergaud
Abdul Noury
WORKING PAPER
No. 24/2016
Department of Economics and Finance
College of Business and Economics
University of Canterbury
Private Bag 4800, Christchurch
New Zealand
WORKING PAPER No. 24/2016
Biases and Strategic Behavior in Performance Evaluation:
The Case of the FIFA’s Best Soccer Player Award
Tom Coupe1†
Olivier Gergaud2
Abdul Noury3
27 October 2016
Abstract: In this paper, we study biases in performance evaluation by analyzing votes for the
FIFA Ballon d’Or award for best soccer player, the most prestigious award in the sport. We
find that ‘similarity’ biases are substantial, with jury members disproportionately voting for
candidates from their own country, own national team, own continent, and own league team.
Further, we show that the impact of these biases on the total number of votes a candidate
receives is fairly limited and hence is likely to affect the outcome of this competition only on
rare occasions where the difference in quality between the leading candidates is small. Finally,
analyzing the incidence of ‘strategic voting’, we find jury members who vote for one leading
candidate are more, rather than less, likely to also give points to his main competitor, as
compared with neutral jury members. We discuss the implications of our findings for the design
of awards, elections and performance evaluation systems in general, and for the FIFA Ballon
d’Or award in particular.
Keywords: Award; Bias; Voting; Soccer
JEL Classifications: D72, Z2
1
Kyiv School of Economics, UKRAINE; and the Department of Economics and Finance,
University of Canterbury, Christchurch, NEW ZEALAND
2
KEDGE Business School & LIEPP, Sciences Po, FRANCE
3
Department of Politics, New York University-Abu Dhabi, UAE
† Corresponding author is Tom Coupé. Email: [email protected]
1. Introduction
For a long time, economists have been studying incentive mechanisms ranging from monetary
compensation to fringe benefits, and, more recently, awards. For these incentive mechanisms to be
effective in making people work hard, they must reward people for performance rather than luck or
other non-performance-related outputs (see Prendergast, 1999). In practice, however, organizations
often fail to reward for performance only. For example, Bertrand and Mullainathan (2001) show that
how much chief executive officers (CEOs) are paid depends on luck, such as fluctuations in oil prices
or changes in industry performance. Similarly, Blanchard et al. (1994) find that firms use part of their
cash windfalls to increase executive compensation. One of the reasons for such ‘pay without
performance’ (Bebchuk and Fried, 2004) is that the people in charge of evaluating performance are
often biased and take into account factors other than past (or expected) performance when judging
how good performance has been (or is likely to be), or who performed best (or is likely to perform
best).
In this paper, we investigate the size and direction of biases in performance evaluation, by analyzing
the voting process for the most prestigious best player award in soccer, known as the Ballon d’Or
(Golden Ball) 1, awarded annually by the International Federation of Association Football (FIFA).
Many major sports associations use awards to reward and promote performance. FIFA and the
National Basketball Association (NBA), for example, use elections to choose the best athlete of the
year, while the Association of Tennis Players (ATP) and the International Cycling Union (UCI) use
a mathematical formula that gives more weight to victories in more prestigious tournaments than
those gained in lesser competitions. 2
Awards also play an important role for many international organizations, and in some cases giving
an award is a key element of their functioning (see Frey and Gallus, 2015). Awards are a popular
incentive mechanism in various fields of business, particularly where information asymmetry is
important, ranging from advertising and finance to medicine, public affairs, technology, and
transportation. According to Awards, Honors & Prizes (Gale, 2015), a primary source of information
on awards first published 35 years ago, the number of awards has been growing steadily over time,
with the 2015 edition containing references to some 20,000 awards and prizes.
Despite this increase in the importance of awards as incentive mechanisms, economists, and other
social scientists have only recently begun to study the various dimensions of awards. In a workplace
setting, awards have been shown to reduce absenteeism (Markham et al., 2002), improve call center
workers’ performance (Neckermann et al., 2014) and positively affect health worker trainees’ exam
scores (Ashraf et al., 2014). Awards can improve the welfare of both the giver and the receiver. For
the awarding bodies, they can provide a way to promote their values, while awardees can use them
to reveal or signal their talent. Other economic agents who are engaged in the activities covered by
the awards can also benefit as they can bask in the reflected glory and adoration of the receiver (Frey
and Gallus, 2015). Biased decision-making serves to impair these positive impacts, resulting in a
series of undesirable effects, not only reducing incentives for current or future applicants but also
damaging the reputation of the institution organizing the award.
1
According to the FIFA’s ‘rules of allocation’ (2014), this award is ‘bestowed according to on-field performance and
overall behavior on and off the pitch’.
2
In soccer, there is also a best African player award, a best South American player award and a best Asian player award.
L'Equipe, the leading French sports newspaper, and the Guardian, a major UK newspaper, also both publish their own
ranking of the top 100 best soccer players of the year.
1
In this paper, we focus on two kinds of biases in performance evaluation, which have been identified
by the academic literature. First, we evaluate how ‘similarity’ between voters and candidates affects
evaluations. 3 A large body of literature in the social sciences documents that subjects tend to be
attracted to people who are similar to them (see, for example, Montoya et al., 2008). This also extends
to how people evaluate others’ performance. For example, Giuliano et al. (2011) document the
presence of an own-race bias in the behavior of managers of a large retail firm, while Parsons et al.
(2011) show such bias in the behavior of umpires in baseball. Zitzewitz (2006) documents an ownnationality bias in the behavior of judges of various winter sports, while Ginsburgh and Noury (2008)
show that linguistic and cultural proximities between singers and jury members are important
predictors of success in the Eurovision Song Contest.
Similarity biases have been shown to exist, too, in settings where evaluators need to predict
candidates’ future performance, such as in hiring decisions or elections. Combes, Linnemer, and
Visser (2008) find that links between jury members and candidates, such as working at the same
university or the jury member being the Ph.D. advisor of the candidate, matter for the hiring of
economics professors, while Stoll et al. (2004) observe own-race bias in the behavior of hiring agents
in a large sample of US firms. Using experiments, Bailenson et al. (2008) show that voters prefer
candidates with faces similar to their own, especially for unfamiliar candidates. Using survey data,
Caprara et al. (2007) find that voters prefer candidates whose traits they rate most similar to their
own, while Webster and Pierce (2015) show that voters are more likely to vote for candidates of a
similar age to themselves. Similarly, Cutler (2002) shows Canadian voters are more likely to support
candidates with whom they share a common gender, location or language. One possible explanation
of why similarity might matter is that similarity can validate the evaluator’s own characteristics
(Kaptein et al. 2014): in other words, if a candidate with your characteristics wins an award or an
election, your characteristics will be deemed more valuable.
Second, we evaluate the importance of ‘strategic’ behavior of evaluators. Strategic voting occurs
when the ranking of candidates by an evaluator does not correspond to the evaluator’s true
preferences. Several papers provide evidence for the importance of strategic voting. Fujiwara (2011),
for instance, uses Brazilian data to show that lower placed candidates get higher vote shares in the
first round of a two-round election (dual ballot) than when there is only a single-round election,
suggesting that voters vote for the top candidates rather than their preferred candidates in single-round
elections. Similarly, Alvarez et al. (2006) find that in the UK, voters are less likely to support a
political party that is perceived as being unlikely to win. In rank-order elections like the ones used
for the FIFA Ballon d’Or award, voters have an incentive to behave strategically. Indeed, a major
criticism of ranked order voting methods is that they are vulnerable to strategic behavior. Rational
voters can indeed increase the winning chance of their preferred candidate by ranking their direct
competitors as low as possible.
In addition to contributing to the academic literature on performance evaluation, voting behavior and
awards, this paper further adds to the popular debate on biased evaluations in the Ballon d’Or, an
award whose reputation has been marred by scandals and allegations of biased voting. 4 As
information is made public on how each jury member voted, every year articles appear in the popular
3
Other studies have focused on biases not related to similarity. For example, for musical competitions, Van Ours and
Ginsburgh (2003) find that jury members are influenced by the order of appearance of candidates, while Tsay (2013)
documents that judges are influenced more by what they see than what they hear. Similarly, for academic awards,
Hamermesh and Schmidt (2003) find that descriptive characteristics of candidates (such as affiliation or subspecialty)
affect the voters, while in the context of political elections, Berggren et al. (2006) show that a candidate’s perceived
physical beauty matters.
4
There have been allegations of votes being changed ex-post, jury members being influenced by politicians, being
biased against specific countries, being biased in favor of some clubs and being biased against defenders.
2
press that list the unexpected votes of high-profile jury members, accompanied by allegations of both
strategic voting and similarity voting. For example, a Guardian blog article commented: ‘The two
frontrunners were not the only men guilty of tactical voting. The Portugal coach did not rate
[Argentine player] Messi in his top three; the Argentina coach voted for three Argentinians; the Brazil
coach picked [Brazilian player] Neymar; and the Germany manager selected three Germans. The
levels of bias and favouritism in the voting make the Eurovision Song Contest look positively
objective’ (Campbell, 2015). Similarly, a South African soccer news website observed: ‘In fact, it
was only the captains, whose compatriots and teammates weren’t amongst the nominees, that seemed
to make remotely objective votes’ (Smith, 2015).
In this paper, we contribute to this popular discussion by presenting estimates of the impact of
different biases, both on an individual candidate’s chance of being selected as best player by a given
voter, and on the overall outcome of these elections. We find clear evidence that jury members are
much more likely to vote for ‘similar’ candidates, defined here as candidates with whom they share
a national team, nationality, or continent, than for candidates with whom they share no such similarity.
We also show that the size of the bias is larger for closer ties. We document the existence of such bias
in each of the four best player elections we have analyzed, and among all types of voters: coaches,
players, and media representatives. Hence, the recent decision to limit the electorate to media
representatives only (Lacombe, 2016) is unlikely to result in an electoral process free of bias. Most
importantly, however, we find that while some of the biases we detect are sizeable, the overall impact
on election outcomes is limited. This supports FIFA’s decision in 2011 to simplify the electoral
process by allowing jury members to vote for candidates from their own national team.
As far as strategic behavior is concerned, we find some evidence that is consistent with voters
engaging in strategic voting. For example, a sizeable fraction of Messi fans avoid giving points to his
biggest rival, Ronaldo (and vice versa). However, compared with ‘neutral’ voters (who vote for
neither Messi nor Ronaldo), jury members placing Messi first are more, rather than less, likely to put
Ronaldo second. Still, replacing the current ranked order voting system with an alternative system
such as approval voting, which is arguably less vulnerable to strategic behavior, could reduce worries
about strategic biases in FIFA’s Ballon d’Or award.
The rest of this paper is organized as follows. Section 2 presents some background information on
FIFA’s Ballon d’Or award, its rules and participants, and provides data on our similarity variables.
Section 3 analyzes the degree of similarity bias in the Ballon d’Or voting. Section 4 examines the
extent to which this bias affects the outcome of the Ballon d’Or competition. Section 5 analyzes the
degree of strategic voting, and Section 6 provides conclusions.
2. Background and data
Since 2010, FIFA, together with France Football, a popular French soccer magazine, has organized a
best player of the year award, known as the Ballon d’Or (or Golden Ball), with a jury of captains and
coaches of national teams, and media representatives. 5 Each jury member has three votes, worth
respectively one, three and five points for third, second and first place, and they can choose to vote
from 23 pre-selected candidates. 6 The candidate with the highest overall score receives the award.
5
Prior to 2010, France Football organized its own player of the year awards, with voting conducted by journalists, while
FIFA’s equivalent award was voted for by national team coaches (and later also captains). In 2010, these two awards
were merged. We have data for 2010 but, as FIFA rules in 2010 did not allow jury members to vote for candidates from
their own national team, we excluded 2010 from our analysis.
6
We focus on the voting for the 23 candidates. It is not clear how FIFA selects these 23 candidates. We cannot exclude
the possibility that there are already biases at the initial selection stage.
3
From FIFA’s website we collected individual-level voting data for all jury members, for each of the
elections from 2011 to 2014. In addition, we gathered data on various characteristics of each jury
member and candidate.
For each individual, we know which country they represent from the FIFA voting files.
For coaches, using Google searches, we also collected data on nationality. While players for the
national team are by definition nationals of the country they represent, more than 40% of the coaches
manage a team of a country that is different from their country of nationality.
For captains, trainers and candidates, we used Google to search for data on their year of birth and
their position on the field: forward, midfielder, defender, or goalkeeper.
For each of the captains and each of the candidates, we further used Google to collect data on the
team for which they play as well as the country where they play, for the year relevant to the vote.
Finally, for each candidate, we collected data on their popularity using Google Trends (GT). GT is
an online search tool that enabled us to measure how often a specific player had been searched for
over a specific period of time. We used GT scores for the month preceding the election, i.e. October
of the election year as the voting happens in November. We also collected the ‘player rating’, a widely
used performance indicator calculated by WhoScored.com (WS), which states that its ratings are ‘the
most accurate, respected and well-known performance indicators in the world of football [soccer]’.
Rating is a variable that ranges from 6 to 10 and is calculated based on a large number of raw statistics
and weighted according to their influence within the game. Although the GT scores and Rating are
correlated, they capture different dimensions of candidate quality. Whereas the GT score is a measure
of popularity, Rating measures performance regardless of how popular a player is.
Based on the data we collected, we created variables that reflect similarity 7 with respect to:
•
•
•
•
•
•
•
National team (given a value of 1 if jury member and candidate are affiliated with the same
national team and 0 if otherwise. In about 0.6% of candidate-jury member pairs, the jury
member and candidate share the same country)
Nationality (1 if jury member and candidate share nationality, 0 otherwise. In about 1.3% of
candidate-jury member pairs, the jury member and candidate share the same nationality)
Continent (1 if jury member and candidate share continent, 8 0 otherwise. In about 29% of
candidate-jury member pairs, the jury member and candidate share the same continent)
Competition (1 if jury member and candidate play in the same soccer competition, 0
otherwise. In about 6% of candidate-jury member pairs, the jury member and candidate play
in the same competition)
League team (1 if jury member and candidate play for the same league team, 0 otherwise. In
about 0.1% of candidate-jury member pairs, the jury member and candidate play in the same
competition)
Position on the field (1 if jury member and candidate share position, 0 otherwise. In about
27% of candidate-jury member pairs, the jury member and candidate play the same position)
Younger (1 if the jury member is younger than the candidate, 0 otherwise. In about 19% of
candidate-jury member pairs, the jury member is younger than the candidate)
7
We focus on interactions of given characteristics of jury members and candidates—for example, do older jury
members select older candidates? We do not consider interactions between different characteristics, such as whether or
not older jury members choose goalkeepers more often than other players. In the regressions in this paper we control for
candidate characteristics. That is, jury members could vote for better players. We do not include voter characteristics,
however, as such effects would be candidate-specific, for example, do older voters prefer Messi.
8
We use nationality to determine continent.
4
3. Similarity biases in performance evaluation
The data we use consist of choices made by various jury members in the FIFA Ballon d’Or
competition. Each jury member has a choice of 23 candidates and has to select one candidate as ‘best’
player. 9 This means that for each jury member, we have 23 observations, together representing a
single vote. Given that there are approximately 500 jury members and we use data from four Ballon
d’Or elections, we have around 47,000 observations in total.
In this section, we focus on the top choice made by voters. Each observation of our dependent variable
consists of a value for a binary dependent variable: that is, 1 if the candidate is selected as the best
player, and 0 otherwise. Our set of explanatory variables describes how the jury member and the
candidate are related. Having a binary dependent variable calls for the use of non-linear regression
methods such as logit or probit to make sure that the predicted probabilities lie within the zero–one
interval. A drawback of these non-linear methods is that the marginal effects of the explanatory
variables on the probability of being selected as best player depend on the values of all explanatory
variables. As a consequence, the marginal effect of a similarity link will be different for Messi than
for Ronaldo, for instance.
It is important to realize that the 23 observations of each vote are not independent from one another.
When a jury member chooses one candidate as best player, by definition none of the other 22
candidates can then be chosen as best player. A model that captures this feature is McFadden’s
Discrete Choice model (1974) which can be estimated by the (alternative specific) conditional logit
model. Note that this further complicates the interpretation of the marginal effects: anything that
increases the chance of one candidate being selected will simultaneously decrease the chance of other
candidates being selected. For example, assigning Messi an extra (positive) similarity link with a jury
member will increase his chance of winning, but will decrease Ronaldo’s and other candidates’
winning chance.
The discussion above suggests that using the correct statistical model comes at the cost of ease of
interpretation: The (alternative specific) conditional logit model provides us with odds ratios that have
the right properties but are not easy to grasp, while models that are easy to interpret are likely to be
less accurate. For example, given the binary nature of the dependent variable, we could estimate a
linear probability model (LPM) with robust standard errors. The advantage of doing so is that LPM
is very easy to interpret, with coefficients reflecting how the probability of being selected as best
player changes as one changes the value of a single explanatory variable. However, we know that
several of the assumptions behind LPM will be violated: First, LPM assumes observations are
independent, which in our case does not hold, as explained above. Second, LPM does not restrict
predicted values to be in the zero–one interval.
As a solution to the above problem, we provide in this paper the results of a conditional logit analysis,
present the odds ratios and specific examples to illustrate the economic significance of these odds
ratios, and in the appendix provide the results of the LPM model. Overall, both models lead to the
same conclusions, but we point out if, where, and when the results differ.
In our basic specification, we use three similarity variables: National Team, Nationality, and
Continent. As control variables, we further include alternative specific variables including age of the
candidate, his Google Trends score, his WhoScored.com (WS) rating, and his nationality (captured
by a country dummy). Ideally, we would include alternative specific dummies to capture all
characteristics of the candidates. However, this sometimes leads to non-convergence of the maximum
likelihood estimation procedure. Therefore, for comparability across specifications, we show the
9
They also choose one candidate as second best, and another one as third best. Section 5 analyzes the vote for the
second and third place.
5
results of regressions with a fixed set of candidate characteristics. The bias introduced by this is likely
to be small: for the cases where including candidate-specific dummies did lead to convergence, we
get results very similar to those when we use the set of candidate characteristics rather than the
candidate-fixed effects.
Column 1 of Table I presents the results of a regression using the votes from all jury members
regardless of their types. Column 2 presents the results of a regression using votes from the media
representatives only. Since these jury members are not affiliated with the national team, they can only
share the same nationality and the same continent with candidates. Column 3 presents the results of
a regression using votes from coaches. As coaches can be of a different nationality from the national
team they train, we can separate the effect of belonging to the same national team from the effect of
sharing nationality. Column 5 presents the results of a regression using the votes of captains only.
Since captains have the nationality of the national team, one cannot distinguish between the impact
of being players of the same national team and sharing nationality, and hence the coefficient of the
National Team in this regression also incorporates the effect of Nationality.
Columns 4, 6, and 7 show the results of including additional similarity variables as explanatory
variables. For coaches and captains, we have dummies that indicate whether or not jury members and
candidates share the same position on the field, and whether or not the jury member is younger than
the candidate (columns 4 and 6). In addition, for captains, we have dummies for sharing the same
league team with the candidate and playing in the same competition (league) as the candidate (column
7).
[Table I here]
Column 1 shows that the odds to be voted best player by a given jury member are about 9.3 times
higher if a player plays for the same national team as the jury member, compared with a case where
the player is not playing for the same national team as the jury member. This effect is statistically
significant but to gauge how sizeable it is we compute, for all candidates, the expected chance to be
selected as best player by a jury member under two opposing scenarios. For both scenarios, we
assume that all other candidates keep their actual sample values. For the first scenario, we assume the
candidate under consideration has no similarity link with the jury member. For the second scenario,
we assume the candidate under consideration has a similarity link to the jury member. Comparing
these two scenarios for a given candidate gives us, for that candidate, the average (across observations
for that candidate) marginal effect of moving from not being similar to being similar on a given
dimension with a given jury member. 10 In the text, we discuss these average marginal effects for three
players: Messi, Ronaldo, and Xabi Alonso. 11 We also discuss the overall average marginal effect,
which is the average of these candidate-specific ‘average’ marginal effects across all candidates. 12
10
Note that while for each individual vote this marginal effect will correspond to the odds ratio, the average of these
marginal effects (which we use here) does have an odds ratio that is similar in magnitude but not exactly the same,
because the average of a ratio is not the ratio of the averages. Alternatively, one could compute the marginal effects at
the ‘average’ values of the dependent variables but, as this average player would be a player with several different
nationalities, it makes little sense to do this.
11
Messi and Ronaldo were the main candidates in the four elections we have studied, while Xabi Alonso is an example
of a candidate with a relatively low likelihood of winning.
12
Note we use an unweighted average and do not weigh a candidate’s (average) marginal effect by the number of
observations this candidate represents in the sample.
6
Table AI and AII in the appendix give the values for these statistics, for the different regressions we
present below.
Starting with the impact of the jury member and the candidate being affiliated with the same national
team, we find that under the first scenario, Messi would get 40% of the votes as best player, while
under the second scenario, he would get 84% of the votes. Hence, the marginal effect of representing
the same national team is 44 percentage points for Messi. Ronaldo would get 28% under the first
scenario, and 74% under the second scenario, giving a marginal effect of 46 percentage points. Xabi
Alonso would get 2% under the first scenario, and 16% under the second scenario: a marginal effect
of 14 percentage points. These examples show that the effect of being affiliated with the same national
team as a jury member is sizeable. Not surprisingly, the percentage points difference between the two
scenarios tends to be bigger for the candidates who, even without being similar to the jury member,
have a higher chance of getting the best player vote. The overall average marginal effect, which is
the average marginal effect of candidates, averaged across all candidates, is 9.8 percentage points.
Further, column 1 of Table I shows that belonging to the same national team is not the only variable
that matters, but also that candidates benefit substantially from sharing nationality with a jury
member. The odds to be voted best player by a given jury member are about five times higher if a
candidate shares his nationality with the jury member, than if the candidate and jury member are of
different nationalities. Using the scenarios described above, for Messi, not sharing nationality with a
jury member would get him 40% of the votes, compared with 75% in the case of sharing nationality.
For Ronaldo, these percentages would be 28% and 65% respectively, while for Xabi Alonso, these
percentages would be 2% and 9%. The overall average marginal effect of sharing nationality is about
6 percentage points, which is smaller than the average marginal effects across candidates who play
for the same national team (which we found to be about 9.8 percentage points), but is still a sizeable
effect.
Finally, the impact of candidate sharing continent with a jury member is also positive and significant,
though the odds ratio is fairly small. The odds to be voted best player by a given jury member are
about 29% higher if a candidate shares his continent with the jury member, compared to the case
where the candidate is not sharing a continent with the jury member. Using the scenarios described
above, for Messi, not sharing his continent with a jury member would get him 40% of the votes,
compared with 45% in the case of sharing a continent with the jury member. For Ronaldo, these
percentages would be 27% and 32% respectively, while for Xabi Alonso, these percentages would be
2% and 2.5%. While in relative terms these are still sizeable impacts, for candidates with a low chance
of being selected as the best player, in the case of absence of similarity with the candidate the absolute
change in probability is small. The overall average marginal effect across candidates is about 0.6
percentage points.
So far, we have assumed all jury members behave in a similar way. Next, we allow for differences
between media representatives (column 2), coaches (column 3), and captains (column 5). All types
of jury members show biases, though the magnitude of specific biases somewhat varies across types
of voters. For coaches we can distinguish between the three types of similarity and confirm the
significant and sizeable impact of sharing nationality (odds ratio approximately 5) and national team
(odds ratio approximately 16), while the impact of sharing continent is again positive (odds ratio
approximately 1.1) but insignificant at the 10% significance level. For captains and media
representatives, we find significant effects of sharing continent (odds ratios of approximately 1.5),
and significant effects of sharing nationality (odds ratio of approximately 9 for media representatives)
and of sharing national team/nationality (which cannot be separated for captains and has an odds ratio
of approximately 23). Note the need to be careful when comparing odds ratios across equations as
the denominators, the probabilities under the first scenario, can be quite different across equations.
7
Tables AIa and AIb in the appendix provide the percentage of votes for the three above-mentioned
players under the two scenarios for the different columns of Table I. It shows that in our case the
denominators (the first scenario) are fairly similar across equations, and hence that rough comparisons
of odds ratios across specifications can be informative. The estimate of the average marginal effect
of being part of the same national team varies (across columns of table AI) between 9.8 and 19.3
percentage points, of having nationality in common between 3.2 and 8.5 percentage points, and of
sharing a continent between 0.2 and 1.1 percentage points. This suggests that the effect of similarity
is bigger if this similarity is ‘geographically’ closer, which intuitively makes sense.
Columns 4 and 6 add two more similarity indicators: one indicator that is 1 if the jury member is
younger than the candidate, and 0 otherwise, and one indicator that is 1 if the candidate and the jury
member play on the same position on the field, and 0 otherwise. We find both variables have a
negative effect on the chance of a candidate being selected as the best player, but only significantly
so when we use the votes of captains. When voted for by jury members who are captains, the odds of
being selected as best player when the jury member is younger than the candidate is approximately
75% of the odds when the jury member is older than the candidate. Similarly, the odds of being
selected as best player when the candidate plays the same position on the field as the jury member is
approximately 60% of the odds when the candidate is playing a different position than the jury
member. Hence, we find that captain jury members have a tendency both to avoid voting for
candidates who are older than they are themselves and to avoid voting for players who are playing in
the same position as they did. Table I in the Appendix illustrates these effects by showing that Messi
would get 42% of votes of jury members who are not forwards while only 31% of votes of jury
members who are forwards (based on column 6). For Ronaldo, who is also a forward, these
percentages are 32% versus 23% respectively. Xabi Alonso, a midfielder, in contrast, would get about
3% of the votes of jury members who are not midfielders, and 2% of votes from jury members who
play as midfielders. The marginal effects for the younger dummy are similar in nature but somewhat
smaller.
Finally, column 7 adds two more similarity variables to the model that analyzes the voting behavior
of the captains. We add dummies for jury members and candidates who play in the same league team,
and for those playing in the same league. Adding these variables reduces the odds ratio of sharing the
national team/nationality and shows a sizeable advantage of playing for the same league team (odds
ratio of approximately 4.6) and playing in the same league (odds ratio of approximately 1.7). For
Messi, this implies that he will get 72% of the votes of Barcelona jury members, compared with
38.5% of non-Barcelona jury members. Similarly, while approximately 51% of jury members in the
Spanish League would vote for Messi, only approximately 38% of jury members active in other
leagues would do so. Table AI in the Appendix further shows that 11% of Real Madrid jury members
would select Xabi Alonso, compared with approximately 3% of non-Real Madrid jury members.
Similarly, while approximately 5% of jury members in the Spanish League would vote for Xabi
Alonso, only approximately 3% of jury members active in other leagues would do so.
[Table II here]
In the above regressions, we pooled the data from four best player elections. Next, we run our basic
regression using one election at a time. Table II below repeats column 1 of Table I but then splits the
sample by year in columns 2 to 5. The direction of the effects in each year is the same as in the
regression that pooled data from all years, though some coefficients are not significant in 2011. This
could be because of the fact that in 2010 votes for national teammates had been forbidden and that
8
the votes of jury members who had voted for their national team members anyway had been made
invalid (Volkskrant, 2011), which could have led to uncertainty about whether or not voting for
teammates was allowed in 2011. Over time, there is indeed an increase in the share of best player
votes going to candidates linked by national team or nationality. Moreover, 2011 had very
concentrated voting, with 78% of all first-place votes going to Messi. The other years, 2012, 2013,
2014, all give significant effects, but magnitudes of the odds ratios do vary across years. In 2011 and
2012 the odds ratio of shared nationality is bigger than the odds ratio of shared national team, while
in 2013 and 2014 the opposite is true. Table IIa gives the corresponding estimates of the average
marginal effects.
Tables AIII and AIV provide the results of OLS regressions with jury member fixed effects, clustered
and robust standard errors, and the same set(s) of explanatory variables as used in Tables I and II.
Overall, qualitative results are similar with the relative ordering of the impact of various similarities
typically being maintained, though significance differs somewhat for the similarities in terms of age
and position and the ordering of impact does not change across year-specific regressions. The biggest
difference is in the impact of playing in the same competition, which is significant and positive based
on the conditional logit analysis, but insignificant based on the OLS analysis. Given that many of the
OLS model assumptions are violated by the data, the conditional logit results are more convincing,
however.
4. The overall impact of similarity biases
While the impact of the biases on the chance a given candidate gets selected as best player by a given
jury member is sizeable, this does not necessarily mean the biases have a sizeable impact on the
overall election outcome. Indeed, as Table III shows, the variation in the number of jury members
with which a candidate shares a given feature is fairly limited for some dimensions and is inversely
related to the extent of bias.
[Table III here]
For example, while the impact of having a shared national team on the probability a given candidate
receives a vote from a given jury member is sizeable, all candidates typically have only two potential
jury members from their national team: the captain and the coach of the national team. Given there
are several hundred jury members, this means that the overall impact on the election outcome must
be small. There is more variation across candidates in the number of jury members linked through
the continent, but as column 3 shows, the effect of being linked through continent is small.
To illustrate the overall impact of the similarities, we compare the actual percentage of votes received
by a top-three player of a given year with the predicted percentage of best player votes in a scenario
where jury members and candidates would not be linked by nationality, national team or continent.
That is, we simultaneously set all linkages for all candidates to zero in the year-specific models (the
various columns of Table II). 13
[insert Table IV here]
13
This is different from the player-specific marginal effects we computed earlier by varying a single similarity
dimension for a single candidate from one to zero, while keeping the values for all other candidates as they were in the
sample.
9
As Table IV shows, the differences between actual outcomes and predicted outcomes based on the
zero-links scenario are small, showing that similarities, while affecting individual jury members, do
not affect in any meaningful way who gets most votes as best player.
This suggests that only if there is very little consensus on who is the best player, could similarities
affect the election outcome. As an example, in 2014 Ronaldo was a clear winner, with 37.66% of the
overall vote count, compared with Messi who took 15.76% of the vote count. German player Manuel
Neuer was third, with 15.72% of the vote count. 14 While the difference between the number one and
two was clearly too big to be caused by biased voting, biases could potentially be sufficiently large
to affect who won second place, though in this case, as Table V suggests, Messi’s lead would have
been bigger if there had not been any nationality, national team or continent similarity links.
5. Strategic voting
Besides biased voting, where jury members use criteria other than candidate quality to define their
preferred candidate, jury members can also behave strategically when voting. For example, jury
members can make choices that benefit their preferred candidate by not voting for that candidate’s
direct competitors. For example, supporters of Ronaldo will maximize their support if they do not
only rank him first (thus allocating him 5 points) but also do not put Lionel Messi, Ronaldo’s arch
rival, in second or third place (which receive respectively 3 points and 1 point), even if they did think
Messi was the second or third best player.
In fact, of those who are ranking Ronaldo first, 45.93% put Messi second and another 18.81% put
Messi third—hence almost 65% of Ronaldo fans give points to Messi. Similarly, about 57.89% of
Messi fans put Ronaldo second, and another 17.11% put Ronaldo third—hence, almost 75% of Messi
fans award points to Ronaldo.
Our conclusion about strategic voting thus depends on how strictly one defines strategic voting: We
find that about 65% of Ronaldo fans do give points to Messi, so we could argue that 35% of Ronaldovoting jury members vote strategically by not giving points to Messi. Similarly, one could argue that
25% of Messi jury members vote strategically by not giving points to Ronaldo. However, if it is
assumed that one can rationally have different evaluations of soccer quality, then one should compare
the behavior of Messi and Ronaldo voters with the voting behavior of the quarter of jury members
who do not put either of these players in the top place—jury members whom we will call ‘neutral’
jury members. Of these neutral jury members only 53.85% put Messi in the top three and only 51.44%
put Ronaldo in the top three—suggesting, if anything, that voting for Messi goes together with voting
for Ronaldo and vice versa. That is, there are many jury members who are choosing quality above all
and thus give both contenders points. At the same time, in relative terms, Ronaldo fans are less
generous to Messi (65% voting for Messi, relative to 53.8% of ‘neutral’ jury members) than Messi
fans are to Ronaldo (75% vote for Ronaldo, versus only 51.44% of ‘neutral’ members). However,
jury members who are fans of both star players are more likely to vote for the other star player than
are ‘neutral’ jury members.
Table V checks this comparison more formally, using regression analysis. We first analyze the choice
for second place made by jury members who did not vote for Messi first and compare the choice of
those jury members who put Ronaldo first with the choice of those who put neither Ronaldo nor
Messi first. We then repeat a similar regression, analyzing the choice for second place of jury
members who did not vote for Ronaldo first, and compare the choice of those who put Messi first
with that of those who put neither Ronaldo nor Messi first.
14
The vote count combines first, second, and third places while Table V focuses only on the first place vote share.
10
[Table V about here]
As control variables, we use dummies for jury members affiliated with Messi’s (Ronaldo’s) national
team, for jury members who are Argentinian (Portuguese), and for jury members who are South
American (European). We find that Ronaldo fans are 12% points more likely than neutral fans to put
Messi second, while Messi fans are 30% points more likely than neutral fans to put Ronaldo second.
As far as control variables are concerned, we find, for the sample that excludes jury members who
put Messi first, that members of the Portuguese national team are less likely to put Messi second,
while Argentinian nationals are more likely to put Messi second.
6. Conclusion
This paper analyzes in detail the determinants of the vote for the most popular soccer award, the FIFA
Ballon d’Or, thereby adding to the expanding literature on the economics of awards (see Frey and
Gallus, 2015). This paper also adds to the literature on the provision of incentives by analyzing biases
in performance evaluation, and to the literature on voting behavior by analyzing biased and strategic
voting. In addition, this paper contributes to the popular debate about FIFA’s Ballon d’Or by
providing empirical estimates of the extent and origins of biased voting in the elections for this award.
We show that voting for the FIFA Ballon d’Or is indeed subject to sizeable biases, and thus provide
support to the academic literature on the importance of ‘similarity’ as a determinant of biased
performance evaluations or biased voting behavior. Our results suggest that closer ties between jury
members and candidates lead to bigger biases. Our basic specification, for example, suggests that a
candidate who is affiliated with the same national team as the jury member will, on average, be about
10 percentage points more likely to be chosen as best player by that jury member than a candidate
not affiliated with the same national team as that jury member. It also suggests that a candidate who
has the same nationality as the jury member will, on average, be about 6 percentage points more likely
to be chosen as best player by that jury member than a candidate of a different nationality from the
jury member. Sharing a continent with a jury member, in contrast, only leads to a 0.6 percentagepoint difference.
Further, our results suggest that all types of voters (players, coaches and media representatives) are
affected by these biases and that such biases were present in all four elections years we analyzed. In
September 2016, France Football, the French magazine that between 2010 and 2015 co-organized the
Ballon d’Or elections with FIFA, announced it would stop cooperating with FIFA and revert to its
pre-2010 independent Ballon d’Or award and the pre-2010 award procedure, which only allows
media representatives to vote. This decision was motivated by ‘the hope that the award will gain
impartiality as journalists do not have fellow team members to defend nor a dressing room to keep
happy, while certain team captains or coaches might show their friendship or their desire to keep
social peace’ (Lacombe, 2016). 15 Our results suggest, however, that media representatives, too, have
reasons to vote for candidates of the same nationality or continent as themselves.
We do not find, however, that similarity biases are likely to affect the overall outcome of the FIFA
best player elections, as for those similarity dimensions for which the bias is sizeable, candidates tend
to have similarity links with only a small number of jury members. For those similarity dimensions
on which candidates tend to be linked with many jury members, the size of the bias is small or the
variation in the number of links across candidates is small. The fact that in 2011 FIFA changed the
15
Translated from the original French.
11
voting rules by allowing voters to vote for candidates affiliated with their own national team is thus
unlikely to have affected the election outcome in any meaningful way.
The experience of the FIFA Ballon d’Or award thus provides an interesting lesson for designers of
awards, elections or performance evaluation systems (and for reporters covering such systems).
Having a rule against biased (national team) voting, as existed in 2010, led to some votes being
invalidated, as voters had not been aware of the rule. At the same time, the impact of such national
team bias on who won the award would have been negligible. Hence, the designers of bias-proof
electoral systems should be aware that there is a cost–benefit trade-off: Complicating electoral rules
to avoid biases only makes sense if the advantages in terms of reducing biases outweigh the costs in
terms of confusing voters. Our results suggest that even while biases can be sizeable, voting
procedures can be fairly robust against such biases. If the number of voters is sufficiently large and
the degree to which candidates benefit from biases is small, or the variation across candidates in the
degree to which they benefit from such biases is limited, the cost of trying to avoid biases might be
bigger than the benefits.
In this paper, we also investigate whether jury members behave strategically by not giving points to
the direct competitors of their most preferred candidate. While we find that a sizeable number of jury
members vote for Messi but not Ronaldo (and vice versa), compared with ‘neutral’ voters, jury
members who vote for Messi (Ronaldo) are more, rather than less, likely to vote for Ronaldo (Messi).
To reduce strategic voting, or at least the suspicion that it affects the voting outcome, FIFA (and
France Football since both will offer their on award from 2016 onwards) might want to consider
changing the current rank-order vote to an electoral process that is based on approval voting. The
theoretical literature indeed suggests that approval voting, whereby voters can approve any number
of candidates but do not rank them, is less sensitive to strategic voting (see Brams and Fishburns,
1978).
It would be interesting to investigate the extent to which our findings are influenced by the fact that
the votes are made public. One could speculate that because jury members know their votes will be
made public, they are pushed to vote in a less biased way. This argument was indeed one of the
motivations for the Professional Basketball Writers Association to make votes public in 2014. In
2013, player LeBron James missed an MVP sweep because one voter voted for an outsider, something
the Association wanted to prevent from happening again (Draper, 2014). However, when fellow
player Stephen Curry did realize such a sweep in 2016, one commentator on Reddit (Redmond24,
2016) argued exactly the opposite, that voters no longer dare to deviate and ‘risk the internet’s wrath
upon themselves’. There is an emerging literature in political science and political economy that
compares theoretically and empirically the results of secret and public ballots. Morton and Ou (2015),
for example, using experimental data, compare the effects on voters’ electoral choices. They find that
when voting is public, individuals are significantly more likely to make ethical rather than selfish
choices. This suggests that, if Ballon d’Or votes had not been made public, biases in voting for the
award would have been bigger than the ones we document here.
Finally, FIFA and France Football could reduce controversy over the Ballon d’Or award in the future
by following the example of the International Cycling Union or the International Tennis Federation.
These sports organizations created a ranking system that is based not on elections but rather on a
mathematical equation that aggregates various performance statistics. Similarly, FIFA and France
Football could use statistics of player performance in international competitions to rank players.
While such ranking methodology would be controversial at the development stage, eventually it could
become widely accepted and debates at the time of the annual award might diminish. After all,
discussing statistical methodology is much less fun than discussing how strangely this or that famous
person voted.
12
References
Alvarez R.M., Boehmke F.J. and Nagler J. (2006). “Strategic Voting In British Elections.” Electoral
Studies, 25(1), pp. 1–19.
Ashraf N., Bandiera O. and Lee S.S. (2014). “Awards unbundled: evidence from a natural field
experiment.” Journal of Economic Behavior & Organization, 100. pp. 44–63.
Bailenson J.N., Iyengar S., Yee N. and Collins N.A., (2008) “Facial Similarity between Voters and
Candidates Causes Influence.” Public Opinion Quarterly, 72(5), pp. 935–961.
Bebchuk L.A. and Fried J.M. (2004). “Pay without Performance, The Unfulfilled Promise of
Executive
Compensation,
Part
II:
Power
and
Pay.”
Retrieved
at
http://www.law.harvard.edu/faculty/bebchuk/pdfs/Performance-Part2.pdf
Berggren N., Jordahl H. and Poutvaara P. (2006) “The Looks of a Winner: Beauty, Gender and
Electoral Success.” IZA Discussion Paper No. 2311. Available at SSRN:
http://ssrn.com/abstract=933639
Bertrand M. and Mullainathan S. (2001). “Are CEOs Rewarded For Luck? The Ones Without
Principals Are.” The Quarterly Journal of Economics, pp. 901–932.
Blanchard O.J., Lopez de Silanes F. and Shleifer A. (1994), “What do firms do with cash windfalls?”
Journal of Financial Economics, vol. 36, pp. 337–360.
Brams S. and Fishburn P. (1978). “Approval Voting.” The American Political Science Review,
72(3), pp. 831-847.
Campbell, P. (2015), “The strange world of Ballon d'Or voting: starring Ronaldo, Messi and
Mascherano.” The Guardian, January 13 2015. Retrieved on 04/09/2016 at
http://www.theguardian.com/football/blog/2015/jan/13/strange-ballon-dor-voting-cristiano-ronaldolionel-messi-javier-mascherano
Caprara1 G.V., Vecchione M., Barbaranelli C. and Fraley C.R. (2007) “When Likeness Goes with
Liking: The Case of Political Preference.” Political Psychology, 28(5), pp. 609–632.
Cutler F. (2002) “The Simplest Shortcut of all: Sociodemographic Characteristics and Electoral
Choice.” Journal of Politics, 64(2), pp. 466-90.
Combes P.P., Linnemer L. and Visser M. (2008). “Publish or peer-rich? The role of skills and
networks in hiring economics professors.” Labour Economics, 15(3), pp. 423–441.
Draper K. (2014). “Increased Transparency Has Revealed that Awards Voting is More Broken Than
We Thought.” The Diss, April 23, 2014.
Lacombe R. (2016). “Ballon d’Or: Retour a la Maison.” France Foot, September 20, 2016.
Fujiwara T. (2011). “A Regression Discontinuity Test of Strategic Voting and Duverger’s Law.”
Quarterly Journal of Political Science, 2011, 6: 197–233.
FIFA
(2014).
“Rules
of
Allocation.”
Retrieved
on
03/09/2016
at
http://resources.fifa.com/mm/document/ballon-dor/playeroftheyear-men/02/46/27/12/rulesofallocation2014en_neutral.pdf
13
Frank, R. and Cook P.J. (1995). The Winner-Take-All Society, New York: Martin Kessler Books at
The Free Press, 1995.
Frey B. and Gallus J. (2015). “Towards an Economics of Awards.” Journal of Economic Surveys,
doi:10.1111/joes.12127.
Gale (2015). ‘Awards, Honors & Prizes.’ 36th Edition.
Ginsburgh V. and Noury A.G. (2008). “The Eurovision song contest. Is voting political or cultural?”
European Journal of Political Economy, 24 (1), pp. 41–52.
Giuliano L., Levine D.I. and Leonard J. (2011). “Racial Bias in the Manager-Employee Relationship:
An Analysis of Quits, Dismissals, and Promotions at a Large Retail Firm.” Journal of Human
Resources, 46 (1), pp. 26–52.
Hamermesh D.S. and Schmidt P. (2003). “The Determinants of Econometric Society Fellows
Elections.” Econometrica, 71(1), pp. 399–407.
Kaptein M., Castaneda D., Fernandez N., and Nass C. (2014). “Extending the Similarity-Attraction
Effect: The Effects of When-Similarity in Computer-Mediated Communication.” Journal of
Computer-Mediated Communication, 19(3), pp. 342–357.
Markham S.E., Dow Scott K., and McKee G.H. (2002). “Recognizing Good Attendance: A
Longitudinal, Quasi-Experimental Field Study.” Personnel Psychology, 55 (3), September 2002, pp.
639–660.
McFadden D.L. (1974). “Conditional logit analysis of qualitative choice behavior”. In Frontiers in
Econometrics, ed. P. Zarembka, pp. 105–142. New York: Academic Press.
Montoya R.M., Horton R.S., Kirchner J. (2008). “Is actual similarity necessary for attraction? A metaanalysis of actual and perceived similarity.” Journal of Social and Personal Relationships, 25(6), pp.
889–922.
Morton R.B. and Ou K. (2015). “The Secret Ballot and Prosocial Behavior.” Working paper.
Neckermann S., Cueni R. and Frey B.S. (2014). “Awards at work.” Labour Economics, 31, pp. 205–
217.
Obstfeld M. and Rogoff K. (2000). “The Six Major Puzzles in International Macroeconomics: Is
There a Common Cause?” NBER Working Paper, No. 7777.
Parsons C.A., Sulaeman J., Yates M.C. and Hamermesh D.S. (2011). “Strike Three: Discrimination,
Incentives, and Evaluation.” American Economic Review, 101(4), pp. 1410–35.
Prendergast C. (1999). “The Provision of Incentives in Firms.” Journal of Economic Literature, 37
(1), pp. 7–63.
Redmond24 (2016). “A Question about Stephen Curry's unanimous MVP award by a non NBAfollower”.
Retrieved
on
03/09/2016
https://m.reddit.com/r/nba/comments/4j71gt/a_question_about_stephen_currys_unanimous_mvp/
Smith K. (2015). “How Biased Is The Ballon d’Or Voting? Bias In Ballon d’Or Voting Revealed”,
Soccerladuma,
January
14,
2015.
Retrieved
on
04/09/2016
at
http://webcache.googleusercontent.com/search?q=cache:GrgAULpDYYJ:www.soccerladuma.co.za/news/articles/categories/generic/bias-in-ballon-d-or-votingrevealed/197939+&cd=1&hl=en&ct=clnk&gl=ua
14
Stoll M.A., Raphael S., and Holzer H.J. (2004). “Black Job Applicants And The Hiring Officer’s
Race”, Industrial And Labor Relations Review, 57 (2), pp. 267–287.
Tsay C.-J. (2013). “Sight over sound in the judgment of music performance.” Proceeding of the
National Academy of Sciences. www.pnas.org/cgi/doi/10.1073/pnas.1221454110.
Van Ours J. and Ginsburgh V. (2003). “Expert opinion and compensation: evidence from a musical
competition.” American Economic Review, 93 (1), pp. 289–296.
Volkskrant (2011). “Per ongeluk ongeldig gestemd”. January 12, 2011. Retrieved on 03/09/2016
http://www.volkskrant.nl/archief/per-ongeluk-ongeldig-gestemd~a1823539/
Webster S.W. and Pierce, A.W. (2015). “Older, Younger, or More Similar? The Use of Age as a
Voting Heuristic.” Working paper.
Zitzewitz E. (2006). “Nationalism in Winter Sports Judging and Its Lessons for Organizational
Decision Making.” Journal of Economics & Management Strategy, 15 (1), pp. 67–99.
15
Table I: Conditional Logit Regression of being selected as best player on similarity links and candidate characteristics
All Jury members
(I)
National Team
Nationality
Continent
Media
(II)
Coaches
(III)
Coaches
(IV)
Captains
(V)
Captains
(VI)
Captains
(VII)
9.290***
(3.12)
5.027***
(1.07)
1.293***
(0.1)
24.352***
(12.12)
3.033***
(0.9)
1.126
(0.18)
0.779
(0.13)
0
0
23.730***
(9.24)
25.209***
(9.73)
15.370***
(6.49)
9.016***
(4.19)
1.479***
(0.22)
15.842***
(6.99)
5.034***
(1.23)
1.108
(0.14)
1.559***
(0.23)
1.600***
(0.24)
0.591***
(0.08)
0.757*
(0.12)
1.563***
(0.23)
0.594***
(0.09)
0.760*
(0.13)
4.632***
(2.17)
1.723*
(0.54)
YES
YES
YES
YES
YES
YES
YES
0.441
46,141
0.482
15,160
0.437
15,545
0.453
10,043
0.425
15,436
0.43
15,052
0.434
14,852
Position
Younger
League Team
Competition
Candidate Characteristics
R Adj sq.
N
Standard errors within parentheses. * p<0.10, ** p<0.05, *** p<0.01. We excluded cases where no information on some of the player characteristics were available. Votes of coaches, captains and press
representatives are used. Note these regressions are framed as the probability a given candidate gets a vote (not a given jury member votes) as the unit of observation is the candidate. Numbers in the table are
odds ratios based on a conditional logit regression.
16
Table II: Conditional Logit Regression of being selected as best player
on similarity links and candidate characteristics – by year
All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1
all years
2011
2012
2013
2014
National Team
Nationality
Continent
Candidate Characteristics
R Adj sq.
N
9.290***
(3.12)
5.027***
(1.07)
1.293***
(0.1)
1.92
(1.86)
7.372***
(4.53)
1.07
(0.21)
5.908**
(4.8)
26.892***
(13.78)
1.381**
(0.22)
20.250***
(12.49)
3.726***
(1.51)
1.860***
(0.33)
19.506***
(13.5)
6.422***
(2.89)
2.246***
(0.54)
YES
YES
YES
YES
YES
0.441
46141
0.677
10098
0.532
11088
0.454
12443
0.476
12512
Standard errors within parentheses. * p<0.10, ** p<0.05, *** p<0.01. We excluded cases where no information on some of the player characteristics were available. Votes of coaches, captains and press
representatives are used. Note these regressions are framed as the probability a given candidate gets a vote (not a given jury member votes) as the unit of observation is the candidate. Numbers in the table are
odds ratios based on a conditional logit regression.
17
Table III: Variation in the number of similar jury members
National Team
Nationality
Continent
Position
Younger
League team
Competition
(1)
Max links
(2)
Min links
(3)
Avg Marg Eff
2
14
190
116
159
4
27
0
1
33
21
2
0
1
0.098
0.057
0.006
-0.01
-0.006
0.054
0.014
The number of links reflects the number of jury members being similar to a candidate on a given criterion. The average marginal effects come from table AI (column I for
the top 3 similarity variables and column VII for the bottom 4 – the maximum and minimum number of links is based on the corresponding specification)
Table IV – Actual versus predicted outcomes in case similarities are set to zero (%)
2011
Predicted
Predicted w/o links
2012
Predicted w/o links
Predicted w/o links
Messi Lionel
Ronaldo Cristiano
Hernández Xavi
78
7.2
4.3
79.2
6.7
4.1
Messi Lionel
Ronaldo Cristiano
Iniesta Andres
58.7
17.4
4.1
62.9
16
3.7
2013
Actual
Predicted w/o links
2014
Predicted w/o links
Predicted w/o links
Ronaldo Cristiano
Messi Lionel
Ribery Franck
30.9
22
30.1
30.6
24.7
29.4
Ronaldo Cristiano
Messi Lionel
Neuer Manuel
55.7
8.9
10.2
56.3
9.8
10
The numbers in the table are the percentage of jury members who chose a given player as best player.
18
Table V – How are the first and second place choices related – a regression analysis
Messi Second
Ronaldo First
Argentina National Team
Portuguese National Team
Argentina Nationality
Portuguese Nationality
European Jury Member
South American Jury member
Sample
Pseudo R²
# of observations
Ronaldo Second
0.12***
(0.03)
.
.
-0.49**
(0.24)
0.47*
(0.27)
0.23
(0.17)
-0.05
(0.03)
-0.05
(0.06)
Excluding those voting Messi First
0.02
1210
Messi First
Argentina National Team
Portuguese National Team
Argentina Nationality
Portuguese Nationality
European Jury Member
South American Jury member
0.30***
(0.02)
-0.22
(0.23)
-2.97
(116.09)
-0.04
(0.18)
2.92
(116.09)
0.02
(0.03)
0.04
(0.05)
Excluding those voting Ronaldo First
0.08
1460
Standard errors within parentheses. * p<0.10, ** p<0.05, *** p<0.01. We excluded cases where characteristics perfectly predict outcomes. Votes of coaches, captains and press representatives are used. Note these
regressions are framed as the probability Messi/Ronaldo gets a vote. Numbers in the table are marginal effects based on a logit regression.
19
Table AIa: Estimates of marginal effects under different scenarios
National Team
Nationality
Continent
Column (I)
Scenario I Scenario II
Messi
0.400
0.841
Ronaldo
0.284
0.745
Xabi
0.021
0.162
Overall Average Marginal Effect
0.098
Messi
0.398
0.747
Ronaldo
0.281
0.625
Xabi
0.020
0.094
Overall Average Marginal Effect
0.057
Messi
0.397
0.455
Ronaldo
0.268
0.316
Xabi
0.019
0.025
Overall Average Marginal Effect
0.006
Column (II)
Scenario I Scenario II
.
.
.
.
.
.
0.386
0.297
0.017
0.085
0.382
0.279
0.015
0.009
.
.
.
Estimates are based on the conditional logit regressions of table I
20
0.826 .
0.743 .
0.133 .
0.469 .
0.353 .
0.022 .
Column (III)
Scenario I
Scenario II
0.420
0.900
0.254
0.802
0.017
0.215
0.150
0.416
0.758
0.249
0.586
0.017
0.079
0.056
0.420
0.443
0.248
0.265
0.017
0.019
0.002
.
.
.
.
.
.
.
.
.
Table AIb: Estimates of marginal effects under different scenarios
Column (IV)
Scenario I
Scenario II
National
Team
Nationality
Continent
Same
Position
Younger
Same League
team
Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
0.430
0.243
0.016
Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
0.445
0.257
0.019
0.931
0.842
0.273
Column (V)
Scenario I
Scenario II
0.388
0.295
0.027
0.193
0.429
0.241
0.016
0.926
0.886
0.393
Column (VI)
Scenario I
Scenario II
0.388
0.296
0.027
0.206
0.929
0.891
0.397
Column (VII)
Scenario I
Scenario II
0.385
0.299
0.027
0.210
0.887
0.836
0.292
0.148
0.672
0.459
0.046
0.032
0.432
0.236
0.016
0.458
0.256
0.018
0.002
0.386
0.275
0.024
0.488
0.362
0.037
0.385
0.273
0.024
0.010
0.391
0.217
0.015
0.382
0.277
0.025
0.011
0.418
0.323
0.031
-0.005
0.434
0.251
0.017
0.493
0.365
0.037
0.010
0.305
0.228
0.019
0.413
0.324
0.032
-0.011
0.000
0.000
0.000
0.402
0.317
0.033
-0.029
0.342
0.265
0.025
0.398
0.319
0.034
0.339
0.268
0.026
-0.006
0.385
0.299
0.027
21
0.303
0.231
0.020
-0.010
-0.006
Messi
Ronaldo
Xabi
0.483
0.364
0.038
0.720
0.629
0.114
Column (IV)
Scenario I
Scenario II
Same League
Column (V)
Scenario I
Scenario II
Overall Average
Marginal Effect
Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
Column (VI)
Scenario I
Scenario II
Column (VII)
Scenario I
Scenario II
0.054
0.384
0.298
0.028
0.507
0.409
0.047
0.014
Estimates are based on the conditional logit regressions of table I.
22
Table AII: estimates of marginal effects under different scenarios – year by year
National Team Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
Nationality
Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
Continent
Messi
Ronaldo
Xabi
Overall Average
Marginal Effect
2011
2012
2013
2014
Scenario I Scenario II Scenario I Scenario II Scenario I Scenario II Scenario I Scenario II
0.779
0.870
0.587
0.882
0.219
0.830
0.086
0.609
0.071
0.126
0.170
0.535
0.308
0.890
0.557
0.953
0.020
0.038
0.063
0.279
.
.
.
.
0.013
0.778
0.067
0.019
0.072
0.962
0.344
0.128
0.584
0.166
0.060
0.057
0.779
0.070
0.020
0.176
0.966
0.830
0.623
0.218
0.306
.
0.197
0.790
0.074
0.021
0.583
0.155
0.057
0.001
Estimates are based on the conditional logit regressions of table II.
23
0.496
0.614
.
0.084
0.554
.
0.056
0.656
0.200
0.077
0.009
0.190
0.212
0.269
.
0.085
0.328
0.402
.
0.022
0.354
0.878
.
0.081
0.492
.
0.162
0.674
.
0.025
Table AIII: OLS Regression of being selected as best player
on similarity links and candidate characteristics
All Jury members
(I)
National Team
Nationality
Continent
Media
(II)
Coaches
(III)
Coaches
(IV)
Captains
(V)
Captains
(VI)
Captains
(VII)
0.177***
(0.04)
0.081***
(0.02)
0.013***
(0.00)
0.222***
(0.06)
0.056***
(0.02)
0.008
(0.01)
-0.016***
(0.00)
-0.076***
(0.01)
0.205***
(0.05)
0.206***
(0.05)
0.187***
(0.06)
0.108***
(0.03)
0.016***
(0.01)
0.233***
(0.06)
0.079***
(0.02)
0.008
(0.01)
0.021***
(0.01)
0.022***
(0.01)
-0.016***
(0.00)
-0.009*
(0.01)
YES
YES
0.201
46,141
YES
YES
0.212
15,160
YES
YES
0.2
15,545
YES
YES
0.207
10,043
YES
YES
0.196
15,436
YES
YES
0.197
15,052
0.021***
(0.01)
-0.015***
(0.00)
-0.009
(0.01)
0.089***
(0.03)
-0.004
(0.01)
YES
YES
0.198
14,852
Position
Younger
League Team
Competition
Candidate Characteristics
Jury member Fixed Effects
R Adj sq.
N
Standard errors within parentheses under the coefficient estimates. * p<0.10, ** p<0.05, *** p<0.01. Votes of coaches, captains and press representatives are used. Numbers in the table are
coefficients of OLS regression, with robust standard errors clustered at the jury member.
24
Table AIV: OLS Regression of being selected as best player
on similarity links and candidate characteristics – by year
All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1 All Jury members, Top 1
all years
2011
2012
2013
2014
National Team
Nationality
Continent
Candidate Characteristics
Jury member Fixed Effects
R Adj sq.
N
0.177***
(0.04)
0.081***
(0.02)
0.013***
(0.00)
YES
YES
0.201
46,141
0.038
(0.07)
0.044
(0.03)
0.004
(0.01)
YES
YES
0.46
10,098
0.150**
(0.07)
0.118***
(0.03)
0.019**
(0.01)
YES
YES
0.276
11,088
0.289***
(0.09)
0.078**
(0.03)
0.021***
(0.01)
YES
YES
0.215
12,443
0.212***
(0.08)
0.093***
(0.03)
0.017***
(0.00)
YES
YES
0.308
12,512
Standard errors within parentheses. * p<0.10, ** p<0.05, *** p<0.01. Votes of coaches, captains and press representatives are used. Numbers in the table are coefficients of OLS regression, with robust standard
errors clustered at the jury membe.
25