MIT Sloan Sports Analytics Conference 2012 March 2-3, 2012, Boston, MA, USA Do Two Wrongs Make a Right in NBA Officiating? An Analysis of Referee Bias in Make-Up Call Situations Paul Gift Pepperdine University, Graziadio School of Business and Management Los Angeles, CA, USA, 90045 Email: [email protected] Abstract A make-up call is a particularly enigmatic type of potential referee bias in sports. Examples could include a wrong call to balance a prior wrong call or a questionable call to balance a prior questionable call. Motivation for a make-up call may derive from the rationalization that “two wrongs make a right,” from crowd or team pressures, or from a league’s explicit or implicit incentives. In this paper, I investigate whether NBA referees may consciously or subconsciously be affected by such factors using play-by-play data of over 1.1 million possessions from 6,538 games played during five seasons from 2006-2011. I examine the probability of various judgment call turnovers on one team when a judgment call was recently made against the opposing team, using information on nonjudgment turnovers to control for possible changes in player aggression and awareness. Findings support a make-up call hypothesis, whether intentioned or not. Results do not support a hypothesis of make-up non-calls; i.e., that a referee is less likely to make a judgment call because a potentially incorrect call had recently been made on the same team. This paper sheds important light on an oft-suggested but seldom-studied type of potential behavioral bias. 1 Introduction On Jan. 24, 2011, reporter Jon Krawczynski tweeted, “Ref Bill Spooner told Rambis he'd 'get it back' after a bad call. Then he made an even worse call on Rockets. That's NBA officiating folks.” Referee Spooner subsequently sued the Associate Press and its sportswriter, and a settlement was reached in December. He claimed he told coach Rambis he would “get back” to him after reviewing the tape. The NBA investigated and concluded that the referee had acted properly. [1] Regardless of the outcome, this story is one of numerous anecdotes and suggestions that can be found regarding the possible existence of make-up calls. In general, a make-up call can be thought of as a referee’s conscious or subconscious balancing of a wrong/questionable call on one team with a subsequent wrong/questionable call on the opposing team. In this paper, I move out of the realm of anecdotes and empirically investigate make-up call situations in the NBA using actual in-game decisions of referees for more than 6,500 games played over five seasons. The sports world can be a useful economic research lab because there is typically very detailed and precise reporting of individual behavior and decisions in a variety of situations. It is a fitting setting to investigate referee bias as allegations of make-up calls are quite common in a number of sports. This may involve the judgment calls of pass interference or roughing the passer in football, strikes and balls in baseball, cross-checking or two-man advantages in hockey, red/yellow cards or MIT Sloan Sports Analytics Conference 2012 March 2-3, 2012, Boston, MA, USA offside in soccer, point deductions in mixed martial arts and boxing, and offensive fouls or certain violations in basketball. Prior research on NBA referees has examined home and losing team bias [2], racial bias [3], and omission bias [4]. Sequential judgment effects have been analyzed for penalty decisions in soccer [5], gymnastics judging [6], and ball/strike calls in baseball [4]. Some were done in an experimental lab setting and others analyzed actual judge/referee decisions. I model and analyze make-up call situations in the NBA using in-game play-by-play data to investigate the changes in probability of certain referee judgment calls following recent judgment calls on the opposing team. While make-up calls cannot be “proven,” my findings are consistent with a make-up call hypothesis. I also investigate the changes in probability of certain referee judgment calls following recent judgment calls on the same team. Results initially appear to support this make-up non-call hypothesis, but a more detailed analysis suggests that they are likely explained by changes in player awareness. The NBA may have a latent history with make-up calls. [7] Shortly after the Tim Donaghy scandal in the summer of 2007, the league hired former federal prosecutor Lawrence Pedowitz to conduct an investigation of its refereeing structure. Among other things, Donaghy alleged that an incorrect call had been made in a Minnesota/New Orleans game and he was later told by another referee that they “could have made something up at the other end…calling a traveling violation on Kevin Garnett.” [8] After interviewing every referee [9] as well as other team and league personnel, Pedowitz found that there were two historical refereeing philosophies, the old and the new. Under the old philosophy, “…if a referee recognized that he or his crew had made an incorrect call, a referee might whistle a ‘make-up call’ soon thereafter.” [7] But, the league changed its officiating philosophy in 2003, establishing 16 performance standards where referees were to “strive for the unattainable goal of perfection,” “get the calls right,” and make “accurate calls, regardless of the circumstances of the game.” [7] This would seem to at least discourage conscious make-up calls, but the effect on subconscious ones remains debatable. 2 Data Play-by-play data were obtained from basketballvalue.com for five NBA regular and post seasons from 2006-07 to 2010-11, firmly within the time period of the new refereeing philosophy. The data contain complete information for 6,538 games and over 1.2 million possessions.1 More than 300 random spot checks were conducted and compared to online play-by-play information from nba.com (the original source), espn.com, and cbssports.com. The data were then distilled down to the possession-by-possession level.2 Possessions were dropped if there were less than 24 seconds remaining in the period or less than two minutes remaining in the 4th quarter or overtime. This was done in an effort to exclude situations where the offense does not execute a typical full possession or the defense commits an intentional foul when trailing towards the end of a game. The final dataset contains over 1.1 million possessions. The outcome of interest for each possession is whether it ends in a turnover, and, if so, which type. The make-up call situations I examine involve offensive fouls and violations. While defensive foul calls could technically be “made up,” their impact on the game can vary greatly. The effect on a player’s foul total may be de minimis, it may put a star or bench player in foul trouble, or it 1 2 18 games were missing and seven were dropped due to incomplete data. I use the definition of a possession that is associated with the Points Per Possession statistic. Under this definition, a team’s possession ends with a made basket or free throw, a turnover, or a defensive rebound or out of bounds to the opposing team. This differs with the NBA’s definition of a team possession which “ends when the defensive team gains possession or there is a field goal attempt which hits the rim.” [10] 2 MIT Sloan Sports Analytics Conference 2012 March 2-3, 2012, Boston, MA, USA may possibly add points as a shooting foul. However, every single time an offensive foul or violation is called it results in a turnover and loss of possession. This could be easily remedied on a conscious or subconscious level by whistling an erroneous or questionable offensive foul or violation on the opposing team, thus “evening things out.” Researchers have noted [11] that time pressure and ambiguity are two conditions under which implicit attitudes may arise. The best candidates for ambiguous calls are those involving the most judgment. I classify offensive fouls, traveling violations, and 3 second violations as judgment calls (JCs). In the sample, each respectively occurs on average 2.08, 1.28, and .36 times every 100 possessions. I classify 24 second violations, step out-of-bounds turnovers, bad pass turnovers, lost ball turnovers, bad pass steals, and lost ball steals as non-judgment calls (NJCs). Each respectively occurs on average .64, .28, 1.58, .94, 4.70, and 3.09 times every 100 possessions.3 These classifications do not imply that NJCs involve a lack of judgment, just significantly less than JCs.4 3 Hypotheses and Method Hypothesis 1 Make-up Calls: The probability of a judgment call turnover on one team will increase following a recent judgment call turnover on the opposing team. Hypothesis 2 Make-up Non-Calls: The probability of a judgment call turnover on one team will decrease following a recent judgment call turnover on the same team. The probability of a particular turnover is not solely dependent on referee behavior. It is also affected by team characteristics such as the players, their motivation to play the opposing team, and the coaching styles; game/possession characteristics such as the score differential, time remaining, home/away possession, and if it is a playoff game; the current aggression level of the players; and the current awareness of the players. I model the probability of a turnover in the current possession as Pr (Y ) = f Y (team, ref ( X , Z ), pag ( X , Z ), pawY ( X , Z )) where Y is an indicator variable equal to one if there is a particular turnover in the current possession and zero otherwise, X is a vector of game/possession characteristics, Z is an indicator variable equal to one if there was a particular turnover in the opponent’s previous possession and zero otherwise, team is a vector of observable and unobservable team characteristics, ref is a referee behavior function, pag is a player aggression function, and paw is a player awareness function. I control for team factors using a regression model with offense-defense-season fixed effects. The fixed effects account for any invariant characteristics among the two teams in a season; e.g., during possessions of the Lakers’ offense against the Warriors’ defense in 2010-11. The regression model is yij = α i + zij β1 + xij′ β 2 + ε ij 3 4 (1) These nine turnovers have the highest frequency of occurrence in the data. Prior research [2] has classified the JCs as calls involving more referee discretion and five of the six NJCs as events involving less discretion. In addition, whether true or not, Tim Donaghy used an erroneous traveling call as an example of a make-up call situation. And, former NBA player and current TV analyst, Chris Webber, recently stated [12], “I'd just take anything subjective out of the game, so the charge [an offensive foul] is something that there's not a definite rule on…It has to be the same thing every time.” (emphasis and brackets added) The data do not reveal the quality of individual calls. 3 MIT Sloan Sports Analytics Conference 2012 March 2-3, 2012, Boston, MA, USA for i = 1…4,350 (30×29×5) offense-defense-season combinations and j = 1…Ji possessions. The parameter of interest is β1, the marginal impact of a particular recent turnover by the opposing team on the probability of a particular turnover by the current team. Interactions of Z and X were tested but were mostly insignificant and did not pass the Likelihood Ratio test. To identify the probable cause of β1, a few basic assumptions are needed to disentangle changes in referee behavior from possible changes in player aggression or awareness. Assumption 1 Referee behavior does not affect nor is affected by non-judgment calls Assumption 2 Recent turnovers do not affect player awareness of different turnovers Assumption 3 If there are no changes in player awareness from recent non-judgment calls, then there are also no changes from recent judgment calls Assumption 4 If there are no changes in player aggression from recent non-judgment calls, then there are also no changes from recent judgment calls By way of example, Assumption 1 implies that referee behavior does not affect nor is affected by 24 second violations or lost ball turnovers. Assumption 2 implies that a bad pass turnover should not make players think about traveling. Assumption 3 implies that if a 24 second violation does not make players more aware of 24 second violations, then a 3 second violation does not make players more aware of 3 second violations. Assumption 4 implies that if stepping out of bounds does not make players more aggressive with respect to charging, then traveling does not make players more aggressive with respect to charging. 4 Results Equation 1 was estimated using a fixed effects logit (FE Logit) and fixed effects linear probability model (FELPM). The FE Logit avoids the possible nonsense probability problem but does not allow for estimation of average partial effects of β1. Estimates of β1 alone do not have much intuitive value for interested parties such as league and team personnel, the media, or fans. The FELPM allows for estimation of average partial effects, but may lead to some nonsense estimates if a probability is near zero or one. In what follows, I report sign and significance from the FE Logit and the average partial effect (in percent change format) from the FELPM.5 Sign and significance results for β1 were qualitatively similar for both models so this distinction is largely academic. Hypothesis 1 (Make-Up Calls) is tested by examining the average percent change in the probability of certain current turnovers (Y) associated with various turnovers in the opponent’s previous possession (Z). These results are presented in Table 1. The upper-left panel shows the probability changes of current JCs associated with various JCs in the previous possession. An offensive foul is associated with a significant increase in probability of all three JCs in the next possession. Travelling and 3 second violations have some positive, significant associations in the next possession and all JCs have a positive, significant association with subsequent 3 second violations.6 Having controlled for team factors and game/situational factors, these probability increases may be due to changes in referee behavior or player behavior. If changes in player aggression are affecting current turnovers, this should be apparent upon examination of the upper-right panel of Table 1. This section shows the probability changes of 5 6 I make this distinction believing the FE Logit to be the preferred statistical model but the FELPM more useful for interpreting the results. The percent changes in probability of current NJCs associated with JCs in the previous possession are all insignificant or negative and have been excluded due to space limitations. 4 MIT Sloan Sports Analytics Conference 2012 March 2-3, 2012, Boston, MA, USA current JCs associated with various NJCs in the previous possession. These NJCs should not affect referee behavior nor should they affect player awareness of JCs. 17 out of 18 of the statistics in this area are negative or insignificant, supporting the notion that changes in player aggression are not driving the earlier results. If changes in player awareness are affecting current turnovers, this should be apparent upon examination of the lower-right panel of Table 1. This section shows the probability changes of current NJCs associated with the same NJC in the previous possession. These NJCs should not affect referee behavior and changes in player aggression have been previously rejected. All statistics in this area are insignificant, supporting the notion that changes in player awareness are not driving the initial results. Table 1: Percent Change in Probability of Current Turnovers Associated with Various Turnovers in the Opponent’s Previous Possession Previous Possession of Opposing Team (Z) Judgment Calls (JC) Offensive Current Turnover (Y) Foul Non‐Judgment Calls (NJC) 3 Second 24 Second Step Out Bad Pass Lost Ball Traveling Violation Violation of Bounds Turnover Turnover Bad Pass Lost Ball Steal Steal Player Aggression Offensive Foul 21.7% ** 5.2% 11.5% 8.1% ‐1.0% ‐9.1% 11.3% ‐0.3% 1.6% Traveling 16.4% ** 66.5% ** 30.2% ** 49.2% ** 23.6% 6.6% 3.7% 0.3% 8.1% ‐8.5% * ‐4.8% 10.0% ‐19.1% ‐11.8% 3 Second Violation 52.3% * 53.1% ** ‐14.3% ‐5.3% 0.5% ‐2.2% Player Awareness Same variable as Z 21.7% ** 30.2% ** 52.3% * ‐25.0% 28.2% 1.1% ‐15.4% Notes: Current turnovers (Y) are on the vertical axis and previous turnovers (Z) are on the horizontal axis. The second most recent possession by the opposing team is used for both steal categories since these turnovers are often immediately followed by fastbreaks. ** and * indicate significance at the 1 and 5 percent levels, respectively. Hypothesis 2 (Make-Up Non-Calls) is tested by examining the average percent change in the probability of certain current turnovers (Y) associated with various turnovers in the previous possession of the same team (Z). These results are presented in Table 2. Examination of the upperleft panel reveals that the probability of a current JC decreases only when the same JC was whistled in the team’s previous possession. Estimates in the upper-right panel are all weak and insignificant, suggesting that changes in player aggression are not meaningful. Results thus far are consistent with the idea that referees are less likely to whistle the same JC on the same team two possessions in a row. Evidence from the lower-right panel strongly suggests that player awareness adjusts after committing a turnover, making the team much less likely to commit the same turnover again in their next possession.7 This holds for every single NJC and is therefore a very plausible explanation for the observed JC results. Thus, one cannot disentangle the effects of possible changes in referee behavior from changes in player awareness. The strong results in this panel do not support Hypothesis 2 and suggest that changes in player awareness explain most, if not all, of the observed JC probability declines. 7 Notice the two nonsense probability changes of -126.2% and -118.7%. Since 3 second violations and step out of bounds are the most infrequent of the nine JC and NJC turnovers and their probability of occurrence is sufficiently close to zero, the FELPM yields nonsense probabilities when there is a strong negative association of Z to Y. This does not pose a concern because the numeric estimates are not important in this case. What matters is the significant, negative estimate of β1, and both the FE Logit and FELPM models have this result. 5 MIT Sloan Sports Analytics Conference 2012 March 2-3, 2012, Boston, MA, USA Table 2: Percent Change in Probability of Current Turnovers Associated with Various Turnovers in the Previous Possession of the Same Team Previous Possession of Same Team (Z) Judgment Calls (JC) Offensive Current Turnover (Y) Foul Non‐Judgment Calls (NJC) 3 Second 24 Second Step Out Bad Pass Lost Ball Traveling Violation Violation of Bounds Turnover Turnover Bad Pass Lost Ball Steal Steal Player Aggression Offensive Foul Traveling 3 Second Violation Same variable as Z ‐25.8% ** ‐0.4% ‐0.2% 7.4% ‐4.9% ‐0.1% 11.5% 3.0% 1.7% 3.5% ‐34.2% ** 2.0% ‐17.6% ‐7.8% 0.7% ‐4.6% 0.6% ‐1.4% ‐6.9% ‐6.0% ‐126.2% ** 4.1% ‐4.9% 6.0% ‐13.8% ‐0.9% ‐9.2% ‐44.9% ** ‐6.0% ** ‐25.8% ** Player Awareness ‐34.2% ** ‐126.2% ** ‐58.1% ** ‐118.7% * ‐35.0% ** ‐10.3% ** Notes: Current turnovers (Y) are on the vertical axis and previous turnovers (Z) are on the horizontal axis. ** and * indicate significance at the 1 and 5 percent levels, respectively. 5 Summary and Conclusions The mysterious make-up call is something that can invoke strong opinions and emotions from any sports fan or anyone who has ever played organized sports. In this study, I find evidence consistent with a make-up call hypothesis but not supportive of a make-up non-call hypothesis at the highest level of professional basketball (the NBA). The most likely source for a make-up call appears to be an offensive foul. After an offensive foul on one team, the likelihood of a judgment call in the next possession of the opposing team increases by 16-66% depending on the type of call. Increases of this magnitude lend credence to the position that the make-up call balancing effect is subconscious. For example, a 3 second violation occurs approximately one time every 300 possessions. A conscious attempt to increase scrutiny of this violation on the opposing team would likely increase the probability of this event by substantially more than 66% because of its infrequency of occurrence. On the other hand, the statistics in this paper use information on all turnover calls. Thus, they are a likely lower bound to the probability increases estimated with knowledge of truly questionable calls. Due to space limitations, a thorough examination of the time trend of subsequent probability increases was excluded from this paper. In general, I find significant and long-lasting increases in the probability of offensive foul calls and 3 second violations on one team following an offensive foul on the opposing team. These increases can remain noteworthy for up to 10-15 possessions (a team has 23 possession in a typical quarter). Significant increases in the probability of traveling and 3 second violations on one team following a traveling violation on the opposing team tend to disappear after 36 possessions. All probability increases on one team following a 3 second violation on the opposing team are transitory. This appears to be consistent with a theory of subconscious make-up calls. A 3 second violation is often away from the ball, so the initial call may be less questionable with less referee desire to subsequently balance. Traveling and offensive fouls occur mostly on or near the ball where time and social pressures may be the greatest. In particular, when a block/charge event occurs, everyone in the arena knows that something just happened, but there might be differences of opinion as to what it was. The implicit desire to subsequently balance may be greater in these situations. The subject of make-up calls in the NBA has important implications for not only employee (referee) behavior and training, but for the reputation and perceived quality of a multibillion dollar business enterprise. There are also important strategic implications for coaches and players. This paper attempts to shed a little light on this frequently-suggested but seldom-examined topic. 6 MIT Sloan Sports Analytics Conference 2012 March 2-3, 2012, Boston, MA, USA Acknowledgments Excellent research assistance was provided by Matthew “MGM” Morgan. I am grateful for helpful comments and suggestions from Mike Beauregard, seminar participants at Pepperdine University, and conference participants at the Academy of Business Research. References [1] Associated Press, “AP and NBA Referee Reach Settlement in Lawsuit over Reporter’s Twitter Message,” Washingtonpost.com, 7 Dec. 2011, Web, 14 Dec. 2011. <http://www.washingtonpost.com/national/ap-and-nba-referee-reach-settlement-in-lawsuit-overreporters-twitter-message/2011/12/07/gIQAiXMqcO_story.html> [2] Price et al., “Sub-Perfect Game: Profitable Biases of NBA Referees,” Working Paper, Oct. 2010. [3] J. Price and J. Wolfers, “Racial Discrimination among NBA Referees,” The Quarterly Journal of Economics, vol. 125, no. 4, p1859-1877, Nov. 2010. [4] T. Moskowitz and L. Wertheim, Scorecasting: The Hidden Influences Behind How Sports Are Played and Games Are Won, 1st Edition, New York, NY, 2011. [5] H. Plessner and T. Betsch, “Sequential Effects in Important Referee Decisions: The Case of Penalties in Soccer,” Journal of Sport and Exercise Psychology, vol. 23, no. 3, pp. 254-259, Sept. 2001. [6] Damisch et al., “Olympic Medals as Fruits of Comparison? Assimilation and Contrast in Sequential Performance Judgments,” Journal of Experimental Psychology: Applied, vol. 12, no. 3, pp. 166178, Sept. 2006. [7] Lawrence Pedowitz “Report to the Board of Governors of the National Basketball Association,” Wachtell, Lipton, Rosen & Katz, 2008. [8] Tim Donaghy, Personal Foul: A First-Person Account of the Scandal that Rocked the NBA, 1st Edition, Sarasota, FL, 2009. [9] Mark Stein, “League Won't Immediately Release Probe's Findings,” Espn.com, 29 Jul. 2008, Web, 4 Jan. 2012. <http://sports.espn.go.com/nba/news/story?id=3509550> [10] Official Rules 2010-2011, National Basketball Association, Aug. 2010. [11] Bertrand et al., “Implicit Discrimination,” The American Economic Review, vol. 95, no. 2, pp. 94-98, May 2005. [12] “If I Was Commish,” Open Court, NBA TV, 21 Dec. 2011, Television. 7
© Copyright 2026 Paperzz