The Generalist Bias: Estimating the Value of Three-Point Shooting in the National Basketball Association A Thesis Presented to The Established Interdisciplinary Committee for Mathematics and Economics Reed College In Partial Fulfillment of the Requirements for the Degree Bachelor of Arts Torrey Payne May 2014 Approved for the Committee (Mathematics and Economics) Jeffrey Parker and Albert Kim Table of Contents Chapter 1 ......................................................................................................................... 12 1.1 Introduction .......................................................................................................... 12 1.2 Literature Review ................................................................................................... 16 Chapter 2: Data and Models .......................................................................................... 23 2.1 Data ...................................................................................................................... 23 2.2 Models ................................................................................................................. 27 2.2.1 Basic Model ..................................................................................................... 27 2.2.2 Full Performance Model .................................................................................. 28 Chapter 3: Results and Discussion ................................................................................ 29 3.1.1 Basic Performance Model: Two-point Shooting, Three-point Shooting, and standard performance statistics. .................................................................................... 29 3.1.2 Full Performance Model: Non-scoring Box Score Statistics & Advanced Player Statistics ........................................................................................................................ 34 3.2 Discussion ............................................................................................................... 40 Conclusion ....................................................................................................................... 42 Appendix .......................................................................................................................... 43 Bibliography .................................................................................................................... 45 Abstract My paper looks to investigate the effect of scoring on NBA real average salaries, using observations of newly signed contracts and the previous-season’s performance statistics. I use both box-score statistics and advanced statistics to analyze my data. Results suggest that scoring from beyond the three-point line has a slightly larger impact on wage than two-point scoring, but these results are not strongly confirmed. The true impact of three-point shooting on salary is still unclear, but the evidence suggests it has at least a mild impact. 12 Chapter 1 1.1 Introduction This thesis investigates a field of economics that perhaps doesn’t receive as much attention as its sports analytics cousin: sports economics. My thesis specifically investigates the value of three-point shooting in the NBA labor market. The three-point shot has only recently started to be fully utilized by basketball players around the world; its introduction has fundamentally changed the pace of a basketball game, the spacing on offensive and defensive plays, and created countless highlights and moments. Fig. 1. The layout of a standard NBA basketball court (Britannica 2013). The basic rules of basketball address the number of players, positions, scoring, violations, and fouls. On the competitive levels, basketball teams are made up of 5 players on the court and 5 players sitting on the bench that can be used for substitution 13 during the whole period of the game. Each player is assigned a position on the court, which is usually determined by the height of the player. The tallest player on the team usually player “center”, also known as position 5, while shortest players play “guards” or positions 1 and 2. The “forwards” are medium height and play positions 3 and 4 (FIBA 2014). A player scores when he manages to throw, or “shoot”, the ball into the basket, with the ball passing through the basket from above the hoop. Scoring a basket increases the team’s score by 3, 2 or 1 point. If the player successfully shoots from outside of the three-point line, the basket is worth 3 points, otherwise it is worth 2 points. A player can also score one point when shooting from the free throw line after a personal foul, technical foul, or other violation (FIBA 2014). Perhaps most interesting in the economics of professional basketball is the impact of the 3-point line. The 3-point line was first used in a professional sports league in 1961 by the American Basketball League, which folded after less than two years (Wood 2013). In the NBA, the line is 22-feet from the rim in the corners, and 23 ft. 9 in. elsewhere. For the WNBA and international play, the line is 20-feet 6 in. In the NCAA, the line is 20 ft. 9 in. or 19 ft. 9 in. for men’s and women’s basketball, respectively (women’s and high school lines are the same). 14 Fig. 2. The distance of the three-point line in different basketball leagues and competitions (Condotta 2008). The American Basketball Association, or ABA, adopted the line in 1967 as part of its experimentation with fan-friendly ideas. “‘We called it the home run, because the 3pointer was exactly that,’ George Mikan said, ‘it brought fans out of their seats’” (Wood 2013). The ABA and NBA merged in 1976, and in 1979 the NBA finally adopted the 3point line. After its implementation, a whole generation of basketball coaches had to rethink their fundamental understanding of the game, since this line gave a new incentive to long-range shooting. Hubie Brown, a former ABA and NBA coach, is noted for saying in the book Loose Balls: “Don’t give them the 25-footer, which is something players had been conditioned to do all their lives. [And] as a coach, if you have a shooter with range, you have to give him the freedom to take the 25footer, which is probably a philosophy that goes against what you learned as a young coach—namely, pound the ball inside.” (Wood 2013) 15 Use of three point shooting has exploded in the past decade. Ignoring the threeyear period in which the NBA decided to shorten the 3-point line(it was restored in the 1997-98 regular season), 3-pt. shooting attempts steadily increased soon after the 199798 season. In 1992-93, not a single team attempted more than 1,100 three-pointers. In the 2012-13 season, “each and every team attempted more than 1,100 three-pointers” (Beer 2013). The New York Knicks set all-time records for most three-point attempts and makes in a season shooting 891 of 2,371, followed by the Houston Rockets with 867 of 2369, 350 more than the entire league in 1979-80. Stephen Curry of the Golden State Warriors set the NBA record for most individual three-point shots made with 272, 178 more than the entire 1982-83 San Antonio Spurs, the league leader in three-point shots made for that season. Additionally, the top two seeds in each of the league conferences (Miami Heat, New York Knicks, San Antonio Spurs, and Oklahoma City Thunder) all finished in the top-five in three-point percentage last season. Two of those teams, the Miami Heat and San Antonio Spurs, met in the NBA. Of the 16 teams to qualify for the playoffs (8 in the Eastern Conference and 8 in the Western Conference), 11 were in the top half of the league in 3-pt. shots made, and the top 5 all made the playoffs. For attempts, 8 of the top 9 teams in 3 pointers attempted qualified for the playoffs. A similar story is true for 3-pt. efficiency: the top 5 all qualified for playoffs, and 9 of the top 15 (Basketball-Reference 2013). 16 1.2 Literature Review Scoring is one of the most emphasized statistics in basketball. Berri, Brook, and Schmidt (2007) summarize the current economic literature on scoring by commenting, “points scored dominates the evaluation of player productivity in the NBA… The only factor consistently found to be correlated with player evaluation in the NBA is points scored.” Their study uses the “standard approach” in the relevant sports economics literature, following Becker(1971), by using the following model: 𝑌 = 𝛼0 + 𝛼1𝑋 + 𝛼 2R+ 𝜀i where Y equals a decision variable such as salary, employment, or playing time, X equals the measures of worker productivity(player characteristics and performance data, as well as market variables), R is a dummy variable for a worker’s race, and ei is an error term. The study references a survey done by Berri in 2006 that surveyed twelve studies examining racial discrimination in the NBA, and each employed a model similar to the above equation. In 14 of the 15 models examined in Berri’s survey, points scored was found to be both with the expected sign and statistically significant. Even though efficiency in utilizing shot attempts would also be an indicator of a player’s worth to a team, field goal percentage was not statistically significant in the majority of studies where it was considered. “In other words, a player who scores points can expect to receive a higher salary. Evidence that scoring needs to be achieved via efficient shooting is not quite as clear… Given the ambiguous results uncovered with respect to everything else besides a player’s points scored per game, these results suggest that a player interested in maximizing salary, draft position, employment tenure, and playing time should primarily focus upon taking as many shots as a coach allows.” Berri, Brook, and Schmidt (2007) follow Jenkins (1996) by restricting the study of salary in professional sports to recent free agents. Researchers often regressed current 17 salary upon current player statistics. The NBA often signs players to multi-year contracts, for example in the 2002-2003 season where 70% of players were under contracts at least three years in length. Therefore, it was argued that to determine the relationship between productivity and salary one must consider information at the time the salary was determined. Interestingly, all these models fail to account for 3-pt. shooting. The results of regressions in the paper found that NBA Efficiency per game (an official NBA statistic) explains 64% of player salary, Wins Produced has explanatory of 41%, and when points scored per game is used as a sole measure it explains 59% of a player’s average wage. When all the vectors of performance data were used, only scoring, rebounds, and blocked shots statistically impact player compensation. In terms of elasticity measures, a 10% increase in points scored per game increases average salary by 7.7%. A similar increase in rebounds only leads to a 4.8% increase in compensation. In conclusion, the analysis shows that player evaluation in the NBA seems overly focused on scoring. Berri(1999) investigates how to measure productivity of an individual participating in basketball. Berri creates a model that links the player’s statistics in the NBA to team wins. This model is then employed in the measurement of each player’s marginal product. He begins with a fixed-effects model, estimated using aggregate team data from the 1994-95 through 1997-98 seasons: The fi are team specific fixed effects. Using this model, Berri finds that total points a team scores and surrenders in a season explains 95% of the variation in team wins. Such findings suggest that how many points a team scores and surrenders per game is a good approximation for team wins, hence the value of a player should simply be a function of how many point he scores and allows the opponent to score per contest. 18 However, Berri acknowledges that scoring is determined by various factors that can be quantified; a team’s scoring should be a function of how the team acquires the ball, efficiency of ball handling, and ability to convert possessions into points. Berri introduces two additional equations: Y2 is virtually identical to the previous equation for wins, and Y3 represents the opponent’s scoring. The equations include all the typical performance statistics. This model presents a basic theory of basketball, with the primary determinants of offense and defense laid forth and then connected to wins. Berri uses a factor, team tempo, which was not generally accounted for in previous academic studies, but crucial in accurately measuring a player’s statistical output from the philosophical view of a coaching staff. The number of shots a team takes on average per game plays a significant role in determining how many opportunities the opponent will have. A slower tempo implies the opponent with have less time of possession; given less time the opponent will, ceteris paribus, score less. Controlling for rebounds and turnovers further explain possessions and scoring opportunities. Since a team playing a faster pace will have more opportunities, players from these teams will accumulate greater numbers of statistics. Weighting tempo will mitigate such bias. The results of the regressions show an interesting case for players in the 1996-97 season. According to the paper, Dennis Rodman actually outproduced Michael Jordan due to his incredible rebounding abilities. The results also seemed to accurately weigh win contribution. Differences between actual wins and predicted wins (when summing up player wins contributed individually) are relatively small, and for over five teams this difference is less than 1. Goldman and Rao (2013) attempt to determine the right proportion of 2- and 3point shots to take. This paper is significant because it quantifies some of the in-game impact of three-point shooting on a possession-by-possession basis. In their study, they investigate optimal two-point and three-point shooting selection. As time remaining decreases, the trailing team should place an increasingly positive value on risk, and the opposite (a negative value on risk) for the leading team. Hence, a testable optimality condition: 3-point success rate must fall relative to 2-point success rate when a team’s preference for risk increases. This should be true since teams should be forcing more 3pt. shots to shorten the lead. For teams with a lead, as the gap in score decreases, the team 19 should become more risk-neutral. Their findings show that this condition only holds to the trailing team. The leading team in fact is not efficiently allocating their shots and hence score differentials become tighter than they should be. Their paper also shows that if the offense shoots more 3’s as it becomes risk-loving this implies the attack can be varied more readily than the defensive adjustment. In their analysis, they exclude situations in which one team has less than a 5% chance of winning (“garbage time”), end of quarter shots, and fast-break shots, as all these situations tend to have very different strategies than a half-court offensive set. The study revolves around a parameter, α, defined as: The increases in win probability of adding 2 or 3 points to the team’s current score are denoted WV2 and WV3 respectively. α defines the degree to which 3-pointer win value diverges from 1.5 2-pointers. When α > 1.5, the win value of a 3-pointer exceeds it’s nominal value. This occurs for the trailing team, especially late in the game. The opposite is true for the leading team, where α < 1.5—here a 3-pointer is worth less than usual since the team should be risk-averse. Using this parameter, as well as a basketball analysis concept called a “usage curve”, Goldman and Rao (2013) create and analyze an optimization problem centered on fraction of shots attempted as 3’s, with the first-order condition that marginal returns to 2-pointers and 3-pointers should be equal. The above graph gives a representation of the maximization problem. Lutz (2012) attempts to redefine positions on the basketball court and observe the contributions of types of basketball player through cluster analysis. This study helps to clarify the role that a Three-pt. shooter has on a basketball team. Lutz uses data on games played, minutes played per game, percent of made field goals that are assisted, assist rate, turnover rate, offensive rebound rate, defensive rebound rate, steals per 40 minutes, blocks per 40 minutes, and the number of shots attempted per 40 minutes at each of the 20 following locations: at the rim, from 3-9 feet, from 10-15 feet, from 16-23 feet, and beyond the 3-point line. All the variables are standardized using z-scores in order to put them all on the same scale and thus give equal weight to each variable. An Expectationmaximization algorithm for Gaussian mixture models is employed to do the clustering. A Mclust function is used with the Bayesian Information Criterion to determine the parameters of the model and how many clusters to use, which explain the 10 categories the paper settles on. Fisher’s Linear Discriminant is utilized to place players into one of the 10 clusters. In the investigation, Lutz finds that players in the Durable Shooters cluster “are most often members of winning organizations… 66.7% of these players are on a winning team.” A typical member of the Durable Shooters cluster is Ray Allen, who is the alltime leader in 3-pt. field goals made in NBA history (NBA). These players, statistically, can be differentiated from the other clusters by a high number of 3-pt. field goal attempts, above average minutes played, above average steal rates, low rebound and turnover rates, and much more games played than average (these are represented by the z-scores in Table 3). In second was the Combo Guard cluster, containing players who attempt more 3s than average but mainly accumulate high assist and turnover rates, high steal rates, and very high assist ratios (defined as assists - turnover ratio); 62% were on winning teams. The next closest clusters, Defensive Bigs and Elite Bigs, have percentages of players on winning teams 55% and 52%, respectively. Every other group had percentages under 50%. The lowest were the Big Bodies and Active Bigs (41% and 38% respectively). Active Bigs tend to shoot a lot more, be more active in rebounding, and have good rebound and block rates compared to Big Bodies, but both groups tend to miss more games and play less minutes than average, games than average, have below average assist ratio, above average rebound rates. Coming in last was the Ball Handlers cluster; these players find themselves on winning teams 44% of the time. When comparing the abundance of players in each cluster, Durable Shooters are scarce and Ball Handlers are quite abundant (57 Durable Shooters and 172 Ball Handlers). When looking at the p-values of the percentages on winning teams, only 3 out of 10 clusters have p-value less than .05, and 4 out of 10 have p-value less than .1. 21 Lutz also considers groups of clusters and their “interaction effect” on point differential. The Durable Shooters combination was found the most on winning teams with a p-value of .007, and the top four pairs are combinations with Durable Shooters. Of the top 10 3-way combinations found on winning teams, five contained a Durable Shooters cluster, and all but one contained one cluster from Durable Shooters or Combo Guards. Michaelides (2010) stresses the importance of testing for unobserved heterogeneity in analyzing basketball compensating differences. The results from his investigation indicate that the quality of empirical results is distorted when important measures of player skills are omitted from the specifications. Michaelides uses data on all professional basketball players employed in the NBA between 1999 and 2003. The data contains on-court performance (minutes played, points, rebounds., etc.), race, age, height, place of birth, year entered the league, and draft pick number at the annual league draft. The paper uses the classical hedonic wage equation of other studies (Berri 1999) that includes all available measures of player productivity and team-specific characteristics that capture employer heterogeneity. A major contribution in his paper is the specification for firm heterogeneity. In the context of professional basketball, this would include location amenities, quality of the team’s coaching staff, and team success. Michaelides obtains measures that capture coach and team quality through official reports of the NBA such as the Association of Professional Basketball Research. To account for location amenities, he obtains weather conditions from the National Climatic Data Center and the Meteorological Service of Canada. Additional specifications were included to control for the salary structures of rookie contracts and veteran contracts. Finally, there is Wang and Murnighan (2001) paper on generalist bias. Their paper investigates a tendency to reward and select people with general skills when complementary, specialized skills are needed. The paper includes five studies to investigate these effects. Their second study investigated compensation of NBA players, comparing two-point scoring to three-point scoring. The study identifies three-point shooters are specialists because they have special skills and they are typically not as individually productive as players who are generalists with a wider variety of skills. 22 Three-point shooters’ long-range shooting abilities allow their contributions to complement those of their teammates, who may be more overall skilled at scoring. According to their study, two-point shooting accounted for 82.3% of team scoring in 2005 and 81.6% in 2006. On average, 21% of the players made 76% of their team’s three-point shots in 2005, and 23% of the players made 75% of these shots in 2006. This is all evidence that three-point shooting is a specialized skill and that most players on most teams focus on two-point shooting, and teams may evaluate their players more on the basis of their two-point scoring than their three-point scoring. The study identified three-point shooters as guards whose three-point scoring represented more than 20% of their overall scoring. Using a subset of 35 players from the NBA player pool who were described with their definition of three-point shooter, they found that three-point scoring was statistically insignificant, and two-point scoring was significant with p < 0.01. Analyzing two-point shooters, only two-point scoring was significant, with p<0.001. Their results suggested the bias is restricted to true specialists. 23 Chapter 2: Data and Models 2.1 Data To examine the market value of various NBA performance statistics, data are combined from numerous sources. The core performance data, which includes season totals of the main NBA performance measures, such as points, rebounds, games played, and minutes, come from BasketballReference.com, a subsidiary of SportsReference.com. The salary data, which included contract length, contract amount, and final year of contract, were collected from University of Michigan professor Rodney Fort. Each year’s data were collected from news sources such as USAToday or basketball websites like Hoopsdata.com. The data contain relevant seasonal information on NBA players who signed with an NBA franchise during the NBA free-agency/re-signing periods from the 2002-2003 NBA regular season and 2004-05 through 2007-08 seasons. This would include restricted free agents and unrestricted free agents, and would exclude rookies. We only included players who played at least 12 minutes per game, since playing fewer than 12 minutes a game may lead to skewed performance statistics. The year 2004-05 is excluded from the models due to restrictions with availability of the data. The data do not contain observations of players who were released and signed with another team in the same season, as this created an issue with allocating a particular set of performance statistics with a certain team and salary. Hence, we only have the most recent year’s performance information linked the contract that was signed after it. Some players may appear multiple times in the data in different years, but not in the same year. The salary measures are calculated according to 2008 U.S. dollars; the previous years are inflated according to the CPI. Salary is calculated by dividing the total contract amount by contract length. This method avoids the unbalanced contract structure that many players agree to in their negotiations; players may have a back-weighted or front- 24 weighted salary structure that may otherwise appear to be correlated with year-to-year performance. The models used in my investigation take many core elements from Berri (2007); the models will include additional independent variables and a longer timespan. The dependent variable in my models is real (logged) average salary. This is due to the fact that the NBA functions on a season-by-season basis, and different length contracts may be determined by the age and durability of a player. Average salary ideally should represent some portion of the expected marginal product that the player contributes to a team in a particular upcoming season, regardless of whether the player is expected to do this for multiple seasons or one. For this reason, age may not have as strong an effect on the salary as much as on the contract length or structure; older players may perhaps just sign shorter contracts but still get paid according to their expected marginal labor product based off the previous season's performance. The independent variables used in the data fall into relatively two categories: nonperformance variables, and performance variables. Non-performance variables include contract year, team, and position. The year variable is to control for year-by-year changes in the free agent market, whether it be by changes in the collective bargaining agreement, the salary cap ceiling, or as a reflection of the scarcity for talented players in the free agent market. The team categorical variable will control for any relevant factors that may be due to a specific market location. The year variable indicates the regular season immediately before the (year)-(year+1) contract was signed (2005 would mean indicate player performance in the 2005-06 NBA regular season and the contract for the 2006-07 regular season and/or further seasons). Table 1: Summary Statistics Variable year g age per ts efg ftr par Obs 248 248 248 248 248 248 248 248 Mean 2002.665 62.1129 27.58871 13.65565 0.5170887 0.4745685 0.3225605 0.1843992 Std. Dev. Min Max 1.888835 2000 2007 21.2539 2 82 4.073241 18 38 4.018597 1.6 27 0.0549687 0.326 0.85 0.0569163 0.28 0.8 0.1590199 0 1.1 0.1875033 0 0.735 25 orb drb trb ast stl tov usg ortg drtg ws48 contractyrs contractamt mpg ppg rpg apg spg bpg topg fpg fgmpg fgapg ftpct fgpct tpmpg tpapg threeptpct salary lsal ftmpg twoptscoring threeptsco~g 248 248 248 248 248 248 248 248 248 248 248 248 248 248 248 248 248 248 248 248 248 248 245 248 248 248 210 248 248 248 248 248 5.880645 13.95847 9.918548 14.28427 1.672581 14.34476 18.08145 104.5323 104.9194 0.0915726 3.028226 2.16E+07 24.21865 8.993101 4.163952 2.166794 0.7736303 0.4952926 1.382882 2.226145 3.34372 7.581034 0.7421754 0.4422836 0.5015793 1.427805 0.2743201 4744826 14.86526 1.804082 5.68428 1.504738 4.10655 0 17.2 5.655695 3.1 33 4.596802 3.2 22.5 9.70982 0 49.3 0.7321245 0 5.3 4.08967 3.6 42.9 4.694011 6 32.9 9.066944 74 129 4.337198 94 115 0.0545501 -‐0.115 0.248 1.937069 1 7 3.01E+07 231625.7 1.50E+08 8.047507 12 42.5 5.131943 1.405063 27.58537 2.47007 0.7058824 14.14286 1.800436 0 10.5122 0.450692 0 2.885246 0.6187575 0 3.317073 0.7126767 0.2 3.7 0.6709745 0.625 4.012821 1.872388 0.5189874 11.22857 4.10485 1.253165 23.65854 0.1013789 0.4328358 1 0.0584132 0.28 0.7 0.5973277 0 2.756098 1.581141 0 6.804878 0.1482258 0 0.6666667 4896657 231625.7 2.20E+07 1.032914 12.35288 16.90655 1.372363 0 6.95122 3.545835 1.037975 22.4 1.791983 0 8.268292 Ts: True Shooting percentage; gp: Games Played; per: Player Efficiency Rating(Hollinger); efg: Effective Field-Goal Percentage; ftr: Free-Throw Rating; par; 3pt. Shooting Rating; ast-tov: rates for analogous performance statistics, based on overall rates by team; ws: Win-Shares; ppg-drpg: per-game performance statistics; lsal: loggedsalary; tpmpg: Three-points made per game; tp12, tp23: Indicator variables for amount of 3-pt. shots made per game; ftpct: Free-Throws made percentage. The performance variables include all major performance statistics used by the NBA in their box scores, such as points per game, rebounds per game, and minutes per game, as 26 well as some advanced statistics either utilized by the NBA or constructed by sports statisticians and economists. These statistics include Win Shares, Win Shares Per 48 minutes, Hollinger PER, True Shooting percentage, and Three-point shooting rate. Pergame rates are used instead of season totals because they are generally recognized as better indicators of player performance; the 82-game season is long enough to where even generally healthy players may miss a few games. The WS48(Win Shares per 48 minutes) and PER independent variables are significant additions to the data, since these are considered to be statistics that evaluate overall player contribution. Also, the Threepoint shooting rate is significant since this perhaps captures how heavily a player depends on the three-point line for scoring his field goals. Twoptscoring and threeptscoring are two variables constructed specifically for the models. Twoptscoring represents the average amount of points scored through two-point fields goals per game. This was constructed using the following equation: twoptscoring = 2*(fgm-tpm)/g where fgm is the total field goals made in a season, and tpm is the total number of threepoint field goals in a season. Dividing by g gives us a per-game rate. Multiplying by 2 gives us the point value of the shot. Analogously, I constructed a threeptscoring variable: threeptscoring = 3* tpm/g These two variables, in addition to ftmpg, the amount of free-throws made per game, account for the total points per game for each player. Therefore, these three variables should be perfect instrumental variables for ppg if calculated correctly. Looking at the means in the summary table, this is confirmed. By breaking up points per game, we can more closely inspect the variation in scoring and better explain the variance. where (alpha) represents the intercept term, and represents the error terms in year t. 27 2.2 Models 2.2.1 Basic Model A variation of the model will attempt to estimate three-point shooting specialization using the threeptscoring performance variable using only basic performance statistics. The model aims to quantify how much an increase of 1 point in each of the scoring statistics increases real average salary. The Finally, I will rerun the regression on the observations that fall into the Three-point shooting category to investigate any effects within this specific subset. I will get estimates using only standard performance explanatory variables, and also with the advanded performance statistics. Table 1: Basic Model. Dependent variable is player’s average wage, regressed on some performance explanatory variables. Dependent Variable Indep. Variable of Interest Other Explanatory Variables Logged Avg. Real Salary Threeptscoring Team Twoptscoring Minutes per game Year Position According to the relevant literature, this model should explain much of the variation in average salary. Of concern would be the potential collinearity between the scoring variables and minutes per game; we would expect plays that play more minutes to score more points, due to increased opportunities. However, if the results are robust to the inclusion of mpg, then we can conclude that these statistics are appropriate explanatory variables for describing variation in salary. 28 2.2.2 Full Performance Model The Full Performance model will follow in the footsteps of the Basic Performance model, but additionally include the major box score statistics, and the more advanced performance statistics that do not appear on the box score. The final regression will include a combination of the performance statistics. The expected result of including these statistics is to more accurately control for the skill sets of these players, especially the defensive impact. Table 2: Harnessing all performance statistics. Dependent Variable Indep. Variable of Interest Other Explanatory Variables Logged Avg. Real Salary Threeptscoring Team Twoptscoring Minutes per game Year Position Non-scoring Box Score Statistics Advanced Player Statistics 29 Chapter 3: Results and Discussion In this chapter, the regression results of the models presented in the previous chapter will be presented and discussed. 3.1.1 Basic Performance Model: Two-point Shooting, Three-point Shooting, and standard performance statistics. The basic model is useful because it utilizes variables that until recently were the core explanatory variables for describing anything that happens on the basketball court. These variables are presented in the box scores for every basketball game, and many awards are based off these statistics. From the perspective of an NBA front office, these variables are the first ones seen when investigating a player’s marginal labor product. Table 3: Regression Results of Basic Model – Dependent Variable: Log Avg. Salary twoptscoring Threeptscorin lsal lsal lsal lsal lsal 0.197 0.132 0.076 0.068 0.056 (0.014)** (0.023)** (0.028)** (0.026)** (0.026)* 0.046+ -0.011 0.080 0.040+ (0.027)+ (0.033) (0.034)* (0.034)+ 0.206 0.168 0.175 0.171 (0.057)** (0.054)** (0.049)** (0.059)** 0.038 0.037 0.043 g ftmpg mpg 30 (0.012)** Center PF SF SG 2bn.Tm (0.011)** (0.010)** 0.726 0.608 (0.138)** (0.153)** 0.415 0.375 (0.141)** (0.149)* -0.108 0.004 (0.149) (0.158) -0.037 -0.034 (0.128) (0.130) -0.021 (0.309) 3.Tm 0.187 (0.310) 4.Tm 0.049 (0.326) 5.Tm 0.286 (0.253) 6.Tm 0.293 (0.320) 7.Tm -0.020 (0.332) 8.Tm 0.350 (0.281) 9.Tm 0.084 (0.273) 31 10.Tm 0.395 (0.414) 11.Tm 0.312 (0.277) 12.Tm 0.354 (0.432) 13.Tm -0.094 (0.319) 14.Tm 0.096 (0.421) 15.Tm 0.548 (0.548) 16.Tm -0.004 (0.290) 17.Tm 0.199 (0.254) 18.Tm 0.099 (0.241) 19.Tm -0.093 (0.400) 20.Tm 0.600 (0.332) 21.Tm 0.428 (0.395) 22.Tm 0.002 32 (0.267) 23.Tm 0.603 (0.250)* 24.Tm 0.186 (0.283) 25.Tm 0.450 (0.323) 26.Tm 0.185 (0.287) 27.Tm -0.127 (0.290) 28.Tm 0.491 (0.321) 29.Tm 0.111 (0.246) 30.Tm 0.312 (0.260) 2001bn.year -0.356 (0.257) 2002.year -0.045 (0.139) 2004.year 0.245 (0.135) 2005.year 0.208 (0.178) 33 2006.year -0.470 (0.379) 2007.year 0.517 (0.224)* _cons R2 N 13.744 13.676 13.226 12.959 12.691 (0.085)** (0.090)** (0.153)** (0.166)** (0.257)** 0.46 248 0.49 248 0.52 248 0.58 248 0.66 248 * p<0.05; ** p<0.01 The results from the basic model regressions corroborate some expected results when it comes to scoring. Twoptscoring is positive and significant the 1% significance level in four out of five regressions, and significant at the 5% level in all five. In the first regression, a one-point scoring increase leads to a 20% increase in salary, and in the final regression a 5% increase. Analogously, the R2 in the first model is 0.46 and in the last model 0.66, a difference of 0.2. The coefficient on twoptscoring decreases in magnitude by half when mpg (minutes per game) is included in the model. The correlation coefficient between mpg and twoptscoring is 0.7860; this indicates twoptscoring is an effective explanatory variable even when strong collinearity is present. The twoptscoring coefficient was an expected result; PPG overall had a significant coefficient in most of the regressions run by Berri The coefficient on threeptscoring is not as significant as twoptscoring; in three out of the four models that include threeptscoring is statistically significant at the 10% level, but only significant at the 5% level in one of the three (p < 0.02). In the one model where threeptscoring is not statistically significant, the sign is slightly negative. In an Ftest on the coefficients in the second model, the probability of twoptscoring and threeptscoring being equal was approximately 1%; an analogous F-test on twoptscoring and ftmpg had probability 0.34, and 0.02 for threeptscoring and ftmpg. The F-test on whether all the coefficients were the same had probability under 1%. 34 Ftmpg had the largest coefficient of the scoring explanatory variables in all four regressions and was also statistically significant at the 1% level. After controlling for minutes per game, the coefficient did not change much in magnitude to variation in the model. This is an unexpected result; free-throw shots are worth the least in terms of point value in an NBA game. However, it is not uncommon for players to shoot free-throws at a percentage as high as 80% or 90%. Also, the correlation coefficient between ftmpg and twoptscoring is 0.8244, which is very strong. It is therefore likely that the coefficients for both variables describe similar effects on salary, which is likely skill set(players who shoot more free-throws are likely to be more talented on offense and draw more personal fouls). This is corroborated by the fact that introducing both ftmpg and threeptshooting into the model increased the R2 by .03, a 6% increase in explanatory power. The position categorical variables had two statistically significant coefficients, which were on the Power Forward and Center indicator variables. This result was robust throughout all the variations of the model that included position; b3.pos1 was significant at the 1% level, and 2.pos1 significant at the 5% level. These positions also represent the positions with the tallest and largest players, and compromise 37.5% of the observations. Adding these explanatory variables increased R2 by approximately 12% from model (3) to model (4). 3.1.2 Full Performance Model: Non-scoring Box Score Statistics & Advanced Player Statistics Table 4: Regressions with all performance statistics. twoptscoring lsal lsal lsal 0.056 0.137 0.098 (0.028) threeptscoring 0.067 (0.036) (0.038)** 0.212 (0.052)** (0.043)* 0.230 (0.067)** 35 ftmpg 0.160 (0.062)* mpg 0.256 (0.096)** 0.217 (0.106)* 0.020 (0.016) apg rpg 0.100 -0.085 (0.065) (0.101) 0.041 (0.040) spg -0.027 (0.140) bpg 0.285 0.223 (0.105)** topg fpg Center Power Forward Small Forward Shooting Guard 2bn.Tm (0.106)* -0.084 0.188 (0.136) (0.259) 0.060 0.102 (0.102) (0.096) 0.423 0.443 0.312 (0.231) (0.268) (0.268) 0.304 0.295 0.256 (0.214) (0.243) (0.238) 0.050 0.052 0.029 (0.215) (0.234) (0.228) 0.087 0.023 0.049 (0.182) (0.187) (0.183) 0.027 0.135 0.171 36 3.Tm 4.Tm 5.Tm 6.Tm 7.Tm 8.Tm 9.Tm 10.Tm 11.Tm 12.Tm 13.Tm 14.Tm (0.293) (0.283) (0.294) 0.191 0.121 0.227 (0.295) (0.276) (0.296) -0.036 0.202 0.164 (0.309) (0.298) (0.299) 0.217 0.035 0.015 (0.258) (0.257) (0.283) 0.268 0.294 0.303 (0.327) (0.318) (0.329) -0.156 -0.038 -0.159 (0.295) (0.296) (0.297) 0.259 0.390 0.398 (0.293) (0.300) (0.314) 0.055 0.002 0.043 (0.304) (0.269) (0.293) 0.386 0.299 0.329 (0.416) (0.379) (0.396) 0.239 0.223 0.217 (0.296) (0.309) (0.328) 0.224 0.102 0.054 (0.391) (0.426) (0.409) -0.062 0.043 0.036 (0.330) (0.323) (0.335) 0.070 0.095 0.150 (0.440) (0.436) (0.468) 37 15.Tm 16.Tm 17.Tm 18.Tm 19.Tm 20.Tm 21.Tm 22.Tm 23.Tm 24.Tm 25.Tm 26.Tm 27.Tm 0.439 0.584 0.460 (0.521) (0.535) (0.521) -0.097 -0.050 -0.076 (0.307) (0.301) (0.314) 0.210 0.201 0.272 (0.267) (0.277) (0.296) -0.004 0.095 0.063 (0.267) (0.240) (0.266) -0.073 0.008 0.060 (0.387) (0.418) (0.423) 0.588 0.545 0.549 (0.342) (0.334) (0.335) 0.392 0.398 0.408 (0.396) (0.379) (0.389) 0.029 0.004 0.036 (0.287) (0.281) (0.299) 0.512 0.584 0.559 (0.251)* (0.258)* (0.269)* 0.149 0.193 0.191 (0.305) (0.285) (0.292) 0.419 0.541 0.559 (0.329) (0.342) (0.340) 0.189 0.068 0.160 (0.326) (0.317) (0.327) -0.205 -0.252 -0.293 38 28.Tm 29.Tm 30.Tm 2001bn.year 2002.year 2004.year 2005.year 2006.year 2007.year (0.307) (0.287) (0.310) 0.442 0.527 0.511 (0.332) (0.313) (0.336) 0.024 0.046 0.024 (0.269) (0.243) (0.268) 0.333 0.399 0.437 (0.280) (0.280) (0.301) -0.309 -0.289 -0.285 (0.262) (0.263) (0.267) -0.038 -0.071 -0.081 (0.139) (0.139) (0.137) 0.210 0.363 0.317 (0.140) (0.146)* (0.148)* 0.182 0.412 0.351 (0.204) (0.225) (0.236) -0.525 -0.355 -0.394 (0.384) (0.384) (0.383) 0.475 0.762 0.728 (0.228)* ts ftr par (0.234)** (0.233)** -0.373 -0.029 (1.069) (1.118) -0.228 -0.213 (0.591) (0.589) -1.125 -1.535 (0.567)* (0.621)* 39 trb -0.016 -0.021 (0.023) (0.023) 0.013 0.028 (0.011) (0.018) -0.159 -0.127 (0.084) (0.078) 0.005 -0.019 (0.014) (0.023) -0.056 -0.059 (0.015)** (0.022)** -0.062 -0.053 (0.018)** (0.018)** 12.774 21.009 20.053 (0.294)** (2.370)** (2.334)** ast stl tov usg drtg _cons R2 N 0.68 248 0.69 248 0.70 248 In the Full Performance Model, we see that twoptscoring, threeptscoring and ftmpg are all statistically significant at the 1% level in two out of three regressions. The first regression uses standard performance statistics that are reported in the box score for every basketball game. When we include all these variables, and additionally control for team-specific, position-specific, and year-specific effects on the data, we get an R2 of 0.68, only 0.02 higher than model (5) in the basic performance model. Furthermore, the only statistically significant explanatory scoring variable is ftmpg, which is significant at the 5% level. Due to the high correlation between mpg, ftmpg, and twoptscoring, it is likely collinearity is disturbing the explanatory power of certain scoring variables. 40 Amongst the newly introduced explanatory variables, 3Par, drtg, and usg are statistically significant at least at the 5% level in both models they are included in. 3Par describes the amount of attempts a player from the 3-pt. line as a percentage of overall shooting; an increase in this variable means a player relies more heavily on the 3-pt. line. The coefficient on this variable is quite large, significant only at the 5% level, and has negative magnitude. In conjunction with the threeptscoring variable, this coefficient is somewhat confusing and counterintuitive. Usg, usage rate, an estimate of the percentage of team plays run by a team through a specific player, is intended to capture how heavily a team relies on a certain player. One would expect to see this variable positively correlated with better skill sets and more productive players, yet the coefficient is negative and statistically significant. However, if we assume the other variables are capturing player skill, then usg may be controlling for the statistical inflation a player receives by being the only good player (if a team is not very talented, the most talented player may receive a heavier load of plays and have inflated performance statistics). Overall, the three models have relatively the same R2, and do not have much stronger predictive power than the Basic Performance Model. The addition of nonscoring performance explanatory variables barely aids in describing the variation in real average salary. 3.2 Discussion Inspecting the results of both models, the Basis Performance Model explains roughly 70% of the variation in logged real average salary with relatively few performance statistics. Controlling for team-specific, time-specific, and position-specific effects, we can conclude from the model that all three forms of shooting have relatively the same impact on salary. Testing for difference in coefficients, we cannot reject the null hypothesis that the coefficients are different, but the model suggests that the scoring variables are likely appropriate explanatory parameters for salary. Even though minutes per game is highly correlated with scoring, our results were robust to the inclusion of mpg 41 when it came to two-point scoring and free-throw scoring; the effect of three-point shooting was more mild but significant as well. In terms of estimating the impact of three-point shooting on salary, we see a mild but positive effect in both models. In the first model, this effect is weak, and in the second, this effect is strong. This mildly supports the generalist bias assumption prevalent in the literature. Three-point shooting specialization does have market value, and could perhaps be slightly higher than for two-point shooting. This would be a sensible conclusion to make since three-point shooting is more difficult physically, and more valuable from a game theory perspective. There should still be concern as to how strong this effect is. The advanced model utilized explanatory variables that partially control for skill level and overall scoring ability. Introducing the advanced performance variables, most notably 3Par, led to the coefficient on threeptscoring to almost quadruple in magnitude. In this model, a player who attempts 20% of his shots from beyond the three-point line would get a decrease of 0.2, or a 20% decrease in salary. If the player scores three points a game from the threepoint line(or only one three-point field goal per game), then this would increase his salary by 79%, for an overall net increase of roughly 59% on his salary. Controlling for reliance on the three-point line, this result would imply very high market value on three-point shooting. This would also imply that we reject the generalist bias almost completely, and instead conclude that instead two-point field goals are valued less than three-point shooting; two-point shooting only has a strong effect on salary since it is the main scoring mechanism for most players. This would also accurately reflect the relative scarcity of elite three-point shooters. Finally, this result would also suggest that players who rely on the three-point line but do not score much from there will be negatively evaluated for their inefficiency; bad or ineffective scoring is not rewarded in the market. Finally, it is important to note the explanatory power of the models were not greatly ameliorated by inclusion of more performance variables beyond the scoring ones. 42 Conclusion The purpose of my thesis was to determine whether three-point shooting was valued differently than two-point shooting in NBA labor markets. Using a small regression model, three-point shooting was found to have a statistically significant effect on real average salaries for NBA players, and there is some evidence to suggest this coefficient may be different than two-point scoring. When the model included minutes per game, the results were much more mild; when 3Par was included, the effect was exacerbated. In multiple regressions, the coefficient on three-pt scoring was higher than two-point scoring. These results were not expected but do follow along with the expected value of the two shots in a competitive basketball game. There are issues with the results of the models when it comes to the estimates on three-point shooting. For one, the sample size of 248 players is relatively small; a sample size of over a decade would further improve the precision of the results. Also, it is difficult to determine which variables are explaining the same underlying effects on the court; perhaps all the variables just describe overall talent when it comes to the superstar players, and since these players are on another pedigree they skew results that would otherwise not be true for players that may be just average or below-average. The model also could have benefited from better explanatory variables. Instead of using age, it would have been better to have years of experience as a variable; players can come into the league at any age over 18 or 19 and be at different levels of experience. Finally, it would have been more interesting to include more recent years of performance into the data. From the results presented, it is clear that utilization of the three-point line has been on the increase, and it would be interesting to determine whether the market value of three-point baskets has increased in recent years. Overall, the results suggest that three-point shooting has a positive, signficnat impact on salary. Three-point shooting is has at least as much impact on per-year wage on a point-by-point scale. The fact that the coefficients are slightly larger on three-point shooting suggest NBA front-offices correctly put slightly more value on the three-point shot. Appendix Bibliography Wang, Long, and J. Keith Murnighan. 2013. “The Generalist Bias.” Organizational Behavior and Human Decision Making Processes 120:47-61. Lazear, Edward P., and Sherwin Rosen. 1981. “Rank-Order Tournaments as Optimum Labor Contracts.” Journal of Political Economy 89.5: 841. Print. Lutz, Dwight. 2012. “A Cluster Analysis of NBA Players.” MIT Sloan Conference. Berri, David J. 1999. “Who is ‘Most Valuable’? Measuring the Player’s Production of Wins in the National Basketball Association.” Managerial and Decision Economics 20.8: 411-27. JSTOR. Wood, Ryan. 2013. “The History of the 3-Pointer.” IHoops. Youth USAB. Goldman, Matthew, and Justin M. Rao. 2012. “Live by the Three, Die by the Three? The Price of Risk in the NBA.” MIT Sloan Conference. “NBA Player Salaries – National Basketball Association – ESPN.” 2013. ESPN.com. Michaelides, M. “A Test of Compensating Differences: Evidence on the Importance of Unobserved Heterogeneity.” Journal of Sports Economics 11.5: 475-95. Sagepub. Condotta, Bob. 2008. “College Men’s New 3-point Line Sparks Debate.” College Sports. Seattle Times. Berri, David J., Stacey L. Brook, and Martin B. Schmidt. “Does One Simply Need to Score to Score.” Internaitonal Journal of Sport Finance 2, no. 4: 190-205. “Three-point Line: Basketball.” 2006. Image. Encyclopedia Brittanica – Kids Encyclopedia. Gaines, Cork. 2013. “Does The NBA Need To Move Back The Three-Point Line?” Business Insider. “NBA & ABA Basketball Statistics & History.” 2013. Basketball-Reference.com. Sports Reference LLC. 46 “NBA.com, Official Site of the National Basketball Association.” 2013. NBA.com.
© Copyright 2026 Paperzz