Measuring Talent through Benchmarking: Babe Ruth is the Home Run King Peter A. Groothuis Walker College of Business Appalachian State University Boone, NC 28608 Kurt W. Rotthoff Seton Hall University Stillman School of Business Spring 2012 Abstract Comparing talent across generations may be like comparing apples to oranges because over time the benchmark of what determines talent changes. On the other hand the ability to compare talent over different generations could be very useful. We develop a benchmarking technique to accurately measure different talent levels over different periods of time. Utilizing Major League Baseball data from 1871-2010, we measure talent given that hitting, pitching, and defensive abilities of the league on the whole have fluctuated over time. Using this approach, we find that Babe Ruth is indeed the home run king across all generations. JEL Classifications: Keywords: Benchmarking, Major League Baseball, Home Runs Kurt Rotthoff at: [email protected], Seton Hall University, JH 621, 400 South Orange Ave, South Orange, NJ 07079. We would like to thank Brian Henderson. Any mistakes are our own. I. Introduction In finance, one’s performance is not simply measured by the absolute return of a portfolio, but the true performance is measured relative to some benchmark. The benchmark is established as market portfolio or Security Market Line (Roll 1977). Thus, every portfolio manager’s goal is to ‘beat the market’. This benchmarking is also used in many other ways: salaries are benchmarked to find the relative pay, benchmarking technological development, research output, teaching performance, and others. The use of benchmarking allows the revelation of performance. The use of benchmarking, however, can be expanded to provide more detailed analysis of talent levels; not just of today’s performances, but it allows for a viable measure of talent across different eras. Talent is highly valued, thus accurate measures of relative talent today, and comparisons across time, are also highly valued. When a CEO makes a big deal, or a professor publishes in a top tier journal, are they compared to other CEOs and professors today, or all current and past CEOs and professors? Talent is highly valued and in some industries hard to measure. This measurement increases in difficulty as you measure the talent within an industry over long periods of time, for instance from 1871-2010. This problem is also complicated by the fact that when the opportunity to reveal talent is limited, true talent does not have the opportunity to reveal itself (Terviö 2009). Most sports restrict the revelation of talent as well. The National Football League (NFL) and the National Basketball Association (NBA) both use the National Collegiate Athletic Association’s (NCAA) college sports system as their talent training program. In collegiate sports there is a high variance of talent, both within and across teams, as well 2 as a limited ability to transfer to another team. Major League Baseball (MLB) uses two training systems; the NCAA and a minor league system. MLB recruits talent from college, but they also have a well-developed minor league system, in which they develop their own talent. This structure allows for more revelation of talent and skills before the players play at the top level. This reduces the talent restriction problem, so matched with a benchmark talent measure, makes for a superior measure for identifying superstars. We use a benchmark measure, similar to that found in finance, to find the distribution of talent, relative to all players in a given season, to reveal the superstars of MLB. Because talent changes over time, having a moving benchmark reveals an accurate measure of true talent levels of a player at any given point of time. Throughout the years skills, strength, and training have changed. This benchmark measure increases the accuracy of measurement and the ability to truly compare talent over time. The use of a benchmark technique provides a method to identify superstars in baseball. The next section looks at the data and methodology. Section three looks at various measures of hitting performance, followed by a looks at structural changes over time. The last section concludes. II. Data and Methodology The deliberation on superstars and their relative performance in oft debated and hard to measure, particularly when the comparison happens over different periods of time. Over time technologies, strategies, skills, size, and speed have changed, thus making accurate comparisons nearly impossible. An accurate way to measure these talents across time allows conclusions on who the truly great stars are. Given a seemingly 3 endless set of debates and lists of superstars we propose a measurement technique, adopted from the finance literature, to compare stars in a relative performance measure of their same generational cohort. In finance, all funds want their returns to be positive; however, the true measure of success is the ability to ‘beat the market’. Thus, the overall ranking is relative to some moving measure over time, referred to as the market average. Applying this relative measure to sporting events will allow us to compare groups of individuals that played in very different eras. Given that talent levels have changed over time, in both batting and pitching, the net effect is unclear. Add the technology changes (in bats, training, clothes, fields, and so on) and these continue to muddy the ability to compare superstars. Major League Baseball attracts the best baseball players in the world. The league was started in the late 1800s and continues its play today. We use data from Sean Lahman’s Baseball Database, analyzing the hitting of all baseball players from 18712010, with at least 100 at-bats, to analyze superstars in baseball.1 Because the game has changed over time, analyzing the hitter’s standard deviations above the mean, a measure of relative performance at each point in time, gives an accurate measure of superstars. With 35,728 single season observations we find that the average player hit 7 homeruns per year (with a max of 73), had an average of: 342 at-bats, 42.5 runs batted in (RBI), and a slugging percentage of .379. 1 http://baseball1.com/2011/01/baseball-database-updated-2010/ 4 Table 1: Average of the yearly mean and standard deviations Home Runs per RBIs per 100 at‐bats 100 at‐bats Slugging Percentage Standard Standard Standard Mean Deviation Mean Deviation Mean Deviation 1800s (1871‐1899) 0.486 0.588 11.797 4.190 0.343 0.080 Pre‐WWII (1900‐1939) 0.876 0.993 11.420 3.984 0.363 0.078 Post‐WWII (1940‐1976) 2.141 1.719 11.690 4.196 0.379 0.079 Free Agency (1977‐1994) 2.326 1.762 11.842 3.979 0.387 0.075 Modern Era (1995‐2010) 2.995 1.983 13.079 4.308 0.419 0.082 We break the data into five different time periods: the eighteen hundreds (18711899), Pre World War II (1900-1939), Post World War II (1940-1976), the free agency era (1977-1994), and the modern era (1995-2010). For home runs, both the mean and standard deviation are increasing in time. Similarly the mean is increasing in time for slugging percentage. However, for RBIs there is no noticeable change in the mean or standard deviation. III. Hitting When you take the homer runs per at-bats for each individual player for a given year, and rank the standard deviations above the mean for each given year, the top ranked home run hitter is Babe Ruth in 1920 (Yankees), 1921 (Yankees), 1919 (Boston), and 1927 (Yankees). He was 10.58, 8.07, 7.26, and 7.04, respectively, standard deviations above the mean. The fifth highest ranked player is Ned Williamson (1884 Chicago), followed by Ruth (1926), Ruth (1924), Buck Freeman (1899 Washington Senators), Ruth (1928), and Gavvy Cravath (1915 Phillies). From the modern era the highest ranked players are Barry Bonds (2001 San Francisco), in 13th place, at 5.85 standard deviations 5 above the mean and Mark McGwier (1998 and 1997 St Louis), in 19th and 20th place, at 5.4 standard deviations above the mean (table 2). Table 2: Home Runs Standard Deviations above the mean Player Year HR 1 Babe Ruth 1920 10.58 2 Babe Ruth 1921 8.07 3 Babe Ruth 1919 7.26 4 Babe Ruth 1927 7.04 5 Ned Williamson 1884 7.01 6 Babe Ruth 1926 6.83 7 Babe Ruth 1926 6.50 8 Buck Freeman 1899 6.41 9 Babe Ruth 1928 6.11 10 Gavvy Cravath 1915 6.08 13 Barry Bonds 2001 5.85 19 Mark McGwier 1998 5.42 20 Mark McGwier 1997 5.41 Babe Ruth, in his 1920 playing year with the New York Yankees, was 10.58 standard deviations above the mean for that season. This is simply amazing and displays his level of performance relative to the competition he faced. To put this in perspective, if Babe Ruth was 10.58 standard deviations above the mean in 2001, when Barry Bonds set the single season home run record, and had the same 476 at-bats that Barry Bonds did, he would have hit 120 home runs. Barry Bond still holds the single season mark with 73. Although Babe Ruth is the home run king, there are other measures of hitting talent that cannot be ignored. Next, we measure the RBIs per at-bat of the players throughout time to measure how each player performs relative to the mean of the year they played in, again with at least 100 at-bats. 6 Table 3: RBIs Standard Deviations above the mean Player Year RBI 1 Reb Russell 1922 4.93 2 Cap Anson 1886 4.74 3 Babe Ruth 1920 4.65 4 Babe Ruth 1919 4.21 5 Babe Ruth 1921 4.21 6 Babe Ruth 1926 4.20 7 Charlie Furguson 1887 4.17 8 Gavvy Cravath 1913 4.05 9 Joe Wood 1921 4.04 10 Babe Ruth 1932 4.03 13 Manny Ramirez 1999 3.92 20 Manny Ramirez 2008 3.80 In table 3, we report the results of the superstars as measured by standard deviations above the mean. We find that Reb Russell, playing for the Pittsburgh Pirates, has the highest ranking in RBIs.2 He was 4.93 standard deviations above the mean. Babe Ruth has 5 of the top ten rankings and Manny Ramirez is ranked 13th and 20th as the highest ranked modern era player. Slugging Percentage is another widely used batting statistic. The slugging percentage is total bases divided by the at-bats (table 4). 2 Reb Russell was a pitcher from 1912-1917 with the Chicago White Soxs. He did not become a big hitter until after developing arm troubles and finding his hitting in the minor leagues. 7 Table 4: Slugging Percentage Standard Deviations above the mean Player Year Slugging Pct 1 Babe Ruth 1920 5.77 2 Babe Ruth 1921 5.21 3 Barry Bonds 2001 5.03 4 Barry Bonds 2004 4.91 5 Barry Bonds 2002 4.79 6 Babe Ruth 1927 4.57 7 Babe Ruth 1926 4.50 8 Lou Gehrig 1927 4.49 9 Ted Williams 1941 4.36 10 Babe Ruth 1924 4.35 Babe Ruth is again at the top of this ranking. Measuring the standard deviations above the mean, for that season, Babe Ruth ranks first, second, sixth, seventh, and tenth. This is the first measure that has a modern era baseball player ranked in the top ten. Barry Bonds is ranked third, fourth, and fifth, measuring 5.03, 4.91, and 4.79 standard deviations above the mean for the 2001, 2004, and 2002 seasons, respectively. IV. Structural Changes over Time The reason benchmarking if valuable is because, over time, the game has changed. This analysis allows for a measure relative to the type of play occurring during the years that each individual player plays. The mean and standard deviation have been changing in the number of home runs over time. 8 Home Runs per 100 at-bats: However, the other measures, including RBIs per 100 at-bats and the slugging percentage, show no noticeable changes in the mean and standard deviation over time. RBIs per 100 at-bats: 9 Slugging Percentage: The nature of the game changes. Another way to analyze how the game has changed over time is to look at the distribution of talent over the different eras. The 1800s, as well as pre- and post-WWII, were fundementially a different game than the free agency and modern era. At one time is a was a genetlemans game, whereas now it is a profession. This can be seen clearly in the differing distributions of talent, as measured by home runs per 100 at-bats, over time. The distribution has become closer to normal over time. This also supports the use of a benchmark to accuratly compare talent as these changes have occurred over time. 10 Player Home Run, by era Pre-WWII Post-WWII 1 2 3 0 0 Free Agency Modern Era 4 5 5 10 0 .5 1 Density .5 1 1800s 0 5 10 0 5 10 Density Normal Pre-WWII Post-WWII 1 2 3 0 .1 .2 .3 .4 1800s -5 Free Agency Modern Era 4 5 0 .1 .2 .3 .4 Density Player RBI, by era -5 0 5 -5 0 Density Normal 11 5 0 5 However, when looking at the RBIs per 100 at-bats and the slugging percentage, the distribution of performance over time has been, and continues to be, relatively normal. Because the distribution is not noticeably changing over time, it is possible to use the absolute measures of performance, rather than a benchmark strategy, to measure performance over time. Pre-WWII Post-WWII 1 2 3 0 .1 .2 .3 .4 1800s -5 Free Agency Modern Era 4 5 0 5 0 .1 .2 .3 .4 Density Player Slugging Percentage, by era -5 0 5 -5 0 5 Density Normal The benchmarking strategy shows that Babe Ruth is the highest ranked in terms of slugging percentage, at 5.77 standard deviations above the mean in 1920. That is the season that where Babe Ruth hit the had 172 hits in 458 at-bats, holding a record slugging percentage of .847. He also holds the MLB career slugging percentage record at .670. V. Conclusion 12 Works Cited Roll, R. (1978) Ambiguity when Performance is Measured by the Securities Market Line Journal of Finance 33, 1051-69. Terviö, Marko (2009) Superstars and Mediocrities: Market Failure in The Discovery of Talent Review of Economic Studies 72(2), 829-850 13
© Copyright 2026 Paperzz