Performance Variation among Major League Baseball Closers:
Field Evidence of Situational Pressure Effects
Anthony Bakshi
Brown University
April 17, 2013
Abstract
Despite immense analytical advances in Major League Baseball (MLB), some traditional methods
of player labor allocation continue to be used. Closers, a subset of pitchers, are primarily
substituted into particular game states that qualify as save situations (SS), and conventional
wisdom suggests that they enjoy motivational benefits that lead to improved performance in
these states. This study investigates the persistence of the performance discrepancy after the
inclusion of key variables, including controls for pitcher and hitter skill levels, and examines other
potential causes. Analysis of a data set of more than 26,000 plate appearances supports a
significant and positive effect of the SS state on closer performance. The results further suggest a
significant and positive effect of the SS state when combined with high-pressure at-bats that
considerably impact the game’s outcome. The study also contributes a “clutchness” ranking
system that quantifies the heterogeneous effects of situational pressure on the performance
levels of individual closers.
_____________________________
I would like to thank my adviser, Professor Pedro Dal Bó, for his guidance and support throughout the research
process. Additional thanks to Professor Jeremy Kahn for serving as the second reader from the mathematics
department, Professors Kenneth Chay and Anna Aizer for their suggestions, and Nicholas Coleman for providing a
wealth of valuable advice. Thank you to the Office of the Dean of the College for providing funding for necessary
data through a Research at Brown grant. I’d finally like to thank Russell A. Carleton, Matthew Goldman and Justin
Rao for their voluntary guidance and insights that significantly contributed to this paper.
1.
Introduction
Major League Baseball (MLB) is a professional sports organization that lies at the heart of
American culture. With origins in the late 19th century, “America’s pastime” has developed into a multibillion dollar industry1 with a strong players union that protects its members’ guaranteed, multi-year
and often multi-million dollar contracts. As player salaries have risen dramatically since the 1980s, so
have the stakes involved in efficiently allocating wages. Coupled with the influx of statistical analysis into
the sport throughout the last decade, a movement brought to the public sphere by Michael Lewis’
Moneyball, the sport seems destined to continue evolving toward a data-driven, wholly efficient
industry.
Despite the analytical advances, some traditions persist. Closers, usually the highest-skilled relief
pitchers on teams, are deployed in a rigid manner, substituted into games to earn “saves,” an oldfashioned statistic. A save situation (SS) occurs when a pitcher enters with his team leading by one, two
or three runs and attempts to convert at least the final three outs of the game.2 A non-save situation
(NSS) is any other situation; the closers’ team may be losing, tied or ahead by four runs or more (Figure
1). The importance attributed to this arbitrary set of game situations by teams, players and fans helps
form an opportune area of study. There has been extensive analysis of the pitfalls of closer usage by
prominent “sabermetricians,” those who apply analytical methods to baseball. Bill James, the founder of
the movement and creator of the term, wrote that “using your relief ace to protect a three-run lead is
like a business using its top executive to negotiate fire insurance.” But the number of saves earned by
closers remains an important statistic, reflected in closer salary levels that are the highest among relief
1
The MLB generated revenues of $7 billion in 2010, according to Reuters.
There are two less common SS: When a pitcher enters the game with the potential tying run already on base, at bat, or next to
bat or pitches at least three innings to finish the game while his team is ahead by any margin.
2
1
pitchers,3 the group of pitchers substituted into games to “relieve” starting pitchers who pitch the
majority of innings.
The relationship between saves and closer pay may contribute to a historical performance
premium: closers typically pitch better in SS than in NSS. An MLB.com study of closers who recorded 40
or more saves in a season between 2001 and 2010 showed a marked discrepancy in Earned Run Average
(ERA)4; the closers had a 2.28 ERA in SS and a 2.99 ERA in NSS (Singer 2011).5 This performance variation
is commonly attributed to behavioral effects by fans, baseball media members and, most curiously, the
closers themselves. Chris Perez, a closer for the Cleveland Indians who had a 2.75 ERA in SS and a 4.18
ERA in NSS in 2011, said of NSS: "Obviously, there isn't as much intensity. The game isn't on the line and
you don't feel like your back is up against the wall." Matt Capps, then a closer for the Minnesota Twins,
echoed Perez, observing that closers “have to find a way to make it the same. You have to find a way to
get over all of it, because there is a difference. The intensity level, the hitters' focus, there are a lot of
differences.” (Meisel, 2012)
This study investigates the persistence of the performance discrepancy after the inclusion of key
variables, including a control for pitcher-batter matchup, and potential other causes of the variation.
After testing a sample of more than 26,000 closer-batter interactions (plate appearances) across two
regular seasons, the findings support a significant effect of the SS state on at-bat outcome. Closers are
12 percent less likely to allow a batter to reach base in an SS than in an NSS, all else equal. The data also
supports a significant effect of changes in Leverage Index (LI), a more precise measure of situational
importance that encapsulates the effect of every possible result of an at-bat on the game’s ultimate
outcome (team win or loss). Though closers are about four percent more likely to allow a hit or walk for
3
Closers have signed approximately 90 percent (24 of 27) of contracts worth $5.5M or more annually in baseball history (Cot’s
Baseball Contracts) and make approximately $2.5M more annually than relievers that are primarily used in the eighth inning
(Carleton 2008).
4
Earned Run Average measures the average number of runs allowed by a pitcher every nine innings.
5
This difference matters in the context of game outcome, as teams scored an average of 4.7 runs each per game during the ten
seasons studied.
2
every unit increase of LI in an NSS, an identical increase in an SS state significantly improves the
likelihood of pitcher success by a magnitude of about one percent.
These findings lend credence to the importance of game pressure and heightened intensity in SS
states that has been anecdotally communicated by closers. It also provides support for the oftbeleaguered status quo management of relief pitchers by MLB teams. If closers indeed internalize the
mantra of SS importance, as suggested by the results, and thus reap the benefits of internally driven
increases in motivation, it is sensible to continue using closers in this set of game states.
This study contributes to existing literature on the effects of psychological pressure and
motivation on agents in high-stakes environments. The results provide mixed evidence regarding the
impact of psychological pressure on athletes that has recently also been studied in professional soccer
(Apesteguia and Palacios-Huerta, 2010) and basketball (Goldman and Rao, 2012). The findings do not
support the existence of other behavioral biases among closers, such as the loss averse preferences
found by Pope and Schweitzer (2011) among professional golfers. This paper also aims to provide an
explanatory extension to the main findings by creating a method to rank closers based on their
sensitivities to game state changes. This “clutchness” statistic identifies individual situational pressure
effects on each pitcher in the sample, and the calculated coefficients quantify the effects and allow for
inter-pitcher comparisons. This measure is developed and discussed later in the paper, which proceeds
as follows.
Section 2 details the motivations for this study and provides background on related literature.
Section 3 describes the data analyzed, and Section 4 explains the chosen econometric methods and
tested hypotheses. Section 5 provides detailed explanations of the results. Section 6 explains the player
ranking system, and Section 7 discusses robustness measures, alternative hypotheses, and potential
caveats.
3
2.
Motivation and Background
2.1
Effort and Performance
Recent research has applied psychological theories to test the normative economic convention
of a strictly positive relationship between effort and performance. Ariely et al. (2009) conducted
experiments involving games and found that high incentives can lead to decreases in performance. The
authors’ findings regarding the relationship between moderate incentives and optimal performance
echoes the Yerkes-Dodson Law, a psychological theory that states that optimal performance requires an
intermediate level of “arousal,” or emotional intensity (Yerkes and Dodson, 1908). Chib et al. (2012) use
fMRI technology in a neuroscience study to observe agents performing motor task experiments and also
find evidence that larger incentives can lead to performance decreases.6 Rauh and Seccia (2006) develop
a theoretical model for performance that incorporates anxiety into a framework composed of individual
skill and effort level.
Goldman and Rao (2012) contribute to this area of literature by distinguishing between the
effects of psychological pressure based on the type of game action in basketball. The authors find that
players do not benefit from playing at home (a “home-court advantage”) in all situations, despite the
increased pressure and presumed beneficial effects on performance that arise from performing in front
of a supportive audience. Butler and Baumeister (1998) previously observed, through experimental
results, that the presence of supportive audiences can lead to declines in performance. Goldman and
Rao found that basketball players benefit from home crowds when grabbing rebounds, a high-effort task
that takes place in a fluid game state. But home players were negatively impacted when shooting free
throws in tense moments, as that task requires concentration that can lend itself to the detrimental selffocus observed in previous experimental work (Lewis and Linder, 1997; Baumeister and Steinhilber,
6
The study also relates the observed performance declines to loss-averse preferences, discussed in the next section.
4
1984). Cao et al. (2011) also studied free-throw shooting in the NBA and found an average performance
deterioration of 5-10 percent in the final moments of close games.
A study of field goal kick conversions in professional football (Clark et al., 2013) found no
significant effect of psychological factors, while Apesteguia and Palacios-Huerta (2010) observed a
negative impact of psychological pressure on penalty kicks in soccer. They observed a higher-thanexpected winning percentage among soccer teams shooting first in the penalty kick rounds. Their
empirical results matched survey data collected from professional players that overwhelmingly stated a
preference to kick first to place the mental pressure on the other team. Baseball at-bats, which is the
game action studied in this paper, is quite similar to such concentration-heavy tasks, which provides
some context for the effects observed in high-pressure game situations in the MLB.
2.2
Loss Aversion
Closer performance also provides a template for testing for evidence of loss aversion in a field
environment. Loss aversion is a tenet of prospect theory, a foundational set of behavioral economic
concepts developed by Kahneman and Tversky (1979). Behavioral economics incorporates psychological
influences and heuristic biases, or rules of thumb, into the rational decision-making process, thereby
diverging from traditional neoclassical theory in which rational agents maximize their individual utilities.
Prospect theory maintains that agents’ choices depend not just on total utility, but also on reference
points from which decisions, and the potential subsequent gains or losses, are compared to on a relative
scale. Departures from reference points are not weighed equally, and this idea that “losses loom larger
than corresponding gains” (Kahneman and Tversky, 1991) is formalized in the prospect theory value
function:
(1)
( )
{
| |
5
The utility of a gain of amount x is less than the disutility of a loss of –x. The discrepancy
depends on lambda, a measure of an individual’s degree of loss aversion found experimentally to be
approximately 2.25 (Kahneman and Tversky, 1992). Figure 2 depicts a typical value function diagram
with the reference point located at the origin. The function is steeper in the loss domain (x<0) than in
the gain domain (x>0), which indicates a higher marginal benefit of effort in the loss domain. The beta
exponent term, found to have a median value of 0.88 in the same set of experiments, gives the value
function its “S-shape.” The resulting convexity of the function in the loss domain and concavity in the
gain domain relate to the concept of diminishing sensitivity, which suggests that the impact of a gain or
a loss lessens as an agent moves further away from the reference point. In such a situation, facing a loss
of $110 instead of $100 is less painful than facing a $20 loss after expecting a $10 loss. The authors also
found experimental evidence of varying risk attitudes in the two domains: agents tend to be risk-seeking
when faced with potential losses and risk-averse when faced with potential gains.
Extensive contributions to prospect theory literature have been made through both
experiments and fieldwork (Kahneman et al. 1990; List, 2003; List, 2004; Fryer et al., 2012; Levitt et al.,
2012). The studies of Thaler (1999) and Haigh and List (2005) investigate myopic loss aversion, a theory
that incorporates the narrow bracketing of decisions examined in mental accounting literature. The
impact of goals, expectations and endogenous changes on reference points has been studied
theoretically (Köszegi and Rabin, 2006), experimentally (Heath et al, 1999), and in relation to effort
provision in labor markets (Camerer et al., 1997; Farber, 2005; Fehr and Goette, 2007; Abeler et al.,
2011).
The loss aversion framework discussed in Section 4.1 incorporates the ideas of Pope and
Schweitzer (2011), who found a significant difference between accuracy on comparable par and birdie
putts by golfers on the PGA Tour. This difference was attributed to loss-averse preferences, in which
golfers exerted more effort to avoid falling below the reference point of par on individual holes, an
6
irrational behavioral bias since all strokes are equally valued in the final score. This effect diminished in
later rounds of tournaments, as the par reference point lost salience and the aggregate tournament
score gained importance. Berger and Pope (2011) and Goldman and Rao (2013) both find significant
effects of score margins on basketball teams. The former study found that college and professional
teams losing by small margins at halftime were more likely to ultimately win the game, and the latter
found an improvement in shot-making efficiency among trailing teams.
Several aspects of baseball have also been used to study reference-dependent preferences.
Moskowitz and Wertheim (2011) found evidence of loss aversion within MLB at-bats through the use of
Pitch f/x data, a tool for graphical analysis of pitch-by-pitch data7. The authors claimed that both
pitchers and batters adjusted their risk-seeking behaviors depending on the count of balls and strikes
within individual at-bats. Pope and Simonsohn (2011) found evidence in MLB regular season batting
statistics that supported the use of round numbers as reference points, concluding that significantly
more batters finished regular seasons with averages of .300 than with averages of .299. Pedace and
Smith (2012) observed loss-averse tendencies in baseball managerial decisions, finding that general
managers were more likely to retain poor performers in whom they had originally invested.
2.3
Baseball
A complementary goal of this study is to investigate the effectiveness of common closer usage.
Closers have been used in a confined role, mostly restricted to one-inning outings in save situations, for
about the last 20 years. An incongruous trend has developed, as this increased role specification, which
leads to fewer innings pitched, has been coupled with increasing salary levels. This raises questions of
labor allocation efficiency. Previous research has delved into such issues: Jazayerli (2000) finds that
closers have the second-most potential impact in tie games (a NSS in which they are thus less frequently
7
See Brooks Baseball (Brooks) for examples of the Pitch f/x tool.
7
used), while the protection of a three-run lead in the final inning (an SS) has less impact on the game’s
outcome than achieving a scoreless first inning. Wyers (2012) finds that the fixation of managers on
preserving closers for SS game states in the final inning often backfires; it yields too few opportunities
for closers to pitch, and thus relegates the highly skilled relievers to waste labor on low-leverage
situations after stretches of inactivity.
Despite these drawbacks, the current system may generate unobvious benefits that lead to
increased performance, and this is the area to which this study contributes. The player quotes cited in
the introduction certainly suggest an association of importance to save situations; if a closer is not as
personally motivated to perform in tied game situations, their incentives are misaligned in a way that
may detract from team success. The potential benefits of the well-defined role of closers have also
been considered in previous literature (Tango et al., 2006; Carleton, 2012a). This paper’s analysis sheds
empirical light on the effects of game states on the group of closers studied.
3.
Data
To analyze the effects of game situations on closers, this study examines data on the plate
appearance level. A plate appearance8 is an interaction that involves a pitcher and a batter and occurs at
least six times every inning. Since one of the goals of the paper is to examine the effect of the SS game
state, the pitcher pool in a given season was limited to those who had been closers in a significant
capacity at some point during the season. The sample of pitchers includes only those who earned at
least five saves and finished at least ten games in the season. Games finished simply means the pitcher
was the last to pitch for his team in the given game. The data used to refine the sample was acquired
from the Thebaseballcube.com, a source of baseball statistics.
8
Though the terms at-bats and plate appearances are used interchangeably throughout the text, it should be noted that the
data sample consists of all plate appearances involving the closers. Walks are included in the sample.
8
Two seasons were selected for inclusion in the study: 2000 and 2011. The latter was chosen
because it was the final completed season when the project was started, and the former was selected as
a counterbalance in terms of league-wide batting success. Offensive output has been trending
downward in the MLB in recent years, potentially because of the end of widespread steroid usage (Stark,
2012). The league-average On-Base Percentage (OBP), a measure of offensive success, was 0.321 in
2011, then the lowest in 23 seasons. The 2000 season, meanwhile, had a league-average OBP of 0.345,
tied for the highest mark in the previous 62 seasons. Though the model specification includes a control
for player skill, which incorporates these league-wide averages within seasons, this extra measure was
taken. The season choices also generated a varied sample of pitchers. The cutoffs specified above
yielded 48 pitchers in each season9. Only two pitchers, Jason Isringhausen and Mariano Rivera, were
featured in both sample seasons. Appendix Table A-1 contains descriptive data regarding all of the
pitchers in the sample.
Details about each plate appearance were acquired through the Play Index tool of BaseballReference.com. Specifically, the “Pitching Event Finder” sub-tool was used to isolate all plate
appearances involving each pitcher within the two sample seasons. In total, 26,223 plate appearances
were collected in this manner across the 96 pitcher-seasons. The Play Index output provided almost all
of the necessary information for the specifications described in Section 4, including opposing batter,
score, inning, outcome of the plate appearance, and the Leverage Index (LI).
The LI is a sabermetric statistic that plays a major role in this study’s specifications and analysis.
The statistic, as defined by its creator, Tom Tango (2006), is the “swing in the possible change in win
probability” of an at-bat. The measure depends on the score, the inning, the number of outs, and the
number and position of runners on base10. The LI is found through a summation of the probability of the
occurrence of each at-bat outcome multiplied by the corresponding change in win probability of the
9
th
A 49 in the 2000 season, Jose Jimenez of the Colorado Rockies, was omitted because of difficulties with data obtainment.
For tables that include Leverage Index statistics for every possible game situation, see Tango (2007).
10
9
team. LI is a standardized statistic, which provides valuable intuition: a typical at-bat has an LI of one, so
it can be said that an at-bat with an LI of two is twice as important to the game outcome as a typical atbat. The average LI of plate appearances in the ninth inning or later, the game period in which closers
are primarily used, is 1.33, approximately 37 percent more important than the average event during the
first eight innings (Wyers, 2012). The benefit of this statistic is that it quantifies the importance of game
situations, which provides a way of measuring the stakes, and changes in implied pressure, faced by
closers in individual plate appearances.
The contract statuses of the closers were one statistic not directly available through the Play
Index. These data points were collected from Cot’s Baseball Contracts, which is found on
baseballprospectus.com.
4.
Empirical Methodology and Hypotheses
4.1
Loss Aversion
The uncontrolled statistical evidence and anecdotes about closers’ performance in SS and NSS,
cited in the introduction, do not suggest the presence of loss-averse preferences. While the studies cited
in Section 2.2 involve improvements in performance by players on trailing teams, closers excel in
pressure-packed situations that seem to place them in the gain domain, since their teams are ahead in
SS. But such a straightforward loss aversion model is difficult to apply to closers. Since they are
traditionally deployed to protect leads, it can be argued that SS are in the loss domain; the closer can, at
best, maintain the score margin to not lose ground, and, at worst, lose the lead for the team. In addition,
the gain domain of closers is not a true gain domain, because there is nothing non-preventative that can
be attained in an appearance. This identifies a downside of aggregating closer performance on the
appearance level, as opposed to the at-bat level used for analysis in subsequent sections. While Pope
and Schweitzer narrowly bracketed golfers’ effort selection, thus creating individual utility functions
10
weighing the marginal cost and benefit of effort on each putt, the aggregatory nature of a closer’s
success or failure makes such utility maximization unsuitable for the scenarios studied.
However, the margins of team leads and deficits at which closers enter games do provide a
testable context for loss aversion. All else equal, a loss-averse group of agents would yield fewer runs
when the lead margin is at its narrowest; the benefit of a closer’s effort exertion would be highest when
he is attempting to prevent the tying run from scoring. Extending this idea, the cost of yielding a run
would be significantly smaller if the closer enters in a more relaxed SS state, such as when his team is
ahead by three runs. The benefit of effort exertion would be smaller, initially, and a loss-averse closer
may be expected to yield one or two runs more frequently in this scenario. Pitchers have acknowledged
such a mindset: Jose Valverde, a closer for the Detroit Tigers included in the sample, said, “If I give up a
couple of runs, it doesn’t bother me. I just want to get the save. As long as I get a save so my team wins,
it doesn’t matter” (Meisel, 2012). This statement, however, may also be interpreted as an indication of
rational maximization behavior, in which the only goal of the player is a team victory.
Scatter plots of earned runs allowed per appearance, depending on the run margin at game
entry, are plotted in Figure 3. The 2,940 game appearances of the 48 closers from the 2011 season
compose the sample points. The lowess11 regression curve in 4a does not resemble an S-shaped value
function that would arise under the behavioral bias discussed. There is no evident significant increase in
runs allowed as the team’s lead increases. Figure 3b restricts the sample to appearances in which at
least one run was allowed, which only composes 21 percent of the total sample. The slope of the fitted
curve again does not reveal any particular trends regarding the average number of runs yielded in the
different run margins of SS and NSS states.
Figure 4 plots the same data points, now with adjustments for observation frequency. The size
of each point is conditional on its frequency in the sample. As evident, the largest clusters form in the SS
11
Locally weighted scatterplot smoothing is a form of nonparametric regression.
11
region on the horizontal axis, where the run difference is +1 to +3, and at the zero runs allowed mark on
the vertical axis. The top three observations in the sample of 2011 closers are margins of one, two and
three runs in the pitcher’s favor without any earned runs allowed in the appearance. There are 1,443 of
these points combined, composing 49 percent of the total data set. Such a distribution hints at several
themes already discussed, such as the tendency to use closers in SS states and the generally high skill
level of this group of pitchers.
4.2
Primary Specification
The primary goal in this study was to test whether closers’ improved performance in SS persists
after appropriate controls are included, or whether other situational variables affected performance
more significantly. A logistic regression with a binary dependent variable was applied to the data, similar
to the specifications used in the Clark et al. (2013) study of field goals in football and the baseball
studies of Carleton (2009). Specifically, a binary logit regression was used, which incorporates a
dependent variable that represents a proportion of “successes” between zero and one in the sample
data. The dependent variable takes on the value below, with the variable p representing probability of
success. The logit function takes the log of the odds of the “success” outcome, as evident in (2). This is
the reason that the logit function is also known as “log-odds.”
(2)
( )
[
12
]
A generalized logit regression is presented below, with k independent variables and pi taking on
the value shown in (4). The logit model estimates the probability of the dependent variable taking on
the value of one, or, in other words, of a “success” occurring. The model uses the cumulative standard
logistic distribution, represented by F in (5).
[
(3)
]
(
(4)
(5)
(
|
)
(
)
)
(
)
For the main effect independent variables in such a specification12, the coefficients of the logit
model can be intuitively interpreted through exponentiation with base e. This calculation provides odds
ratios that more clearly describe the relationship between the variables of interest and the binary
outcome variable. In this study, the outcome variable is OnBase, which takes on two values at the
conclusion of each at-bat in the sample.
(6)
{
This outcome variable was selected to allow for analysis at the at-bat level, which is amenable to
the inclusion of key controls and more precise observation of the situational impact. Note, however,
that the outcome variable in the introductory discussion was aggregate Earned Run Average (ERA). To
assuage potential concerns regarding the switch to at-bats, Table 1 shows the mean Onbase
percentage13 yielded by pitchers in the sample in SS and in NSS. The pattern seen in ERA holds; pitchers
are significantly less likely to allow baserunners in SS than in NSS, t(26221) = 3.88, p<0.001.
12
13
The intercept and interaction variables cannot be interpreted directly, which is discussed further in Section 5.
This percentage should not be confused with the previously cited statistic OBP (On-base percentage).
13
The primary variable of interest, SaveSituation, is also binary, and it can take on the two potential
values in (7). The naïve regression incorporating the primary variable of interest is expressed in (8),
where F is again the cumulative logistic distribution function.
{
(7)
(8)
(
|
)
(
(
)
)
(
(
))
This naïve specification can be considered a representation of the evidence previously used to
remark on the disparity in closers’ performance, decomposed to an at-bat level. A negative β1 coefficient
yielded by (8) would suggest that the chance of a closer allowing someone to reach base decreases in a
SS, while a positive coefficient would suggest the opposite.
The complete specification is shown in (9). The subscripts i,p,s represent at-bat i involving
pitcher p that occurred in season s, and X is the vector of covariates. The two terms preceding the error
term are fixed effects for pitchers and seasons, respectively. Standard errors are robust and clustered by
individual pitchers.
| )
(
(9)
(
(
(
)
(
)
)
(
)
(
[ (
)
)]
)
(
(
)
( )
(
)
(
)
(
)
[ (
) ])
LeverageIndex is the plate appearance Leverage Index value, which, as discussed previously,
conveys the implicit pressure in an at-bat through its potential effect on the ultimate game outcome.
RunDiff, or run differential, measures the difference in team scores at each at-bat from the perspective
of the pitcher’s team. The next two terms, Home and Contract, are binary variables taking on the values
shown in (10) and (11). The inclusion of Home was motivated by the findings of Goldman and Rao (2012)
14
discussed in Section 2.1, as well as the work of Moskowitz and Wertheim (2011), who found an awaypitcher disadvantage linking increases in situational Leverage Index to decreases in the rate of strike
calls made by umpires during home at-bats. Contract was included to provide insight into the ambiguous
effect of “contract years,” or the final year of a player contract before he reaches free agency, on player
performance.14
{
(10)
(11)
{
The final variable, ExpectedOddsRatio, is critical. It controls for batter and pitcher skill by
generating an expected outcome for each at-bat using the season statistics of the players involved. This
control method is adapted from Carleton (2009). The variable is constructed by creating odds ratios (OR)
for both players using season On-Base Percentages (OBP)15. The statistic used for pitchers is OBP against,
which measures the average success rate of the opponent batters faced. OBP is already a probability
between [0,1], so the odds ratio formula is directly applied, as shown in (12).
(12)
14
Van-Riper (2010) suggests no impact of contract years on batter performance, while Huckabay (2003) found a significant
difference in performance among batters but no significant difference in performance among pitchers.
15
The equation for OBP is (hits + walks + hit by pitches) / (at-bats + walks + hit by pitches + sacrifice flies). The statistic is more
comprehensive than batting average (BA), which is simply hits/at-bats.
15
Combining the batter, pitcher, and league average odds ratios16 yields (13), and solving for the
variable of interest yields (14). Note that the natural log of the result is then taken before applying to
the binary logistic model in (9). This variable controls for player ability, which allows for more precise
analysis of the variables of interest. As discussed in Carleton, the coefficient of this constant term should
be approximately equal to one.
(13)
(
(14)
)
(
)
Pitcher fixed effects are included to account for individual, time-invariant pitcher characteristics
that may otherwise affect the results but get overlooked on the at-bat scale. Season fixed effects are
included to represent league-wide changes in hitting and pitching performance.17
4.3
Interaction Specification
Another purpose of this study is to investigate the effect of other situational variables on the
variation in closers’ performance levels in their two overarching game states, SS and NSS. To study this
topic, the specification in (9) is expanded to include interaction terms, as seen in (15).
| )
(
(15)
(
(
(
)
(
(
)
)
)
(
)
( )
(
)
(
(
)]
[ (
(
)
(
(
)
16
)
(
)
(
)
)
)
(
)
(
)
(
)
)
The league average OR uses the average OBP of all hitters in the sample season. The league average OBP against for pitchers
is identical.
17
The average number of runs per game in 2011, the final year of the sample, was the lowest since the 1992 season.
16
Four independent variables are interacted with SS to study the potential effects on performance
of these joint situations. There is one interaction between a continuous and a binary variable (LI*SS),
one interaction between a categorical and a binary variable (RunDiff*SS), and two interactions between
two binary variables (Home*SS, Contract*SS). One noteworthy change is the new restriction on RunDiff
value in the interaction term; by definition, the only potential run differences in an SS are one, two and
three. The main effect variables that correspond to the interacted terms now represent the effects of
the respective variables in the NSS state.
4.4
Hypotheses
The primary coefficient of interest in the main specification is β1, the coefficient of the SS
variable. A straightforward Wald test of significance on this coefficient is the first hypothesis tested, with
a null hypothesis of β1 =0. The uncontrolled data would suggest that β1 < 0, meaning that pitchers allow
fewer baserunners in SS than in NSS. Similar Wald tests for significant differences from zero are
calculated for all of the regressor coefficients in (10). For the interacted specification, the focus is on the
coefficients of the four interacted terms and their main effects companions. The significance levels of
these variables can provide insight regarding their varying effects by game state. Section 5 also contains
discussions of additional specifications that contribute to the analysis, and these incorporate other
significance tests that are discussed when appropriate.
17
5.
Results
5.1
Summary Statistics
Table 2 reports summary statistics for the variables used in both the primary and interaction
specifications. Table 3 provides frequencies and means of key variables in the SS and NSS game states.
The latter table provides a clearer picture of the differences in the two game states; for instance, it
presents accurate mean values for the interacted terms that do not include the zeroes accumulated
through NSS sample at-bats.
The mean Leverage Index cited in Table 3 is of particular interest. The mean LI for closers has
previously been reported to be 1.8 (Cameron, 2010), and this study’s sample mean is quite similar with a
rate of 1.7 for SS and NSS combined. The data suggests that in SS, the average plate appearance is about
138 percent more important to the game’s eventual outcome than an average plate appearance.18
Figure 5 adds further clarity to the increased pressure experienced by closers in the sample, as it
compares the percentage of at-bats closers face in subdivisions of LI to the typical breakdown of at-bat
importance in a full game (Appelman, 2008). About a third of all at-bats in the sample are more than
twice as important to the game outcome as a typical at-bat (LI ≥2). This breakdown suggests that though
closers are not substituted in games based on measures of LI, the SS acts as an imperfect proxy that still
tends to place the best relief pitchers in the most important game situations. Figure 6 depicts kernel
density plots for the LI distribution in SS and NSS states within the sample plate appearances. The
distribution is more positively skewed for NSS, which indicates that closers tend to face higher LI levels
in SS.
Figures 7 and 8 incorporate run margin into the discussion of the sample data. The distribution
of plate appearances by run difference levels shows that the four most common margins in the sample
are tied games and plate appearances in the +1 to +3 range that constitute SS. The vertical axis in Figure
18
Part of this wide discrepancy is endogenous to the definition of Leverage Index, since it depends on the inning in which the
at-bat took place, among other factors. This fact is discussed in Section 7.
18
8 represents the plate appearance Leverage Index, which shows the expected decreases in LI as the
magnitude of run margin increases in both the positive and negative directions.
The pairwise correlation coefficients of potentially related variables are reported in Table 4.
Many of the variables indeed exhibit a significant (p < 0.05) correlation. Among the correlations
between the outcome variable and regressors, both SS and RunDiff have a significant negative
correlation with On-base, which partly confirms previously discussed hypotheses. In addition, there are
significant positive correlations between LI and SS, RunDiff and SS, and RunDiff and LI. These correlations
raise multicollinearity concerns, though these are largely eased in the robustness discussion of Section 7.
5.2
Primary Specification Results
Table 5 presents the regression results of the naïve specification (8) and of a specification that
only adds the control for pitcher-batter matchup. Though the naïve specification, as expected, suggests
significant omitted variable bias through its significant constant term (p<0.001), Columns (3) and (4)
show a controlled significant effect of SS on plate appearance outcome. The results suggest a significant
negative effect of the SS game state on the probability of a pitcher yielding a baserunner, as suggested
by the results reported in Table 1. The binary logit specification allows for intuitive interpretation of the
effect’s magnitude, both in this simple pair of regressions and in the full specification. The coefficients
are exponentiated with base e (for any coefficient βn=x, eβn) and this transformation allows the
regressor’s effect to be interpreted as a percentage or factor change. The odds ratio in Column (4)
suggests that a pitcher is 8.7 percent (1-0.913) more likely to get a batter out in an SS than in an NSS,
controlling for the skill levels of both players.
The results of the full specification of (9), along with several variations, are presented in Table 6.
The output is organized into four pairs, the left column of each showing the binary logit coefficient
output and the right column showing the odds ratio coefficients. The most striking result of the full
19
specification is the negative and significant (p<0.01) coefficient on SS. The effect observed in the naïve
results persisted and increased in magnitude after the addition of covariates that may have affected the
observed effect of the SS game state on at-bat outcome. By switching from an NSS state to an SS state, a
pitcher is 11.7 percent (1-0.883) more likely to get the batter out, all else equal.
LI has a positive and significant (at the 10 percent level) coefficient that indicates the effect of
increases in situational game pressure on at-bat outcome. The data suggests that for a unit increase in
Leverage Index, the chance of a pitcher allowing a baserunner increases; specifically, column (2) shows
that the magnitude of this increase is approximately 1.7 percent. Columns (3) and (4) present a
specification that omits the LI variable, as it had a significant and positive pairwise correlation with SS in
Table 4. The SS coefficient remains significant and negative, though the magnitude of the effect
decreases slightly. The Home and Contract coefficients, meanwhile, become significant at the 10 percent
level with opposite signs in this specification. The data suggests that pitching at home results in a 5.3
percent decrease in the chance of yielding a baserunner, while pitchers in their contract years suffer a
2.7 percent increase. In Columns (5) and (6), which report results of a specification that omits the SS
variable, the only significant result19 is a similarly deleterious effect of Contract on at-bat outcome, from
the pitcher’s perspective.
The final pair of columns presents a specification that includes fixed effects for each margin of
run difference. This eliminates all significant effects of the independent variables of interest. The
significance and signs of the run difference coefficients (not reported) vary widely in magnitude and
significance. The individual pitcher fixed effects, which were included in each of the four specifications,
are also not reported. These individual fixed effects yielded coefficients with varying signs, as well as
wide-ranging magnitudes and significances, and the heterogeneity among individual pitchers is explored
further in Section 6. The low pseudo R2 values for these binary logit specifications are not a concern;
19
The coefficient on ln(ExpectedOddsRatio) is also significantly different from zero (p<0.001)in each of the specifications, but it
is a control variable. As expected, the coefficient on the control term is approximately equal to one in the logit model.
20
logistic regression models tend to have low pseudo R2 (Lunt, 2012). The robustness discussion in Section
7 reports goodness-of-fit test results that support the use of this logistic regression specification.
A potential drawback of logit specifications is the misinterpretation of coefficients caused by
changing derivatives along the logit curve, particularly at its non-linear, extreme values. To test this
possibility and confirm the consistency of the results discussed, marginal effects were measured20 at the
mean values of the regressors, as well as at the means of four different subsections of the data. The
coefficients represent the instantaneous rate of change of the outcome variable, as expressed in (16),
where xk represents independent variable k. The results, shown in Table 7, remain consistent both at the
mean values of all variables and within the limited samples of Columns (2)-(6). This suggests no
significant changes in observed effects caused by the use of a logit specification. Apart from the negative
effect of the SS state on at-bat outcome, it is worth noting the persistent and significant positive
marginal effect of the contract variable.
(
(16)
| )
Overall, the regression results stemming from the primary specification suggest a significant,
beneficial effect of the SS game state on the performance of closers. Closers are significantly more likely,
on average, to get the opposing batter out in matchups in the SS state than in the NSS state.
5.3
Interaction Specification Results
The discussion now turns to potential interaction effects between the covariates and these two
game states. Table 8 shows the regression output of (15) and of several variations of the model.
The results in column (1) suggest that LI and LI*SS are the two variables with coefficients
significantly different from zero. The data suggests that for a one-unit increase in Leverage Index in an
20
The function dlogit2 was used in Stata to generate the marginal effects (Sribney, 1996).
21
NSS, the chance of a pitcher allowing a batter to reach base increases. Specifically, column (2) shows
that for a one-unit increase in LI in an NSS, the odds of a pitcher allowing a baserunner increases by a
factor of 1.038. In percentage change terms, a batter is 3.8 percent more likely to successfully reach
base with a unit increase of LI in an NSS. As the situational pressure rises in an NSS, the batter is more
likely to succeed.
The LI*SS coefficient is also significant, but it must be interpreted with caution. Interaction
effects in binary logit specifications present a challenge in proper interpretation. The multiplicative
effect of the variable, as termed by Buis (2010), is discussed first, as these can be observed directly from
the column (2) odds ratio results. The effect of a one-unit increase in LI in an SS is 0.94 times the effect
of a one-unit increase in LI in an NSS. For further analysis, however, regarding the interaction variables’
statistical significance and marginal effects, additional calculations must be reported. As discussed in
Norton et al. (2003), the interaction effect of non-linear models can vary in magnitude, significance, and
sign (positive or negative) across observations, so the reported coefficients in Table 8 do not equal the
true marginal effects of the interactions21. Norton et al. (2004) presents a solution for this problem with
a Stata command, inteff, which correctly calculates the mean magnitudes of interaction effects in the
logit model with appropriate signs and significance levels.
Table 9, and the accompanying Figures 9 and 10, provide an accurate portrayal of the marginal
effect of a one-unit increase in LI in an SS state22. The mean marginal effect is -0.013, and the effect is
significantly different from zero (mean z=-8.46). In addition, as seen in Figure 9, no complications arise
from varying signs of the interaction effect across the sample data. The effect of the increase in
situational pressure conveyed through LI in the SS state is negative across all of the observed outcomes.
21
Precisely, the interaction effect in a non-linear model equals the cross partial derivative of the expected value of the
dependent variable with respect to the two interacted variables, not the partial derivative of only the interacted term. Refer to
Norton et al. (2004) for a complete discussion.
22
The proper mean interaction effects for the insignificant interaction variables are presented in Appendix Table A-2.
22
The significance and uniformity of the sign allows conclusions to be drawn from the reported magnitude.
The data suggests that the chance of a pitcher allowing a baserunner decreases by a factor of 0.987
(1-e-0.013) in an SS relative to an NSS. In percentage change terms, a pitcher is 1.3 percent (1-0.987) less
likely to yield a baserunner in an SS. Therefore, the overall effect of a one-unit increase in LI in an SS is a
2.5 percent increase in probability of the opposing batter reaching base. When compared to the 3.8
percent increase in probability of batter success in an NSS, it is evident that pitchers perform better
under rising situational pressure when it occurs in an SS game state. This result supports the main
effects discussed in Section 5.2.
The variation specifications presented in Table 8 present several noteworthy results. The
exclusion of all terms including LI in Columns (3) and (4) do not lead to significance in the other
interacted variables, while the exclusion of the RunDiff terms in Columns (5) and (6) does not lead to
insignificance in the LI terms.
Figures 10 and 11 graphically represent the disparate effects of LI in the NSS and SS states. The
vertical axes show the average on-base proportion yielded in plate appearances within bounds of
Leverage Index. The mean of these LI ranges — which were [-0.5,4] in intervals of 0.5 for Figure 10 and
[0,4] in intervals of 1 for Figure 11 — were calculated and are used as the precise x-coordinates for the
data points. The first graph of each figure shows a linear regression and the second shows a lowess
curves fit to the data. The figures reinforce the primary finding in this regression specification: it
presents an evident positive relationship between increases in LI and on-base proportion in NSS, but a
relative negative effect of increases in LI in the SS game state on the probability of reaching base.
Table 10 presents the marginal effects of the interacted specification at the means and within
the restricted samples also used in Table 7. Again, there are no apparent issues with the logit
specification, as the LI and LI*SS coefficients remain almost identical across the specifications. The
23
marginal effects of the other variables, which previously had insignificant coefficients, remain
insignificantly different from zero.
5.4
Alternate Specifications
Linear regression specifications were also applied to both the primary and interaction models to
observe if the significant effects persisted. The specifications for both linear models are presented
below.
(
(17)
(
(
)
)]
)
(
)
[ (
(18)
(
(
[ (
(
(
)
)
)
)
(
)]
(
)
(
)
(
)
)
(
(
)
)
)
Pitcher and season fixed effects are included in both linear regressions. The regression results
are reported in Tables 11 and 12, respectively. In the primary specification, the SS coefficient maintains
its significance and sign, though the coefficient is smaller in magnitude. A pitcher is about 2.6 percent
more likely to get a batter out in an SS state than in an NSS state, according to the full linear
specification. For the specification variations in Columns (2) and (3) of Table 11, this significance and
magnitude remains similar despite the omission of the LI and RunDiff variables, respectively. As in the
marginal results of the logit specification, the coefficient on the contract variable is positive and
significant.
Table 12 also presents no changes in the significance of coefficients of the variables of interest,
though the magnitude of the coefficients collectively decreased. The interpretation of the interaction
effect between LI and SS is now different. Whereas the overall effect of a one-unit increase in LI in the SS
24
game state previously favored the batter, it now favors the pitcher. The data suggests that the chance of
a pitcher allowing a baserunner after an LI unit increase increases by 0.8 percent in NSS and decreases
by 1.3 percent in SS. The overall effect of a one-unit increase in LI in an SS state is therefore negative;
the pitcher is 0.5 percent less likely to yield a baserunner in an SS, all else equal. Though the magnitudes
of the effects differ, both the logit and linear interacted specifications suggest an improvement in closer
performance in pressure-packed situations that occur in SS states. Partial regression plots23 of the LI and
LI*SS coefficients, generated from the linear specification, are presented in Figure 12. The divergent
relationships between these two regressors and the outcome variable, as previously discussed, are seen
to persist in the linear specification.
Table 13 presents several adjustments to the included fixed effects in the interaction variable
specification. Column (1) reproduces the full model for reference, which includes pitcher and season
fixed effects. Columns (2) and (3) each use one of the two fixed effects variables from the full model.
Fixed effects for team run difference during each plate appearance is introduced in Column (4), and the
final column presents the logit regression with no fixed effects included. Again, the two significant
variables, LI and LI*SS, remain significant, and no other variables become significantly different from
zero because of these adjustments. Noticeable changes occur among the variable coefficients in the
specification with run difference fixed effects; for instance, the variables of both LI and LI*SS increase in
magnitude in their respective directions.
5.5
(13
Further Analysis: Restricted Specifications
The effects of game situation on closer performance can be explored further within the sample.
The strict definition of an SS game state provides an opportunity to extend the study to three distinct
game states, or zones, in which a closer may be deployed. Referring back to Figure 1, the two non-save
situation (NSS) ranges can be differentiated: run differences of four runs or greater (in favor of the
23
These plots are also called adjusted variable plots.
25
pitcher’s team) will henceforth be called WinNSS, while the range including tied games and all margins
of deficit will be called LoseNSS. The purpose of this refinement is to pose questions regarding the true
drivers of performance variation: does the impact of increases in situational pressure differ across the
two newly defined zones?
The results of two adjusted regressions incorporating the NSS zones are presented in Table 14.
Both a binary logit and a linear specification are included, and the adjusted logit specification is
presented below, where F is the logistic cumulative distribution function.
| )
(
(19)
(
(
(
[ (
)
)
(
(
)]
( )
)
)
(
)
)
(
)
Column (2) will be the focus of the analysis regarding coefficients on the interacted variables.
The results suggest a curious disparity in the effects of LI increases in the two NSS zones. The LI*WinNSS
variable has a negative and significant coefficient in the reported linear results, while the LI*LoseNSS
coefficient is positive and not significantly different from zero. The LI variable, now restricted to SS game
states, yields a positive and insignificant coefficient. Relative to a one-unit increase of LI in an SS state,
pitchers are significantly less likely to allow hits and walks when the pressure increases in the WinNSS
zone. The overall effect is negative and thus favors the pitchers; adding the coefficients of LI and
LI*WinNSS yields -2.8 percent, which suggests a decrease in the chance of a batter reaching base when a
one-unit increase in LI occurs in the WinNSS zone. This result provides further clarity regarding the
effects of situational pressure and game state on closer performance. The benefits to pitcher
performance as situational pressure rises, as measured through increases in Leverage Index, is not
restricted to SS game states, but also extends into higher lead margins. This finding suggests that closers
tend to respond positively to protecting their team’s lead as pressure increases, even when the
individual reward of earning a save is unattainable.
26
The data is parsed in a different manner in the specifications shown in Table 15. Instead of zones
of NSS, the regressions are restricted by several margins of run difference. Columns (1), (2), and (3) show
symmetric ranges of run differences that each incorporate both SS and NSS situations. Column (4) then
restricts the sample to all positive run margins, which composes about 60 percent of the data, and
Column (5) includes all of the other potential margins (tied or trailing). The first three columns display
results similar to those reviewed in Section 5.3, including the positive, significant coefficients on LI in
NSS states. Column (4) presents a newly significant interaction variable; the negative coefficient on
RunDiff*SS suggests that relative to run margin increases in NSS, a lead increase of one run in the SS
state positively affects closer performance. It is noteworthy that the LI values, in this situation of
increasing run difference, would on average be decreasing, as depicted in Figure 8, so this effect is
complementary to several themes discussed throughout the analyses.
6.
Extension: Pitcher Rankings
6.1
Ranking System
The discussions in this study have thus far focused on the average effects of changes in game
situations observed across the entire sample. It is also valuable to examine the effects of such changes
on individual closers, in order to observe the effects of psychological pressure (and its potential
motivational effects) on performance. This section investigates heterogeneity within the sample to
generate “Clutchness” rankings24 for each closer25.
The first ranking system applies specification (17) to each pitcher’s sample at-bats individually.
Each ranking coefficient incorporates the individual estimates of unit increases in SS and LI, as well as
the unique constant that is in effect each pitcher’s fixed effect coefficient. Though the goal here is to
24
The sabermetric website FanGraphs publishes a clutchness measure for hitters; their definition, which differs from the
discussion regarding pitchers here, is explained by Seidman (2008).
25
Pope and Schweitzer (2011) analyze individual regression coefficients in their examination of heterogeneous loss aversion
among golfers in their sample.
27
assemble information about a closers’ performance response to the specific situational pressure
increase of LI and the game state effects of SS, the fixed effects must be considered to present an
accurate portrayal of performance. An effective ranking system should incorporate pitchers’ baseline
levels of performance, regardless of the relative changes that high-stakes environments may cause.
The average effects of the SS and LI changes, as reported in Table 11, are -2.6 percent and 0.3
percent, respectively. But the individual regressions yield point estimates that vary considerably by
player. To generate the ranking coefficients, a 25 percent weight is allocated to the SS and LI coefficients
each, and the remaining 50 percent weight is allocated to the individual pitcher fixed effects. Table 16
and Table 17 present the compiled coefficients for closers in the 2000 and 2011 seasons, respectively.
The interpretation of these averaged values is slightly counterintuitive: increasingly negative numbers
correspond to improved performance and higher levels of clutchness. As previously, the dependent
variables in the individual regressions equal one if the pitcher yielded a hit or a walk, so negative
numbers represent decreases in the probability of yielding baserunners in save situations and in
situations of increased Leverage Index. The top-ranked pitcher represents the most clutch performer, as
determined by the model; he wields the highest combination of baseline performance and performance
improvement in response to the game scenario changes of interest.
The individual regressions generate sample size concerns that may affect the accuracy of the
rankings. For instance, two of the top four clutch pitchers in 2000, and three of the top four in 2011,
converted eight saves or fewer overall in the season, as noted in Tables 16 and 17. Such a small sample
size can create overweighted estimates of the improved performance of these pitchers when they are
deployed in SS. Tables 18 and 19 omit all pitchers who converted eight saves or fewer in the given
seasons to present a condensed Clutchness ranking. These rankings also omit closers with a sample size
of fewer than 100 plate appearances in the season, as well as those who made 30 percent or fewer of
28
their appearances in SS game states. These two latter cutoffs affected significantly fewer pitchers.26
Overall, these refinements reduced the pool of pitchers to 33 in the 2000 season and 35 in the 2011
season, reducing the total sample size by approximately 29 percent to 68 total pitcher-seasons.
Rankings are also generated with the linear interacted variable specification presented in (18).
The coefficients of interest are now those on the LI and LI*SS variables, as these were the significant
results and primary topic of discussion in Section 5.3. A similar weighted average is calculated: 50
percent is allocated to the individual pitcher fixed effects, and 25 percent apiece is now allocated to the
LI and LI*SS coefficients. The observed effects in the individual regressions again varied significantly
from the previously reported average effects. Tables 20 and 21 present the clutchness coefficients for
the interacted specification, in which the focus is now on the performance difference between highpressure situations in SS and in NSS states. The rankings of the pitchers with sufficient sample sizes, as
determined for the first set of rankings, are reported in the two tables.
6.2
Ranking Applications
The devised ranking system can be used to better understand the effects of shifting game
pressure on pitchers. It would be telling if the rankings possessed predictive power on other key
characteristics of individual closers. Two such performance-based attributes, both related to individual
measures, were compared to the calculated clutchness data points. Figure 15 presents scatter plots of
each pitcher’s save opportunity conversion rate, annual salary, and logarithm of annual salary plotted
against his Clutchness coefficient calculated through the non-interacted specification. The statistics that
generated the data points for each player can be found in Appendix Table A-1. The conversion rate of
save opportunities was included instead of total saves to standardize the outcome across unequal
appearance sample sizes among the closers.
26
See table notes for details.
29
The fitted lines in each of the three plots of Figure 13 are positively sloped, which contradicts
the expected relationship in each case. A higher clutchness coefficient indicates worse performance, so
those with lower coefficients would intuitively be more likely to have higher save conversion
percentages and salaries. However, there are no statistically significant relationships between the
outcome variables and the clutchness measurements. T-tests do not reveal significant effects of the
clutchness coefficients on the rate of save opportunities converted (t=1.95, p=0.05), on annual salaries
(t=0.85, p=0.40), or on the logs of annual salaries (t=0.67, p=0.51).
7.
Robustness, Competing Hypotheses and Caveats
7.1
Goodness-of-fit
Beyond the alternate specifications examined in Section 5, several additional goodness-of-fit
tests were applied to the data to ensure the appropriateness of a logistic regression model. A HosmerLemeshow test was conducted; the test divides the data set into groups and then tests the observed and
predicted number of positive outcomes of the dependent variable within each. It tests the null
hypothesis that the difference between the outcomes is zero across the subgroups (H0: Observed
outcomes - Positive outcomes =0). Using the convention of 10 groups (Lunt, 2012), the null hypothesis
fails to be rejected (C2 (8) = 5.9, p=0.66), which confirms the fit of the full specification.
Tests for potential collinearity, a primary concern considering the associations between the
regressors SS, LI, and RunDiff, also did not yield any concerning results that would require model
adjustments. Variance inflation factors (VIFs) were calculated for both the binary logit and linear models.
VIF values of greater than 10 are cause for concern (Chen et al., 2003), but there were no such values in
the models. In the binary logit specification, the three potentially collinear regressors had a mean VIF of
30
1.1227. The mean VIF of the linear regression model that included season and pitcher fixed effects was
6.68, and a linear model without fixed effects yielded a 4.15 mean VIF.
Additional model diagnostics are also presented in Appendix A-3. These include tables
containing various measures of fit for the binary logit and linear models that include the interacted
variables, as well as a classification table that shows the logit model’s rate of accuracy in predicting the
outcome variable28.
7.2
Competing Hypotheses
The choice of timeframe for the primary analyses within this study can be debated. While the
sample of individual at-bats provides an excellent avenue for controlling for individual player skill and
observing the effects of slight changes in situational pressure, it is less suitable for understanding larger
trends within a pitcher’s season. The performance changes attributable to in-game situations may
instead be more closely related to large-scale themes.
The Wald-Wolfowitz, or one-sample, runs test is one method of testing for game-level trends in
closer performance. The runs test has been applied to sports in previous research to test the validity of
the “hot hand effect,” a claim of positive dependence between outcomes most notably applied to
basketball shots in Gilovich et al. (1985). A run is defined as a series of consecutive hits or misses —
saves converted and saves blown in this case — with a minimum length of one. For instance, if C is a
converted save and B is a blown save, the streak CCBBBCBC has five runs. The null hypothesis for the
test is H0: Observed # of runs = Expected # of runs.
27
A user-generated command, collin, was used in Stata to calculate the VIF values of the non-linear regression. Additional
information can be found in Chen et al. (2003).
28
The estat class command was used to generate this classification table in Stata. While the default cutoff for the test is 0.5, the
threshold was changed to 0.314, the proportion of batters who reached base in the sample. This cutoff better reflects the true
proportion of successes (hits, walks) and failures (outs) that the model aimed to predict.
31
A runs test was administered to each of the 96 pitcher-seasons in the sample, with sequences of
save opportunities ordered chronologically from April to September in each regular season. Of the 96
pitcher-seasons tested, only six (6.25 percent) had seasons in which their observed number of runs of
converted and blown saves significantly differed from the number of expected runs. The test statistics of
these six players are presented in Table 22, and the entire list of results is found in Appendix A-4. The
runs of the overwhelming majority of players did not significantly differ from the expected binomial
distribution, which suggests that the closers are not affected by streakiness that carries over between
game appearances. It is noteworthy that the six players with significant results all had fewer runs of
converted and blown saves than would be expected. This corresponds to the potential of a “hot hand
effect” in this small subgroup; significantly differing from zero in the other direction (compiling more
runs than expected) would instead suggest a different type of performance inconsistency.
7.3
Caveats and Research Implications
Simultaneity is the most pressing issue affecting the analyses discussed in this paper. Unlike
some of the reviewed literature that also examined situations in sports (golf putts, free throws), the
interactions studied herein are between two players. It is therefore difficult to make definitive
conclusions regarding psychological effects on one of the competing groups. Though the study strove to
isolate the effects on pitchers — by including and clustering by pitcher fixed effects, for instance — it by
no means accounted for all of the impact the game situation changes may have had on the batters.
Carleton (2012b) remarks on similar concerns regarding a study that also involves closers. The
Apesteguia and Palacios-Huerta (2010) paper on soccer penalty kicks, which also involves a game event
involving two players, provides somewhat of a comparable situation. The authors note that the majority
of penalty kicks are scored, which is analogous to batter-pitcher interactions; just as most penalty kicks
are converted, most batters do not reach base (see Table 3). However, randomness plays a larger role in
32
the outcome of at-bats, which presents a challenge in the attribution of the majority of the situational
psychological pressure to one side of the interaction. The pitcher ranking system developed in Section 6
works toward ameliorating such issues. The line of reasoning discussed there could be extended with
the further isolation of game state effects on individual pitcher performance, and a more robust sample
pool may lead to more substantive results.
It would also be valuable to expand the sample to include playoff plate appearances. This would
offer a new avenue for analysis, since the playoffs could be expected to increase pressure across all
types of in-game situations. Otten and Barrett (2013) recently conducted an observational study
comparing regular season and playoff performance among baseball players, and an extension of the
empirical work described in this paper would provide an intriguing extension to their findings. In a
similar vein, the inclusion of game-level pressure indices, such as the “Season Leverage Index”
developed by Studeman (2008), could provide a more complete picture of the motivations and
pressures experienced by players at different stages of the regular season.
This study broadly addressed the possibility of loss-averse preferences among closers, but such
analysis could be extended with the use of Pitch f/x pitch-by-pitch data. By expanding the study into
intra-plate appearance patterns, the risk preferences of pitchers could be examined. For instance, the
pitch selection, velocity, accuracy, and movement could be compared in SS, NSS, and moments of
heightened LI. Such work would complement the previously mentioned analysis of Moskowitz and
Wertheim (2011) regarding pitch selection that varied after extreme changes in batting count.
Though a powerful statistic, Leverage Index is not a panacea that can be used independently in
player performance analysis. The LI can be thought of as a snapshot measure of game pressure, so it
cannot account for a player’s personal effect on the situational pressure he later experiences. As
discussed in Tango (2006b), a pitcher who performs well can reduce the LI of ensuing at-bats. Variations
of the plate appearance LI (paLI) statistic used in this study, such as average Leverage Index (aLI) that
33
measures the average pressure across sets of at-bats, may be considered for inclusion in study
extensions to account for such changes.
An ideal sample would provide several improvements to the dataset used in this study. As
evident in Section 6, the generally low number of innings pitched by closers in a given season can be
troublesome when analyzing thin slices of the dataset. There is literature on consistency measures and
stabilization rates of baseball statistics (Carleton, 2012b), and closers’ season statistics rarely meet those
standards. In addition, there is potential for a bandwidth problem regarding the OBP values used to
calculate the odds ratio control variables for player skill. The end-of-season OBP statistics were used for
both pitchers and batters, which does not reflect potential swings in performance throughout the
lengthy regular season. A refinement of the study could include the same control variable calculated
with different ranges of OBP (e.g. the three months around the at-bat’s occurrence) to observe if such a
change impacts the results.
8.
Conclusion
This study examined the causes of performance variation among professional baseball closers, a
highly skilled and well-compensated set of agents. A binary logit regression model was developed to
consider the drivers of closer performance. It included controls for individual pitcher and batter skill
levels, which provided a chance to clearly examine the potential psychological effects caused by changes
in situational pressure. The findings support a significant and positive effect of the save situation (SS)
game state on closer performance, which reinforces the conventional wisdom held by players, fans and
the media. After controlling for player matchups and other potential situational influences, the heralded
motivational effects of SS on pitchers persist in the average performance of the nearly 100 closers
included in the study. The findings also support a beneficial effect of the SS state on pitchers within the
34
more specific context of higher-pressure situations, as measured by the Leverage Index statistic, that
considerably affect the game’s ultimate outcome.
The study also contributes a statistic that isolates the effects of various situational pressures on
individual pitchers. This line of reasoning allows for the study of game pressure impact on an
individualized level, and it presents a set of heterogeneous situational effects that are more precise than
the broad-stroke claims previously applied to closers as an entity. Further refinement of such individual
“clutchness” measures could provide teams with an analytical tool to help optimally allocate its roster
selection based on the situational factors encountered in games.
This paper contributes to the literature on the relationship between incentives, effort level and
performance, as well as to recent work examining the influence of situational pressure on outcomes of
professional sporting events. The study’s findings provide empirical support for the implicit motivational
effects of the SS game state on pitchers. The conclusions suggest that the relationship between
situational factors and closer performance is valid and an area ripe for further investigation.
35
References
Apesteguia, Jose and Ignacio Palacios-Huetra. 2010. “Psychological Pressure in Competitive
Environments: Evidence from a Randomized Natural Experiment,”
American Economic Review 100:5, 2548–2564.
Appelman, David. "Get to Know: Leverage Index." FanGraphs (2008),
http://www.fangraphs.com/blogs/index.php/get-to-know-leverage-index/.
Ariely, Dan, Uri Gneezy, George Loewenstein, and Nina Mazar. 2009. “Large Stakes and Big Mistakes.”
Review of Economic Studies, Vol.76, No. 2, pp. 451-469.
Baumeister, R.F. and Steinhilber, A. (1984). "Paradoxical effects of supportive audiences on
performance under pressure: The home field disadvantage in sports championships."
Journal of Personality and Social Psychology, 47(1): 85-93
Berger, Jonah, and Devin Pope. "Can Losing Lead to Winning?"
Management Science 57(5) (2011): 817-827.
Brooks, Dan. "Brooks Baseball." http://www.brooksbaseball.net/.
Buis, M. L. "Predict and Adjust with Logistic Regression." Stata Journal 7 2 (2007): 221-26.
———. "Stata Tip 87: Interpretation of Interactions in Nonlinear Models."
Stata Journal 10, no. 2 (2010): 305-08.
Butler, J. L., & Baumeister, R. F. (1998). The trouble with friendly faces: Skilled performance
with a supportive audience. Journal of Personality and Social Psychology, 75(5), 1213-1230.
Camerer, Colin, Linda Babcock, George Loewenstein, and Richard Thaler. 1997.
“Labor Supply of New York City Cab Drivers: One Day at a Time.” Quarterly Journal of Economics,
112(2): 407–441.
Cameron, Dave. WAR and Relievers. 2010. Available from
http://www.fangraphs.com/blogs/index.php/war-and-relievers/.
Carleton, Russell A. "A Modest Proposal for the Use of Closers." (2008),
http://statspeakmvn.wordpress.com/2008/02/16/a-modest-proposal-for-the-use-of-closers/.
———. “If you’re happy and you know it, get on base.”(2009),
http://www.hardballtimes.com/main/blog_article/if-youre-happy-and-you-know-it-get-on-base/.
———. "In Praise of the Modern Bullpen." Baseball Prospectus (2012a),
http://www.baseballprospectus.com/article.php?articleid=18835.
———. “It's a Small Sample Size After All.” (2012b),
http://www.baseballprospectus.com/article.php?articleid=17659.
36
Cao, Zheng, Joseph Price, and Daniel F. Stone. "Performance under Pressure in the NBA."
Journal of Sports Economics 12 3 (2011): 231-52.
Chen, X., Ender, P., Mitchell, M. and Wells, C. 2003. Regression with Stata,
from http://www.ats.ucla.edu/stat/stata/webbooks/reg/default.htm .
Chib, V. S., B. De Martino, S. Shimojo, and J. P. O'Doherty, 2012, Neural mechanisms underlying
paradoxical performance for monetary incentives are driven by loss aversion,
Neuron 74, 582-594.
Clark, Aaron W. Johnson and Alexander J. Stimpson and Torin K. "Going for Three:
Predicting the Likelihood of Field Goal Success with Logistic Regression." Sloan Sports Analytics
Conference. 2013, Boston http://www.sloansportsconference.com/wp-content/uploads/2013/
Cot's Baseball Contracts. 2012.
http://www.baseballprospectus.com/compensation/cots/.
Farber, Henry S. 2005. “Is Tomorrow Another Day? The Labor Supply of New York City Cab Drivers.”
Journal of Political Economy, 113(1): 46–82.
Fehr, Ernst, and Lorenz Goette. 2007. “Do Workers Work More If Wages Are High?
Evidence from a Randomized Field Experiment.” American Economic Review, 97(1): 298–317.
Fryer, Roland G., Steven D. Levitt, John List and Sally Sadoff. 2012. “Enhancing the Efficacy of
Teacher Incentives through Loss Aversion: A Field Experiment.”
NBER Working Paper No. 18237
Gilovich, Thomas, Robert Vallone, and Amos Tversky. 1985. “The Hot Hand in Basketball:
On the Misperception of Random Sequences.” Cognitive Psychology 17: 295-314.
Goldman, Matthew and Justin M. Rao. 2012. “Effort vs. Concentration: The Asymmetric Impact
of Pressure on NBA Performance.” Submission to the MIT Sloan Sports Analytics Conference
http://www.sloansportsconference.com/wpcontent/uploads/2012/02/Goldman_Rao_Sloan2012.pdf
Goldman, Matthew and Justin M. Rao. 2013. "Live by the Three, Die by the Three? The Price of
Risk in the NBA." Submission to the MIT Sloan Sports Analytics Conference
http://www.sloansportsconference.com/wp-content/uploads/2013/
Gullickson, Aaron. "Logistic Regression." University of Oregon,
http://pages.uoregon.edu/aarong/teaching/G4075_Outline/node16.html.
Haigh, M. and List J. 2005. Do professional traders exhibit myopic loss aversion?
“An experimental analysis.”Journal of Finance 60(1), 523–534.
Heath, Chip, Richard P. Larrick, and George Wu. 1999. “Goals as Reference Points.”
Cognitive Psychology. 38(1): 79–109.
37
Huckabay, Gary. Hitters love the 'walk year'. 2003. ESPN,
http://sports.espn.go.com/mlb/columns/story?id=1608344.
James, Bill. 2003. The New Bill James Historical Baseball Abstract. New York: Simon and Schuster.
http://books.google.com/books/about/The_New_Bill_James_Historical_Baseball_A.html?id=3u
SbqUm8hSAC.
Jazayerli, Rany. "The Impact of Closers: Moving Away from Save Situation Specialization."
Baseball Prospectus (2000), http://www.baseballprospectus.com/article.php?articleid=648.
Kahneman, Daniel, Jack L. Knetsch, and Richard H. Thaler. 1990. "Experimental Tests of the
Endowment Effect and the Coase Theorem." Journal of Political Economy 98 (6): 1325-48.
Kahneman, Daniel, and Amos Tversky. 1992. “Advances in Prospect Theory:
Cumulative Representation of Uncertainty.”Journal of Risk and Uncertainty 5: 297-323.
Kahneman, Daniel, and Amos Tversky. 1991. “Loss Aversion in Riskless Choice:
A Reference-Dependent Model.”The Quarterly Journal of Economics, 106(4): 1039-1061
Kahneman, Daniel, and Amos Tversky. 1979. “Prospect Theory: An Analysis of Decision under Risk.”
Econometrica, 47(2):263–91.
Klayman, Ben. 2010. Analysis: No perfect game but MLB to post record revenue. Reuters,
http://www.reuters.com/article/2010/10/25/us-baseball-economics-idUSTRE69O4GQ20101025.
Köszegi, Botond, and Matthew Rabin. 2006. “A Model of Reference-Dependent Preferences.”
Quarterly Journal of Economics, 121(4): 1133–65.
League Year-By-Year Batting--Averages. 2012. [cited 12/16 2012]. Available from
http://www.baseball-reference.com/leagues/MLB/bat.shtml.
Levitt, Steven D., John A. List, Susanne Neckermann and Sally Sadoff. 2012.
“The Behavioralist Goes to School: Leveraging Behavioral Economics to Improve Educational
Performance.” NBER Working Paper, No.18165.
Lewis, Brian P., and Darwyn E. Linder. "Thinking About Choking? Attentional Processes and Paradoxical
Performance." Personality and Social Psychology Bulletin 23 9 (1997): 937-44.
List, John A. 2003. “Does Market Experience Eliminate Market Anomalies?.”
Quarterly Journal of Economics, 118:41-71.
List, John A. 2004. “Neoclassical Theory vs. Prospect Theory: Evidence from the Marketplace.”
Econometrica 72:615–25.
Lunt, Mark. "Modelling Binary Outcomes."
http://personalpages.manchester.ac.uk/staff/mark.lunt/stats_course.html.
38
Meisel, Zack. "Non-Save Situations No Easy Task for Closers." (2012),
http://mlb.mlb.com/news/article.jsp?ymd=20120612&content_id=33161766&vkey=news_mlb&c_id=mlb.
MLB Closer Report - 2012. 2012. [cited 12/16 2012]. Available from
http://espn.go.com/mlb/stats/closers.
Moskowitz, Tobias., and L. Jon Wertheim. Scorecasting: The Hidden Influences Behind How Sports Are
Played and Games Are Won. New York: Random House, 2011.
Norton, E. C., H. Wang, and C. Ai. "Computing Interaction Effects and Standard Errors in
Logit and Probit Models." Stata Journal 4 2 (2004): 154-67.
———. "Interaction terms in logit and probit models." Economics Letters 80 1 (2003): 123-129
Otten, M. P., & Barrett, M. E. (2013). Pitching and clutch hitting in Major League Baseball:
What 109 years of statistics reveal. Psychology of Sport and Exercise, 14(4), 531-537.
Pedace, Roberto, and Janet Smith. 2012. “Loss Aversion and Managerial Decisions:
Evidence from Major League aseball.” Economic Inquiry,
doi: 10.1111/j.1465-7295.2012.00463.x
Pope, Devin G., and Maurice E. Schweitzer. 2011. "Is Tiger Woods Loss Averse?
Persistent Bias in the Face of Experience, Competition, and High Stakes."
American Economic Review 101 1: 129-57.
Pope, Devin, and Uri Simonsohn. 2011. "Round Numbers as Goals." Psychological Science 22(1): 71-79.
Rauh, Michael T., and Giulio Seccia. "Anxiety and Performance: An Endogenous
Learning-by-Doing Model*." International Economic Review 47 2 (2006): 583-609.
Read, Daniel, George Loewenstein, and Matthew Rabin. "Choice Bracketing."
Journal of Risk and Uncertainty 19 1-3 (1999): 171-97.
Seidman, Eric. "All About Clutch." FanGraphs (2008),
http://www.fangraphs.com/blogs/index.php/all-about-clutch/.
Singer, Tom. "Valverde Vulnerable in Non-Save Situations." (2011),
http://detroit.tigers.mlb.com/news/article.jsp?ymd=20111013&content_id=25637296&c_id=det.
Sribney, Bill. Dlogit2: Stata Modules to Compute Marginal Effects for Logit, Probit, and Mlogit.
Computer software. Boston College Department of Economics, 1996.
Stark, Jayson. "The Age of the Pitcher." ESPN (2012),
http://espn.go.com/mlb/story/_/id/8048897/the-age-pitcher-how-got-here-mlb.
Studeman, Dave. "Season Leverage Index." The Hardball Times (2008),
http://www.hardballtimes.com/main/article/season-leverage-index/.
39
Tango, Tom. “Crucial Situations” (2006a),
http://www.hardballtimes.com/main/article/crucial-situations.
———. "Crucial Situations: Part 3." The Hardball Times (2006b),
http://www.hardballtimes.com/main/article/crucial-situations-part-three/.
———. "Crucial Situations: Leverage Index (LI)." (2007), http://www.insidethebook.com/li.shtml.
Tango, Tom and Mitchel Lichtman and Andrew Dolphin. "Excerpt: The Book - the Right –
and Wrong -- Time to Use Your Ace Reliever." Sports Illustrated (2006),
http://sportsillustrated.cnn.com/2006/baseball/mlb/04/17/thebook.excerpt/index.html.
Thaler, Richard H. 1999. “Mental Accounting Matters.” Journal of Behavioral
Decision-making. 12, 183-206.
Thaler, R. H., & Johnson, E. J. 1990. Gambling with the house money and trying to break even:
The effects of prior outcomes on risky choices. Management Science, 36: 643-660.
Torres-Reyna, Oscar. "Getting Started in Logit and Ordered Logit Regression."
Princeton University, http://dss.princeton.edu/training/Logit.pdf.
Van-Riper, Tom. The Myth Of The Contract Year Slugger. Forbes 2010. Available from
http://www.forbes.com/2010/04/13/yankees-phillies-astros-business-sports-bloomberg-baseball.html/.
Wyers, Colin. "Extra Innings Excerpt Are Relievers Being Used Properly?"
Baseball Prospectus (2012), http://www.baseballprospectus.com/article.php?articleid=16287.
Yerkes, R. M. and Dodson, J. D. 1908. “The Relationship of Strength of Stimulus to
Rapidity of Habit-Formation”, Journal of Comparative Neurology of Psychology, 18 (5), 459–482.
40
Tables and Figures
Figure 1: Breakdown of save situations (SS) and non-save situations NSS) depending on the
game run difference
41
Figure 2: An “S-shaped” value function
42
Figure 3a
Figure 3b
Figure 3: Scatterplots of earned runs allowed in appearances by closers in the 2011 season against the
run difference margin faced at game entry. Lowess curves are fitted to the data points in both figures.
43
Figure 4: Scatterplot of earned runs allowed in appearances by closers in the 2011 season against the
run difference margin faced at game entry. The size of each circle is conditional on the frequency of
the observation within the sample. The three points most observed were (1,0), (2,0), and (3,0),
respectively.
44
Table 1
Proportion of On-base Occurrences
Game State
Proportion of batters who
reached base (%)
Non-save situation (NSS)
32.2
Save situation (SS)
29.8
T-test
t=3.88***
Note: This table reports summary statistics detailing the proportion of
plate appearances that resulted in a batter reaching base, depending on
the game state. The total sample size is 26,223 plate appearances. The ttest tested if the difference in proportions was significantly different from
*
**
***
zero: t(26221) = 3.88. p < 0.05, p < 0.01, p < 0.001
45
Table 2
Summary Statistics
Observations
Mean
Standard
Deviation
Minimum
Maximum
On-base
26223
0.3135
0.4639
0
1
Save Situation
26223
0.3401
0.4737
0
1
Leverage Index
26223
1.6576
1.5649
-0.34
11.04
Run Difference
26223
0.8351
3.1054
-18
15
Home
26223
0.5234
0.4995
0
1
Contract
26223
0.1429
0.3500
0
1
Player Skill
Control
26223
-0.8171
0.2751
-3.42
1.80
Leverage Index x
Save Situation
26223
0.8102
1.5152
0
11.04
Run Difference x
Save Situation
26223
0.6256
0.9866
0
3
Home x
Save Situation
26223
0.1579
0.3646
0
1
Contract x
Save Situation
26223
0.0574
0.2327
0
1
Note: This table reports summary statistics for the variables used in the primary and interacted variable
regression specifications described in the text. Data correspond to the collected sample of 26,223 atbats involving 96 closers in the 2000 and 2011 Major League Baseball regular seasons. On-base, Save
Situation, Home, and Contract are binary variables.
46
Table 3
Summary Statistics by Game State
Number of
Observations
Percent of
Total Observations
Mean
Leverage Index
Mean
Run Difference
Non-save situation (NSS)
17,305
65.99
1.28
0.32
Save situation (SS)
8,918
34.01
2.38
1.84
Total
26,223
100
1.66
0.84
Out
18,001
68.65
Reached Base
8,222
31.35
Total
26,223
100
Game State
Plate appearance outcome
Note: This table reports summary statistics depending on the game state, specifically non-save situations (NSS) and save
situations (SS). The table also presents frequency statistics for the binary dependent variable. The results correspond to
the values taken on by the dependent variable, Onbase, in subsequent regressions. “Out” corresponds to a coding of
zero and “Reached Base” corresponds to a coding of one.
47
Leverage Index Ranges
70
Percent of at-bats (%)
60
50
40
Closers in Sample
30
All Game Situations
20
10
0
43.05 60
Low (<1)
23.68 30
33.27 10
Medium (1-2)
Leverage Index
High (>2)
Figure 5: A comparison of Leverage Index ranges
Figure 6: Kernel density plots of Leverage Index distribution by game state
48
Figure 7: Histogram of run differences faced by closers in the sample
Figure 8: Scatter plot of Leverage Index values against run differences
49
Table 4
Matrix of Correlation Coefficients
On-base
Save situation
(SS)
Leverage Index
(LI)
On-base
1
Save situation (SS)
-0.0240*
(0.0001)
1
Leverage Index (LI)
0.0066
(0.2841)
0.3324*
(0)
1
Run Difference (RunDiff)
-0.0122*
(0.0489)
0.2322*
(0)
0.0591*
(0)
Note: This table reports pairwise correlation coefficients for the variables of interest.
*
Standard errors in parentheses: p < 0.05.
50
Run Difference
(RunDiff)
1
Table 5
Naïve Specifications
On-Base
Naive
(1)
Odds
Ratio
(2)
-0.110***
(0.0283)
0.896***
(0.0254)
Logit
Save Situation (SS)
Player Skill Control
Constant
Observations
2
Pseudo R
-0.747***
(0.0163)
26223
0.000
26223
0.000
Control Added
Odds
Logit
Ratio
(3)
(4)
-0.0911**
(0.0285)
0.913**
(0.0260)
0.962
(0.0517)
2.617
(0.135)
0.0200
(0.0437)
26223
0.012
26223
0.012
Note: This table reports regression results for two binary logit specifications. Column (1)
report the naïve specification results and Column (3) reports results after the inclusion of
the player matchup control variable. The effect of the Player Skill Control variable is also
significantly different from zero (p<0.001) in each of the specifications, but it is a control
variable that, as expected, has a coefficient approximately equal to one in the logit model.
The coefficients for the odds ratio columns are equal to base e raised to the logit
coefficients. Constant terms are omitted in the OR columns because that relationship does
*
**
***
not hold. Standard errors in parentheses: p < 0.05, p < 0.01, p < 0.001
51
Table 6
The Effects of Game Situations and Other Determinants on Probability of Reaching Base
Full Specification
Odds
Logit
Ratio
On-Base
LI Omitted
SS Omitted
Odds
Odds
Logit
Logit
Ratio
Ratio
RunDiff FEs
Odds
Logit
Ratio
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Save Situation (SS)
-0.124***
(0.0349)
0.883***
(0.0308)
-0.107**
(0.0328)
0.898**
(0.0295)
-0.0572
(0.0515)
0.944
(0.0486)
Leverage Index (LI)
0.0166
(0.00940)
1.017
(0.00956)
0.00708
(0.00891)
1.007
(0.00897)
0.0127
(0.0130)
1.013
(0.0131)
Run Difference
(RunDiff)
-0.00212
(0.00443)
0.998
(0.00442)
-0.00236
(0.00438)
0.998
(0.00437)
-0.00538
(0.00425)
0.995
(0.00423)
Home
-0.0505
(0.0316)
0.951
(0.0300)
-0.0548
(0.0321)
0.947
(0.0304)
-0.0440
(0.0312)
0.957
(0.0299)
-0.0554
(0.0325)
0.946
(0.0307)
Contract
0.0273
(0.0170)
1.028
(0.0174)
0.0269
(0.0157)
1.027
(0.0161)
0.0423*
(0.0207)
1.043*
(0.0216)
0.0264
(0.0209)
1.027
(0.0215)
Player Skill Control
0.942
(0.0732)
2.564
(0.188)
0.943
(0.0731)
2.569
(0.188)
0.941
(0.0731)
2.562
(0.187)
0.945
(0.0727)
2.572
(0.187)
Season Fixed Effects
(2011)
0.0108
(0.0134)
1.011
0.0135)
0.0140
(0.0117)
1.014
(0.0119)
0.0136
(0.0183)
1.014
(0.0185)
0.0124
(0.0130)
1.012
(0.0132)
26223
0.012
0.0566
(0.0530)
26223
0.012
26223
0.012
-0.0305
(0.0531)
26223
0.012
26223
0.012
-0.682*
(0.322)
26223
0.013
26221
0.013
Constant
Observations
Pseudo R2
0.0292
(0.0552)
26223
0.012
Note: This table reports regression results for binary logit specifications and the odds ratio results of those same specifications. The
coefficients for LI in (1) and (2) are significant at the 10% level. The coefficients for the odds ratio columns are equal to base e raised to the
logit coefficients. Constant terms are omitted in the OR columns because that relationship does not hold. Fixed effects for individual pitchers
were included in each regression (not reported). The specification reported in (7) and (8) include run difference fixed effects (not reported).
The effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control
variable that, as expected, has a coefficient approximately equal to one in the logit model. Standard errors are robust and adjusted for
clustering at the pitcher level.
*
**
***
Robust standard errors in parentheses: p < 0.05, p < 0.01, p < 0.001.
52
Table 7
Marginal Effects of Primary Specification Variables
On-Base
Restricted Samples
LI ≤ 2
LI ≥ 1
(3)
(4)
***
-0.0265 -0.0268***
(0.00744)
(0.00754)
Save Situation
(SS)
Means
(1)
-0.0266***
(0.00746)
LI ≥ 2
(2)
-0.0266***
(0.00748)
Leverage Index
(LI)
0.00354
(0.00201)
0.00355
(0.00203)
0.00354
(0.00199)
0.00357
(0.00205)
0.00353
(0.00200)
Run Difference
(RunDiff)
-0.000455
(0.000948)
-0.000456
(0.000950)
-0.000454
(0.000946)
-0.000458
(0.000955)
-0.000453
(0.000945)
Home
-0.0108
(0.00676)
-0.0108
(0.00677)
-0.0108
(0.00674)
-0.0109
(0.00682)
-0.0108
(0.00673)
Contract
0.0159***
(0.00385)
0.0160***
(0.00384)
0.0159***
(0.00386)
0.0161***
(0.00384)
0.0159***
(0.00385)
Player Skill
Control
0.202
(0.0156)
0.202
(0.0156)
0.201
(0.0156)
0.203
(0.0156)
0.201
(0.0156)
Season Fixed
Effects (2011)
0.00231
(0.00286)
0.00232
(0.00287)
0.00231
(0.00286)
0.00233
(0.00288)
0.00231
(0.00285)
Constant
-0.00383
(0.0113)
-0.00384
(0.0113)
-0.00382
(0.0113)
-0.00386
(0.0114)
-0.00382
(0.0112)
26223
26223
26223
26223
26223
Observations
LI ≤ 1
(5)
-0.0265***
(0.00743)
Note: This table reports the marginal effects of the independent variables. Column (1) reports marginal effects at the mean value of each of
the variables, and Columns (2)-(5) report the marginal effects at the means of subsections of the data. The marginal effects were calculated
using the dlogit2 command in Stata. Fixed effects for individual pitchers were included in each regression (not reported). Standard errors
*
**
***
are robust and adjusted for clustering at the pitcher level. Robust standard errors in parentheses: p < 0.05, p < 0.01, p < 0.001
53
Table 8
The Interacted Effects of Game State and Other Determinants on On-Base Probability
Full Specification
Odds
Logit
Ratio
On-Base
LI Omitted
RunDiff Omitted
Odds
Odds
Logit
Logit
Ratio
Ratio
LI, RunDiff Omitted
Odds
Logit
Ratio
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Save Situation (SS)
0.0899
(0.111)
1.094
(0.121)
-0.0814
(0.0668)
0.922
(0.0616)
-0.0501
0.0672)
0.951
(0.0640)
-0.121
(0.0492)
0.886 (0.0436)
***
Leverage Index (LI)
0.0369 (0.0112)
1.038 (0.0116)
0.0365 (0.0111)
1.037 (0.0115)
Run Difference (RunDiff)
-0.00254
(0.00447)
0.997
(0.00446)
-0.00187
(0.00439)
0.998
(0.00438)
Home
-0.0454
(0.0385)
0.956
(0.0367)
-0.0475
(0.0385)
0.954
(0.0367)
-0.0438
(0.0387)
0.957
(0.0371)
-0.0462
(0.0388)
0.955
(0.0370)
Contract
-0.0351
0.0440)
0.966
(0.0425)
-0.0397
(0.0441)
0.961
(0.0424)
-0.0275
(0.0445)
0.973
(0.0433)
-0.0342
(0.0444)
0.966
(0.0429)
**
***
**
**
*
(8)
*
*
**
*
-0.0600 (0.0222)
0.942 (0.0209)
-0.0438 (0.0187)
0.957 (0.0179)
Run Difference x
Save Situation
-0.0516
(0.0333)
0.950
(0.0316)
-0.0209
(0.0273)
0.979
(0.0267)
Home x
Save Situation
-0.0353
(0.0635)
0.965
(0.0613)
-0.0199
(0.0617)
0.980
(0.0605)
-0.0284
(0.0632)
0.972
(0.0615)
-0.0217
(0.0619)
0.978
(0.0606)
Contract x
Save Situation
0.126
(0.0768)
1.135
(0.0871)
0.128
0.0769)
1.136
(0.0873)
0.127
0.0767)
1.135
(0.0870)
0.127
(0.0770)
1.136
(0.0875)
Player Skill Control
0.941
(0.0736)
2.563
(0.189)
0.944
(0.0731)
2.570
(0.188)
0.941
(0.0734)
2.562
(0.188)
0.944
(0.0730)
2.570
(0.188)
Season Fixed Effects (2011)
0.00458
(0.0138)
1.005
(0.0138)
0.0120
(0.0144)
1.012
(0.0145)
0.0088
(0.0141)
1.009
(0.0142)
0.0142
(0.0139)
1.014
(0.0141)
Constant
0.0134
(0.0620)
Leverage Index x Save
Situation
Observations
2
Pseudo R
26223
0.013
0.0675
(0.0589)
26223
0.013
26223
0.012
0.0073
(0.0615)
26223
0.012
26223
0.013
0.0636
(0.0585)
26223
0.013
26223
0.012
26223
0.012
Note: This table reports regression results for binary logit specifications and the odds ratio results of those same specifications. The coefficients
for the odds ratio columns are base e raised to the logit coefficients. Constant terms are omitted in the odds ratio columns because that
relationship does not hold. Fixed effects for individual pitchers were included in each regression (not reported). The effect of the Player Skill
Control variable is also significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a
coefficient approximately equal to one in the logit model. Standard errors are robust and adjusted for clustering at the pitcher level.
*
**
***
Robust standard errors in parentheses: p < 0.05, p < 0.01, p < 0.001
54
Table 9
Effect of Leverage Index*Save Situation Interaction Variable
Interaction Effect
Standard error
Z
Mean
Std. Dev.
-0.0126
0.0018
-8.46
0.0013
0.0010
2.90
LI*SS Interaction
Minimum
Maximum
-0.0148
0.0004
-13.05
-0.0019
0.0115
-1.03
Observations
26223
26223
26223
Note: This table reports the results of the inteff Stata command, which correctly calculates the coefficient, sign and
significance of interacted variables in non-linear models, such as the binary logit specification (full results reported in
Table 8).
Figure 9a
Figure 9b
Figure 9: Distributions of the LI*SS interaction effect and its significance level plotted against predicted probabilities of the dependent variable
55
Figure 10a
Figure 10b
Figure 11a
Figure 11b
Figures 10, 11: Scatter plots of the proportion of on-base outcomes against ranges of LI and LI in save situations (LI*SS). Figures 10a and 11a show
linear regressions and Figures 10b and 11b show lowess curves fit to the data.
56
Table 10
Marginal Effects of Interacted Variables
On-Base
Restricted Samples
Save Situation
(SS)
(1)
0.0193
(0.0238)
Save
Situations
(2)
0.0187
(0.0231)
Leverage Index
(LI)
0.00789***
(0.00239)
0.00767***
(0.00232)
0.00800***
(0.00242)
0.00796**
(0.00244)
0.00785***
(0.00236)
0.00793**
(0.00242)
0.00784***
(0.00235)
Run Difference
(RunDiff)
-0.00054
(0.00095)
-0.000529
(0.000930)
-0.000551
(0.000969)
-0.000549
(0.000965)
-0.000541
(0.000952)
-0.000547
(0.000961)
-0.000540
(0.000950)
Home
-0.00973
(0.00823)
-0.00945
(0.00799)
-0.00986
(0.00835)
-0.00982
(0.00831)
-0.00968
(0.00819)
-0.00978
(0.00827)
-0.00966
(0.00817)
Contract
-0.00463
(0.00948)
-0.00450
(0.00921)
-0.00469
(0.00961)
-0.00467
(0.00957)
-0.00461
(0.00944)
-0.00465
(0.00953)
-0.00460
(0.00942)
Leverage Index
x Save Situation
-0.0128**
(0.00475)
-0.0125**
(0.00461)
-0.0130**
(0.00482)
-0.0130**
(0.00478)
-0.0128**
(0.00473)
-0.0129**
(0.00477)
-0.0128**
(0.00471)
Run Difference
x Save Situation
-0.0111
(0.00713)
-0.0107
(0.00692)
-0.0112
(0.00724)
-0.0112
(0.00720)
-0.0110
(0.00710)
-0.0111
(0.00717)
-0.0110
(0.00708)
Home x
Save Situation
-0.00755
(0.0136)
-0.00734
(0.0132)
-0.00765
(0.0138)
-0.00762
(0.0137)
-0.00751
(0.0135)
-0.00759
(0.0137)
-0.00750
(0.0135)
Contract x
Save Situation
0.0271
(0.0164)
0.0263
(0.0160)
0.0275
(0.0167)
0.0273
(0.0166)
0.0269
(0.0164)
0.0272
(0.0165)
0.0269
(0.0163)
Player Skill
Control
0.201
(0.0156)
0.196
(0.0151)
0.204
(0.0160)
0.203
(0.0158)
0.201
(0.0156)
0.202
(0.0157)
0.200
(0.0156)
0.000980
(0.00295)
0.000952
(0.00287)
0.000993
(0.00299)
0.000989
(0.00298)
0.000975
(0.00294)
0.000985
(0.00296)
0.000973
(0.00293)
-0.000018
(0.0176)
26223
-0.000018
(0.0171)
8918
-0.000018
(0.0179)
17305
-0.000018
(0.0178)
8724
-0.000018
(0.0175)
17499
-0.000018
(0.0177)
14933
-0.000018
(0.0175)
11290
Means
Season Fixed
Effects (2011)
Constant
Observations
Non-Save
Situations
(3)
0.0195
(0.0241)
LI ≥ 2
LI ≤ 2
LI ≥ 1
LI ≤ 1
(4)
0.0194
(0.0240)
(5)
0.0192
(0.0237)
(6)
0.0193
(0.0239)
(7)
0.0191
(0.0236)
Note: This table reports the marginal effects of the independent variables in the specification including interaction variables.
Column (1) reports marginal effects at the mean value of each of the variables, and Columns (2)-(7) report the marginal effects at
the means of subsections of the data. The marginal effects were calculated using the dlogit2 command in Stata. Fixed effects for
individual pitchers were included in each regression (not reported). Standard errors are robust and adjusted for clustering at the
pitcher level. Robust standard errors in parentheses: * p < 0.05, ** p < 0.01, *** p < 0.001
57
Table 11
Primary Specification Linear Regression
Linear
(1)
On-Base
LI Omitted
RunDiff Omitted
(2)
(3)
RunDiff FEs
(4)
Save Situation (SS)
-0.0262***
(0.00731)
-0.0227**
(0.00688)
-0.0269***
(0.00704)
-0.0119
(0.0109)
Leverage Index (LI)
0.00346
(0.00202)
0.00350
(0.00201)
0.00265
(0.00279)
-0.000501
(0.000956)
-0.000560
(0.000950)
Home
-0.0110
(0.00669)
-0.0119
(0.00680)
-0.0107
(0.00672)
-0.0120
(0.00689)
Contract
0.0155***
(0.00402)
0.0177***
(0.00357)
0.0166***
(0.00407)
0.0133*
(0.00522)
Player Skill Control
0.189
(0.0134)
0.189
(0.0133)
0.189
(0.0133)
0.189
(0.0132)
Season Fixed Effects
(2011)
0.00208
(0.00344)
0.00272
(0.00295)
0.00218
(0.00346)
0.00251
(0.00319)
0.471***
(0.0105)
26223
0.015
0.474***
(0.0102)
26223
0.015
0.469***
(0.0102)
26223
0.015
0.330***
(0.0553)
26223
0.016
Run Difference
(RunDiff)
Constant
Observations
R2
Note: This table reports regression results for linear specifications including the non-interacted variables of interest. The
results of the binary logit specification of the same variables are reported in Table 6. Fixed effects for individual pitchers
were included in the regression (not reported). The run difference fixed effects of Column (4) are also not reported. The
effect of the Player Skill Control variable is also significantly different from zero (p<0.001) in each of the specifications, but
it is a control variable that is expected to differ from zero. Standard errors are robust and adjusted for clustering at the
pitcher level.
Robust standard errors in parentheses: * p < 0.05, ** p < 0.01, *** p < 0.001
58
Table 12
Linear Regression of Interacted Variable Specification
On-Base
Linear
(1)
0.0182
(0.0230)
Save Situation (SS)
Leverage Index (LI)
0.00799**
(0.00246)
Run Difference (RunDiff)
-0.000592
(0.000957)
Home
-0.0101
(0.00829)
Contract
-0.00255
(0.00932)
Leverage Index x Save
Situation
-0.0128**
(0.00463)
Run Difference x Save
Situation
-0.0106
(0.00687)
Home x Save Situation
-0.00676
(0.0133)
Contract x Save Situation
0.0265
(0.0163)
Player Skill Control
0.188
(0.0134)
0.000789
(0.00357)
Season Fixed Effects (2011)
0.472***
(0.0161)
26223
Constant
Observations
Note: This table reports regression results for a linear specification that includes all of the
covariates used in the binary logit specifications of Table 8. Fixed effects for individual
pitchers were included in the regression (not reported). The effect of the Player Skill Control
variable is also significantly different from zero (p<0.001) in the specification, but it is a
control variable that is expected to differ from zero. Standard errors are robust and
adjusted for clustering at the pitcher level.
*
**
***
Robust standard errors in parentheses: p < 0.05, p < 0.01, p < 0.001
59
Figure 12a
Figure 12b
Figure 12: Partial regression plots of the LI and LI*SS variables
60
Table 13
Variation of Fixed Effects Specifications in Binary Logit Models
On-Base
Full Logit
Pitcher FE only
Year FE only
RunDiff FE
No FE
(1)
0.0899
(0.111)
0.0369***
(0.0112)
(2)
0.0900
(0.111)
0.0369***
(0.0112)
(3)
0.108
(0.117)
0.0370**
(0.0124)
(4)
0.0804
(0.154)
0.0625***
(0.0157)
(5)
0.103
(0.116)
0.0368**
(0.0124)
Run Difference
(RunDiff)
-0.00254
(0.00447)
-0.00254
(0.00447)
-0.00259
(0.00456)
Home
-0.0454
(0.0385)
-0.0454
(0.0385)
-0.0442
(0.0332)
-0.0494
(0.0399)
-0.0447
(0.0332)
Contract
-0.0351
(0.0440)
-0.0397
(0.0415)
-0.0386
(0.0494)
-0.0338
(0.0466)
-0.0423
(0.0490)
Leverage Index x
Save Situation
-0.0600**
(0.0222)
-0.0600**
(0.0222)
-0.0598**
(0.0218)
-0.0837**
(0.0257)
-0.0592**
(0.0217)
Run Difference x
Save Situation
-0.0516
(0.0333)
-0.0517
(0.0333)
-0.0518
(0.0385)
0.0284
(0.0608)
-0.0510
(0.0385)
Home x
Save Situation
-0.0353
(0.0635)
-0.0353
(0.0635)
-0.0338
(0.0584)
-0.0303
(0.0653)
-0.0328
(0.0584)
Contract x
Save Situation
0.126
(0.0768)
0.126
(0.0768)
0.100
(0.0786)
0.128
(0.0787)
0.101
(0.0785)
Player Skill Control
0.941
(0.0736)
0.941
(0.0732)
0.955
(0.0547)
0.945
(0.0729)
0.964
(0.0520)
Season Fixed
Effects (2011)
0.00458
(0.0138)
-0.0165
(0.0289)
0.0155
(0.0123)
Constant
0.0134
(0.0620)
0.0179
(0.0697)
0.00293
(0.0507)
-0.675*
(0.309)
0.00471
(0.0506)
26223
0.013
26223
0.013
26223
0.012
26221
0.014
26223
0.012
Save Situation (SS)
Leverage Index (LI)
Observations
Pseudo R2
-0.00266
(0.00456)
Note: This table reports regression results for logit specifications that vary based on the included fixed effects. The results of
the various individual fixed effects are not reported. The effect of the Player Skill Control variable is also significantly
different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a coefficient
approximately equal to one in the logit model. Standard errors are robust and adjusted for clustering at the pitcher level.
*
**
***
Robust standard errors in parentheses: p < 0.05, p < 0.01, p < 0.001
61
Table 14
Regression of On-Base Probability on Ranges of NSS
States
On-Base
Logit
(1)
Linear
(2)
Win Non-Save Situation (WinNSS)
0.125*
(0.0605)
0.0258*
(0.0119)
Lose Non-Save Situation (LoseNSS)
0.0863
0.0458)
0.0183
(0.00966)
Leverage Index (LI)
0.00947
(0.0131)
0.00200
(0.00254)
Leverage Index * WinNSS
-0.149
(0.0770)
-0.0300*
(0.0151)
Leverage Index * LoseNSS
0.0281
(0.0190)
0.00592
(0.00430)
Home
-0.0598
(0.0324)
-0.0129*
(0.00581)
Contract
0.0503**
(0.0183)
-0.00172
(0.00822)
Player Skill Control
0.945
(0.0734)
0.197
(0.00998)
Season Fixed Effects (2011)
0.0225
(0.0171)
-0.0880
(0.0582)
26223
0.466***
(0.0119)
26223
Constant
Observations
Note: This table reports regression results for binary logit and linear
specifications that divide the NSS variable into two subsections. Fixed
effects for individual pitchers were included in the logit regression (not
reported). The linear regression omitted pitcher and season fixed effects.
Standard errors are robust in both specifications and adjusted for
clustering at the pitcher level in Column (1). The coefficients for LoseNSS in
(1) and (2), and the coefficients for LI*WinNSS and Home in (1), are
significant at the 10% level. The effect of the Player Skill Control variable is
also significantly different from zero (p<0.001) in each of the specifications, but
it is a control variable that, as expected, has a coefficient approximately equal
to one in the logit model.
Robust standard errors in parentheses: * p < 0.05, ** p < 0.01, *** p < 0.001
62
Table 15
Regressions of On-Base Probability on Run Difference Ranges
On-Base
Run Range:
[-3,3]
(3)
Run Range:
[-1,1]
(1)
Run Range:
[-2,2]
(2)
Save Situation (SS)
0.163
(0.154)
0.104
(0.156)
0.156
(0.118)
0.357*
(0.141)
Leverage Index (LI)
0.0662***
(0.0196)
0.0666***
(0.0159)
0.0707***
(0.0142)
0.0757**
(0.0252)
0.0335
(0.0183)
Run Difference
(RunDiff)
-0.0546
(0.0412)
-0.0396
(0.0225)
-0.0508**
(0.0163)
0.0407*
(0.0165)
0.00464
(0.00883)
Home
-0.118*
(0.0507)
-0.0983*
(0.0439)
-0.0820*
(0.0414)
0.00767
(0.0566)
-0.0898
(0.0467)
Contract
0.109
(0.0725)
0.0277
(0.109)
-0.000324
(0.0798)
-0.155*
(0.0649)
0.343***
(0.0273)
-0.0968**
(0.0337)
-0.0964***
(0.0289)
-0.0949***
(0.0246)
-0.101**
(0.0317)
0.0356
(0.0619)
-0.00433
(0.0362)
-0.0978**
(0.0360)
Leverage Index x
Save Situation
Run Difference x
Save Situation
Leading
(RunDiff > 0)
(4)
Trailing or Tied
(RunDiff ≤ 0)
(5)
Home x
Save Situation
-0.0363
(0.0911)
-0.0324
(0.0704)
0.000449
(0.0643)
-0.0924
(0.0752)
Contract x
Save Situation
0.0552
(0.115)
0.121
(0.0932)
0.120
(0.0872)
0.0940
(0.0796)
Player Skill Control
1.052
(0.112)
1.069
(0.0906)
1.015
(0.0819)
0.928
(0.0840)
0.995
(0.106)
0.0278
(0.0210)
0.0374
(0.0915)
0.0308
(0.0562)
0.0377*
(0.0152)
0.0335*
(0.0168)
-0.0738
(0.0917)
11346
0.018
-0.0631
(0.114)
16691
0.017
-0.0745
(0.0883)
20263
0.015
-0.236*
(0.119)
15896
0.014
-0.0508
(0.0831)
10327
0.016
Season Fixed
Effects
(2011)
Constant
Observations
Pseudo R2
Note: This table reports regression results for logit specifications that vary depending on ranges of team run differences at the time
of the plate appearance sample points. Fixed effects for individual pitchers were included in each regression (not reported).
Standard errors are robust and adjusted for clustering at the pitcher level. The effect of the Player Skill Control variable is also
significantly different from zero (p<0.001) in each of the specifications, but it is a control variable that, as expected, has a
*
**
coefficient approximately equal to one in the logit model. Robust standard errors in parentheses: p < 0.05, p < 0.01,
***
p < 0.001
63
Table 16
Clutchness Rankings: 2000
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Name
Steve Kline
Jose Paniagua#
Keith Foulke
Jerry Spradlin*#
Danny Graves
Robb Nen
Mike Williams
Mike Fetters#
Rick Aguilera
Mike Remlinger
Gabe White#
Eddie Guardado*#
Shigetoshi Hasegawa#
Steve Karsay
Wayne Gomes*#
Bob Howry#
John Wetteland
Jason Isringhausen
LaTroy Hawkins*
Ugueth Urbina^#
Mike Morgan*#
Mike Timlin
Armando Benitez
Octavio Dotel*
Team
MON
SEA
CHW
KCR
CIN
SFG
PIT
LAD
CHC
ATL
COL
MIN
ANA
CLE
PHI
CHW
TEX
OAK
MIN
MON
ARI
BAL
NYM
HOU
Coefficient
0.1049
0.1133
0.1185
0.1382
0.1388
0.1426
0.1502
0.1658
0.1724
0.1766
0.1935
0.2013
0.2030
0.2035
0.2092
0.2093
0.2111
0.2143
0.2144
0.2152
0.2153
0.2170
0.2407
0.2433
Rank
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Name
Trevor Hoffman
Jeff Shaw
Scott Strickland
Bob Wells
Bob Wickman
John Rocker
Mariano Rivera
Curtis Leskanic*
Billy Koch
Jeff Brantley
Ryan Kohlmeier
Troy Percival
Dave Veres
Kazuhiro Sasaki
Ricky Bottalico
Scott Williamson*#
Byung-Hyun Kim
Todd Jones
Roberto Hernandez
Kerry Ligtenberg
Billy Wagner#
Derek Lowe
Antonio Alfonseca
Matt Mantei
Team
SDP
LAD
MON
MIN
CLE
ATL
NYY
MIL
TOR
PHI
BAL
ANA
STL
SEA
KCR
CIN
ARI
DET
TBR
ATL
HOU
BOS
FLA
ARI
Coefficient
0.2462
0.2510
0.2529
0.2533
0.2555
0.2607
0.2635
0.2674
0.2704
0.2729
0.2740
0.2779
0.2813
0.2867
0.2879
0.2896
0.2897
0.3076
0.3105
0.3310
0.3449
0.3670
0.3758
0.3906
Note: This table reports the “Clutchness” coefficients of the 48 pitchers in the sample who pitched in the 2000 season. The coefficients were calculated by
taking the weighted average of the SS, LI, and the pitcher-specific constant term obtained through linear regressions individually run for each pitcher. The
pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The symbols following some pitcher names
correspond to the following caveats: # if the pitcher earned eight or fewer saves in the season, * if the pitcher made 30 percent or fewer of his
appearances in SS game states, ^ if the pitcher participated in fewer than 100 plate appearances in the season.
64
Table 17
Clutchness Rankings: 2011
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Name
J.J. Putz
Sean Marshall#
Eduardo Sanchez#
Bobby Parnell#
Francisco Cordero
Joel Hanrahan
Chris Sale#
Brandon League
Ryan Madson
Joel Peralta#
Sergio Santos
Carols Marmol
Joakim Soria
David Hernandez
Kevin Gregg
Jonathan Papelbon
Jason Isringhausen#
Francisco Rodriguez
Heath Bell
Matt Capps
Jason Motte#
Fernando Salas
Frank Francisco
John Axford
Team
ARI
CHC
STL
NYM
CIN
PIT
CHW
SEA
PHI
TBR
CHW
CHC
KCR
ARI
BAL
BOS
NYM
NYM
SDP
MIN
STL
STL
TOR
MIL
Coefficient
0.1151
0.1335
0.1425
0.1536
0.1604
0.1613
0.1616
0.1631
0.1689
0.1748
0.1801
0.1810
0.1843
0.1965
0.1998
0.2017
0.2047
0.2054
0.2233
0.2325
0.2365
0.2391
0.2391
0.2393
Rank
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Name
Brian Fuentes
Jordan Walden
Drew Storen
Antonio Bastardo#
Jose Valverde
Jim Johnson
Kyle Farnsworth
Neftali Feliz
Kenley Jansen*#
Rafael Betancourt#
Javy Guerra
Mariano Rivera
Jon Rauch
Chris Perez
Santiago Casilla*#
Jonathan Broxton^#
Brian Wilson
Jose Contreras^#
Juan Carlos Oviedo
Craig Kimbrel
Mark Melancon
Huston Street
Andrew Bailey
Joe Nathan
Team
OAK
LAA
WAS
PHI
DET
BAL
TBR
TEX
LAD
COL
LAD
NYY
TOR
CLE
SFG
LAD
SFG
PHI
FLA
ATL
HOU
COL
OAK
MIN
Coefficient
0.2438
0.2440
0.2475
0.2496
0.2531
0.2556
0.2557
0.2576
0.2669
0.2682
0.2689
0.2709
0.2762
0.2809
0.3019
0.3043
0.3062
0.3066
0.3183
0.3188
0.3237
0.3259
0.3359
0.3665
Note: This table reports the “Clutchness” coefficients of the 48 pitchers in the sample who pitched in the 2011 season. The coefficients were calculated by taking
the weighted average of the SS, LI, and the pitcher-specific constant term obtained through linear regressions individually run for each pitcher. The pitcherspecific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The symbols following some pitcher names correspond to the
following caveats: # if the pitcher earned eight or fewer saves in the season, * if the pitcher made 30 percent or fewer of his appearances in SS game states, ^ if
the pitcher participated in fewer than 100 plate appearances in the season.
65
Table 18
Clutchness Rankings: 2000 (limited sample)
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Name
Steve Kline
Keith Foulke
Danny Graves
Robb Nen
Mike Williams
Rick Aguilera
Mike Remlinger
Steve Karsay
John Wetteland
Jason Isringhausen
Mike Timlin
Armando Benitez
Trevor Hoffman
Jeff Shaw
Scott Strickland
Bob Wells
Bob Wickman
Team
MON
CHW
CIN
SFG
PIT
CHC
ATL
CLE
TEX
OAK
BAL
NYM
SDP
LAD
MON
MIN
CLE
Coefficient
0.1049
0.1185
0.1388
0.1426
0.1502
0.1724
0.1766
0.2035
0.2111
0.2143
0.2170
0.2407
0.2462
0.2510
0.2529
0.2533
0.2555
Rank
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Name
John Rocker
Mariano Rivera
Billy Koch
Jeff Brantley
Ryan Kohlmeier
Troy Percival
Dave Veres
Kazuhiro Sasaki
Ricky Bottalico
Byung-Hyun Kim
Todd Jones
Roberto Hernandez
Kerry Ligtenberg
Derek Lowe
Antonio Alfonseca
Matt Mantei
Team
ATL
NYY
TOR
PHI
BAL
ANA
STL
SEA
KCR
ARI
DET
TBR
ATL
BOS
FLA
ARI
Coefficient
0.2607
0.2635
0.2704
0.2729
0.2740
0.2779
0.2813
0.2867
0.2879
0.2897
0.3076
0.3105
0.3310
0.3670
0.3758
0.3906
Note: This table reports the Clutchness rankings of the 33 pitchers in the sample in the 2000 season who did not have sample size-related caveats. The coefficients were
calculated by taking the weighted average of the SS, LI, and the pitcher-specific constant term obtained through linear regressions individually run for each pitcher. The
pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The follow caveats led to the omission of 15 pitchers: if the pitcher
earned eight or fewer saves in the season, if the pitcher made 30 percent or fewer of his appearances in SS game states, and/or if the pitcher participated in fewer than
100 plate appearances in the season.
66
Table 19
Clutchness Rankings: 2011 (limited sample)
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Name
J.J. Putz
Francisco Cordero
Joel Hanrahan
Brandon League
Ryan Madson
Sergio Santos
Carols Marmol
Joakim Soria
David Hernandez
Kevin Gregg
Jonathan Papelbon
Francisco Rodriguez
Heath Bell
Matt Capps
Fernando Salas
Frank Francisco
John Axford
Brian Fuentes
Team
ARI
CIN
PIT
SEA
PHI
CHW
CHC
KCR
ARI
BAL
BOS
NYM
SDP
MIN
STL
TOR
MIL
OAK
Coefficient
0.1151
0.1604
0.1613
0.1631
0.1689
0.1801
0.1810
0.1843
0.1965
0.1998
0.2017
0.2054
0.2233
0.2325
0.2391
0.2391
0.2393
0.2438
Rank
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Name
Jordan Walden
Drew Storen
Jose Valverde
Jim Johnson
Kyle Farnsworth
Neftali Feliz
Javy Guerra
Mariano Rivera
Jon Rauch
Chris Perez
Brian Wilson
Juan Carlos Oviedo
Craig Kimbrel
Mark Melancon
Huston Street
Andrew Bailey
Joe Nathan
Team
LAA
WAS
DET
BAL
TBR
TEX
LAD
NYY
TOR
CLE
SFG
FLA
ATL
HOU
COL
OAK
MIN
Coefficient
0.2440
0.2475
0.2531
0.2556
0.2557
0.2576
0.2689
0.2709
0.2762
0.2809
0.3062
0.3183
0.3188
0.3237
0.3259
0.3359
0.3665
Note: This table reports the Clutchness rankings of the 35 pitchers in the sample in the 2011 season who did not have sample size-related caveats. The coefficients were
calculated by taking the weighted average of the SS, LI, and the pitcher-specific constant term obtained through linear regressions individually run for each pitcher. The
pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The follow caveats led to the omission of 15 pitchers: if the pitcher
earned eight or fewer saves in the season, if the pitcher made 30 percent or fewer of his appearances in SS game states, and/or if the pitcher participated in fewer than
100 plate appearances in the season.
67
Table 20
Leverage Index Clutchness Rankings: 2000 (refined sample)
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Name
Steve Kline
Robb Nen
Keith Foulke
Mike Williams
Danny Graves
Mike Remlinger
Steve Karsay
Troy Percival
John Wetteland
Rick Aguilera
Ryan Kohlmeier
Ricky Bottalico
Mike Timlin
Trevor Hoffman
Mariano Rivera
John Rocker
Armando Benitez
Team
MON
SFG
CHW
PIT
CIN
ATL
CLE
ANA
TEX
CHC
BAL
KCR
BAL
SDP
NYY
ATL
NYM
Coefficient
0.0580
0.1045
0.1092
0.1456
0.1463
0.1720
0.1875
0.1928
0.1958
0.2119
0.2225
0.2248
0.2332
0.2345
0.2377
0.2409
0.2421
Rank
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Name
Jeff Shaw
Scott Strickland
Dave Veres
Bob Wells
Jason Isringhausen
Jeff Brantley
Billy Koch
Todd Jones
Roberto Hernandez
Kazuhiro Sasaki
Bob Wickman
Byung-Hyun Kim
Kerry Ligtenberg
Derek Lowe
Antonio Alfonseca
Matt Mantei
Team
LAD
MON
STL
MIN
OAK
PHI
TOR
DET
TBR
SEA
CLE
ARI
ATL
BOS
FLA
ARI
Coefficient
0.2455
0.2480
0.2549
0.2619
0.2692
0.2855
0.2856
0.2914
0.2968
0.3037
0.3076
0.3135
0.3238
0.3357
0.3812
0.4184
Note: This table reports the Leverage Index Clutchness rankings of the 33 pitchers in the sample in the 2000 season who did not have sample size-related caveats. The
coefficients were calculated by taking the weighted average of the LI*SS, LI, and the pitcher-specific constant term obtained through linear regressions (including the
interaction terms) that were individually run for each pitcher. The pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each. The
follow caveats led to the omission of 15 pitchers: if the pitcher earned eight or fewer saves in the season, if the pitcher made 30 percent or fewer of his appearances in SS
game states, and/or if the pitcher participated in fewer than 100 plate appearances in the season.
68
Table 21
Leverage Index Clutchness Rankings: 2011 (refined sample)
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Name
Joel Hanrahan
Heath Bell
Brandon League
Francisco Cordero
Ryan Madson
Matt Capps
Jonathan Papelbon
J.J. Putz
Jordan Walden
Drew Storen
Carols Marmol
Joakim Soria
Francisco Rodriguez
Kevin Gregg
Sergio Santos
Fernando Salas
Jose Valverde
John Axford
Team
PIT
SDP
SEA
CIN
PHI
MIN
BOS
ARI
LAA
WAS
CHC
KCR
NYM
BAL
CHW
STL
DET
MIL
Coefficient
0.1302
0.1509
0.1525
0.1744
0.1756
0.1798
0.1834
0.1846
0.1853
0.1855
0.1932
0.2139
0.2181
0.2225
0.2231
0.2249
0.2259
0.2265
Rank
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Name
Mariano Rivera
Brian Fuentes
Andrew Bailey
Kyle Farnsworth
Frank Francisco
Juan Carlos Oviedo
David Hernandez
Jim Johnson
Jon Rauch
Neftali Feliz
Javy Guerra
Mark Melancon
Chris Perez
Huston Street
Joe Nathan
Craig Kimbrel
Brian Wilson
Team
NYY
OAK
OAK
TBR
TOR
FLA
ARI
BAL
TOR
TEX
LAD
HOU
CLE
COL
MIN
ATL
SFG
Coefficient
0.2418
0.2477
0.2516
0.2549
0.2566
0.2650
0.2659
0.2701
0.2710
0.2884
0.3189
0.3228
0.3270
0.3272
0.3348
0.3432
0.3669
Note: This table reports the Leverage Index Clutchness rankings of the 35 pitchers in the sample in the 2011 season who did not have sample size-related caveats. The
coefficients were calculated by taking the weighted average of the LI*SS, LI, and the pitcher-specific constant term obtained through linear regressions (including the
interaction terms) that were individually run for each pitcher. The pitcher-specific fixed effect receives a 50 percent weight and the other two receive 25 percent each.
The follow caveats led to the omission of 15 pitchers: if the pitcher earned eight or fewer saves in the season, if the pitcher made 30 percent or fewer of his
appearances in SS game states, and/or if the pitcher participated in fewer than 100 plate appearances in the season.
69
Figure 13a
Figure 13b
Figure 13c
Figure 13: Scatter plots of save opportunity conversion percentage, annual salary, and log(annual salary) against individual Clutchness rankings
70
Table 22
Wald-Wolfowitz Runs Test: Significant Results
Name
Mike Fetters
Eddie Guardado
Byung-Hyun Kim
Brandon League
Joe Nathan
Francisco Rodriguez
Team
LAD
MIN
ARI
SEA
MIN
NYM
Year
2000
2000
2000
2011
2011
2011
Runs
2
2
5
7
3
5
Expected Runs
3.9
4.3
9.4
9.8
5.9
10.5
z
-1.97*
-2.64**
-2.43**
-2.17*
-2.67*
-3.24**
Save Opportunities
7
11
20
42
17
29
Note: This table reports the six significant results yielded by the Wald-Wolfowitz runs tests. The results of all 96 Wald-Wolfowitz tests can be found
in Appendix Table A-5. Significance symbols are as follows: * p < 0.05, ** p < 0.01, *** p < 0.001
71
Appendix
Table A-1
Descriptive Statistics of Sample Pitchers
Name
Rick Aguilera
Antonio Alfonseca
John Axford
Andrew Bailey
Antonio Bastardo
Heath Bell
Armando Benitez
Rafael Betancourt
Ricky Bottalico
Jeff Brantley
Jonathan Broxton
Matt Capps
Santiago Casilla
Jose Contreras
Francisco Cordero
Octavio Dotel
Kyle Farnsworth
Neftali Feliz
Mike Fetters
Keith Foulke
Frank Francisco
Brian Fuentes
Wayne Gomes
Danny Graves
Kevin Gregg
Eddie Guardado
Javy Guerra
Joel Hanrahan
ShiShigetoshi Hasegawa
Season
2000
2000
2011
2011
2011
2011
2000
2011
2000
2000
2011
2011
2011
2011
2011
2000
2011
2011
2000
2000
2011
2011
2000
2000
2011
2000
2011
2011
2000
Team
CHC
FLA
MIL
OAK
PHI
SDP
NYM
COL
KCR
PHI
LAD
MIN
SFG
PHI
CIN
HOU
TBR
TEX
LAD
CHW
TOR
OAK
PHI
CIN
BAL
MIN
LAD
PIT
ANA
Saves
29
45
46
24
8
43
41
8
16
23
7
15
6
5
37
16
25
32
5
34
17
12
7
30
22
9
21
40
9
Blown
Saves
8
4
2
2
1
5
5
4
7
5
1
9
1
0
6
7
6
6
2
5
4
3
4
5
7
2
2
4
9
Conversion
Rate (%)
78
92
96
92
89
90
89
67
70
82
88
63
86
100
86
70
81
84
71
87
81
80
64
86
76
82
91
91
50
72
Season Salary
($)
3,500,000
380,000
442,500
465,000
419,000
7,500,000
3,437,500
3,775,000
1,500,000
500,000
7,000,000
7,150,000
1,300,000
2,500,000
12,125,000
240,000
2,600,000
457,160
550,000
375,000
4,000,000
5,000,000
925,000
2,100,000
4,200,000
875,000
488,000
1,400,000
900,000
SS
AB
140
217
195
108
84
185
192
123
144
132
34
131
51
34
169
111
117
147
72
207
88
97
72
186
120
66
84
177
217
NSS AB
70
94
110
62
141
71
112
114
175
124
28
143
160
26
105
448
114
105
129
143
130
153
252
202
155
196
111
97
98
Total AB
209
310
305
170
225
256
304
236
319
256
62
274
210
60
273
559
231
252
205
350
217
250
323
386
274
262
194
274
415
LaTroy Hawkins
David Hernandez
Roberto Hernandez
Trevor Hoffman
Bob Howry
Jason Isringhausen
Jason Isringhausen
Kenley Jansen
Jim Johnson
Todd Jones
Steve Karsay
Byung-Hyun Kim
Craig Kimbrel
Steve Kline
Billy Koch
Ryan Kohlmeier
Brandon League
Curtis Leskanic
Kerry Ligtenberg
Derek Lowe
Ryan Madson
Matt Mantei
Carols Marmol
Sean Marshall
Mark Melancon
Mike Morgan
Jason Motte
Joe Nathan
Robb Nen
Juan Carlos Oviedo
Jose Paniagua
Jonathan Papelbon
Bobby Parnell
Joel Peralta
2000
2011
2000
2000
2000
2000
2011
2011
2011
2000
2000
2000
2011
2000
2000
2000
2011
2000
2000
2000
2011
2000
2011
2011
2011
2000
2011
2011
2000
2011
2000
2011
2011
2011
MIN
ARI
TBR
SDP
CHW
OAK
NYM
LAD
BAL
DET
CLE
ARI
ATL
MON
TOR
BAL
SEA
MIL
ATL
BOS
PHI
ARI
CHC
CHC
HOU
ARI
STL
MIN
SFG
FLA
SEA
BOS
NYM
TBR
14
11
32
43
7
33
7
5
9
42
20
14
46
14
33
13
37
12
12
42
32
17
34
5
20
5
9
14
41
36
5
31
6
6
0
3
8
7
5
7
4
1
5
4
9
6
8
4
5
1
5
1
2
5
2
3
10
4
5
1
4
3
5
6
3
3
6
2
100
79
80
86
58
83
64
83
64
91
69
70
85
78
87
93
88
92
86
89
94
85
77
56
80
83
69
82
89
86
63
91
50
75
73
1,115,000
423,500
6,000,000
6,600,000
325,000
825,000
700,000
416,000
975,000
3,650,000
1,200,000
762,500
419,000
355,000
333,333
200,000
2,250,000
1,450,000
255,000
625,000
4,833,333
2,831,000
2,533,333
1,600,000
421,000
800,000
435,000
11,250,000
5,500,000
3,650,000
275,000
12,000,000
433,500
925,000
101
149
177
196
121
182
122
42
166
187
198
128
210
141
189
64
157
97
90
232
142
80
196
155
121
78
103
97
175
167
110
136
98
105
269
142
138
95
168
122
78
147
200
84
131
177
96
208
137
56
93
236
127
147
104
120
131
152
188
276
165
94
81
101
234
119
170
151
370
290
315
290
288
304
198
218
366
270
329
318
299
347
325
120
250
330
216
379
246
200
327
306
309
445
267
191
256
268
343
255
267
256
Troy Percival
Chris Perez
J.J. Putz
Jon Rauch
Mike Remlinger
Mariano Rivera
Mariano Rivera
John Rocker
Francisco Rodriguez
Fernando Salas
Chris Sale
Eduardo Sanchez
Sergio Santos
Kazuhiro Sasaki
Jeff Shaw
Joakim Soria
Jerry Spradlin
Drew Storen
Huston Street
Scott Strickland
Mike Timlin
Ugueth Urbina
Jose Valverde
Dave Veres
Billy Wagner
Jordan Walden
Bob Wells
John Wetteland
Gabe White
Bob Wickman
Mike Williams
Scott Williamson
Brian Wilson
2000
2011
2011
2011
2000
2000
2011
2000
2011
2011
2011
2011
2011
2000
2000
2011
2000
2011
2011
2000
2000
2000
2011
2000
2000
2011
2000
2000
2000
2000
2000
2000
2011
ANA
CLE
ARI
TOR
ATL
NYY
NYY
ATL
NYM
STL
CHW
STL
CHW
SEA
LAD
KCR
KCR
WAS
COL
MON
BAL
MON
DET
STL
HOU
LAA
MIN
TEX
COL
CLE
PIT
CIN
SFG
32
36
45
11
12
36
44
24
23
24
8
5
30
37
27
28
7
43
29
9
12
8
49
29
6
32
10
34
5
30
24
6
36
10
4
4
5
4
5
5
3
6
6
2
2
6
3
7
7
4
5
4
4
6
2
0
7
9
10
10
9
4
7
5
2
5
76
90
92
69
75
88
90
89
79
80
80
71
83
93
79
80
64
90
88
69
67
80
100
81
40
76
50
79
56
81
83
75
88
74
2,350,000
2,225,000
4,000,000
3,500,000
1,400,000
7,250,000
14,911,700
290,000
12,166,666
425,000
425,000
425,000
435,000
4,000,000
5,383,333
4,000,000
962,500
418,000
7,300,000
202,500
4,250,000
3,200,000
7,000,000
1,366,667
3,200,000
414,000
700,000
6,500,000
630,000
2,400,000
1,000,000
300,000
6,500,000
172
145
163
81
170
201
178
147
184
145
122
67
151
166
134
154
84
209
139
67
112
40
192
176
77
183
158
188
125
155
120
78
168
49
103
47
144
141
110
55
104
123
150
166
51
109
99
115
102
287
94
100
133
183
14
109
134
52
70
193
81
204
154
187
413
75
221
248
210
225
310
311
222
251
307
294
287
118
260
264
248
256
389
303
239
198
293
54
301
310
129
253
350
269
326
308
307
491
243
Table A-2.1
Effect of Run Difference*Save Situation Interaction Variable
Observations
Interaction Effect
Standard error
Z
26223
26223
26223
RunDiff*SS Interaction
Mean
Std. Dev.
Minimum
-0.0109862
0.0021855
-6.720539
0.0012338
0.0015608
2.892888
-0.0128623
0.0005329
-12.00559
Maximum
-0.0010391
0.015625
-0.3968737
Table A-2.2
Effect of Home*Save Situation Interaction Variable
Observations
Interaction Effect
Standard error
Z
26223
26223
26223
Home*SS Interaction
Mean
Std. Dev.
Minimum
-0.0073779
0.0013126
-5.705046
0.0006544
0.0001505
0.889937
-0.0082426
0.0005213
-7.206858
Maximum
-0.0012449
0.0024951
-1.768682
Table A-2.3
Effect of Contract*Save Situation Interaction Variable
Observations
Interaction Effect
Standard error
Z
26223
26223
26223
Contract*SS Interaction
Mean
Std. Dev.
Minimum
0.0215835
0.0021625
10.72835
0.0022451
0.0006044
3.17989
0.0036194
0.0010952
1.75991
Maximum
0.0251713
0.0073849
22.23528
Note: These tables report the results of the inteff Stata command, which correctly calculates the
coefficient, sign and significance of interacted variables in non-linear models, such as the binary logit
specification (full results reported in Table 8). The results of the LI*SS interaction are reported in Table 9,
and these tables report the results of the remaining three interactions in the related specification.
75
Table A-3
Goodness-of-fit Test Results
Table A-3.1
Collinearity Test
Collinearity Diagnostics
VIF
SQRT VIF
1.18
1.09
1.12
1.06
1.06
1.03
Save Situation
Leverage Index
Run Difference
Mean VIF
1.12
Eigenvalue
Condition Index
1
2
3
4
2.4370
0.8632
0.4302
0.2696
1
1.6802
2.3801
3.0067
Condition Number
Det(correlation matrix)
Table A-3.2
Classification
3.0067
0.8412
Logistic model for ob
True
Classified
D
~D
Total
+
—
4636
3586
8195
9806
12831
13392
Total
8222
18001
26223
Classified + if predicted Pr(D) >= .314
True D defined as ob != 0
Sensitivity
Pr( + D)
Specificity
Pr( -~D)
Positive predictive value
Pr( D +)
Negative predictive value
Pr(~D -)
56.39%
54.47%
36.13%
73.22%
False + rate for true ~D
Pr( +~D)
False - rate for true D
Pr( - D)
False + rate for classified + Pr(~D +)
False - rate for classified - Pr( D -)
45.53%
43.61%
63.87%
26.78%
Correctly classified
55.07%
76
Tolerance
0.8441
0.8891
0.9457
R-Squared
0.1559
0.1109
0.0543
Table A-3.3
Logit goodness-of-fit tests
Log-Lik Intercept Only:
D(26119):
McFadden's R2:
ML (Cox-Snell) R2:
McKelvey & Zavoina's R2:
Variance of y*:
Count R2:
AIC:
BIC:
BIC used by Stata:
Measures of Fit for logit of ob
-16308.217
Log-Lik Full Model:
32204.076
LR(9):
Prob > LR:
0.013
McFadden's Adj R2:
Cragg0.016
Uhler(Nagelkerke)
0.023
Efron's R2:
3.368
Variance of error:
0.687
Adj Count R2:
1.236
AIC*n:
-233540.873
BIC':
32305.82
AIC used by Stata:
-16102.038
412.357
0
0.006
R2:
0.022
0.016
3.29
0
32412.076
-320.788
32224.076
Table A-3.4
Linear goodness-of-fit tests
Log-Lik Intercept Only:
D(26119):
R2:
AIC:
BIC:
BIC used by Stata:
Measures of Fit for regress of ob
-17069.129
Log-Lik Full Model:
33733.887
LR(9):
Prob > LR:
0.015
Adjusted R2:
1.294
AIC*n:
-232011.062
BIC':
33835.631
AIC used by Stata:
-16866.944
404.37
0
0.011
33941.887
-312.801
33753.887
Note: The tables in these two pages report goodness-of-fit tests for both the binary logit and linear models used
throughout the paper. The first table presents variance inflation factors (VIFs) to test for issues of collinearity in the
logit model. The test command collin was used in Stata. The second table reports classification results, generated
through the estat command, that present a prediction table based on the binary logit specification. As noted in the
text, the test threshold was changed from the baseline 0.5 to 0.314 to reflect the sample proportion of positive onbase outcomes in the data. Tables A-3.3 and A-3.4 present a number of goodness-of-fit tests for the logit and linear
models, respectively, using the fitstat command in Stata.
77
Table A-4.1
Wald-Wolfowitz Runs Test: 2000 Results
Name
Team
Runs
Expected
Runs
Z
Saves
Blown
Saves
Name
Team
Runs
Expected
Runs
Z
Saves
Blown
Saves
Shigetoshi Hasegawa
ANA
11
10.0
0.49
9
9
Jerry Spradlin
KCR
8
6.1
1.32
7
4
Troy Percival
ANA
19
16.2
1.20
32
10
Mike Fetters
LAD
2
3.9
-1.97*
5
2
Byung-Hyun Kim
ARI
5
9.4
-2.43**
14
6
Jeff Shaw
LAD
10
12.1
-1.15
27
7
Matt Mantei
ARI
6
6.1
-0.10
17
3
Curtis Leskanic
MIL
17
20.0
-1.10
35
13
Mike Morgan
ARI
2
2.7
-1.41
5
1
Eddie Guardado
MIN
2
4.3
-2.64**
9
2
Kerry Ligtenberg
ATL
4
4.4
-0.54
12
2
LaTroy Hawkins
MIN
1
1.0
-
14
0
Mike Remlinger
ATL
6
7.0
-0.71
12
4
Bob Wells
MIN
12
11.0
0.46
10
10
John Rocker
ATL
6
6.3
-0.35
24
3
Steve Kline
MON
7
7.2
-0.16
14
4
Ryan Kohlmeier
BAL
3
2.9
0.41
13
1
Scott Strickland
MON
6
6.5
-0.37
9
4
Mike Timlin
BAL
6
9.0
-1.65
12
6
Ugueth Urbina
MON
4
4.2
-0.23
8
2
Derek Lowe
BOS
9
9.9
-0.75
42
5
Armando Benitez
NYM
11
9.9
0.87
41
5
Rick Aguilera
CHC
13
13.5
-0.27
29
8
Mariano Rivera
NYY
11
9.8
0.93
36
5
Keith Foulke
CHW
10
9.7
0.21
34
5
Jason Isringhausen
OAK
11
12.6
-0.88
33
7
Bob Howry
CHW
8
6.8
0.73
7
5
Jeff Brantley
PHI
8
9.2
-0.82
23
5
Danny Graves
CIN
11
9.6
1.03
30
5
Wayne Gomes
PHI
7
6.1
0.63
7
4
Scott Williamson
CIN
3
4.0
-1.08
6
2
Mike Williams
PIT
11
9.3
1.18
24
5
Steve Karsay
CLE
11
13.4
-1.07
20
9
Trevor Hoffman
SDP
13
13.0
-0.02
43
7
Bob Wickman
CLE
15
12.4
1.47
30
7
Robb Nen
SFG
7
9.9
-2.33
41
5
Gabe White
COL
4
5.4
-1.04
5
4
Jose Paniagua
SEA
3
4.8
-1.44
5
3
Todd Jones
DET
9
8.3
0.69
42
4
Kazuhiro Sasaki
SEA
7
6.6
0.56
37
3
Antonio Alfonseca
FLA
9
8.3
0.66
45
4
Dave Veres
STL
15
12.3
1.50
29
7
Octavio Dotel
HOU
9
10.7
-0.88
16
7
Roberto Hernandez
TBR
13
13.8
-0.41
32
8
Billy Wagner
Ricky Bottalico
HOU
8
8.2
-0.11
6
9
John Wetteland
TEX
14
15.2
-0.58
34
9
10
9.7
0.24
33
5
KCR
11
10.7
0.13
16
7
Billy Koch
TOR
Note: This table reports the results of the Wald-Wolfowitz runs tests for closers in the sample from the 2000 season.
*
**
***
Significance symbols are as follows: p < 0.05, p < 0.01, p < 0.001
78
Table A-4.2
Wald-Wolfowitz Runs Test: 2011 Results
Name
Team
Runs
Expected
Runs
Z
Saves
Blown
Saves
Name
Team
Runs
Expected
Runs
Z
Saves
Blown
Saves
J.J. Putz
ARI
9
8.3
0.66
45
4
Matt Capps
MIN
11
12.3
-0.56
15
9
David Hernandez
ARI
4
5.7
-1.48
11
3
Jason Isringhausen
NYM
4
6.1
-1.45
7
4
Craig Kimbrel
ATL
12
14.6
-1.46
46
8
Bobby Parnell
NYM
7
7.0
0.00
6
6
Jim Johnson
BAL
6
7.4
-0.87
9
5
Francisco Rodriguez
NYM
5
10.5
-3.24**
23
6
Kevin Gregg
BAL
15
11.6
1.77
22
7
Mariano Rivera
NYY
9
10.0
-0.80
44
5
Jonathan Papelbon
BOS
34
6.5
-0.55
31
3
Brian Fuentes
OAK
7
5.8
1.05
12
3
Carlos Marmol
CHC
18
16.5
0.68
34
10
Andrew Bailey
OAK
4
4.7
-1.10
24
2
Sean Marshall
CHC
7
5.4
1.12
5
4
Jose Contreras
PHI
1
1.0
-
5
0
Sergio Santos
CHW
13
11.0
1.25
30
6
Ryan Madson
PHI
5
4.8
0.42
32
2
Chris Sale
CHW
3
4.2
-1.36
8
2
Antonio Bastardo
PHI
2
2.8
-1.87
8
1
Francisco Cordero
CIN
9
11.3
-1.54
37
6
Joel Hanrahan
PIT
9
8.3
0.71
40
4
Chris Perez
CLE
7
8.2
-1.12
36
4
Heath Bell
SDP
11
10.0
0.85
43
5
Rafael Betancourt
COL
4
6.3
-1.61
8
4
Santiago Casilla
SFG
2
2.7
-1.58
6
1
Huston Street
COL
8
8.0
-0.03
29
4
Brian Wilson
SFG
8
9.8
-1.36
36
5
Jose Valverde
DET
1
1.0
-
49
0
Brandon League
SEA
7
9.8
-2.17*
37
5
Juan Carlos Oviedo
FLA
11
11.3
-0.19
36
6
Jason Motte
STL
4
6.5
-1.75
9
4
Mark Melancon
HOU
10
9.0
0.65
20
5
Fernando Salas
STL
10
10.6
-0.36
24
6
Joakim Soria
KCR
11
12.2
-0.65
28
7
Eduardo Sanchez
STL
3
3.9
-0.91
5
2
Jordan Walden
LAA
14
16.2
-0.97
32
10
Kyle Farnsworth
TBR
11
10.7
0.19
25
6
Jonathan Broxton
LAD
3
2.8
0.58
7
1
Joel Peralta
TBR
4
4.0
0.00
6
2
Javy Guerra
LAD
4
4.7
-0.98
21
2
Neftali Feliz
TEX
11
11.1
-0.07
32
6
Kenley Jansen
LAD
3
2.7
0.71
5
1
Frank Francisco
TOR
7
7.5
-0.36
17
4
John Axford
MIL
4
4.8
-1.73
46
2
Jon Rauch
TOR
7
7.9
-0.53
11
5
Joe Nathan
MIN
3
5.9
-2.67**
14
3
Drew Storen
WSN
Note: This table reports the results of the Wald-Wolfowitz runs tests for closers in the sample from the 2011 season.
*
**
***
Significance symbols are as follows: p < 0.05, p < 0.01, p < 0.001
11
10.0
0.85
43
5
79
© Copyright 2026 Paperzz