Measured Strength: Estimating Alliance in the

Measured Strength: Estimating Alliance in the International
System, 1816-2000⇤
Brett V. Benson†
Joshua D. Clinton‡
May 15, 2013
Word Count: 10,800
Keywords: Alliances; Measurement; Formal Treaty Terms; Reliability
Abstract
Alliances play a critical role in the international system and understanding the determinants and consequences of their strength is an important task. Many have argued
that the strength of an alliance is determined by both the power of the signatories involved and the formal terms of the agreement, but using these insights to measure the
strength of alliances is difficult. We use a Bayesian statistical measurement model to
estimate the strength of all alliances signed between 1816-2000 along two theoretically
derived dimensions: the strength of the signatories involved and the strength of the
formal terms of the alliance. The resulting estimates not only provide a measure of
alliance strength based on the terms of the alliance itself, which allow for the investigation of many possible questions, but exploring the validity of the resulting estimates
reveals support for some core intuitions that were previously hard to verify regarding
the relationship between signatory strength and the formal terms of a treaty, the extent
to which alliance balancing occurs, and whether alliances with stronger treaty terms are
also those in which allies are less likely to renege if conflict occurs.
⇤
The authors would like to thank Ashley Leeds and Michaela Mattes for comments on an earlier version
of this manuscript.
†
Associate Professor of Political Science, Vanderbilt University, E-mail: [email protected].
PMB 505, 230 Appleton Place, Nashville TN, 37203-5721.
‡
Associate Professor of Political Science and Co-Director of the Center for the Study of Democratic Institutions, Vanderbilt University. E-mail: [email protected]. PMB 505, 230 Appleton Place,
Nashville TN, 37203-5721.
Interstate alliances are a critical feature of the international system and understanding
their causes and consequences is important for better understanding the conditions under
which international security can be achieved. Many of the main questions related to military
alliances entail a conceptualization of alliance strength. For example, in considering the e↵ect
of an alliance on interstate conflict, an important question to ask is whether the strength of an
alliance a↵ects the military decisions of both the alliance members as well as the prospective
targets of the alliance. To investigate such a question, scholars might begin by asking how
powerful the allies are relative to their targets and whether the obligations the signatories
have agreed to require them to expend costs that enhance the military capacity of the alliance
prior to or during war. In this illustration, alliance strength is clearly related to both the
characteristics of the signatories and the content of the associated agreement between the
alliance members. Because many factors might be related to these two dimensions of alliance
strength, the question of how best to characterize the dimensions – and therefore also of the
relationship between them – is an elusive one. In this paper, we characterize the strength of
an alliance along theoretically implied dimensions using available data and accounting for the
uncertainty in the measures.
Such a measure of alliance strength would provide the ability to ask and answer many
important, but elusive, questions in the study of international politics. For example, does the
strength of the alliance signatories relate to the strength of the treaty terms? Do powerful
alliances, both in terms of the signatory capabilities and the alliance terms, deter conflicts
(Smith 1995, Benson et al 2013a)? Are treaties with stronger terms less likely to be violated
(perhaps because the likelihood of violation resulted in the creation of a weaker alliance)
(Leeds 2003a)? How does alliance strength a↵ect the distribution of power and the likelihood
of war in the international system (Mearsheimer 1990, Organski and Kugler 1980, Powell
1996)? Does fear of opportunism among allies lead to weaker treaty terms (Benson 2012,
Snyder 1997, Fearon 1997)? Empirical investigations of many of these questions have either
led to mixed results or have not been directly examined because of measurement difficulties.
We lack a measure of alliance strength that reflects its nuanced nature.
1
Our goal is to generate a measure of alliance strength that helps advance research on
these and other questions related to alliances and international security. Because prominent
theories raise questions related to both the signatory characteristics and treaty terms, our
measure depends on two dimensions of strength: 1) the potential military capacity of an
alliance given the combined strength and other characteristics of the signatories, and 2) the
specific terms of the treaty that obligate signatories to expend costs prior to or during war
for the purpose of limiting or enhancing the wartime military capacity of the alliance. As an
illustration of the relevance of both dimensions, consider a comparison between the the 1958
treaty signed between the United Arab Republic and the Kingdom of Yemen (UAR-Yemen)
and the 1975 Helsinki Final Acts. In the UAR-Yemen alliance, the signatories agreed to
strong treaty terms, calling for integrated military command and joint military bases for both
o↵ensive and defensive purposes. Yet, the combined strength of the signatories was relatively
weak. By contrast, the 1975 Helsinki Final Acts was signed by both NATO and Warsaw Pact
members, making the combined military capacity of the signatories unparalleled by other
alliances at the time, but the treaty terms of this nonaggression pact merely required states
to “respect each other’s sovereign equality and individuality as well as all the rights inherent
in and encompassed by its sovereignty, including in particular the right of every State to
juridical equality, to territorial integrity, and to freedom and political independence” (Article
1). There were no obligations in the formal agreement of the Helsinki Final Acts that would
require signatories to utilize their military strength against any other state. While extreme,
these examples illustrate two points: 1) the strength of an alliance varies both in terms of
the military capacity of the signatories involved and the extent to which the formal terms
of a treaty require signatories to expend costs that enhance or limit their combined military
power, and 2) the military capacity of the signatories and the strength of the treaty terms
may not always be correlated.
Based on theoretical arguments about the correlates of alliance strength and using observable characteristics of an alliance and the associations between these characteristics, we
measure the strength of every available military alliance – including multilateral alliances and
2
alliances without a treaty – signed between 1816 and 2000.1 Our approach is similar to the
approach taken by scholars interested in measuring the ideology of elected and unelected officials (e.g., Poole and Rosenthal 1997; Martin and Quinn 2002; Clinton, Jackman and Rivers
2004), the positions of a political party in an underlying policy space (Budge et. al. 2001),
the ideology of a congressional district in the United States (Levendusky, Pope and Jackman
2008), the extent to which a country is democratic (Pemstein, Meserve, and Melton 2010), or
the positions taken by a country in the United Nations General Assembly (Voeten 2000). We
use the formal terms of the alliance agreement and characteristics of the countries involved at
the time of the signing to construct a measure of alliance strength based on the underlying
associations of the observable measures.
We make several contributions. First, we show how a Bayesian latent trait model (Quinn
2004) can recover a theoretically informed, multi-dimensional estimate of alliance strength
that reflects both the terms of the formal alliance agreement and the characteristics of the signatories. Our measure can quantify the influence of various treaty terms on the strength of an
alliance – a task that has previously eluded scholars interested in alliances and forced existing
work to rely on coarse indicators that cannot discriminate between alliances belonging to broad
classifications (e.g., “defensive alliances”) and we can show that, across all alliances, there is
almost no relationship between the strength of the involved signatories and the strength of
the formal terms. Second, because we provide a measure for a concept for which we lack an
agreed upon measure, our estimates allow us to test existing intuitions about alliance strength.
For example, our measure allows us to compare the relative strength of alliances and so doing reveals clear evidence of “balancing” among post-war East Asian alliances. Moreover,
whereas prior work interested in the impact of treaty strength on durability was forced to
rely on proxy measures due to the lack of a systematic manner of measuring what various
treaty terms imply about the strength of an alliance, our measure of strength confirms that
alliances with weaker formal obligations are also less durable in times of war. Confirming core
intuitions not only help establish the validity of our measure, but it illustrates the potential
of our measure for additional questions. Third, because any assessment of alliance strength
3
is inherently ambiguous, we can quantify how certain we are about the resulting estimates
(Jackman 2009b). Fourth, our method is sufficiently general that we can extend the model
to all alliances treaties – including multilateral alliances and alliances without a target. Finally, although we think our estimates are based on strong theoretical foundations and possess
strong conceptual validity, the statistical measurement model empowers scholars to construct
their own measures if their questions of interest are sufficiently di↵erent or if they choose to
make alternative assumptions about the underlying relationships.
In validating our measure using qualitative and quantitative information, we are able to
draw several substantive conclusions. First, we show that there is a very weak relationship
between signatory strength and treaty strength. This finding suggests that relatively weak
signatories may occasionally wish to strengthen their alliance through treaty terms that are
designed to expand their combined capabilities through costly peacetime military coordination
and sweeping wartime obligations. On the other hand, relatively strong alliance signatories
may wish to curtail allies’ access to the aggregate capabilities of the alliance by designing
provisions that make it costless to escape or that restrict conditions for intervention. Second,
we also show that prominent alliances within a theatre of operations show some evidence of
“balancing” both in terms of the strength of the signatories involved and the commitment
created by the treaty. Confirming our estimation of the strength of some prominent alliances
against our historical expectations lends face validity to the measure. Finally, we use our
measure to replicate existing work where previously only indirect measures of treaty strength
were used to analyze the reliability of alliances. Applying our measure to an existing question
where it would be of value further confirms its usefulness and the validity of the estimates that
we recover. This last point is important because whereas existing work is forced to “black
box” the measure of treaty strength and use characteristics that are thought to predict treaty
strength, the estimate we provide is a direct measure of the concept of interest.
The outline of the paper is as follows. Section 2 briefly recaps the extensive literature
dealing with the strength of international alliances to extract the primary dimensions that
scholars have identified as influencing the strength of an alliance – the terms of the alliance
4
and the characteristics of the signatories. Section 3 describes the Bayesian latent variable
model we use to measure alliance strength and it describes the observable characteristics we
use to estimate our two-dimensional estimate of alliance strength. Section 4 establishes the
predictive validity of the measure and confirms Leeds’ (2003) findings that alliances with
weaker treaty terms are more likely to be violated in times of war. Section 5 concludes by
discussing the possible uses and extensions of both our estimates of alliance strength and also
the Bayesian latent variable measurement model we employ.
1
Conceptualizing Alliance Strength
For many prominent research programs, a common way to conceptualize the strength of an
interstate alliance is to consider the military capacity that can be generated as a result of the
joint cooperation of the allies in an interstate war. This notion entails a calculation of the
sum capacity of the signatories. Balance of power theories privilege the role of alliances in
world politics by considering that the distribution of power that results from the joint military strength of competing alliance networks a↵ects the stability of the international system
(Morgenthau 1948, Organski 1968, Waltz 1979, Walt 1987). There is a long debate, which
has yielded mixed results in the empirical literature, about the e↵ect of alliances in various
theories of the distribution of power and stability.2 In these empirical analyses, scholars typically utilize some measure of aggregate capabilities of alliance signatories to approximate the
strength of an alliance.
The same conception of alliance strength is relevant for research on the deterrence e↵ect of
military alliances. Theories of alliances and deterrence examine how the combined potential
military force of the allies might a↵ect an adversary’s calculation to challenge one of the
allies (Morrow 1994, Smith 1995, Zagare and Kilgour 2003, Yuen 2009, Benson 2012, Benson
et al 2013b). Empirical investigations of the relationship between alliances and deterrence
have been forced to rely on coarse indicators of the presence of an alliance and then control
for factors that may be related to alliance strength such as a signatory’s capabilities and its
5
presence in alliance networks (Leeds 2003b, Benson 2011).
One reason for this approach is that recent studies of the deterrent e↵ects of alliances has
sought to examine how the content of an alliance also a↵ects conflict holding fixed factors
related to the military capacity of the signatory.3 Scholars have long recognized that the
terms of the promises made between countries can a↵ect the expected military strength of
their commitments to one another.4 By including provisions that require signatories either to
pay ex ante costs to facilitate military coordination or to behave in particular ways subject
to the payment of costs for reneging, alliance members can design their agreements either
to enhance or to limit the expected strength of their alliance. A non-aggression pact, for
example, should limit expectations of potential military collusion between signatories while,
by contrast, an o↵ensive or defensive pact should raise expectations of the potential for an
alliance member to make full use of the combined military capabilities of the alliance under
particular circumstances. Additionally, many alliance agreements require signatories to pay
costs prior to war by taking such measures as integrating military command, exchanging
military and economic aid, and establishing military bases in one another’s territory. Paying
such costs in advance of war may create cooperative synergies between signatories that expand
their alliance strength beyond what is estimated by a raw calculation of their summed military
capacity (Morrow 1994). If the terms of a treaty matter, focusing only on the military capacity
of the involved countries will fail to account for the di↵erences in treaty terms that can expand
or limit strength as well as increase or decrease the likelihood that the military capacity of
the involved countries will be exercised to further the objectives of the alliance.
In spite of the relevance of treaty content to the overall strength of an alliance, scholars lack
a measure that might advance empirical research on many important questions. As we have
seen in the deterrence literature, a standard work-around for the missing measure is to use
coarse measures of di↵erent types of alliance provisions that might serve as indicators of strong
military treaties with high expectation of ally intervention or powerful military coordination.
Another approach is to consider alliance content indirectly by examining correlates of treaty
strength. For example, because a good measure of treaty strength does not exist, Leeds (2003a)
6
is forced to study the reliability of alliances by analyzing the relationship between signatory
characteristics thought to correlate with treaty content, and whether signatories deliver on
their promises during wars. In this study of reliability – as well as other important research
programs such as deterrence, the balance of power, and determinants of alliance formation
– a measure of alliance strength that accounts for many of the factors that theorists believe
a↵ect alliance strength would be most valuable.
Given that we think that alliances di↵er both in terms of signatories involved and the
particular terms of the alliance, we seek a measure of alliance strength that reflects these
two notions of potential military capacity while also accounting for the inevitable uncertainty
of any such measure given the concepts’ inherent ambiguity. As a starting point, it is natural to categorize existing available measures of alliance strength according to two separate
dimensions. One dimension, signatory characteristics, consists of particular qualities of the
allies that contribute to the military capacity of the signatories if there is a war and all the
allies cooperate and join the war. The second dimension, treaty characteristics, includes the
provisions of a military alliance that limit or enhance the ability of the signatories to jointly
mobilize their combined military capabilities to fight a war as well as the expectation that
they will make their military capacity available. The first dimension gives a measure of the
total adjusted potential military strength of an alliance, but the actual military strength of
the alliance is only a fraction of the total potential strength if the treaty provisions restrict
the amount of capabilities that any signatory can expect to gain access to if there is a war.
That is, the first dimension can be interpreted as providing an estimate of the total potential
adjusted military capacity of an alliance, while the the second dimension characterizes how
much of the potential military capacity the signatories agree to make available for the benefit
of the alliance and whether additional military coordination can expand the total adjusted
military capacity. While there may certainly be a relationship between the two dimensions,
nothing requires a relationship to exist. It is certainly possible for two countries to enter into
a variety of arrangements that entail a di↵ering level of commitment to one another.
7
1.1
Measuring Signatory Strength
To aggregate signatories’ military capabilities, scholars typically sum the capabilities of the
alliance partners using the Composite Index of National Capabilities (CINC scores) (Bueno de
Mesquita 1983; Reiter 1996; Wagner 2007). While aggregate capabilities provide one estimate
of the raw potential military capability of the alliance, other factors related to specific characteristics of the signatories might also enhance or constrain the military capacity of the alliance.
The presence of a major power in an alliance may also a↵ect the overall military capacity of
an alliance. Scholars claim that major powers possess unique characteristics that give such
alliances a distinctive military advantage – e.g., they possess significantly greater economic
resources, have more economic and security interests, possess advanced weapons systems (such
as nuclear weapons in the post-WWII era), and influence in international institutions such as
the United Nations Security Council (Gibler and Vasquez 1998). Although scholars generally
agree that major powers are qualitatively distinct from other powers, measuring their impact
on the military capacity of an interstate alliance is not straightforward – scholars typically use
a separate indicator to control for the presence of a major power (Levy 1981, Siverson and
Tennefoss 1984, Morrow 1991, Leeds 2003a and Benson 2011).
Distances between allies may degrade the signatories’ combined capabilities because of
the cost related to projecting military forces and coordinating long-distance military actions
(Boulding 1962; Starr and Most 1976; Bueno de Mesquita 1983; Bueno de Mesquita and
Lalman 1986; Smith 1996; Weidmann et al. 2010; Bennett and Stam 2000b, Poast et al
2012).5 The distance to a target country may also be relevant for assessing the strength of
some alliances, but given the difficulty of identifying threats and the aspiration to estimate
the strength of alliances lacking a specified threat we omit this variable.6
The size of the alliance may also a↵ect its military capacity. Multiple signatories may
provide advantages in conflict bargaining, yield potential gains from division of labor and
specialization, and enhance the credible use of allies’ military capabilities beyond the additive
advantage of simply summing individual military capabilities. However, the relationship is
8
somewhat unclear as more signatories may complicate logistical coordination and increase the
chances of that allies’ opinions and interests will diverge.
Finally, the commonality of security interests may a↵ect the resolve of signatories to contribute in a war. One common approach is to use “s-scores” to measure of the closeness
of foreign policy interests (Signorino and Ritter 1999) based on the similarity of countries’
alliance portfolios. Another measure of shared interests is the commonality of regime type;
shared domestic political regimes may produce stronger alliances if security interests are also
shared or if domestic audience costs make jointly democratic alliances are more credible (Lai
and Reiter 2000; Leeds et al. 2002; Gibler and Sarkees 2004; Leeds et al. 2009; Mattes 2012).
On the other hand, the relationship is not entirely clear because democracies may prefer not
to ally with one another (Simon and Garzke 1996; Gibler and Wolford 2006) because the vetopoints created by domestic political institutions may create difficulties for taking action (e.g.,
Tsebelis 2002) or because election-induced leadership turnover may make them unreliable
(Gartzke and Gleditsch 2004).
1.2
Measuring Treaty Terms
Many treaty provisions are thought to reflect the strength of an alliance. Despite a wealth
of relevant and available data detailing various treaty provisions collected by Alliance Treaty
Obligations and Provisions (Leeds et. al. 2002), the various provisions do not necessarily
directly reflect the strength of a treaty and it is unclear how the various features interact to
a↵ect signatories’ ability to access the potential military strength of the alliance.
To identify and estimate the second dimension we rely on the insights of scholars who argue
that some types of alliances are stronger than others either because di↵erent types have more
or less impact on deterrence (Benson 2011, Benson 2012, Benson et al 2013a, Leeds2003b) or
because the type of agreement e↵ects the likelihood signatories will intervene (Leeds2003a,
Sabrosky 1980, Siverson and King 1980, Smith 1996). To measure the influence of the formal
terms of an agreement on alliance strength, we use ATOP data (Leeds et al. 2002) and
9
Benson’s (2011) typology. Alliance agreements are coded in the ATOP data as being o↵ensive,
defensive, neutrality, consultation, and non-aggression. We allow for the di↵erent alliances to
impact alliance strength di↵erently, but we do not assume anything about the ordering of
alliance strength between these typologies – e.g., we assume that there are similarities within
o↵ensive treaties and non-aggression pacts, but we do not impose any a priori assumptions
about which is stronger.
Because an alliance agreement can contain multiple provisions, we also want to allow for
the di↵erential impact of non-exclusive designations. Benson’s (2011) typology, for example,
is based on the expressed objective of the provision to provide military assistance and whether
the obligation to deliver military assistance is guaranteed and conditioned on an action in a
dispute and conditions limiting the application of military force to specified situations may
a↵ect the strength of an alliance. To allow for the possibility that an unconditional guarantee
of military support in any circumstances may imply a di↵erent amount of strength than a
commitment of support only if an adversary attacks and an alliance member did not provoke
(or a promise to possibly only intervene), we allow for treaty commitments containing both
compellent and deterrent objectives to di↵er from those containing just compellent or just
deterrent objectives. To be clear, while we want to permit these types of alliances to di↵er,
we do not impose any assumptions about how they may di↵er.
Many other aspects are plausibly likely related to the strength of the formal terms of an
alliance. In particular, we examine the impact of whether: there are mentions of the possibility of conflict between the members of the alliance (CONWTIN ), an integrated military
command (INTCOM ), the exchange of economic aid (ECAID), the exchange of military aid
(MILAID), provisions for an increase or reduction of arms (ARMRED), and joint troop placements (BASE ). We also account for whether: the formal obligations vary across the alliance
partners (ASYMMETRY ), whether it was formed in secret (SECRECY ), whether it was
formally ratified (ESTMODE ), whether it formed an organization or required formal meetings between the signatories (ORGAN ), whether it allows a signatory to renounce obligations
under an alliance agreement during the term of the agreement (RENOUNCE ), whether the
10
obligations are conditional (CONDITIO), and whether the alliance provided for a specific
term (SPECLGTH ).7
2
A Statistical Measurement Model
The issues scholars confront when attempting to measure whether an alliance presents a
formidable obstacle to potential assailants because of preferences, circumstances and military
capacity, or the extent to which the formal terms of an alliance agreement bind the signatories
together are issues that are endemic to social sciences. How do observable features relate to
the unobservable strength of an alliance? We may know an alliance is o↵ensive and commits
signatories to establish bases in each others’ territories, but is this stronger than a defensive
alliance that establishes an integrated military command? To motivate the exposition, consider the task of measuring alliance strength – hereafter denoted by x⇤ – using the observable
indicator variables x1 and x2 (e.g., major power signatory, o↵ensive alliance).
One possibility is to chose a single characteristic to proxy for alliance strength. Using either
x1 or x2 to measure x⇤ is potentially problematic because so doing ignores information in other
characteristics. The terms of the average “o↵ensive” alliance may reflect stronger commitments than the terms of an average “nonaggression” commitment, but there is still important
variation within each alliance type; the “Pact of Steel” and the 1816 alliance between the
Netherlands and Spain are both “o↵ensive” alliances, but the strength of the commitments
implied by the former are considerably stronger.8
Creating an index based on multiple characteristics does not solve the problem because
there is often no theoretical guidance for combining measures and it is hard to interpret the
resulting scale. Adding characteristics to create an index makes extremely strong assumptions
– even if the indicator variables x1 and x2 are both related to the strength of an alliance, on
what basis can we conclude that an alliance possessing only characteristic x1 is as strong as the
alliance that possess only characteristic x2 ? Moreover, is the alliance containing both x1 and
x2 twice as strong as an alliance containing one feature but not the other? It seems difficult
11
to rationalize the relationships that are assumed by an additive index, and such assumed
equivalences only increase as the number of variables used to construct the measure increase.
If the goal is to predict the e↵ects of alliance strength on an outcome of interest – y
– we can use the regression specification to control for multiple features of an alliance. For
example, if we are predicting the e↵ect of alliance strength on outcome y, the typical regression
specification is y = ↵ +
1 x1
+
2 x2
which allows the left-hand side to measure alliance
strength – as a linear function of x1 and x2 – and its relation to y.9 Note that including
multiple measures in a regression changes measurement issues into specification issues and the
degrees of freedom that analysts have may be quickly reduced given the number of potential
indicators of alliance strength. Interpreting the e↵ects from such a saturated regression model
(Ray 2003; Achen 2005) make also be difficult, particularly if the model includes multiple
interactions (Braumoeller 2004; Brambor, Clark, and Golder 2006). A shortcoming of all
three approaches is that they fail to reflect our uncertainty about how the observed concepts
relate to the underlying dimensions of alliance strength and the precision with which we are
able to estimate the strength of an alliance.
A Bayesian latent variable model provides a framework for measuring alliance strength that
uses the information contained in the many measures that researchers have already collected
that are plausibly related to the strength of an alliance while also allowing researchers to make
weaker assumptions about the nature of the relationships involved. Non-Bayesian methods
are certainly available (e.g., Bollen 1989), but for both theoretical (see the arguments of Gill
2002 and Jackman 2009a) and practical reasons we adopt a Bayesian approach. Unlike a
frequentist approach, a Bayesian latent variable approach allows us to directly measure the
precision of the resulting estimates using the posterior distributions of estimated parameters.
To focus our exposition, suppose we are interested in measuring the strength of alliance
at the time of its founding and let x⇤i denote the unobserved strength of alliance i. Our
task is to use observable characteristics that are theorized to correlate with the strength of
alliance i to construct an estimate of x⇤i that not only describes the relative strength of the
alliance relative to other alliances but also shows how much uncertainty we have regarding our
12
estimate of alliance strength. Suppose further that we have k 2 1...K observable measures of
alliance strength, and let the observed value for variable k for alliance i be denoted by xik .
Our observed measures may include continuous, binary and ordinal measures.10
Similar issues arise when using observed characteristics to measure how “democratic”
a country is or how “liberal” a district or a member of the US Congress is. We observe
characteristics that are related to the concept of interest, and we must use the observed
characteristics and a statistical measurement model to make inferences about the latent traits.
Bayesian latent variable models provide a statistical measurement model that are able to
extract the latent dimensions that are assumed to be responsible for generating the association
between and within the distribution observed characteristics (see, for example, Quinn 2004;
Jackman 2009b), and scholars have used related models to measure latent traits critical for
studying the politics of the United States (e.g., Clinton and Lewis 2008; Levendusky and
Pope 2010) and comparative politics (e.g., Rosenthal and Voeten 2007; Rosas 2009; Pemstein,
Meserve and Melton 2010; Treier and Jackman 2008; Hoyland, Moene and Willumsen 2012),
but scholars have only recently begun to apply the models to concepts in international relations
(see, for example, Schnakenberg and Fariss (2009) and Gray and Slapin (2011)).
Figure 1 provides a graphic representation of the measurement model for the case of 3
measures. The model assumes x⇤i is related to xi1 , xi2 , and xi3 across all alliances, but the
relationship may di↵er between variables. For example, xi1 and xi2 may not be identically
related to x⇤i , and these di↵erences are captured by:
1,
2,
2
1,
and
2
2.
Given the number of parameters to be estimated, recovering the latent measure of alliance
strength (x⇤ ) from the matrix of observed characteristics x requires some additional structure.
The structure we use is provided by a Bayesian latent variable specification (see, for example,
Jackman 2009a,b). For all alliances i 2 1...N we assume:
xi ⇠ N (
k0
+
⇤
k1 xi ,
2
k ).
(1)
The measurement model of equation (1) assumes that the observed correlates of alliance
13
β1$
σ12$
β2$
xi1$
σ22$
β3$
xi2$
σ32$
xi3$
x i *$
Figure 1: Directed Acyclic Graph: Bayesian latent variable model: Circles
denote observed variables, squares denote parameters to be estimated.
strength x are related to alliance strength in identical ways across the N alliances, but di↵erent
measures may be related to alliance strength in di↵erent ways. Not only may the mean
value of xk and x⇤ di↵er (as will be reflected in the estimate of
k0 ),
but the the scale of
the observed variable and the latent variable may also di↵er (captured by
k1 ).
k1
> 1
implies that a one-unit change in the latent scale of x⇤ corresponds to more than a one-unit
change in the observed measure xk ,
k1
< 1 implies that a one-unit change in the latent scale
corresponds to less than a one-unit change in the observed measure, and
k1
< 0 implies that
the orientation of the observed and unobserved measures are “flipped” (i.e., positive values
of xk correspond to negative values of x⇤ ). Moreover, if an included measure is unrelated to
the latent trait revealed in the other included measures, the model can also account for that
possibility –
k1
= 0 means there is no relationship between x⇤ and xk . The model also allows
the relationship to be more or less precise; the
2
k
term allows varying amounts of error in
the mapping between the observed and unobserved variable. Finally, because we estimate a
14
version of equation (1) for each of the K observed measures, we allow for the relationship
to vary across observed traits, and we can use all available measures to help uncover the
underlying latent trait.
Note that these assumptions are silent about causality – nothing requires that the latent
trait x⇤ causes the observed phenomena or visa-versa. All that is assumed is that, for whatever
reason, there is a correlation between the observed and unobserved traits and that we can
therefore use this correlation and the relationship between the observable traits to learn about
the unobserved trait. For example, the Unified Democracy Scores of Pemstein, Meserve, and
Melton (2010) measures “democracy” using 12 existing expert assessment even though the
analyzed expert assessments certainly do not “cause” democracy – the assessments are all
likely correlated with the extent to which a country is democratic because they are presumably
based on observable manifestations of what the expert thinks is indicative of a democratic
state. Similarly, the work of Levendusky, Pope and Jackman (2008) uses various aspects of a
congressional district that are related to the underlying ideology, but which do not cause it.
Another strength of this approach is that we can use the observed data and the specification
of equation (1) to recover estimates of both the latent trait x⇤ (sometimes called the “factor
score”), but also the extent to which the observed matrix of variables x are related to the
latent trait
k
(i.e., the coefficient matrix
sometimes called the “factor loadings”). As a
result, we can characterize both the latent strength of alliances as is revealed in the matrix of
observable characteristics, and also which of the observed characteristics are most influential
for structuring the latent trait that is recovered.
Given the the unknown parameters x⇤ and
that are to be estimated from the observed
covariate matrix x, the likelihood function that is to be maximized is given by:
⇤
⇤
L (x , ) = p (x|[x , ]) /
K
⌃N
i=1 ⌃k=1
✓
xi
(
k0
+
k
⇤
k1 xi )
◆
(2)
where (•) is the pdf of the normal distribution. To complete the specification and form the
posterior distribution of the factors x⇤ and factor loadings
15
, we assume the typical di↵use
conjugate prior distributions.11
As specified, the model is unidentified. Because every parameter in equation (2) except
for xi has to be estimated, it is possible to generate an infinite number of parameter values
that yield the same likelihood by appropriately adjusting
k0 ,
k1 ,
x⇤i and
k.
As Rivers
(2003) shows, in one dimension, two constraints are required to achieve local identification
and fix the scale and location of the space – the orientation of the space can be fixed by
constraining a factor to be positively or negatively related to the latent trait. Typically, this
involves assuming that the mean of x⇤ is 0 and the variance of x⇤ is 1 (see, for example,
Clinton, Jackman and Rivers 2004). In multiple dimensions the number of required constants
increases to d(d + 1) where d denotes the dimensionality of the latent space.
Given the discussion of section 1, we seek to estimate alliance strength (x⇤ ) in two dimensions; let x[1]⇤i denote the latent strength of the alliance in the first dimension – with estimates
ˆ – and let x[2]⇤ denote the latent strength in the second dimension (with estigiven by x[1]
i
i
ˆ ). To identify the center of the latent parameter space, we assume that the mean
mates x[2]
i
of x[1]⇤ and x[2]⇤ are both 0. This assumption is innocuous and it centers the unobserved
latent space. To fix the scale of the recovered space, we assume that the variance of x⇤ [1] and
x⇤ [2]⇤ are both 1. To fix the rotation of the policy space and prevent “flipping”, we assume
that higher values of the summed capacity of signatories correspond to positive values in the
first dimension, and o↵ensive alliances receive positive values in the second dimension.
We do not need to know the precise nature of the relationship between the observed
characteristics and the strength of the alliance to implement the model, but we do need
to identify which measures are, and are not, related to each of the two dimensions we are
interested in. For every characteristic pertaining to the written terms of the alliance we assume
that [1]=0, and for every characteristic related to the alliance partners themselves we assume
that [2]=0. That is, characteristics related to the signatories themselves determine only the
first dimension, and characteristics of the formal agreement a↵ect alliance strength only in
the second dimension.12
To be clear, we are not assuming anything about how alliances are located within the
16
two dimensions we recover. In fact, a question of substantive interest is how x⇤ [1] and x⇤ [2]
are related – which is why we identify the dimensions by placing constraints on
rather
than by making assumptions about the relationship between x⇤ [1] and x⇤ [2]. Because we
identify the latent dimensions using characteristics of the alliances rather an assumption about
the relationship between the latent dimension, our measurement model can shed important
insights into the relationship between the formal terms of an alliance and the characteristics
of the signatories and reveal whether stronger signatories systematically form alliances with
stronger or weaker formal terms.
Given these measures and identification constraints, we use the Bayesian latent factor
model that can accommodate both continuous and ordinal measures described by Quinn
(2004) and implemented via MCMCpack (Martin, Quinn, and Park 2011). We use 100,000
estimates as “burn-in” to find the posterior distribution of the estimated parameters, and we
used one our of every 1,000 iterations of the subsequent 1,000,000 iterations to characterize
the estimates’ posterior distribution. Parameter convergence was assessed using diagnostics
implemented in CODA (Plummer et. al. 2006). The Appendix summarizes the result of
estimating measurement models using slightly di↵erent specifications, but estimates from the
di↵erent specifications correlate in excess of .95.
3
Estimates of Alliance Strength
Our Bayesian latent variable model of alliance strength produces estimates about the strength
of alliances in each of the two theoretically derived dimensions and how the various observable
features are related to the dimensions that we recover. Both are of interest in assessing the
validity of the resulting estimates and we validate our measures in three ways.
First, we examine the estimates for alliances that are known to vary in each of the two
dimensions to see if our scores reflect known variation. We focus on the entire population
of alliances as well as some prominent alliances whose strength can be ascertained based
on careful historical and qualitative work using “out-of-sample” information. The scores we
17
recover provide sensible orderings of prominent alliances even though no prior information was
used to identify the strength of the alliances we investigate. Second, we explore how the various
correlates of alliance strength mentioned in section 1 relate to the estimates we generate. These
relationship provide yet another validity check by revealing which characteristics are a↵ecting
the variation in alliance strengths that we uncover and whether the correlates are sensible
given our prior beliefs.
In Section 4, as a check on the predictive validity of the measure, we replicate Leeds’
(2003a) work that argues that alliances with weaker treaty terms are more likely to be violated
in times of war. This investigation is important because it serves as an important validity
check of our measure, and also because it highlights a gap in the scholarship that our measure
fills – whereas existing work uses a series of variables to try to approximate the binding nature
of a treaty, our measure provides a single point estimate that enables research on a number
of new and existing questions.
3.1
Depicting Alliance Strength
Figure 2 plots the distribution of estimates from the measurement model described in Section
2 along the dimensions defined by signatory strength (x-axis) and formal treaty terms (y-axis).
A score is estimated for each of the 587 alliances signed between 1816 and 2000 for which we
have data on the observable characteristics (plotted in grey), but we focus our attention on
a few selected alliances to illustrate the face validity of our estimates. (The online appendix
contains the full set of estimates and standard errors.)
As Figure 2 reveals, one of the strongest alliances in terms of both signatory characteristics
and treaty terms is the Allied agreement in World War II. This alliance is a joint declaration by
39 countries, including the United States, Russia, the United Kingdom, and China, to devote
their full resources, military or economic, against those members of the Tripartite Pact and
its adherents with which such government is at war. There are no conditions or termination
dates imposed on the terms of the agreement – it is a sweeping declaration of war, o↵ensive
18
3
U.A.E−Yemen (1958)
1
0
Characteristics of Agreement
2
WWII Allies
NATO
−1
Helsinki Final Acts (1975)
Belarus−Bulgaria (1993)
−2
0
2
4
Characteristics of Signatories
Figure 2: Distribution of Alliance Scores, 1815-2000 Points denote the posterior
mean of the estimated alliance strength of each of the 587 alliances we analyze. The ellipses
denote the 95% regions of highest posterior density for the selected alliances.
and defensive, by the most powerful coalition in the international system.
An alliance that is strong in terms of the signatories involved, but which has weak treaty
terms is the Helsinki Final Acts signed in 1975. Thirty-five countries signed the Accords
– including the United States, the USSR, the UK, and most of Europe – and together the
signatories possessed the preponderance of military strength on earth at the time. This
strength is reflected in the fact that the signatories of the Helsinki Final Acts are estimated to
have more combined military capacity than any other alliance in the sample. The signatories’
combined military strength is o↵set, however, by the fact that the obligations of the treaty are
very weak. The main objective of the agreement is to set forth a bargain between respecting
territorial boundaries and human rights and the terms (especially the military obligations)
19
are non-binding. Because the agreement lacks the legal status of a formal treaty and it would
not be governed by international law, scholars classify the Helsinki Final Acts as an example
of soft law (Abbott and Snidal 2000). Reassuringly, our measure estimates the Helsinki Final
Acts to be among the weakest alliances on the dimension of formal treaty terms.
One of the weakest alliances in both treaty terms and signatory characteristics is the
Belarus-Bulgaria alliance of 1993. This alliance was a bilateral treaty that reaffirmed the
nonaggression promise made in the Helsinki Final Act. In addition, the signatories pledged to
refrain from using force in their international relations, to consult with one another when their
security has been breached, and to remain neutral in any hostilities that may be directed at the
other alliance member. The treaty states that it is to be in e↵ect for a period of 20 years, but
either side may unilaterally terminate the agreement with a one year advance notice. As the
confidence intervals of Figure 2 make clear, the strength of the formal terms are statistically
indistinguishable from the Helsinki Final Acts as we would hope, but the strength of the
signatories is far weaker.
In contrast, the 1958 United Arab Republic (UAR) is an example of a strong formal
agreement among weak signatories. The UAR was formed as an e↵ort to unite the Arab
community against the expansion of communism in Syria and elsewhere in the Arab world
(Walt 1987, pp. 71-80). It included Egypt, Syria, and Yemen, though Yemen’s inclusion was
regarded as merely a cosmetic gesture (p. 72). The agreement called for the integration of the
allies militaries and unified command over those forces. Gamal Abdel Nassar, former President
of Egypt and the President of the United Arab Republic, insisted on a full union and control
over both countries in exchange for his agreement to halt the rising influence of the Syrian
Communist Party. Consequently, the terms of the agreement granted full military power of
the signatories to a unified command and authorized a Commander-in-Chief to pursue the
unified foreign policy drawn by the Union, which could extend to both defensive and o↵ensive
campaigns. Nassar responded by seizing control of Syria and banning all political parties.13
While the location of these alliances along both dimensions o↵er assurance of the face
validity of the measure, many other important and interesting insights come to light as a
20
result of the measure. First, we can now explore the relationship between the two dimensions
of alliance strength to ask if stronger signatories form stronger alliances or not. Because we
assume nothing about the relationship between the two dimensions that we estimate, we can
compare how the estimates correlate to answer this important question. We find that there
is a weak positive relationship – implying that as the strength of the signatories increases so
too does the strength of the formal terms of the alliance – but the relationship is relatively
modest (correlation of .256). Moreover, it is possible to make interesting comparisons by
using variation in the two dimensions. For example, the strength of the signatories of the
Helsinki Final Acts are stronger than the NATO alliance (because of the addition of Warsaw
Pact countries such as the USSR in the Helsinki Final Acts), but the terms of the NATO
alliance are far stronger as we might expect given the divergent preferences of signatories of
the Helsinki Final Acts.
Second, there is a great deal of heterogeneity between alliances – even between alliances
belonging to the same traditional “type.” Figure 3 graphs the distribution of estimates within
an ATOP alliance categorization and reveals that although o↵ensive alliances are, on average,
associated with higher estimates than defensive alliances, there is considerable variation within
each grouping.
It is important to highlight the fact that the variation evident in Figure 3 is variation that
existing measures cannot easily characterize. This variation can be used to explore many old
and new questions – e.g., why are some defensive alliances stronger than others?14
Third, a strength of our measurement model is the ability to account for the uncertainty
that we have about the estimates themselves. For every alliance, we can quantify how certain
we are about the estimated strength of the alliance in each of the dimensions we estimate.
Moreover, the uncertainty may vary across alliances and across dimensions. For example, as
Figure 2 illustrates, while we can distinguish the 1993 nonaggression alliance between Belarus
and Bulargia from the 1975 Helsinki Final Acts in terms of the strength of the signatories
because the 95% regions of highest posterior density do not overlap on the first dimension, we
cannot be certain that the formal terms of the two alliances are distinguishable. Because our
21
0.0 0.2 0.4 0.6 0.8
Density
Offensive
−1
0
1
2
3
4
2
3
4
3
4
0.4
0.2
0.0
Density
0.6
Defensive
−1
0
1
2
0
1
Density
3
4
Non−Aggression
−1
0
1
2
Figure 3: Distribution of Treaty Strength by Alliance Type, 1815-2000
Distribution of second dimension estimates by ATOP alliance category.
Bayesian latent variable model recovers how precisely we are able to estimate the strength
of an alliance, we can use such information to characterize the statistical confidence we have
in our assessments. This is important because we have more confidence in our ability to
distinguish alliances according to the strength of the signatories than we do using the terms
of the agreement.
3.2
Alliances in the World Wars and East Asia
To take a closer look at the estimates and further explore the face validity of our estimates, we
examine the alliances involved in World War I, World War II, and the post-WWII alliances in
East Asia for which we have strong priors regarding their relative ranking in each dimension.15
22
Figure 4 plots the estimated strength of each of the relevant alliances in terms of the first
dimension (left graph) and the second dimension (right graph) in temporal order.
0
1
2
3
4
5
0
Dimension 1 (Signatory Strength)
1
2
3
Dimension 2 (Alliance Terms)
China−North Korea (1961)
USSR−North Korea (1961)
US−Japan (1960)
US−Republic of China (1954)
US−South Korea (1953)
c(1, 15)
US−Japan (1951)
USSR−China (1950)
WWII Allies (1942)
WWII Axis (1941)
Pact of Steel (1939)
WWI Axis (1915)
WWI Allies (1915)
Franco−Russian Alliance (1893)
Triple Alliance (1882)
0
1
2
3
4
5
0
1
2
3
Figure 4: Selected Alliance Scores Estimated strengths for alliances involved in
World War I, World War II, and post-WWII East Asian security in terms of the first
dimension (left) and second dimension (right) are plotted. The points denote estimate
in each dimension for each alliance and the lines show 95% regions of highest posterior
density for the selected alliances.
Consider first the alliances that were involved in World War II and which are plotted
in the middle of Figure 4. As Figure 2 revealed above, the strongest alliance in the first
dimension involved the alliance formed between the Allies in 1942. Notice that the 1940
Tripartite Alliance, which was targeted by the Allied Pact, is estimated to be weaker on both
dimensions than the alliance formed by the Allies during World War II. This is reassuring given
that it is a multilateral defensive pact signed during World War II between countries whose
23
combined capabilities are not as great as the Allied powers. Additionally, the terms of the
defensive obligation are conditional upon one of the signatories being attacked by a party not
involved in World War II at the time the alliance was signed. However, the antecedent for this
defensive pact was the 1939 Pact of Steel between Germany and Italy. As an unconditional
pledge to undertake shared o↵ensive and defensive military campaigns, it is on par with the
Allied Pact in the strength of its agreement terms. The 1940 Tripartite Pact was replaced by
a more aggressive agreement, which, like the Allied Pact, is also a wartime alliance containing
similar terms. In Figure 4, it is in approximately the same position in the second dimension
as the Allied Pact, indicating the similarity in the strength of the terms of the agreements
between the opposing World War II alliances. Like the Allied Pact, the three signatories to the
Tripartite Alliance pledged to use all means, o↵ensive and defensive, to pursue the war. The
strength score for the terms of the Allied Pact is likely slightly higher because the Tripartite
Pact specifies a termination date (a stipulation initiated in the Pact of Steel and passed down
to the successor alliances). Even though the terms of the formal agreements are on parity,
the measure of the combined strength of the Allies according to the members characteristics
is greater than that of the Tripartite Pact.
The estimated strength of the alliances involved in World War I reported in the bottom of
Figure 4 also comport with prior expectations. There is clear parity in the prewar alliances.
The 1882 Triple Alliance between Germany, Italy, and Austria-Hungary was similar to the
1893 Franco-Russian alliance both on alliance terms and signatory strength. The motivation
for the Franco-Russia alliance may have been, as Snyder (1997) suggests, a desire by France
and Russia to gain parity of strength with the growing relative strength of the Triple Alliance.
The terms of the wartime treaty signed by France, Russia, the United Kingdom, and Italy
is similar to the opposing declaration agreed to by Germany, Austria-Hungary, and Bulgaria.
However, the addition of the United Kingdom and Italy to the alliance with France and
Russia shifted the signatory strength significantly in favor of the Allies. The similarities of
the opposing alliance systems in WWI and WWII are noteworthy given recent evidence that
opposing continental alliances are especially prone to balance each other (Levy and Thompson
24
2010).
The estimates of post-WWII deterrent alliances in Asia also meet our expectations. The
1950 USSR-China alliance, the 1951 US-Japan alliance, the 1961 USSR-North Korea, and
the 1961 China-North Korea alliance all obligate alliance partners to defend each other if a
fellow ally is attacked. By comparison, the other three East Asian alliances in Figure 4 –
1953 US-South Korea, 1954 US-Republic of China, 1960 US-Japan – all contain provisions
that that enable alliance members to escape their defensive obligations if there is war.16 In
our measure, the former alliances are all estimated to have stronger treaty terms than the
latter three. However, although these di↵erence comport with our prior expectations, they
are statistically indistinguishable.
Another feature of interest in Figure 4 is the parity in signatory strength between rival
alliances. Comparing the 1950 USSR-China alliance to the 1951 US-Japan alliance, for example, we see that the signatory strength of each alliance is approximately the same. The
US-South Korea alliance was signed after the Korean War to deter North Korea, China, and
the USSR. The signatory strength of the US-South Korea alliance is within the confidence
interval of the USSR-China alliance. The 1954 US-ROC alliance was signed during the first
Taiwan Strait Crisis. It was also designed to deter China and the USSR while also restraining Chiang Kaishek. Its relatively weaker treaty terms reflect the motivation to restrain an
alliance partner, and its signatory strength is on par with the USSR-China.
Moreover, the parity of the East Asian alliances remained consistent even as the schism
between China and the Soviet Union grew during the early 1960s. During this period, China
and North Korea formed a separate alliance, as did the USSR and North Korea. The US and
Japan renewed and revised the terms of their alliance. Even though the signatory strength of
all the alliances decreased, the opposing alliances remained approximately similar to each other
both in terms of signatory strength and treaty terms. The 1960 US-Japan alliance and the
1953 US-South Korea alliance are roughly on parity with the 1950 USSR-China alliance, the
1961 USSR-North Korea alliance, and the 1961 China-North Korea alliance. These empirical
characterizations are consistent with claims that some alliances may be designed to balance
25
threats (Morgenthau 1948; Waltz 1979).
An advantage of the Bayesian approach is that we can compute posteriors of any statistic
of interest. For example, we are 94% certain that the 1951 US-Japan alliance is a stronger
alliance in terms of treaty terms than the 1953 US-ROC alliance, and we can be 92% certain
that the estimated strength of the 1951 US and Japan alliance is stronger than the 1960 US
and Japan alliance.
3.3
Components of Alliance Strength
To further validate our measure it is instructive to compare how the various measures relate
to the estimate in each of the two dimensions we recover. This examination also highlights
two advantages of the latent variable approach we take: 1) our measure of alliance strength
reflects the characteristics of multiple measures without having to take a position on exactly
what the relationship is, and 2) because our measure does not necessarily depend on a single
proxy variable (e.g., “o↵ensive alliances”), we can recover variation between alliances who
possess the same value for any particular measure.
Figure 5 reports the relationship of variables that are assumed to potentially structure
the first dimension – the dimension we interpret as designating the strength of the alliance
signatories – and Figure 6 reports the relationship for variables that may potentially a↵ect the
second dimension – the dimension we interpret as a↵ecting the strength of the commitments
required by the treaty itself. Note that while we defined the dimensions by assuming that
each variable a↵ects either the first or second dimension, we assumed nothing about how each
variable structures these recovered dimension. Our measurement model imposes no constraints
on the parameters graphed in Figures 5 and 6.
A strength of a Bayesian latent variable model is that we can also assess the precision
of these estimated relationships. For example, while we can be confident that the log of
the summed military capacity of the involved signatories is positively related to the latent
dimension we recover in the first dimension, there is no obvious relationship between the
26
−0.5
0.0
0.5
1.0
log(Sum. Mil. Capacity)
●
Major Power Signs
●
log(Ally Count)
●
Avg. Distance btwn Signatories
●
Avg. Polity IV
●
Avg. SLGO
●
−0.5
0.0
0.5
1.0
Figure 5: Dimension 1 Factor Loadings: “Signatory Strength”: Circles denote
posterior mean, and lines denote 95% HPD regions.
average Polity IV score of alliance signatories (Avg. Polity IV ). Substantively, Figure 5
reveals that the logged total military capacity of the signatories is the strongest determinant
of the first-dimension estimate and that the estimate is also increasing in the the number of
signatories (logged) and whether a major power signs.
More novel are the correlates of the second dimension because unlike the case of the first
dimension where scholars already have decent measure (e.g., the summed military capacity),
the options available to scholars interested in assessing the strength of a treaty are far more
crude. Figure 6 reveals the aspects of a treaty that a↵ect the estimated strength of an alliance
according to our measurement model.
We find that, as expected, unconditional alliances are estimated to be stronger, as are
deterministic alliances, o↵ensive alliances, those that establish military bases, integrate the
command of military forces, and provide for military exchanges. Non-Aggression Pacts are
sensibly estimated to be the weakest, as are alliances with specified termination dates, and
provisions for conflict within the alliance itself.
27
−0.5
Unconditional
Deterministic
Offensive Pact
Compellent
Defensive Pact
Asymmetric Oblig.
Military Bases
Integrated Command
Conditional
Secrecy
Military Aid
Economic Aid
Arms Reduction
Renounce
Consultation Pact
Organization
Neutrality Pact
Formal Treaty
Specified Lgth
Conflict w/in
Non−Aggression Pact
0.0
0.5
1.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−0.5
0.0
0.5
1.0
Figure 6: Dimension 2 Factor Loadings: “Formal Terms”: Circles denote posterior mean, and lines denote 95% HPD regions.
The fact that the relationship between our estimates and these observed measures appears
sensible increases our confidence in the validity of the estimates we recover, especially given the
fact that we impose no relationship a priori. The variation our measurement model recovers
is variation that exists in the data itself.
4
Alliance Strength and Reliability
In spite of the face validity of our measure as demonstrated by comparisons of individual
historical alliances, we seek to take the additional step of applying it to an estimation of
28
a model in which we might expect treaty strength to be a relevant measure. Consider, for
example, the question of alliance reliability. Should we expect the strength of the terms of
an alliance treaty to be related to a government’s decision to break its alliance commitment
when its ally becomes involved in war? According to Leeds (2003a), the content of a treaty is
critical, because it signals the costs signatories pay to form and break the alliance. The risk
of violating an alliance decreases with the inclusion of provisions that strengthen an alliance
by institutionalizing these costs. Such provisions include factors included in our measure of
treaty strength, e.g., integrated military command, basing, the conditions of conflict under
which the alliance comes into e↵ect, the actions that the members are required to take in
the event of war, provision of aid to be provided, and conditions limiting the provision of
assistance and intervention.
The inclusion or exclusion of such provisions should predict whether an alliance is likely to
be violated, but previous e↵orts have been forced to analyze this relationship only indirectly
because, prior to now, no measure of the strength of treaty content has existed (Leeds2003a,
Sabrosky 1980, Siverson and King 1980, Smith 1996). For example, Leeds’ (2003a) focuses on
major powers, democratic signatories, and the change in these factors after alliance formation
because of the expectation that these factors might a↵ect the content of an alliance when it
is formed. A measure of the treaty strength like we provide allows us to analyze both the
determinants of treaty strength and the e↵ect of the treaty contents on the likelihood that
signatories deliver their promises when required to do so in war.
To demonstrate the value of our measure, we evaluate both Leeds’ (2003a) determinants
of treaty strength as well as the e↵ect of the strength of the treaty on the likelihood that
signatories deliver their promises when required to do so in war. Let us first examine the
first half of the analysis. Based on Leeds’ argument, we should expect democracies to be
more likely to be more likely to form stronger alliances and major powers to be less likely
to form strong alliances. The democracies and major powers examined in Leeds’ data are
those whose treaty obligations have been activated by war. Thus, in replicating the data and
model using our measure of treaty strength, we may restate the expectations as follows: (1)
29
democracies who might expect to be called upon to support an ally in war are more likely to
form alliances with stronger treaty terms, and (2) major powers who might expect to be called
upon to support an ally in war are more likely to form alliances with weaker treaty terms.
Additionally, Leeds also analyzes the e↵ects of changes in signatories’ capabilities and domestic
political institutions. These factors may be indirect determinants of reliability, because how
an alliance is designed at the time it is formed may depend upon whether signatories expect
their own or their alliance partners’ capabilities and domestic political institutions to change
after the alliance is formed. Hence, to determine whether our measure of treaty strength is
useful for investigating questions such as this, we can estimate whether Leeds’ four factors are
related to our measure of treaty strength.
Table 1: Correlates of treaty strength, 1816-1944.
(1)
Benson Only
(2)
ATOP
(3)
Both
Democracy
0.6061⇤
(0.235)
0.4415⇤
(0.196)
0.5812⇤
(0.287)
Major Power
0.4158+
(0.229)
0.2069
(0.239)
0.2865
(0.287)
Change in Capabilities
0.6230⇤
(0.249)
0.5456⇤
(0.263)
0.8285⇤⇤
(0.257)
Change in Domestic
Political Institutions
0.1173
(0.299)
0.0243
(0.353)
0.3835
(0.264)
136
0.1254
136
0.0795
136
0.1430
N
R2
Standard errors in parentheses
+ p<0.10, * p<0.05, ** p<0.01
Table 1 shows the estimation of treaty strength using the four factors in Leeds (2003a). We
report the results of three models using three di↵erent measures of treaty strength to indicate
the robustness of the measure to the inclusion of di↵erent factors that might be indicative
of the strength of a treaty. The measure of alliance strength in Model 1 excludes the ATOP
classifications of o↵ensive and defensive alliances and instead uses Benson’s (2011) measures
of treaty categories based on whether the alliance is deterrent, compellent or probabilistic in
30
its commitments. In contrast, Model 2 excludes these measures and relies only the ATOP
categories. The measure of treaty strength analyzed using Model 3 includes both sets of
measures. The appendix compares the measures in greater detail but, as is shown below, the
choice of measure does not greatly a↵ect the substantive conclusions.
The results of Table 1 confirm Leeds’ (2003a) argument regarding the determinants of
treaty strength and they do not depend on the particular measure of treaty strength that is
used. Democracies tend to favor forming stronger treaties, and signatories that anticipate a
shift in one of the alliance members’ future capabilities tend to favor weaker treaty terms.
The exceptions to Leeds’ expectations are the e↵ect of major powers and the anticipation of
shifts in domestic political institutions after signing an alliance. However, major power is,
as Leeds argues, negative in all three models, and it is significant at p<0.074 level in Model
1. The consistency of these results with Leeds’ argument and expectations gives us further
confidence in the predictive validity of our measure of formal treaty strength – a measure that
we have previously lacked.
Table 2: Treaty strength and alliance commitment violation in war, 1816-1944.
(1)
Benson Only
(2)
ATOP
(3)
Both
Treaty Strength
0.4509+
(0.241)
0.3700
(0.196)
0.6441⇤⇤
(0.198)
Ally is Original
Target in War
1.320⇤
(0.239)
1.286⇤
(0.577)
1.398⇤
(0.596)
N
Chi2
136
7.06
136
6.31
136
15.12
Standard errors in parentheses
+ p<0.10, * p<0.05, ** p<0.01
Our second check includes evaluating the application of the measure of treaty strength
to the likelihood that governments renege on their alliance commitments. Leeds argues that
stronger treaties are costlier to break and we should therefore expect that our measure of
treaty strength is negatively correlated with the likelihood of violation. Table 2 presents the
results of a replication of the logit estimation found in Leeds (2003) after replacing her four
31
indirect measures of treaty strength with our direct measure and including her variable for
whether the ally is the original target in a war (which Leeds argues is important for controlling
for the selection e↵ect related to challenger’s propensity for targeting governments who they
believe have unreliable allies).
The results of Table 2 are consistent with the claim that weaker treaty terms are associated
with a higher likelihood of violating alliance commitments. Our measure of treaty strength
is negative in all three models, and it is significant at the p<0.061 level using the measure
based on Benson’s (2011) categories in Model 1 and p<0.001 level using the measure that
includes both the Benson (2011) and ATOP factors. Leeds’ variable for indicating whether
the ally is the original target in war is also positive and significant according to expectation.
These results further solidify our confidence in the validity and usefulness of the measure,
because the measure is related to factors that we might expect it to be associated with, it
is relatively robust to the inclusion of di↵erent factors for specifying the measure, and it fills
a previously open hole in important questions about alliances and conflict such as whether
alliance commitments are reliable.
5
Conclusion, Caveats, and Implications
Scholars of alliances have been blessed with a tremendous amount of data due to the generous
and impressive e↵orts of those who have collected data on both the formal terms of the
various alliances as well as the characteristics of the signatories involved. The amount of
available data prompts the question: how can we best measure the strength of international
alliances given the wealth of available data and our theoretical conceptions of alliance strength
while also accounting for the inevitable ambiguity that must necessarily accompany such a
determination?
We show how a Bayesian latent variable trait model – a model that has been applied
to many other important concepts in political science – can be used to integrate observable
measures with theoretical arguments about the determinants of alliance strength to provide
32
estimates of: how the various observable factors relate to to the theoretically implied dimensions, how the strength of alliances in the two dimensions relate to one another, and how
certain we are about all of the estimated parameters. Applying the model to all alliances
in the international system between 1816 and 2000 provides estimates of alliance strength
in terms of the strength of signatories and the strength of the formal terms of the alliance
agreement. In total, we estimate the strength of 582 alliances – including those that are multilateral and lack a specific target. The estimates have strong face validity – familiar alliances
are located as we would expect, and exploring the post-World War II alliances of East Asia
provides reassuringly reasonable estimates.
Not only do our estimates provide scholars with measures that can be used to explore
questions such as the determinants of alliance formation, the persistence of alliances, and their
relationship with conflict; but the estimates also reveal important insights into the nature of
alliances. For example, although there is a relationship between the strength of signatories
and the strength of the formal terms of an alliance, the relationship is modest (.256), and there
is considerable variation of potential interest in the relationship. Moreover, when comparing
the estimates of temporally and geographically proximate alliances, the estimates suggest that
the alliances that are formed are nearly balanced. Perhaps reflecting the theoretical claims of
balance of power theorists (Morgenthau 1948 and Waltz 1979), alliances formed in response to
one another are very similar in strength. Third, our ability to distinguish between alliances on
the basis of the strength of signatories is better than our ability to do so using the formal terms
of the alliance – we have more uncertainty about the estimates of the latter than we do about
the former. Relatedly, the ability to assess the certainty with which we are able to estimate
the location of alliances provides the ability to account for the precision of our estimates
when comparing alliances or using the estimates in secondary analyses. Finally, although we
think our estimates are theoretically sound and possess strong face validity, the measurement
model we employ is sufficiently general that scholars can use the model to generate their own
estimates using di↵erent assumptions or di↵erent measures.
We think our measure and the method provide an important advance for scholars inter33
ested in alliances and the international system, but some caveats are worth noting. First,
while our statistical model provides a principled way of extracting information from multiple
measures related to theoretically relevant dimensions with a minimal amount of assumptions,
the resulting estimates are still dependent on the relationship between the observable measures to extract the latent dimensions. For example, the fact that CINC scores are relative to
the global capacity in each year means that comparisons across long periods of time may be
difficult and the estimates are likely most appropriate for temporally bounded comparisons or
when temporally-related di↵erences are accounted for (similar concerns can emerge whenever
these scores are used in any regression that explores variation over time).
Second, the ability to use a Bayesian latent variable model to extract the latent dimensions structuring the strength of an alliance does not ameliorate potential concerns about the
endogeneity of alliances. While we have been careful not to include variables in our measure
of alliance strength that scholars may seek to correlate with the strength of an alliance in
the hopes of better understanding alliance formation, nothing in our analysis discounts the
fact that the formation of an alliance is presumably a strategic act based on the assessment
of expected consequences and those interested in using the estimates should be careful of the
potential pitfalls.
While no measure is perfect, our model is able to use theoretical insights to: identify
how features are related to the strength of an alliance in theoretically relevant dimensions,
measure the strength of alliances formed between 1816 an 2000 along dimensions defined by
the strength of the signatories involved and the strength of the formal terms of the agreement,
and quantify how certain we are about the resulting estimates. Moreover, it also provides
a principled way to integrate alternative measures and assumptions into the measurement
task. Our measurement model and the estimates we produce will hopefully allow scholars
to focus on better understanding the determinants and consequences of alliances rather than
continually having to grapple with the question of how best to measure the relative strength
of international alliances.
34
Notes
1
This is the time period covered by the databases collected by Leeds et. al. (2002), Gibler
and Sarkees (2004), Benson (2011) and integrated by Bennett and Stam (2000a) in EUGene
v3.204.
2
See Powell 1996 and Levy 1989 for a review of the debate over the relationship between
alliances, distribution of power, and stability.
3
For an overview of the progression of this research, see Leeds 2003b, Johnson and Leeds
2011, Benson 2011, Benson et al 2013a
4
This notion goes back at least to Schelling 1966, who claimed that weak and flexible
commitments diminished opponents’ perception of the expected military strength of the commitment. The idea is echoed in Snyder 1984, Snyder 1997, Fearon 1997, Zagare and Kilgour
2003, Benson 2012, and Benson et al 2013b.
5
Scholars have long recognized the need for adjusting aggregate military capacity for dis-
tance. Most existing research degrades strength linearly (see for example Bueno de Mesquita
and Lalman 1986 and Smith 1996). However, research has not established the actual mathematical relationship between distance and capabilities, and we also do not know if the rate
of degradation is sensitive to the technological sophistication and geography of a country.
Consequently, it is not clear how distance should a↵ect alliance strength.
6
That said, scholars could certainly use our framework to estimate a more limited measure
for alliances with targets.
7
When the variables were coded to contain multiple categories, we often collapsed the
categories to a binary measure to denote the presence or absence of each feature because
the ordering of categories was unclear. For example, the coding of MILAID is “if there are
no provisions regarding military aid, the variable is coded 0. If the agreement provides for
35
general or unspecified military assistance, the variable is coded 1. If the agreement provides
for grants or loans, the variable is coded 2. If the agreement provides for military training
and/or provision or transfer of technology, the variable is coded 3. If the agreement provides
for both grants and/or loans and training and/or technology, the variable is coded 4” (p. 27).
It is unlearn whether terms that denote specific loans or grants (MILAID=2) are stronger
than terms that provide for unspecified military assistance (MILAID=1), or half as strong as
terms that include both grants and/or loans and training and/or technology (MILAID=4).
As a consequence, we use whether there are any provisions for military aid or any kind (i.e.,
if MILAID
8
1).
Among other provisions, the “Pact of Steel” requires that: “The contracting parties will
remain inconstant touch with one another in order to reach agreement upon all questions
a↵ecting their common interests or the European situation as a whole” (Art.1),“In the event
of the common interests of the contracting parties being injured by international events of any
kind whatever, they will immediately consult as to the measures to be taken for the protection
of those interests” (Art.2), “If contrary to the wishes and the hopes of the contracting parties
it should occur that one of them becomes involved in warlike complications with another power
or powers, the other contracting party will at once assist it as an ally and will support it with
all its military forces on land, sea and in the air” (Art.3), and “The parties are conscious of
the importance of their common relations to the powers friendly to them. They are resolved
to maintain these relations in the future and to conduct them in common in accordance with
the identical interests which bind them to these powers” (Art.6). The terms of the alliance
between Spain and the Netherlands note that: ‘1This alliance is purely defensive and its
object is to protect the commerce of the two parties” (Art.I), “This alliance exists until the
Regents of Algeria, Tunis and Tripoli do not renounce their o↵ensive system with respect to
the properties of the subjects of the two parties” (Art.II), and “If one of those pirate states
causes an o↵ense to the parties, the allies consuls and representatives shall demand reparation
from the government of the o↵ender through legal means; if the o↵enders government does
36
not abide by law, the allied powers will agree on proceeding to take reprisals corresponding
to the quantity that the o↵ender has taken” (Art.III).
9
We can treat y = ↵ +
1 x1
+
2 x2
as accounting for the regression of y on the unobserved
alliance strength x⇤ given the true specification y = ↵0 +
linear function of x1 and x2 . If, for example, x⇤ =
0x
1 x1 + 2 x2
⇤
if we can assume that x⇤ is a
and y = ↵0 +
0x
⇤
the regression
of y = ↵+ 1 x1 + 2 x2 is equivalent to the regression of y = ↵0 + 0 ( 1 x1 + 2 x2 ) because ↵ = ↵0 ,
1
=
0
⇥
1,
and
2
=
0
⇥
2
even though we do not observe x⇤ ! Note however, that this
decomposition relies on the extremely strong – and implausible – assumption that x⇤ =
2 x2 .
1 x1 +
This requires that not only must the latent trait be a function of observables that are
correctly specified in the regression specification, but also that the relationship between x⇤ and
the observables is without error. If there is error in this relationship – say x⇤ =
1 x1 + 2 x2 + ✏
– then we are in a classic error-in-variables situation and the estimated regression coefficients
are inconsistent (see, for example, the nice review by Hausman 2001).
10
Technically, if variable k is a discrete variable, the observed value for alliance i in variable
k is the category c which is generated according to xik = c if xik 2 (
k(c 1) ,
kc ]
and where
k
is the vector of cut points for the C categories in variable k (Quinn 2004).
11
Specifically, the prior distribution of
prior distribution for
12
2
k
k
conditional on
2
k
is normally distributed and the
is an inverse-Gamma distribution (Jackman 2009a).
Because we impose theoretically derived parameter constraints to identify the dimensions
being estimated, our estimator is similar to “confirmatory” factor analysis where theoretical
insights are used to define the dimensions of interest a priori. “Exploratory” factor analysis
places fewer constraints on the measurement model and lets (possibly spurious) relationships
present within the data to define the recovered dimensions.
13
Even though the alliance was weak in terms of the signatory characteristics, it was powerful
enough to satisfy the primary objective of the Union, which was to crack down on the Syrian
Communists. The larger hope, which was explicitly expressed in the alliance, was that other
37
Arab states would also join to enhance the combined influence of the Arab community. Instead,
Jordan and Iraq felt threatened by the UAR and formed the short-lived Iraq-Jordan Federal
Union. Walt (1987) argues that the Federal Union was designed to balance against the UAR.
At least in terms of our measure of overall alliance strength, the UAR and Federal Union
appear to be matched. The Federal Union lies within the confidence interval of the UAR,
implying that the strength of the Federal Union is statistically indistinguishable from the
UAR.
14
Benson 2011 and 2012, for example, disaggregates deterrent types of alliances and studies
why di↵erences exist among these alliances.
15
To be clear, although we use a Bayesian framework for theoretical and practical reasons,
the priors we use are di↵use and contain no information about the relationships we examine.
Our posteriors are almost entirely driven by the likelihood function.
16
See Benson 2012 for a case study analysis of these East Asian alliances.
38
References
Abbott, Kenneth W. and Duncan Snidal. 2000. “Hard and Soft Law in International Governance.” International Organization 54(3):421–456.
Achen, Christopher H. 2005. “Let’s Put Garbage-Can Regressions and Garbage-Can Probits
Where They Belong.” Conflict Management and Peace Science 22(4):327–339.
Bennett, D. Scott and Allan C. Stam. 2000a. “EUGene: A conceptual manual.” 26(2):179–204.
Bennett, D. Scott and Allan C. Stam. 2000b. “A Universal Test of an Expected Utility Theory
of War.” International Studies Quarterly 44(3):451–480.
Benson, Brett V. 2011. “Unpacking Alliances: Deterrent and Compellent Alliances and Their
Relationship with Conflict, 1816–2000.” The Journal of Politics 73(4):1111–1127.
Benson, Brett V. 2012. Constructing International Security: Alliances, Deterrence, and Moral
Hazard. New York, NY: Cambridge University Press.
Benson, Brett V., Adam Meirowitz and Kris Ramsay. 2013. “Inducing Deterrence Through
Moral Hazard in Alliance Contracts.” Journal of Conflict Resolution .
Benson, Brett V., Patrick Bentley and Jim Ray. 2013. “Ally Provocateur: Why Allies Do Not
Always Behave.” Journal of Peace Research 50(1):47–58.
Boulding, Kenneth Ewart. 1962. Conflict and Defense: A General Theory. Harper.
Budge, Ian, Hans-Dieter Klingemann Andrea Volkens Judith Bara Eric Tanenbaum Richard
C. Fording Derek J. Hearl Hee Min Kim Michael McDonald and Silvia Mendez. 2001.
Mapping Policy Preferences. Estimates for Parties, Electors, and Governments 1945-1998.
Oxford: Oxford University Press.
Bueno de Mesquita, Bruce. 1983. The War Trap. New Haven, CT: Yale University Press.
Bueno de Mesquita, Bruce and David Lalman. 1986. “Reason and War.” The American
Political Science Review pp. 1113–1129.
Clinton, Joshua D. and David E. Lewis. 2008. “Expert Opinion, Agency Characteristics, and
Agency Preferences.” Political Analysis 16(1):3–20.
Clinton, Joshua D., Simon Jackman and Doug Rivers. 2004. “The Statistical Analysis of Roll
Call Data.” American Political Science Review 98(2):355–370.
Fearon, James D. 1997. “Signaling Foreign Policy Interests: Tying Hands versus Sinking
Costs.” The Journal of Conflict Resolution 41(1):68–90.
Gartzke, Erik and Kristian Skrede Gleditsch. 2004. “Why Democracies May Actually Be Less
Reliable Allies.” American Journal of Political Science 48(4):775–795.
Gibler, Douglas M. and John A. Vasquez. 1998. “Uncovering the Dangerous Alliances, 14951980.” International Studies Quarterly 42(4):785–807.
39
Gibler, Douglas M. and Meredith Reid Sarkees. 2004. “Measuring Alliances: The Correlates of
War Formal Interstate Alliance Dataset, 1816-2000.” Journal of Peace Research 41(2):211–
222.
Gibler, Douglas M. and Scott Wolford. 2006. “Alliances, Then Democracy: An Examination
of the Relationship between Regime Type and Alliance Formation.” The Journal of Conflict
Resolution 50(1):129–153.
Gill, Je↵. 2002. Bayesian Methods: A Social and Behavioral Sciences Approach. Boca Raton,
FL: Chapman & Hall/CRC.
Gray, Julia and Jonathan Slapin. 2011. “How E↵ective are Preferential Trade Agreements?
Ask the Experts.” The Review of International Organizations pp. 1–25.
Hausman, Jerry. 2001. “Mismeasured Variables in Econometric Analysis: Problems from the
Right and Problems from the Left.” Journal of Economic Perspectives 15(4):57–67.
Hoyland, Bjorn, Karl Moene and Fredrik Willumsen. 2012. “The tyranny of international
index rankings.” Journal of Development Economics 97(1):1–14.
Jackman, Simon. 2009a. Bayesian Analysis for the Social Science. NY:NY: Wiley & Sons.
Jackman, Simon. 2009b. Measurement. In The Oxford Handbook of Political Methodology, ed.
Henry E. Brady Janet M. Box-Ste↵ensmeier Collier, David. Oxford: Oxford University
Press chapter 6.
Lai, Brian and Dan Reiter. 2000. “Democracy, Political Similarity, and International Alliances,
1816-1992.” The Journal of Conflict Resolution 44(2):203–227.
Leeds, Brett Ashley. 2003a. “Alliance Reliability in Times of War: Explaining State Decisions
to Violate Treaties.” International Organization 57(4):801–827.
Leeds, Brett Ashley. 2003b. “Do Alliances Deter Aggression? The Influence of Military
Alliances on the Initiation of Militarized Interstate Disputes.” American Journal of Political
Science 47(3):427–439.
Leeds, Brett, Je↵rey Ritter, Sara Mitchell and Andrew Long. 2002. “Alliance Treaty Obligations and Provisions, 1815-1944.” International Interactions 28(3):237–260.
Levendusky, Matthew S. and Jeremy C. Pope. 2010. “Measuring Aggregate-Level Ideological
Heterogeneity.” Legislative Studies Quarterly 35(2):259–282.
Levendusky, Matthew S., Jeremy C. Pope and Simon D. Jackman. 2008. “Measuring DistrictLevel Partisanship with Implications for the Analysis of U.S. Elections.” American Journal
of Political Science 70(3):736–753.
Levy, Jack. 1989. The Causes of War: A Review of Theories and Evidence. In Behavior,
Society, and Nuclear War, ed. Philip tetlock et al. Vol. 1 New York, NY: Oxford University
Press.
40
Levy, Jack S. 1981. “Alliance Formation and War Behavior: An Analysis of the Great Powers,
1495-1975.” The Journal of Conflict Resolution 25(4):581–613.
Levy, Jack S. and William R. Thompson. 2010. “Balancing on Land and at Sea: Do States
Ally Against the Leading Power?” International Security 35(1):7–43.
Marshall, Monty G., Keith Jaggers and Ted Robert Gurr. 2002. Polity IV Project Political
Regime Characteristics and Transitions, 1800-2002. College Park, Maryland: Center for
International Development and Conflict Management, University of Maryland.
Martin, Andrew D., Kevin M. Quinn and Jong Hee Park. 2011. “MCMCpack: Markov Chain
Monte Carlo in R.” Journal of Statistical Software 42(9):1–21.
Mattes, Michaela. 2012. “Democratic Reliability, Precommitment of Successor Governments,
and the Choice of Alliance Commitment.” International Organization 66(1):153–172.
Mearsheimer, John. 1990. “Back to the Future: Instability in Europe After the Cold War.”
International Security 15(1):5–56.
Morgenthau, Hans J. 1948. Politics Among Nations: The Struggle for Power and Peace. New
York: Knopf.
Morrow, James D. 1991. “Alliances and Asymmetry: An Alternative to the Capability Aggregation Model of Alliances.” American Journal of Political Science pp. 904–933.
Morrow, James D. 1994. “Alliances, Credibility, and Peacetime Costs.” Journal of Conflict
Resolution 38(2):270–297.
Organski, Abramo F. K. and Jacek Kugler. 1980. The War Ledger. Chicago, IL: University
of Chicago Press.
Pemstein, Daniel, Stephen A. Meserve and James Melton. 2010. “Democratic Compromise:
A Latent Variable Analysis of Ten Measures of Regime Type.” Political Analysis 18(4):426–
449.
Plummer, Martyn, Nicky Best, Kate Cowles and Karen Vines. 2006. “CODA: Convergence
Diagnosis and Output Analysis for MCMC.” R News 6(1):7–11.
Poast, Paul, Alexander Von-Hagen Jamar and James D. Morrow. 2012. “Does Capability
Aggregation Explain Alliance Formation?” Working Paper .
Poole, Keith T. and Howard Rosenthal. 1997. Congress: A Political-Economic History of Roll
Call Voting. New Yorki: Oxford University Press.
Quinn, Kevin M. 2004. “Bayesian Factor Analysis for Mixed Ordinal and Continuous Responses.” Political Analysis 12(4):338–353.
Ray, James Lee. 2003. “Explaining Interstate Conflict and War: What Should Be Controlled
for?” Conflict Management and Peace Science 20(2):1–31.
41
Reiter, Dan. 1996. Crucible of Beliefs: Learning, Alliances, and World Wars. Cornell University Press.
Rivers, Doug. 2003. “Identification of Multidimensional Spatial Voting Models.” Stanford
University Working Paper .
Rosas, Guillermo. 2009. “Dynamic Latent Trait Models: An Application to Latin American
Banking Crises.” Electoral Studies 28(3):375 – 387.
Rosenthal, Howard and Erik Voeten. 2007. “Measuring Legal Systems.” Journal of Comparative Economics 35(4):711 – 728.
Sabrosky, Alan N. 1980. Interstate Alliances: Their Reliability and the Expansion of War. In
The Correlates of War II: Testing Some Realpolitik Models, ed. J. D. Singer. New York:
The Free Press.
Schnakenberg, Keith E. and Christopher J. Fariss. 2011. “A Dynamic Ordinal Item Response
Theory Model with Application to Human Rights Data.” APSA 2009 Toronto Meeting
Paper .
Senese, Paul D. and John A. Vasquez. 2008. The Steps to War: An Empirical Study. Princeton:
Princeton University Press.
Signorino, Curtis S. and Je↵ M. Ritter. 1999. “Tau-b or Not Tau-b: Measuring the Similarity
of Foreign Policy Positions.” International Studies Quarterly 43(1):115–144.
Simon, Michael W. and Erik Gartzke. 1996. “Political System Similarity and the Choice of
Allies: Do Democracies Flock Together, or Do opposites Attract?” The Journal of Conflict
Resolution 40(4):617–635.
Singer, J. David, and Melvin Small. 1966. “Formal Alliances, 1815-1939: A Quantitative
Description.” Journal of Peace Research 3:1–31.
Siverson, Randolph M. and Joel King. 1980. “Attributes of National Alliance Membership
and War Participation, 1815-1965.” American Journal of Political Science 24(1):1–15.
Siverson, Randolph M. and Michael R. Tennefoss. 1984. “Power, Alliance, and the Escalation
of International Conflict, 1815-1965.” American Political Science Review 78(4):1057–1069.
Smith, Alastair. 1995.
39(4):405–425.
“Alliance Formation and War.” International Studies Quarterly
Smith, Alastair. 1996. “To Intervene or Not to Intervene: A Biased Decision.” The Journal
of Conflict Resolution 40(1):16–40.
Snyder, Glenn H. 1984. “The Security Dilemma in Alliance Politics.” World Politics 36(4):461–
495.
Snyder, Glenn Herald. 1997. Alliance Politics. Ithaca, NY: Cornell University Press.
42
Starr, Harvey and Benjamin A. Most. 1976. “The Substance and Study of Borders in International Relations Research.” International Studies Quarterly pp. 581–620.
Treier, Shawn and Simon Jackman. 2008. “Democracy as a Latent Variable.” American
Journal of Political Science 52(1):201–217.
Tsebelis, George. 2002. Veto Players: How Political Institutions Work. Princeton, NJ: Princeton University Press.
Vasquez, John A. 1993. The War Puzzle. Cambridge: Cambridge University Press.
Voeten, Erik. 2000. “Clashes in the Assembly.” International Organization 54:185–215.
Wagner, Robert Harrison. 2007. War And the State: The Theory of International Politics.
University of Michigan Press.
Walt, Stephen M. 1987. The Origins of Alliances. Ithaca, NY: Cornell University Press.
Waltz, Kenneth Neal. 1979. Theory of International Politics. Addison-Wesley Pub. Co.
Weidmann, N. B., D. Kuse and K. S. Gleditsch. 2010. “The geography of the international
system: The cshapes dataset.” International Interactions 36(1):86–106.
Yuen, A. 2009. “Target Concessions in the Shadow of Intervention.” Journal of Conflict
Resolution 53(5):745–773.
Zagare, F. C. and D. Marc Kilgour. 2003. “Alignment Patterns, Crisis Bargaining,
and Extended Deterrence: A Game-Theoretic Analysis.” International Studies Quarterly
47(4):587–615.
43