Measured Strength: Estimating Alliance in the International System, 1816-2000⇤ Brett V. Benson† Joshua D. Clinton‡ May 15, 2013 Word Count: 10,800 Keywords: Alliances; Measurement; Formal Treaty Terms; Reliability Abstract Alliances play a critical role in the international system and understanding the determinants and consequences of their strength is an important task. Many have argued that the strength of an alliance is determined by both the power of the signatories involved and the formal terms of the agreement, but using these insights to measure the strength of alliances is difficult. We use a Bayesian statistical measurement model to estimate the strength of all alliances signed between 1816-2000 along two theoretically derived dimensions: the strength of the signatories involved and the strength of the formal terms of the alliance. The resulting estimates not only provide a measure of alliance strength based on the terms of the alliance itself, which allow for the investigation of many possible questions, but exploring the validity of the resulting estimates reveals support for some core intuitions that were previously hard to verify regarding the relationship between signatory strength and the formal terms of a treaty, the extent to which alliance balancing occurs, and whether alliances with stronger treaty terms are also those in which allies are less likely to renege if conflict occurs. ⇤ The authors would like to thank Ashley Leeds and Michaela Mattes for comments on an earlier version of this manuscript. † Associate Professor of Political Science, Vanderbilt University, E-mail: [email protected]. PMB 505, 230 Appleton Place, Nashville TN, 37203-5721. ‡ Associate Professor of Political Science and Co-Director of the Center for the Study of Democratic Institutions, Vanderbilt University. E-mail: [email protected]. PMB 505, 230 Appleton Place, Nashville TN, 37203-5721. Interstate alliances are a critical feature of the international system and understanding their causes and consequences is important for better understanding the conditions under which international security can be achieved. Many of the main questions related to military alliances entail a conceptualization of alliance strength. For example, in considering the e↵ect of an alliance on interstate conflict, an important question to ask is whether the strength of an alliance a↵ects the military decisions of both the alliance members as well as the prospective targets of the alliance. To investigate such a question, scholars might begin by asking how powerful the allies are relative to their targets and whether the obligations the signatories have agreed to require them to expend costs that enhance the military capacity of the alliance prior to or during war. In this illustration, alliance strength is clearly related to both the characteristics of the signatories and the content of the associated agreement between the alliance members. Because many factors might be related to these two dimensions of alliance strength, the question of how best to characterize the dimensions – and therefore also of the relationship between them – is an elusive one. In this paper, we characterize the strength of an alliance along theoretically implied dimensions using available data and accounting for the uncertainty in the measures. Such a measure of alliance strength would provide the ability to ask and answer many important, but elusive, questions in the study of international politics. For example, does the strength of the alliance signatories relate to the strength of the treaty terms? Do powerful alliances, both in terms of the signatory capabilities and the alliance terms, deter conflicts (Smith 1995, Benson et al 2013a)? Are treaties with stronger terms less likely to be violated (perhaps because the likelihood of violation resulted in the creation of a weaker alliance) (Leeds 2003a)? How does alliance strength a↵ect the distribution of power and the likelihood of war in the international system (Mearsheimer 1990, Organski and Kugler 1980, Powell 1996)? Does fear of opportunism among allies lead to weaker treaty terms (Benson 2012, Snyder 1997, Fearon 1997)? Empirical investigations of many of these questions have either led to mixed results or have not been directly examined because of measurement difficulties. We lack a measure of alliance strength that reflects its nuanced nature. 1 Our goal is to generate a measure of alliance strength that helps advance research on these and other questions related to alliances and international security. Because prominent theories raise questions related to both the signatory characteristics and treaty terms, our measure depends on two dimensions of strength: 1) the potential military capacity of an alliance given the combined strength and other characteristics of the signatories, and 2) the specific terms of the treaty that obligate signatories to expend costs prior to or during war for the purpose of limiting or enhancing the wartime military capacity of the alliance. As an illustration of the relevance of both dimensions, consider a comparison between the the 1958 treaty signed between the United Arab Republic and the Kingdom of Yemen (UAR-Yemen) and the 1975 Helsinki Final Acts. In the UAR-Yemen alliance, the signatories agreed to strong treaty terms, calling for integrated military command and joint military bases for both o↵ensive and defensive purposes. Yet, the combined strength of the signatories was relatively weak. By contrast, the 1975 Helsinki Final Acts was signed by both NATO and Warsaw Pact members, making the combined military capacity of the signatories unparalleled by other alliances at the time, but the treaty terms of this nonaggression pact merely required states to “respect each other’s sovereign equality and individuality as well as all the rights inherent in and encompassed by its sovereignty, including in particular the right of every State to juridical equality, to territorial integrity, and to freedom and political independence” (Article 1). There were no obligations in the formal agreement of the Helsinki Final Acts that would require signatories to utilize their military strength against any other state. While extreme, these examples illustrate two points: 1) the strength of an alliance varies both in terms of the military capacity of the signatories involved and the extent to which the formal terms of a treaty require signatories to expend costs that enhance or limit their combined military power, and 2) the military capacity of the signatories and the strength of the treaty terms may not always be correlated. Based on theoretical arguments about the correlates of alliance strength and using observable characteristics of an alliance and the associations between these characteristics, we measure the strength of every available military alliance – including multilateral alliances and 2 alliances without a treaty – signed between 1816 and 2000.1 Our approach is similar to the approach taken by scholars interested in measuring the ideology of elected and unelected officials (e.g., Poole and Rosenthal 1997; Martin and Quinn 2002; Clinton, Jackman and Rivers 2004), the positions of a political party in an underlying policy space (Budge et. al. 2001), the ideology of a congressional district in the United States (Levendusky, Pope and Jackman 2008), the extent to which a country is democratic (Pemstein, Meserve, and Melton 2010), or the positions taken by a country in the United Nations General Assembly (Voeten 2000). We use the formal terms of the alliance agreement and characteristics of the countries involved at the time of the signing to construct a measure of alliance strength based on the underlying associations of the observable measures. We make several contributions. First, we show how a Bayesian latent trait model (Quinn 2004) can recover a theoretically informed, multi-dimensional estimate of alliance strength that reflects both the terms of the formal alliance agreement and the characteristics of the signatories. Our measure can quantify the influence of various treaty terms on the strength of an alliance – a task that has previously eluded scholars interested in alliances and forced existing work to rely on coarse indicators that cannot discriminate between alliances belonging to broad classifications (e.g., “defensive alliances”) and we can show that, across all alliances, there is almost no relationship between the strength of the involved signatories and the strength of the formal terms. Second, because we provide a measure for a concept for which we lack an agreed upon measure, our estimates allow us to test existing intuitions about alliance strength. For example, our measure allows us to compare the relative strength of alliances and so doing reveals clear evidence of “balancing” among post-war East Asian alliances. Moreover, whereas prior work interested in the impact of treaty strength on durability was forced to rely on proxy measures due to the lack of a systematic manner of measuring what various treaty terms imply about the strength of an alliance, our measure of strength confirms that alliances with weaker formal obligations are also less durable in times of war. Confirming core intuitions not only help establish the validity of our measure, but it illustrates the potential of our measure for additional questions. Third, because any assessment of alliance strength 3 is inherently ambiguous, we can quantify how certain we are about the resulting estimates (Jackman 2009b). Fourth, our method is sufficiently general that we can extend the model to all alliances treaties – including multilateral alliances and alliances without a target. Finally, although we think our estimates are based on strong theoretical foundations and possess strong conceptual validity, the statistical measurement model empowers scholars to construct their own measures if their questions of interest are sufficiently di↵erent or if they choose to make alternative assumptions about the underlying relationships. In validating our measure using qualitative and quantitative information, we are able to draw several substantive conclusions. First, we show that there is a very weak relationship between signatory strength and treaty strength. This finding suggests that relatively weak signatories may occasionally wish to strengthen their alliance through treaty terms that are designed to expand their combined capabilities through costly peacetime military coordination and sweeping wartime obligations. On the other hand, relatively strong alliance signatories may wish to curtail allies’ access to the aggregate capabilities of the alliance by designing provisions that make it costless to escape or that restrict conditions for intervention. Second, we also show that prominent alliances within a theatre of operations show some evidence of “balancing” both in terms of the strength of the signatories involved and the commitment created by the treaty. Confirming our estimation of the strength of some prominent alliances against our historical expectations lends face validity to the measure. Finally, we use our measure to replicate existing work where previously only indirect measures of treaty strength were used to analyze the reliability of alliances. Applying our measure to an existing question where it would be of value further confirms its usefulness and the validity of the estimates that we recover. This last point is important because whereas existing work is forced to “black box” the measure of treaty strength and use characteristics that are thought to predict treaty strength, the estimate we provide is a direct measure of the concept of interest. The outline of the paper is as follows. Section 2 briefly recaps the extensive literature dealing with the strength of international alliances to extract the primary dimensions that scholars have identified as influencing the strength of an alliance – the terms of the alliance 4 and the characteristics of the signatories. Section 3 describes the Bayesian latent variable model we use to measure alliance strength and it describes the observable characteristics we use to estimate our two-dimensional estimate of alliance strength. Section 4 establishes the predictive validity of the measure and confirms Leeds’ (2003) findings that alliances with weaker treaty terms are more likely to be violated in times of war. Section 5 concludes by discussing the possible uses and extensions of both our estimates of alliance strength and also the Bayesian latent variable measurement model we employ. 1 Conceptualizing Alliance Strength For many prominent research programs, a common way to conceptualize the strength of an interstate alliance is to consider the military capacity that can be generated as a result of the joint cooperation of the allies in an interstate war. This notion entails a calculation of the sum capacity of the signatories. Balance of power theories privilege the role of alliances in world politics by considering that the distribution of power that results from the joint military strength of competing alliance networks a↵ects the stability of the international system (Morgenthau 1948, Organski 1968, Waltz 1979, Walt 1987). There is a long debate, which has yielded mixed results in the empirical literature, about the e↵ect of alliances in various theories of the distribution of power and stability.2 In these empirical analyses, scholars typically utilize some measure of aggregate capabilities of alliance signatories to approximate the strength of an alliance. The same conception of alliance strength is relevant for research on the deterrence e↵ect of military alliances. Theories of alliances and deterrence examine how the combined potential military force of the allies might a↵ect an adversary’s calculation to challenge one of the allies (Morrow 1994, Smith 1995, Zagare and Kilgour 2003, Yuen 2009, Benson 2012, Benson et al 2013b). Empirical investigations of the relationship between alliances and deterrence have been forced to rely on coarse indicators of the presence of an alliance and then control for factors that may be related to alliance strength such as a signatory’s capabilities and its 5 presence in alliance networks (Leeds 2003b, Benson 2011). One reason for this approach is that recent studies of the deterrent e↵ects of alliances has sought to examine how the content of an alliance also a↵ects conflict holding fixed factors related to the military capacity of the signatory.3 Scholars have long recognized that the terms of the promises made between countries can a↵ect the expected military strength of their commitments to one another.4 By including provisions that require signatories either to pay ex ante costs to facilitate military coordination or to behave in particular ways subject to the payment of costs for reneging, alliance members can design their agreements either to enhance or to limit the expected strength of their alliance. A non-aggression pact, for example, should limit expectations of potential military collusion between signatories while, by contrast, an o↵ensive or defensive pact should raise expectations of the potential for an alliance member to make full use of the combined military capabilities of the alliance under particular circumstances. Additionally, many alliance agreements require signatories to pay costs prior to war by taking such measures as integrating military command, exchanging military and economic aid, and establishing military bases in one another’s territory. Paying such costs in advance of war may create cooperative synergies between signatories that expand their alliance strength beyond what is estimated by a raw calculation of their summed military capacity (Morrow 1994). If the terms of a treaty matter, focusing only on the military capacity of the involved countries will fail to account for the di↵erences in treaty terms that can expand or limit strength as well as increase or decrease the likelihood that the military capacity of the involved countries will be exercised to further the objectives of the alliance. In spite of the relevance of treaty content to the overall strength of an alliance, scholars lack a measure that might advance empirical research on many important questions. As we have seen in the deterrence literature, a standard work-around for the missing measure is to use coarse measures of di↵erent types of alliance provisions that might serve as indicators of strong military treaties with high expectation of ally intervention or powerful military coordination. Another approach is to consider alliance content indirectly by examining correlates of treaty strength. For example, because a good measure of treaty strength does not exist, Leeds (2003a) 6 is forced to study the reliability of alliances by analyzing the relationship between signatory characteristics thought to correlate with treaty content, and whether signatories deliver on their promises during wars. In this study of reliability – as well as other important research programs such as deterrence, the balance of power, and determinants of alliance formation – a measure of alliance strength that accounts for many of the factors that theorists believe a↵ect alliance strength would be most valuable. Given that we think that alliances di↵er both in terms of signatories involved and the particular terms of the alliance, we seek a measure of alliance strength that reflects these two notions of potential military capacity while also accounting for the inevitable uncertainty of any such measure given the concepts’ inherent ambiguity. As a starting point, it is natural to categorize existing available measures of alliance strength according to two separate dimensions. One dimension, signatory characteristics, consists of particular qualities of the allies that contribute to the military capacity of the signatories if there is a war and all the allies cooperate and join the war. The second dimension, treaty characteristics, includes the provisions of a military alliance that limit or enhance the ability of the signatories to jointly mobilize their combined military capabilities to fight a war as well as the expectation that they will make their military capacity available. The first dimension gives a measure of the total adjusted potential military strength of an alliance, but the actual military strength of the alliance is only a fraction of the total potential strength if the treaty provisions restrict the amount of capabilities that any signatory can expect to gain access to if there is a war. That is, the first dimension can be interpreted as providing an estimate of the total potential adjusted military capacity of an alliance, while the the second dimension characterizes how much of the potential military capacity the signatories agree to make available for the benefit of the alliance and whether additional military coordination can expand the total adjusted military capacity. While there may certainly be a relationship between the two dimensions, nothing requires a relationship to exist. It is certainly possible for two countries to enter into a variety of arrangements that entail a di↵ering level of commitment to one another. 7 1.1 Measuring Signatory Strength To aggregate signatories’ military capabilities, scholars typically sum the capabilities of the alliance partners using the Composite Index of National Capabilities (CINC scores) (Bueno de Mesquita 1983; Reiter 1996; Wagner 2007). While aggregate capabilities provide one estimate of the raw potential military capability of the alliance, other factors related to specific characteristics of the signatories might also enhance or constrain the military capacity of the alliance. The presence of a major power in an alliance may also a↵ect the overall military capacity of an alliance. Scholars claim that major powers possess unique characteristics that give such alliances a distinctive military advantage – e.g., they possess significantly greater economic resources, have more economic and security interests, possess advanced weapons systems (such as nuclear weapons in the post-WWII era), and influence in international institutions such as the United Nations Security Council (Gibler and Vasquez 1998). Although scholars generally agree that major powers are qualitatively distinct from other powers, measuring their impact on the military capacity of an interstate alliance is not straightforward – scholars typically use a separate indicator to control for the presence of a major power (Levy 1981, Siverson and Tennefoss 1984, Morrow 1991, Leeds 2003a and Benson 2011). Distances between allies may degrade the signatories’ combined capabilities because of the cost related to projecting military forces and coordinating long-distance military actions (Boulding 1962; Starr and Most 1976; Bueno de Mesquita 1983; Bueno de Mesquita and Lalman 1986; Smith 1996; Weidmann et al. 2010; Bennett and Stam 2000b, Poast et al 2012).5 The distance to a target country may also be relevant for assessing the strength of some alliances, but given the difficulty of identifying threats and the aspiration to estimate the strength of alliances lacking a specified threat we omit this variable.6 The size of the alliance may also a↵ect its military capacity. Multiple signatories may provide advantages in conflict bargaining, yield potential gains from division of labor and specialization, and enhance the credible use of allies’ military capabilities beyond the additive advantage of simply summing individual military capabilities. However, the relationship is 8 somewhat unclear as more signatories may complicate logistical coordination and increase the chances of that allies’ opinions and interests will diverge. Finally, the commonality of security interests may a↵ect the resolve of signatories to contribute in a war. One common approach is to use “s-scores” to measure of the closeness of foreign policy interests (Signorino and Ritter 1999) based on the similarity of countries’ alliance portfolios. Another measure of shared interests is the commonality of regime type; shared domestic political regimes may produce stronger alliances if security interests are also shared or if domestic audience costs make jointly democratic alliances are more credible (Lai and Reiter 2000; Leeds et al. 2002; Gibler and Sarkees 2004; Leeds et al. 2009; Mattes 2012). On the other hand, the relationship is not entirely clear because democracies may prefer not to ally with one another (Simon and Garzke 1996; Gibler and Wolford 2006) because the vetopoints created by domestic political institutions may create difficulties for taking action (e.g., Tsebelis 2002) or because election-induced leadership turnover may make them unreliable (Gartzke and Gleditsch 2004). 1.2 Measuring Treaty Terms Many treaty provisions are thought to reflect the strength of an alliance. Despite a wealth of relevant and available data detailing various treaty provisions collected by Alliance Treaty Obligations and Provisions (Leeds et. al. 2002), the various provisions do not necessarily directly reflect the strength of a treaty and it is unclear how the various features interact to a↵ect signatories’ ability to access the potential military strength of the alliance. To identify and estimate the second dimension we rely on the insights of scholars who argue that some types of alliances are stronger than others either because di↵erent types have more or less impact on deterrence (Benson 2011, Benson 2012, Benson et al 2013a, Leeds2003b) or because the type of agreement e↵ects the likelihood signatories will intervene (Leeds2003a, Sabrosky 1980, Siverson and King 1980, Smith 1996). To measure the influence of the formal terms of an agreement on alliance strength, we use ATOP data (Leeds et al. 2002) and 9 Benson’s (2011) typology. Alliance agreements are coded in the ATOP data as being o↵ensive, defensive, neutrality, consultation, and non-aggression. We allow for the di↵erent alliances to impact alliance strength di↵erently, but we do not assume anything about the ordering of alliance strength between these typologies – e.g., we assume that there are similarities within o↵ensive treaties and non-aggression pacts, but we do not impose any a priori assumptions about which is stronger. Because an alliance agreement can contain multiple provisions, we also want to allow for the di↵erential impact of non-exclusive designations. Benson’s (2011) typology, for example, is based on the expressed objective of the provision to provide military assistance and whether the obligation to deliver military assistance is guaranteed and conditioned on an action in a dispute and conditions limiting the application of military force to specified situations may a↵ect the strength of an alliance. To allow for the possibility that an unconditional guarantee of military support in any circumstances may imply a di↵erent amount of strength than a commitment of support only if an adversary attacks and an alliance member did not provoke (or a promise to possibly only intervene), we allow for treaty commitments containing both compellent and deterrent objectives to di↵er from those containing just compellent or just deterrent objectives. To be clear, while we want to permit these types of alliances to di↵er, we do not impose any assumptions about how they may di↵er. Many other aspects are plausibly likely related to the strength of the formal terms of an alliance. In particular, we examine the impact of whether: there are mentions of the possibility of conflict between the members of the alliance (CONWTIN ), an integrated military command (INTCOM ), the exchange of economic aid (ECAID), the exchange of military aid (MILAID), provisions for an increase or reduction of arms (ARMRED), and joint troop placements (BASE ). We also account for whether: the formal obligations vary across the alliance partners (ASYMMETRY ), whether it was formed in secret (SECRECY ), whether it was formally ratified (ESTMODE ), whether it formed an organization or required formal meetings between the signatories (ORGAN ), whether it allows a signatory to renounce obligations under an alliance agreement during the term of the agreement (RENOUNCE ), whether the 10 obligations are conditional (CONDITIO), and whether the alliance provided for a specific term (SPECLGTH ).7 2 A Statistical Measurement Model The issues scholars confront when attempting to measure whether an alliance presents a formidable obstacle to potential assailants because of preferences, circumstances and military capacity, or the extent to which the formal terms of an alliance agreement bind the signatories together are issues that are endemic to social sciences. How do observable features relate to the unobservable strength of an alliance? We may know an alliance is o↵ensive and commits signatories to establish bases in each others’ territories, but is this stronger than a defensive alliance that establishes an integrated military command? To motivate the exposition, consider the task of measuring alliance strength – hereafter denoted by x⇤ – using the observable indicator variables x1 and x2 (e.g., major power signatory, o↵ensive alliance). One possibility is to chose a single characteristic to proxy for alliance strength. Using either x1 or x2 to measure x⇤ is potentially problematic because so doing ignores information in other characteristics. The terms of the average “o↵ensive” alliance may reflect stronger commitments than the terms of an average “nonaggression” commitment, but there is still important variation within each alliance type; the “Pact of Steel” and the 1816 alliance between the Netherlands and Spain are both “o↵ensive” alliances, but the strength of the commitments implied by the former are considerably stronger.8 Creating an index based on multiple characteristics does not solve the problem because there is often no theoretical guidance for combining measures and it is hard to interpret the resulting scale. Adding characteristics to create an index makes extremely strong assumptions – even if the indicator variables x1 and x2 are both related to the strength of an alliance, on what basis can we conclude that an alliance possessing only characteristic x1 is as strong as the alliance that possess only characteristic x2 ? Moreover, is the alliance containing both x1 and x2 twice as strong as an alliance containing one feature but not the other? It seems difficult 11 to rationalize the relationships that are assumed by an additive index, and such assumed equivalences only increase as the number of variables used to construct the measure increase. If the goal is to predict the e↵ects of alliance strength on an outcome of interest – y – we can use the regression specification to control for multiple features of an alliance. For example, if we are predicting the e↵ect of alliance strength on outcome y, the typical regression specification is y = ↵ + 1 x1 + 2 x2 which allows the left-hand side to measure alliance strength – as a linear function of x1 and x2 – and its relation to y.9 Note that including multiple measures in a regression changes measurement issues into specification issues and the degrees of freedom that analysts have may be quickly reduced given the number of potential indicators of alliance strength. Interpreting the e↵ects from such a saturated regression model (Ray 2003; Achen 2005) make also be difficult, particularly if the model includes multiple interactions (Braumoeller 2004; Brambor, Clark, and Golder 2006). A shortcoming of all three approaches is that they fail to reflect our uncertainty about how the observed concepts relate to the underlying dimensions of alliance strength and the precision with which we are able to estimate the strength of an alliance. A Bayesian latent variable model provides a framework for measuring alliance strength that uses the information contained in the many measures that researchers have already collected that are plausibly related to the strength of an alliance while also allowing researchers to make weaker assumptions about the nature of the relationships involved. Non-Bayesian methods are certainly available (e.g., Bollen 1989), but for both theoretical (see the arguments of Gill 2002 and Jackman 2009a) and practical reasons we adopt a Bayesian approach. Unlike a frequentist approach, a Bayesian latent variable approach allows us to directly measure the precision of the resulting estimates using the posterior distributions of estimated parameters. To focus our exposition, suppose we are interested in measuring the strength of alliance at the time of its founding and let x⇤i denote the unobserved strength of alliance i. Our task is to use observable characteristics that are theorized to correlate with the strength of alliance i to construct an estimate of x⇤i that not only describes the relative strength of the alliance relative to other alliances but also shows how much uncertainty we have regarding our 12 estimate of alliance strength. Suppose further that we have k 2 1...K observable measures of alliance strength, and let the observed value for variable k for alliance i be denoted by xik . Our observed measures may include continuous, binary and ordinal measures.10 Similar issues arise when using observed characteristics to measure how “democratic” a country is or how “liberal” a district or a member of the US Congress is. We observe characteristics that are related to the concept of interest, and we must use the observed characteristics and a statistical measurement model to make inferences about the latent traits. Bayesian latent variable models provide a statistical measurement model that are able to extract the latent dimensions that are assumed to be responsible for generating the association between and within the distribution observed characteristics (see, for example, Quinn 2004; Jackman 2009b), and scholars have used related models to measure latent traits critical for studying the politics of the United States (e.g., Clinton and Lewis 2008; Levendusky and Pope 2010) and comparative politics (e.g., Rosenthal and Voeten 2007; Rosas 2009; Pemstein, Meserve and Melton 2010; Treier and Jackman 2008; Hoyland, Moene and Willumsen 2012), but scholars have only recently begun to apply the models to concepts in international relations (see, for example, Schnakenberg and Fariss (2009) and Gray and Slapin (2011)). Figure 1 provides a graphic representation of the measurement model for the case of 3 measures. The model assumes x⇤i is related to xi1 , xi2 , and xi3 across all alliances, but the relationship may di↵er between variables. For example, xi1 and xi2 may not be identically related to x⇤i , and these di↵erences are captured by: 1, 2, 2 1, and 2 2. Given the number of parameters to be estimated, recovering the latent measure of alliance strength (x⇤ ) from the matrix of observed characteristics x requires some additional structure. The structure we use is provided by a Bayesian latent variable specification (see, for example, Jackman 2009a,b). For all alliances i 2 1...N we assume: xi ⇠ N ( k0 + ⇤ k1 xi , 2 k ). (1) The measurement model of equation (1) assumes that the observed correlates of alliance 13 β1$ σ12$ β2$ xi1$ σ22$ β3$ xi2$ σ32$ xi3$ x i *$ Figure 1: Directed Acyclic Graph: Bayesian latent variable model: Circles denote observed variables, squares denote parameters to be estimated. strength x are related to alliance strength in identical ways across the N alliances, but di↵erent measures may be related to alliance strength in di↵erent ways. Not only may the mean value of xk and x⇤ di↵er (as will be reflected in the estimate of k0 ), but the the scale of the observed variable and the latent variable may also di↵er (captured by k1 ). k1 > 1 implies that a one-unit change in the latent scale of x⇤ corresponds to more than a one-unit change in the observed measure xk , k1 < 1 implies that a one-unit change in the latent scale corresponds to less than a one-unit change in the observed measure, and k1 < 0 implies that the orientation of the observed and unobserved measures are “flipped” (i.e., positive values of xk correspond to negative values of x⇤ ). Moreover, if an included measure is unrelated to the latent trait revealed in the other included measures, the model can also account for that possibility – k1 = 0 means there is no relationship between x⇤ and xk . The model also allows the relationship to be more or less precise; the 2 k term allows varying amounts of error in the mapping between the observed and unobserved variable. Finally, because we estimate a 14 version of equation (1) for each of the K observed measures, we allow for the relationship to vary across observed traits, and we can use all available measures to help uncover the underlying latent trait. Note that these assumptions are silent about causality – nothing requires that the latent trait x⇤ causes the observed phenomena or visa-versa. All that is assumed is that, for whatever reason, there is a correlation between the observed and unobserved traits and that we can therefore use this correlation and the relationship between the observable traits to learn about the unobserved trait. For example, the Unified Democracy Scores of Pemstein, Meserve, and Melton (2010) measures “democracy” using 12 existing expert assessment even though the analyzed expert assessments certainly do not “cause” democracy – the assessments are all likely correlated with the extent to which a country is democratic because they are presumably based on observable manifestations of what the expert thinks is indicative of a democratic state. Similarly, the work of Levendusky, Pope and Jackman (2008) uses various aspects of a congressional district that are related to the underlying ideology, but which do not cause it. Another strength of this approach is that we can use the observed data and the specification of equation (1) to recover estimates of both the latent trait x⇤ (sometimes called the “factor score”), but also the extent to which the observed matrix of variables x are related to the latent trait k (i.e., the coefficient matrix sometimes called the “factor loadings”). As a result, we can characterize both the latent strength of alliances as is revealed in the matrix of observable characteristics, and also which of the observed characteristics are most influential for structuring the latent trait that is recovered. Given the the unknown parameters x⇤ and that are to be estimated from the observed covariate matrix x, the likelihood function that is to be maximized is given by: ⇤ ⇤ L (x , ) = p (x|[x , ]) / K ⌃N i=1 ⌃k=1 ✓ xi ( k0 + k ⇤ k1 xi ) ◆ (2) where (•) is the pdf of the normal distribution. To complete the specification and form the posterior distribution of the factors x⇤ and factor loadings 15 , we assume the typical di↵use conjugate prior distributions.11 As specified, the model is unidentified. Because every parameter in equation (2) except for xi has to be estimated, it is possible to generate an infinite number of parameter values that yield the same likelihood by appropriately adjusting k0 , k1 , x⇤i and k. As Rivers (2003) shows, in one dimension, two constraints are required to achieve local identification and fix the scale and location of the space – the orientation of the space can be fixed by constraining a factor to be positively or negatively related to the latent trait. Typically, this involves assuming that the mean of x⇤ is 0 and the variance of x⇤ is 1 (see, for example, Clinton, Jackman and Rivers 2004). In multiple dimensions the number of required constants increases to d(d + 1) where d denotes the dimensionality of the latent space. Given the discussion of section 1, we seek to estimate alliance strength (x⇤ ) in two dimensions; let x[1]⇤i denote the latent strength of the alliance in the first dimension – with estimates ˆ – and let x[2]⇤ denote the latent strength in the second dimension (with estigiven by x[1] i i ˆ ). To identify the center of the latent parameter space, we assume that the mean mates x[2] i of x[1]⇤ and x[2]⇤ are both 0. This assumption is innocuous and it centers the unobserved latent space. To fix the scale of the recovered space, we assume that the variance of x⇤ [1] and x⇤ [2]⇤ are both 1. To fix the rotation of the policy space and prevent “flipping”, we assume that higher values of the summed capacity of signatories correspond to positive values in the first dimension, and o↵ensive alliances receive positive values in the second dimension. We do not need to know the precise nature of the relationship between the observed characteristics and the strength of the alliance to implement the model, but we do need to identify which measures are, and are not, related to each of the two dimensions we are interested in. For every characteristic pertaining to the written terms of the alliance we assume that [1]=0, and for every characteristic related to the alliance partners themselves we assume that [2]=0. That is, characteristics related to the signatories themselves determine only the first dimension, and characteristics of the formal agreement a↵ect alliance strength only in the second dimension.12 To be clear, we are not assuming anything about how alliances are located within the 16 two dimensions we recover. In fact, a question of substantive interest is how x⇤ [1] and x⇤ [2] are related – which is why we identify the dimensions by placing constraints on rather than by making assumptions about the relationship between x⇤ [1] and x⇤ [2]. Because we identify the latent dimensions using characteristics of the alliances rather an assumption about the relationship between the latent dimension, our measurement model can shed important insights into the relationship between the formal terms of an alliance and the characteristics of the signatories and reveal whether stronger signatories systematically form alliances with stronger or weaker formal terms. Given these measures and identification constraints, we use the Bayesian latent factor model that can accommodate both continuous and ordinal measures described by Quinn (2004) and implemented via MCMCpack (Martin, Quinn, and Park 2011). We use 100,000 estimates as “burn-in” to find the posterior distribution of the estimated parameters, and we used one our of every 1,000 iterations of the subsequent 1,000,000 iterations to characterize the estimates’ posterior distribution. Parameter convergence was assessed using diagnostics implemented in CODA (Plummer et. al. 2006). The Appendix summarizes the result of estimating measurement models using slightly di↵erent specifications, but estimates from the di↵erent specifications correlate in excess of .95. 3 Estimates of Alliance Strength Our Bayesian latent variable model of alliance strength produces estimates about the strength of alliances in each of the two theoretically derived dimensions and how the various observable features are related to the dimensions that we recover. Both are of interest in assessing the validity of the resulting estimates and we validate our measures in three ways. First, we examine the estimates for alliances that are known to vary in each of the two dimensions to see if our scores reflect known variation. We focus on the entire population of alliances as well as some prominent alliances whose strength can be ascertained based on careful historical and qualitative work using “out-of-sample” information. The scores we 17 recover provide sensible orderings of prominent alliances even though no prior information was used to identify the strength of the alliances we investigate. Second, we explore how the various correlates of alliance strength mentioned in section 1 relate to the estimates we generate. These relationship provide yet another validity check by revealing which characteristics are a↵ecting the variation in alliance strengths that we uncover and whether the correlates are sensible given our prior beliefs. In Section 4, as a check on the predictive validity of the measure, we replicate Leeds’ (2003a) work that argues that alliances with weaker treaty terms are more likely to be violated in times of war. This investigation is important because it serves as an important validity check of our measure, and also because it highlights a gap in the scholarship that our measure fills – whereas existing work uses a series of variables to try to approximate the binding nature of a treaty, our measure provides a single point estimate that enables research on a number of new and existing questions. 3.1 Depicting Alliance Strength Figure 2 plots the distribution of estimates from the measurement model described in Section 2 along the dimensions defined by signatory strength (x-axis) and formal treaty terms (y-axis). A score is estimated for each of the 587 alliances signed between 1816 and 2000 for which we have data on the observable characteristics (plotted in grey), but we focus our attention on a few selected alliances to illustrate the face validity of our estimates. (The online appendix contains the full set of estimates and standard errors.) As Figure 2 reveals, one of the strongest alliances in terms of both signatory characteristics and treaty terms is the Allied agreement in World War II. This alliance is a joint declaration by 39 countries, including the United States, Russia, the United Kingdom, and China, to devote their full resources, military or economic, against those members of the Tripartite Pact and its adherents with which such government is at war. There are no conditions or termination dates imposed on the terms of the agreement – it is a sweeping declaration of war, o↵ensive 18 3 U.A.E−Yemen (1958) 1 0 Characteristics of Agreement 2 WWII Allies NATO −1 Helsinki Final Acts (1975) Belarus−Bulgaria (1993) −2 0 2 4 Characteristics of Signatories Figure 2: Distribution of Alliance Scores, 1815-2000 Points denote the posterior mean of the estimated alliance strength of each of the 587 alliances we analyze. The ellipses denote the 95% regions of highest posterior density for the selected alliances. and defensive, by the most powerful coalition in the international system. An alliance that is strong in terms of the signatories involved, but which has weak treaty terms is the Helsinki Final Acts signed in 1975. Thirty-five countries signed the Accords – including the United States, the USSR, the UK, and most of Europe – and together the signatories possessed the preponderance of military strength on earth at the time. This strength is reflected in the fact that the signatories of the Helsinki Final Acts are estimated to have more combined military capacity than any other alliance in the sample. The signatories’ combined military strength is o↵set, however, by the fact that the obligations of the treaty are very weak. The main objective of the agreement is to set forth a bargain between respecting territorial boundaries and human rights and the terms (especially the military obligations) 19 are non-binding. Because the agreement lacks the legal status of a formal treaty and it would not be governed by international law, scholars classify the Helsinki Final Acts as an example of soft law (Abbott and Snidal 2000). Reassuringly, our measure estimates the Helsinki Final Acts to be among the weakest alliances on the dimension of formal treaty terms. One of the weakest alliances in both treaty terms and signatory characteristics is the Belarus-Bulgaria alliance of 1993. This alliance was a bilateral treaty that reaffirmed the nonaggression promise made in the Helsinki Final Act. In addition, the signatories pledged to refrain from using force in their international relations, to consult with one another when their security has been breached, and to remain neutral in any hostilities that may be directed at the other alliance member. The treaty states that it is to be in e↵ect for a period of 20 years, but either side may unilaterally terminate the agreement with a one year advance notice. As the confidence intervals of Figure 2 make clear, the strength of the formal terms are statistically indistinguishable from the Helsinki Final Acts as we would hope, but the strength of the signatories is far weaker. In contrast, the 1958 United Arab Republic (UAR) is an example of a strong formal agreement among weak signatories. The UAR was formed as an e↵ort to unite the Arab community against the expansion of communism in Syria and elsewhere in the Arab world (Walt 1987, pp. 71-80). It included Egypt, Syria, and Yemen, though Yemen’s inclusion was regarded as merely a cosmetic gesture (p. 72). The agreement called for the integration of the allies militaries and unified command over those forces. Gamal Abdel Nassar, former President of Egypt and the President of the United Arab Republic, insisted on a full union and control over both countries in exchange for his agreement to halt the rising influence of the Syrian Communist Party. Consequently, the terms of the agreement granted full military power of the signatories to a unified command and authorized a Commander-in-Chief to pursue the unified foreign policy drawn by the Union, which could extend to both defensive and o↵ensive campaigns. Nassar responded by seizing control of Syria and banning all political parties.13 While the location of these alliances along both dimensions o↵er assurance of the face validity of the measure, many other important and interesting insights come to light as a 20 result of the measure. First, we can now explore the relationship between the two dimensions of alliance strength to ask if stronger signatories form stronger alliances or not. Because we assume nothing about the relationship between the two dimensions that we estimate, we can compare how the estimates correlate to answer this important question. We find that there is a weak positive relationship – implying that as the strength of the signatories increases so too does the strength of the formal terms of the alliance – but the relationship is relatively modest (correlation of .256). Moreover, it is possible to make interesting comparisons by using variation in the two dimensions. For example, the strength of the signatories of the Helsinki Final Acts are stronger than the NATO alliance (because of the addition of Warsaw Pact countries such as the USSR in the Helsinki Final Acts), but the terms of the NATO alliance are far stronger as we might expect given the divergent preferences of signatories of the Helsinki Final Acts. Second, there is a great deal of heterogeneity between alliances – even between alliances belonging to the same traditional “type.” Figure 3 graphs the distribution of estimates within an ATOP alliance categorization and reveals that although o↵ensive alliances are, on average, associated with higher estimates than defensive alliances, there is considerable variation within each grouping. It is important to highlight the fact that the variation evident in Figure 3 is variation that existing measures cannot easily characterize. This variation can be used to explore many old and new questions – e.g., why are some defensive alliances stronger than others?14 Third, a strength of our measurement model is the ability to account for the uncertainty that we have about the estimates themselves. For every alliance, we can quantify how certain we are about the estimated strength of the alliance in each of the dimensions we estimate. Moreover, the uncertainty may vary across alliances and across dimensions. For example, as Figure 2 illustrates, while we can distinguish the 1993 nonaggression alliance between Belarus and Bulargia from the 1975 Helsinki Final Acts in terms of the strength of the signatories because the 95% regions of highest posterior density do not overlap on the first dimension, we cannot be certain that the formal terms of the two alliances are distinguishable. Because our 21 0.0 0.2 0.4 0.6 0.8 Density Offensive −1 0 1 2 3 4 2 3 4 3 4 0.4 0.2 0.0 Density 0.6 Defensive −1 0 1 2 0 1 Density 3 4 Non−Aggression −1 0 1 2 Figure 3: Distribution of Treaty Strength by Alliance Type, 1815-2000 Distribution of second dimension estimates by ATOP alliance category. Bayesian latent variable model recovers how precisely we are able to estimate the strength of an alliance, we can use such information to characterize the statistical confidence we have in our assessments. This is important because we have more confidence in our ability to distinguish alliances according to the strength of the signatories than we do using the terms of the agreement. 3.2 Alliances in the World Wars and East Asia To take a closer look at the estimates and further explore the face validity of our estimates, we examine the alliances involved in World War I, World War II, and the post-WWII alliances in East Asia for which we have strong priors regarding their relative ranking in each dimension.15 22 Figure 4 plots the estimated strength of each of the relevant alliances in terms of the first dimension (left graph) and the second dimension (right graph) in temporal order. 0 1 2 3 4 5 0 Dimension 1 (Signatory Strength) 1 2 3 Dimension 2 (Alliance Terms) China−North Korea (1961) USSR−North Korea (1961) US−Japan (1960) US−Republic of China (1954) US−South Korea (1953) c(1, 15) US−Japan (1951) USSR−China (1950) WWII Allies (1942) WWII Axis (1941) Pact of Steel (1939) WWI Axis (1915) WWI Allies (1915) Franco−Russian Alliance (1893) Triple Alliance (1882) 0 1 2 3 4 5 0 1 2 3 Figure 4: Selected Alliance Scores Estimated strengths for alliances involved in World War I, World War II, and post-WWII East Asian security in terms of the first dimension (left) and second dimension (right) are plotted. The points denote estimate in each dimension for each alliance and the lines show 95% regions of highest posterior density for the selected alliances. Consider first the alliances that were involved in World War II and which are plotted in the middle of Figure 4. As Figure 2 revealed above, the strongest alliance in the first dimension involved the alliance formed between the Allies in 1942. Notice that the 1940 Tripartite Alliance, which was targeted by the Allied Pact, is estimated to be weaker on both dimensions than the alliance formed by the Allies during World War II. This is reassuring given that it is a multilateral defensive pact signed during World War II between countries whose 23 combined capabilities are not as great as the Allied powers. Additionally, the terms of the defensive obligation are conditional upon one of the signatories being attacked by a party not involved in World War II at the time the alliance was signed. However, the antecedent for this defensive pact was the 1939 Pact of Steel between Germany and Italy. As an unconditional pledge to undertake shared o↵ensive and defensive military campaigns, it is on par with the Allied Pact in the strength of its agreement terms. The 1940 Tripartite Pact was replaced by a more aggressive agreement, which, like the Allied Pact, is also a wartime alliance containing similar terms. In Figure 4, it is in approximately the same position in the second dimension as the Allied Pact, indicating the similarity in the strength of the terms of the agreements between the opposing World War II alliances. Like the Allied Pact, the three signatories to the Tripartite Alliance pledged to use all means, o↵ensive and defensive, to pursue the war. The strength score for the terms of the Allied Pact is likely slightly higher because the Tripartite Pact specifies a termination date (a stipulation initiated in the Pact of Steel and passed down to the successor alliances). Even though the terms of the formal agreements are on parity, the measure of the combined strength of the Allies according to the members characteristics is greater than that of the Tripartite Pact. The estimated strength of the alliances involved in World War I reported in the bottom of Figure 4 also comport with prior expectations. There is clear parity in the prewar alliances. The 1882 Triple Alliance between Germany, Italy, and Austria-Hungary was similar to the 1893 Franco-Russian alliance both on alliance terms and signatory strength. The motivation for the Franco-Russia alliance may have been, as Snyder (1997) suggests, a desire by France and Russia to gain parity of strength with the growing relative strength of the Triple Alliance. The terms of the wartime treaty signed by France, Russia, the United Kingdom, and Italy is similar to the opposing declaration agreed to by Germany, Austria-Hungary, and Bulgaria. However, the addition of the United Kingdom and Italy to the alliance with France and Russia shifted the signatory strength significantly in favor of the Allies. The similarities of the opposing alliance systems in WWI and WWII are noteworthy given recent evidence that opposing continental alliances are especially prone to balance each other (Levy and Thompson 24 2010). The estimates of post-WWII deterrent alliances in Asia also meet our expectations. The 1950 USSR-China alliance, the 1951 US-Japan alliance, the 1961 USSR-North Korea, and the 1961 China-North Korea alliance all obligate alliance partners to defend each other if a fellow ally is attacked. By comparison, the other three East Asian alliances in Figure 4 – 1953 US-South Korea, 1954 US-Republic of China, 1960 US-Japan – all contain provisions that that enable alliance members to escape their defensive obligations if there is war.16 In our measure, the former alliances are all estimated to have stronger treaty terms than the latter three. However, although these di↵erence comport with our prior expectations, they are statistically indistinguishable. Another feature of interest in Figure 4 is the parity in signatory strength between rival alliances. Comparing the 1950 USSR-China alliance to the 1951 US-Japan alliance, for example, we see that the signatory strength of each alliance is approximately the same. The US-South Korea alliance was signed after the Korean War to deter North Korea, China, and the USSR. The signatory strength of the US-South Korea alliance is within the confidence interval of the USSR-China alliance. The 1954 US-ROC alliance was signed during the first Taiwan Strait Crisis. It was also designed to deter China and the USSR while also restraining Chiang Kaishek. Its relatively weaker treaty terms reflect the motivation to restrain an alliance partner, and its signatory strength is on par with the USSR-China. Moreover, the parity of the East Asian alliances remained consistent even as the schism between China and the Soviet Union grew during the early 1960s. During this period, China and North Korea formed a separate alliance, as did the USSR and North Korea. The US and Japan renewed and revised the terms of their alliance. Even though the signatory strength of all the alliances decreased, the opposing alliances remained approximately similar to each other both in terms of signatory strength and treaty terms. The 1960 US-Japan alliance and the 1953 US-South Korea alliance are roughly on parity with the 1950 USSR-China alliance, the 1961 USSR-North Korea alliance, and the 1961 China-North Korea alliance. These empirical characterizations are consistent with claims that some alliances may be designed to balance 25 threats (Morgenthau 1948; Waltz 1979). An advantage of the Bayesian approach is that we can compute posteriors of any statistic of interest. For example, we are 94% certain that the 1951 US-Japan alliance is a stronger alliance in terms of treaty terms than the 1953 US-ROC alliance, and we can be 92% certain that the estimated strength of the 1951 US and Japan alliance is stronger than the 1960 US and Japan alliance. 3.3 Components of Alliance Strength To further validate our measure it is instructive to compare how the various measures relate to the estimate in each of the two dimensions we recover. This examination also highlights two advantages of the latent variable approach we take: 1) our measure of alliance strength reflects the characteristics of multiple measures without having to take a position on exactly what the relationship is, and 2) because our measure does not necessarily depend on a single proxy variable (e.g., “o↵ensive alliances”), we can recover variation between alliances who possess the same value for any particular measure. Figure 5 reports the relationship of variables that are assumed to potentially structure the first dimension – the dimension we interpret as designating the strength of the alliance signatories – and Figure 6 reports the relationship for variables that may potentially a↵ect the second dimension – the dimension we interpret as a↵ecting the strength of the commitments required by the treaty itself. Note that while we defined the dimensions by assuming that each variable a↵ects either the first or second dimension, we assumed nothing about how each variable structures these recovered dimension. Our measurement model imposes no constraints on the parameters graphed in Figures 5 and 6. A strength of a Bayesian latent variable model is that we can also assess the precision of these estimated relationships. For example, while we can be confident that the log of the summed military capacity of the involved signatories is positively related to the latent dimension we recover in the first dimension, there is no obvious relationship between the 26 −0.5 0.0 0.5 1.0 log(Sum. Mil. Capacity) ● Major Power Signs ● log(Ally Count) ● Avg. Distance btwn Signatories ● Avg. Polity IV ● Avg. SLGO ● −0.5 0.0 0.5 1.0 Figure 5: Dimension 1 Factor Loadings: “Signatory Strength”: Circles denote posterior mean, and lines denote 95% HPD regions. average Polity IV score of alliance signatories (Avg. Polity IV ). Substantively, Figure 5 reveals that the logged total military capacity of the signatories is the strongest determinant of the first-dimension estimate and that the estimate is also increasing in the the number of signatories (logged) and whether a major power signs. More novel are the correlates of the second dimension because unlike the case of the first dimension where scholars already have decent measure (e.g., the summed military capacity), the options available to scholars interested in assessing the strength of a treaty are far more crude. Figure 6 reveals the aspects of a treaty that a↵ect the estimated strength of an alliance according to our measurement model. We find that, as expected, unconditional alliances are estimated to be stronger, as are deterministic alliances, o↵ensive alliances, those that establish military bases, integrate the command of military forces, and provide for military exchanges. Non-Aggression Pacts are sensibly estimated to be the weakest, as are alliances with specified termination dates, and provisions for conflict within the alliance itself. 27 −0.5 Unconditional Deterministic Offensive Pact Compellent Defensive Pact Asymmetric Oblig. Military Bases Integrated Command Conditional Secrecy Military Aid Economic Aid Arms Reduction Renounce Consultation Pact Organization Neutrality Pact Formal Treaty Specified Lgth Conflict w/in Non−Aggression Pact 0.0 0.5 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 0.0 0.5 1.0 Figure 6: Dimension 2 Factor Loadings: “Formal Terms”: Circles denote posterior mean, and lines denote 95% HPD regions. The fact that the relationship between our estimates and these observed measures appears sensible increases our confidence in the validity of the estimates we recover, especially given the fact that we impose no relationship a priori. The variation our measurement model recovers is variation that exists in the data itself. 4 Alliance Strength and Reliability In spite of the face validity of our measure as demonstrated by comparisons of individual historical alliances, we seek to take the additional step of applying it to an estimation of 28 a model in which we might expect treaty strength to be a relevant measure. Consider, for example, the question of alliance reliability. Should we expect the strength of the terms of an alliance treaty to be related to a government’s decision to break its alliance commitment when its ally becomes involved in war? According to Leeds (2003a), the content of a treaty is critical, because it signals the costs signatories pay to form and break the alliance. The risk of violating an alliance decreases with the inclusion of provisions that strengthen an alliance by institutionalizing these costs. Such provisions include factors included in our measure of treaty strength, e.g., integrated military command, basing, the conditions of conflict under which the alliance comes into e↵ect, the actions that the members are required to take in the event of war, provision of aid to be provided, and conditions limiting the provision of assistance and intervention. The inclusion or exclusion of such provisions should predict whether an alliance is likely to be violated, but previous e↵orts have been forced to analyze this relationship only indirectly because, prior to now, no measure of the strength of treaty content has existed (Leeds2003a, Sabrosky 1980, Siverson and King 1980, Smith 1996). For example, Leeds’ (2003a) focuses on major powers, democratic signatories, and the change in these factors after alliance formation because of the expectation that these factors might a↵ect the content of an alliance when it is formed. A measure of the treaty strength like we provide allows us to analyze both the determinants of treaty strength and the e↵ect of the treaty contents on the likelihood that signatories deliver their promises when required to do so in war. To demonstrate the value of our measure, we evaluate both Leeds’ (2003a) determinants of treaty strength as well as the e↵ect of the strength of the treaty on the likelihood that signatories deliver their promises when required to do so in war. Let us first examine the first half of the analysis. Based on Leeds’ argument, we should expect democracies to be more likely to be more likely to form stronger alliances and major powers to be less likely to form strong alliances. The democracies and major powers examined in Leeds’ data are those whose treaty obligations have been activated by war. Thus, in replicating the data and model using our measure of treaty strength, we may restate the expectations as follows: (1) 29 democracies who might expect to be called upon to support an ally in war are more likely to form alliances with stronger treaty terms, and (2) major powers who might expect to be called upon to support an ally in war are more likely to form alliances with weaker treaty terms. Additionally, Leeds also analyzes the e↵ects of changes in signatories’ capabilities and domestic political institutions. These factors may be indirect determinants of reliability, because how an alliance is designed at the time it is formed may depend upon whether signatories expect their own or their alliance partners’ capabilities and domestic political institutions to change after the alliance is formed. Hence, to determine whether our measure of treaty strength is useful for investigating questions such as this, we can estimate whether Leeds’ four factors are related to our measure of treaty strength. Table 1: Correlates of treaty strength, 1816-1944. (1) Benson Only (2) ATOP (3) Both Democracy 0.6061⇤ (0.235) 0.4415⇤ (0.196) 0.5812⇤ (0.287) Major Power 0.4158+ (0.229) 0.2069 (0.239) 0.2865 (0.287) Change in Capabilities 0.6230⇤ (0.249) 0.5456⇤ (0.263) 0.8285⇤⇤ (0.257) Change in Domestic Political Institutions 0.1173 (0.299) 0.0243 (0.353) 0.3835 (0.264) 136 0.1254 136 0.0795 136 0.1430 N R2 Standard errors in parentheses + p<0.10, * p<0.05, ** p<0.01 Table 1 shows the estimation of treaty strength using the four factors in Leeds (2003a). We report the results of three models using three di↵erent measures of treaty strength to indicate the robustness of the measure to the inclusion of di↵erent factors that might be indicative of the strength of a treaty. The measure of alliance strength in Model 1 excludes the ATOP classifications of o↵ensive and defensive alliances and instead uses Benson’s (2011) measures of treaty categories based on whether the alliance is deterrent, compellent or probabilistic in 30 its commitments. In contrast, Model 2 excludes these measures and relies only the ATOP categories. The measure of treaty strength analyzed using Model 3 includes both sets of measures. The appendix compares the measures in greater detail but, as is shown below, the choice of measure does not greatly a↵ect the substantive conclusions. The results of Table 1 confirm Leeds’ (2003a) argument regarding the determinants of treaty strength and they do not depend on the particular measure of treaty strength that is used. Democracies tend to favor forming stronger treaties, and signatories that anticipate a shift in one of the alliance members’ future capabilities tend to favor weaker treaty terms. The exceptions to Leeds’ expectations are the e↵ect of major powers and the anticipation of shifts in domestic political institutions after signing an alliance. However, major power is, as Leeds argues, negative in all three models, and it is significant at p<0.074 level in Model 1. The consistency of these results with Leeds’ argument and expectations gives us further confidence in the predictive validity of our measure of formal treaty strength – a measure that we have previously lacked. Table 2: Treaty strength and alliance commitment violation in war, 1816-1944. (1) Benson Only (2) ATOP (3) Both Treaty Strength 0.4509+ (0.241) 0.3700 (0.196) 0.6441⇤⇤ (0.198) Ally is Original Target in War 1.320⇤ (0.239) 1.286⇤ (0.577) 1.398⇤ (0.596) N Chi2 136 7.06 136 6.31 136 15.12 Standard errors in parentheses + p<0.10, * p<0.05, ** p<0.01 Our second check includes evaluating the application of the measure of treaty strength to the likelihood that governments renege on their alliance commitments. Leeds argues that stronger treaties are costlier to break and we should therefore expect that our measure of treaty strength is negatively correlated with the likelihood of violation. Table 2 presents the results of a replication of the logit estimation found in Leeds (2003) after replacing her four 31 indirect measures of treaty strength with our direct measure and including her variable for whether the ally is the original target in a war (which Leeds argues is important for controlling for the selection e↵ect related to challenger’s propensity for targeting governments who they believe have unreliable allies). The results of Table 2 are consistent with the claim that weaker treaty terms are associated with a higher likelihood of violating alliance commitments. Our measure of treaty strength is negative in all three models, and it is significant at the p<0.061 level using the measure based on Benson’s (2011) categories in Model 1 and p<0.001 level using the measure that includes both the Benson (2011) and ATOP factors. Leeds’ variable for indicating whether the ally is the original target in war is also positive and significant according to expectation. These results further solidify our confidence in the validity and usefulness of the measure, because the measure is related to factors that we might expect it to be associated with, it is relatively robust to the inclusion of di↵erent factors for specifying the measure, and it fills a previously open hole in important questions about alliances and conflict such as whether alliance commitments are reliable. 5 Conclusion, Caveats, and Implications Scholars of alliances have been blessed with a tremendous amount of data due to the generous and impressive e↵orts of those who have collected data on both the formal terms of the various alliances as well as the characteristics of the signatories involved. The amount of available data prompts the question: how can we best measure the strength of international alliances given the wealth of available data and our theoretical conceptions of alliance strength while also accounting for the inevitable ambiguity that must necessarily accompany such a determination? We show how a Bayesian latent variable trait model – a model that has been applied to many other important concepts in political science – can be used to integrate observable measures with theoretical arguments about the determinants of alliance strength to provide 32 estimates of: how the various observable factors relate to to the theoretically implied dimensions, how the strength of alliances in the two dimensions relate to one another, and how certain we are about all of the estimated parameters. Applying the model to all alliances in the international system between 1816 and 2000 provides estimates of alliance strength in terms of the strength of signatories and the strength of the formal terms of the alliance agreement. In total, we estimate the strength of 582 alliances – including those that are multilateral and lack a specific target. The estimates have strong face validity – familiar alliances are located as we would expect, and exploring the post-World War II alliances of East Asia provides reassuringly reasonable estimates. Not only do our estimates provide scholars with measures that can be used to explore questions such as the determinants of alliance formation, the persistence of alliances, and their relationship with conflict; but the estimates also reveal important insights into the nature of alliances. For example, although there is a relationship between the strength of signatories and the strength of the formal terms of an alliance, the relationship is modest (.256), and there is considerable variation of potential interest in the relationship. Moreover, when comparing the estimates of temporally and geographically proximate alliances, the estimates suggest that the alliances that are formed are nearly balanced. Perhaps reflecting the theoretical claims of balance of power theorists (Morgenthau 1948 and Waltz 1979), alliances formed in response to one another are very similar in strength. Third, our ability to distinguish between alliances on the basis of the strength of signatories is better than our ability to do so using the formal terms of the alliance – we have more uncertainty about the estimates of the latter than we do about the former. Relatedly, the ability to assess the certainty with which we are able to estimate the location of alliances provides the ability to account for the precision of our estimates when comparing alliances or using the estimates in secondary analyses. Finally, although we think our estimates are theoretically sound and possess strong face validity, the measurement model we employ is sufficiently general that scholars can use the model to generate their own estimates using di↵erent assumptions or di↵erent measures. We think our measure and the method provide an important advance for scholars inter33 ested in alliances and the international system, but some caveats are worth noting. First, while our statistical model provides a principled way of extracting information from multiple measures related to theoretically relevant dimensions with a minimal amount of assumptions, the resulting estimates are still dependent on the relationship between the observable measures to extract the latent dimensions. For example, the fact that CINC scores are relative to the global capacity in each year means that comparisons across long periods of time may be difficult and the estimates are likely most appropriate for temporally bounded comparisons or when temporally-related di↵erences are accounted for (similar concerns can emerge whenever these scores are used in any regression that explores variation over time). Second, the ability to use a Bayesian latent variable model to extract the latent dimensions structuring the strength of an alliance does not ameliorate potential concerns about the endogeneity of alliances. While we have been careful not to include variables in our measure of alliance strength that scholars may seek to correlate with the strength of an alliance in the hopes of better understanding alliance formation, nothing in our analysis discounts the fact that the formation of an alliance is presumably a strategic act based on the assessment of expected consequences and those interested in using the estimates should be careful of the potential pitfalls. While no measure is perfect, our model is able to use theoretical insights to: identify how features are related to the strength of an alliance in theoretically relevant dimensions, measure the strength of alliances formed between 1816 an 2000 along dimensions defined by the strength of the signatories involved and the strength of the formal terms of the agreement, and quantify how certain we are about the resulting estimates. Moreover, it also provides a principled way to integrate alternative measures and assumptions into the measurement task. Our measurement model and the estimates we produce will hopefully allow scholars to focus on better understanding the determinants and consequences of alliances rather than continually having to grapple with the question of how best to measure the relative strength of international alliances. 34 Notes 1 This is the time period covered by the databases collected by Leeds et. al. (2002), Gibler and Sarkees (2004), Benson (2011) and integrated by Bennett and Stam (2000a) in EUGene v3.204. 2 See Powell 1996 and Levy 1989 for a review of the debate over the relationship between alliances, distribution of power, and stability. 3 For an overview of the progression of this research, see Leeds 2003b, Johnson and Leeds 2011, Benson 2011, Benson et al 2013a 4 This notion goes back at least to Schelling 1966, who claimed that weak and flexible commitments diminished opponents’ perception of the expected military strength of the commitment. The idea is echoed in Snyder 1984, Snyder 1997, Fearon 1997, Zagare and Kilgour 2003, Benson 2012, and Benson et al 2013b. 5 Scholars have long recognized the need for adjusting aggregate military capacity for dis- tance. Most existing research degrades strength linearly (see for example Bueno de Mesquita and Lalman 1986 and Smith 1996). However, research has not established the actual mathematical relationship between distance and capabilities, and we also do not know if the rate of degradation is sensitive to the technological sophistication and geography of a country. Consequently, it is not clear how distance should a↵ect alliance strength. 6 That said, scholars could certainly use our framework to estimate a more limited measure for alliances with targets. 7 When the variables were coded to contain multiple categories, we often collapsed the categories to a binary measure to denote the presence or absence of each feature because the ordering of categories was unclear. For example, the coding of MILAID is “if there are no provisions regarding military aid, the variable is coded 0. If the agreement provides for 35 general or unspecified military assistance, the variable is coded 1. If the agreement provides for grants or loans, the variable is coded 2. If the agreement provides for military training and/or provision or transfer of technology, the variable is coded 3. If the agreement provides for both grants and/or loans and training and/or technology, the variable is coded 4” (p. 27). It is unlearn whether terms that denote specific loans or grants (MILAID=2) are stronger than terms that provide for unspecified military assistance (MILAID=1), or half as strong as terms that include both grants and/or loans and training and/or technology (MILAID=4). As a consequence, we use whether there are any provisions for military aid or any kind (i.e., if MILAID 8 1). Among other provisions, the “Pact of Steel” requires that: “The contracting parties will remain inconstant touch with one another in order to reach agreement upon all questions a↵ecting their common interests or the European situation as a whole” (Art.1),“In the event of the common interests of the contracting parties being injured by international events of any kind whatever, they will immediately consult as to the measures to be taken for the protection of those interests” (Art.2), “If contrary to the wishes and the hopes of the contracting parties it should occur that one of them becomes involved in warlike complications with another power or powers, the other contracting party will at once assist it as an ally and will support it with all its military forces on land, sea and in the air” (Art.3), and “The parties are conscious of the importance of their common relations to the powers friendly to them. They are resolved to maintain these relations in the future and to conduct them in common in accordance with the identical interests which bind them to these powers” (Art.6). The terms of the alliance between Spain and the Netherlands note that: ‘1This alliance is purely defensive and its object is to protect the commerce of the two parties” (Art.I), “This alliance exists until the Regents of Algeria, Tunis and Tripoli do not renounce their o↵ensive system with respect to the properties of the subjects of the two parties” (Art.II), and “If one of those pirate states causes an o↵ense to the parties, the allies consuls and representatives shall demand reparation from the government of the o↵ender through legal means; if the o↵enders government does 36 not abide by law, the allied powers will agree on proceeding to take reprisals corresponding to the quantity that the o↵ender has taken” (Art.III). 9 We can treat y = ↵ + 1 x1 + 2 x2 as accounting for the regression of y on the unobserved alliance strength x⇤ given the true specification y = ↵0 + linear function of x1 and x2 . If, for example, x⇤ = 0x 1 x1 + 2 x2 ⇤ if we can assume that x⇤ is a and y = ↵0 + 0x ⇤ the regression of y = ↵+ 1 x1 + 2 x2 is equivalent to the regression of y = ↵0 + 0 ( 1 x1 + 2 x2 ) because ↵ = ↵0 , 1 = 0 ⇥ 1, and 2 = 0 ⇥ 2 even though we do not observe x⇤ ! Note however, that this decomposition relies on the extremely strong – and implausible – assumption that x⇤ = 2 x2 . 1 x1 + This requires that not only must the latent trait be a function of observables that are correctly specified in the regression specification, but also that the relationship between x⇤ and the observables is without error. If there is error in this relationship – say x⇤ = 1 x1 + 2 x2 + ✏ – then we are in a classic error-in-variables situation and the estimated regression coefficients are inconsistent (see, for example, the nice review by Hausman 2001). 10 Technically, if variable k is a discrete variable, the observed value for alliance i in variable k is the category c which is generated according to xik = c if xik 2 ( k(c 1) , kc ] and where k is the vector of cut points for the C categories in variable k (Quinn 2004). 11 Specifically, the prior distribution of prior distribution for 12 2 k k conditional on 2 k is normally distributed and the is an inverse-Gamma distribution (Jackman 2009a). Because we impose theoretically derived parameter constraints to identify the dimensions being estimated, our estimator is similar to “confirmatory” factor analysis where theoretical insights are used to define the dimensions of interest a priori. “Exploratory” factor analysis places fewer constraints on the measurement model and lets (possibly spurious) relationships present within the data to define the recovered dimensions. 13 Even though the alliance was weak in terms of the signatory characteristics, it was powerful enough to satisfy the primary objective of the Union, which was to crack down on the Syrian Communists. The larger hope, which was explicitly expressed in the alliance, was that other 37 Arab states would also join to enhance the combined influence of the Arab community. Instead, Jordan and Iraq felt threatened by the UAR and formed the short-lived Iraq-Jordan Federal Union. Walt (1987) argues that the Federal Union was designed to balance against the UAR. At least in terms of our measure of overall alliance strength, the UAR and Federal Union appear to be matched. The Federal Union lies within the confidence interval of the UAR, implying that the strength of the Federal Union is statistically indistinguishable from the UAR. 14 Benson 2011 and 2012, for example, disaggregates deterrent types of alliances and studies why di↵erences exist among these alliances. 15 To be clear, although we use a Bayesian framework for theoretical and practical reasons, the priors we use are di↵use and contain no information about the relationships we examine. Our posteriors are almost entirely driven by the likelihood function. 16 See Benson 2012 for a case study analysis of these East Asian alliances. 38 References Abbott, Kenneth W. and Duncan Snidal. 2000. “Hard and Soft Law in International Governance.” International Organization 54(3):421–456. Achen, Christopher H. 2005. “Let’s Put Garbage-Can Regressions and Garbage-Can Probits Where They Belong.” Conflict Management and Peace Science 22(4):327–339. Bennett, D. Scott and Allan C. Stam. 2000a. “EUGene: A conceptual manual.” 26(2):179–204. Bennett, D. Scott and Allan C. Stam. 2000b. “A Universal Test of an Expected Utility Theory of War.” International Studies Quarterly 44(3):451–480. Benson, Brett V. 2011. “Unpacking Alliances: Deterrent and Compellent Alliances and Their Relationship with Conflict, 1816–2000.” The Journal of Politics 73(4):1111–1127. Benson, Brett V. 2012. Constructing International Security: Alliances, Deterrence, and Moral Hazard. New York, NY: Cambridge University Press. Benson, Brett V., Adam Meirowitz and Kris Ramsay. 2013. “Inducing Deterrence Through Moral Hazard in Alliance Contracts.” Journal of Conflict Resolution . Benson, Brett V., Patrick Bentley and Jim Ray. 2013. “Ally Provocateur: Why Allies Do Not Always Behave.” Journal of Peace Research 50(1):47–58. Boulding, Kenneth Ewart. 1962. Conflict and Defense: A General Theory. Harper. Budge, Ian, Hans-Dieter Klingemann Andrea Volkens Judith Bara Eric Tanenbaum Richard C. Fording Derek J. Hearl Hee Min Kim Michael McDonald and Silvia Mendez. 2001. Mapping Policy Preferences. Estimates for Parties, Electors, and Governments 1945-1998. Oxford: Oxford University Press. Bueno de Mesquita, Bruce. 1983. The War Trap. New Haven, CT: Yale University Press. Bueno de Mesquita, Bruce and David Lalman. 1986. “Reason and War.” The American Political Science Review pp. 1113–1129. Clinton, Joshua D. and David E. Lewis. 2008. “Expert Opinion, Agency Characteristics, and Agency Preferences.” Political Analysis 16(1):3–20. Clinton, Joshua D., Simon Jackman and Doug Rivers. 2004. “The Statistical Analysis of Roll Call Data.” American Political Science Review 98(2):355–370. Fearon, James D. 1997. “Signaling Foreign Policy Interests: Tying Hands versus Sinking Costs.” The Journal of Conflict Resolution 41(1):68–90. Gartzke, Erik and Kristian Skrede Gleditsch. 2004. “Why Democracies May Actually Be Less Reliable Allies.” American Journal of Political Science 48(4):775–795. Gibler, Douglas M. and John A. Vasquez. 1998. “Uncovering the Dangerous Alliances, 14951980.” International Studies Quarterly 42(4):785–807. 39 Gibler, Douglas M. and Meredith Reid Sarkees. 2004. “Measuring Alliances: The Correlates of War Formal Interstate Alliance Dataset, 1816-2000.” Journal of Peace Research 41(2):211– 222. Gibler, Douglas M. and Scott Wolford. 2006. “Alliances, Then Democracy: An Examination of the Relationship between Regime Type and Alliance Formation.” The Journal of Conflict Resolution 50(1):129–153. Gill, Je↵. 2002. Bayesian Methods: A Social and Behavioral Sciences Approach. Boca Raton, FL: Chapman & Hall/CRC. Gray, Julia and Jonathan Slapin. 2011. “How E↵ective are Preferential Trade Agreements? Ask the Experts.” The Review of International Organizations pp. 1–25. Hausman, Jerry. 2001. “Mismeasured Variables in Econometric Analysis: Problems from the Right and Problems from the Left.” Journal of Economic Perspectives 15(4):57–67. Hoyland, Bjorn, Karl Moene and Fredrik Willumsen. 2012. “The tyranny of international index rankings.” Journal of Development Economics 97(1):1–14. Jackman, Simon. 2009a. Bayesian Analysis for the Social Science. NY:NY: Wiley & Sons. Jackman, Simon. 2009b. Measurement. In The Oxford Handbook of Political Methodology, ed. Henry E. Brady Janet M. Box-Ste↵ensmeier Collier, David. Oxford: Oxford University Press chapter 6. Lai, Brian and Dan Reiter. 2000. “Democracy, Political Similarity, and International Alliances, 1816-1992.” The Journal of Conflict Resolution 44(2):203–227. Leeds, Brett Ashley. 2003a. “Alliance Reliability in Times of War: Explaining State Decisions to Violate Treaties.” International Organization 57(4):801–827. Leeds, Brett Ashley. 2003b. “Do Alliances Deter Aggression? The Influence of Military Alliances on the Initiation of Militarized Interstate Disputes.” American Journal of Political Science 47(3):427–439. Leeds, Brett, Je↵rey Ritter, Sara Mitchell and Andrew Long. 2002. “Alliance Treaty Obligations and Provisions, 1815-1944.” International Interactions 28(3):237–260. Levendusky, Matthew S. and Jeremy C. Pope. 2010. “Measuring Aggregate-Level Ideological Heterogeneity.” Legislative Studies Quarterly 35(2):259–282. Levendusky, Matthew S., Jeremy C. Pope and Simon D. Jackman. 2008. “Measuring DistrictLevel Partisanship with Implications for the Analysis of U.S. Elections.” American Journal of Political Science 70(3):736–753. Levy, Jack. 1989. The Causes of War: A Review of Theories and Evidence. In Behavior, Society, and Nuclear War, ed. Philip tetlock et al. Vol. 1 New York, NY: Oxford University Press. 40 Levy, Jack S. 1981. “Alliance Formation and War Behavior: An Analysis of the Great Powers, 1495-1975.” The Journal of Conflict Resolution 25(4):581–613. Levy, Jack S. and William R. Thompson. 2010. “Balancing on Land and at Sea: Do States Ally Against the Leading Power?” International Security 35(1):7–43. Marshall, Monty G., Keith Jaggers and Ted Robert Gurr. 2002. Polity IV Project Political Regime Characteristics and Transitions, 1800-2002. College Park, Maryland: Center for International Development and Conflict Management, University of Maryland. Martin, Andrew D., Kevin M. Quinn and Jong Hee Park. 2011. “MCMCpack: Markov Chain Monte Carlo in R.” Journal of Statistical Software 42(9):1–21. Mattes, Michaela. 2012. “Democratic Reliability, Precommitment of Successor Governments, and the Choice of Alliance Commitment.” International Organization 66(1):153–172. Mearsheimer, John. 1990. “Back to the Future: Instability in Europe After the Cold War.” International Security 15(1):5–56. Morgenthau, Hans J. 1948. Politics Among Nations: The Struggle for Power and Peace. New York: Knopf. Morrow, James D. 1991. “Alliances and Asymmetry: An Alternative to the Capability Aggregation Model of Alliances.” American Journal of Political Science pp. 904–933. Morrow, James D. 1994. “Alliances, Credibility, and Peacetime Costs.” Journal of Conflict Resolution 38(2):270–297. Organski, Abramo F. K. and Jacek Kugler. 1980. The War Ledger. Chicago, IL: University of Chicago Press. Pemstein, Daniel, Stephen A. Meserve and James Melton. 2010. “Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type.” Political Analysis 18(4):426– 449. Plummer, Martyn, Nicky Best, Kate Cowles and Karen Vines. 2006. “CODA: Convergence Diagnosis and Output Analysis for MCMC.” R News 6(1):7–11. Poast, Paul, Alexander Von-Hagen Jamar and James D. Morrow. 2012. “Does Capability Aggregation Explain Alliance Formation?” Working Paper . Poole, Keith T. and Howard Rosenthal. 1997. Congress: A Political-Economic History of Roll Call Voting. New Yorki: Oxford University Press. Quinn, Kevin M. 2004. “Bayesian Factor Analysis for Mixed Ordinal and Continuous Responses.” Political Analysis 12(4):338–353. Ray, James Lee. 2003. “Explaining Interstate Conflict and War: What Should Be Controlled for?” Conflict Management and Peace Science 20(2):1–31. 41 Reiter, Dan. 1996. Crucible of Beliefs: Learning, Alliances, and World Wars. Cornell University Press. Rivers, Doug. 2003. “Identification of Multidimensional Spatial Voting Models.” Stanford University Working Paper . Rosas, Guillermo. 2009. “Dynamic Latent Trait Models: An Application to Latin American Banking Crises.” Electoral Studies 28(3):375 – 387. Rosenthal, Howard and Erik Voeten. 2007. “Measuring Legal Systems.” Journal of Comparative Economics 35(4):711 – 728. Sabrosky, Alan N. 1980. Interstate Alliances: Their Reliability and the Expansion of War. In The Correlates of War II: Testing Some Realpolitik Models, ed. J. D. Singer. New York: The Free Press. Schnakenberg, Keith E. and Christopher J. Fariss. 2011. “A Dynamic Ordinal Item Response Theory Model with Application to Human Rights Data.” APSA 2009 Toronto Meeting Paper . Senese, Paul D. and John A. Vasquez. 2008. The Steps to War: An Empirical Study. Princeton: Princeton University Press. Signorino, Curtis S. and Je↵ M. Ritter. 1999. “Tau-b or Not Tau-b: Measuring the Similarity of Foreign Policy Positions.” International Studies Quarterly 43(1):115–144. Simon, Michael W. and Erik Gartzke. 1996. “Political System Similarity and the Choice of Allies: Do Democracies Flock Together, or Do opposites Attract?” The Journal of Conflict Resolution 40(4):617–635. Singer, J. David, and Melvin Small. 1966. “Formal Alliances, 1815-1939: A Quantitative Description.” Journal of Peace Research 3:1–31. Siverson, Randolph M. and Joel King. 1980. “Attributes of National Alliance Membership and War Participation, 1815-1965.” American Journal of Political Science 24(1):1–15. Siverson, Randolph M. and Michael R. Tennefoss. 1984. “Power, Alliance, and the Escalation of International Conflict, 1815-1965.” American Political Science Review 78(4):1057–1069. Smith, Alastair. 1995. 39(4):405–425. “Alliance Formation and War.” International Studies Quarterly Smith, Alastair. 1996. “To Intervene or Not to Intervene: A Biased Decision.” The Journal of Conflict Resolution 40(1):16–40. Snyder, Glenn H. 1984. “The Security Dilemma in Alliance Politics.” World Politics 36(4):461– 495. Snyder, Glenn Herald. 1997. Alliance Politics. Ithaca, NY: Cornell University Press. 42 Starr, Harvey and Benjamin A. Most. 1976. “The Substance and Study of Borders in International Relations Research.” International Studies Quarterly pp. 581–620. Treier, Shawn and Simon Jackman. 2008. “Democracy as a Latent Variable.” American Journal of Political Science 52(1):201–217. Tsebelis, George. 2002. Veto Players: How Political Institutions Work. Princeton, NJ: Princeton University Press. Vasquez, John A. 1993. The War Puzzle. Cambridge: Cambridge University Press. Voeten, Erik. 2000. “Clashes in the Assembly.” International Organization 54:185–215. Wagner, Robert Harrison. 2007. War And the State: The Theory of International Politics. University of Michigan Press. Walt, Stephen M. 1987. The Origins of Alliances. Ithaca, NY: Cornell University Press. Waltz, Kenneth Neal. 1979. Theory of International Politics. Addison-Wesley Pub. Co. Weidmann, N. B., D. Kuse and K. S. Gleditsch. 2010. “The geography of the international system: The cshapes dataset.” International Interactions 36(1):86–106. Yuen, A. 2009. “Target Concessions in the Shadow of Intervention.” Journal of Conflict Resolution 53(5):745–773. Zagare, F. C. and D. Marc Kilgour. 2003. “Alignment Patterns, Crisis Bargaining, and Extended Deterrence: A Game-Theoretic Analysis.” International Studies Quarterly 47(4):587–615. 43
© Copyright 2026 Paperzz