International Interactions, 33:347–382, 2007 Copyright © Taylor & Francis Group, LLC ISSN: 0305-0629 DOI: 10.1080/03050620701681809 Dispute Settlement Design for Unequal Partners: A Game Theoretic Perspective 1547-7444 Interactions, 0305-0629 GINI International Interactions Vol. 33, No. 4, October 2007: pp. 1–57 Dispute J. P. Langlois Settlement and C.Design C. Langlois for Unequal Partners JEAN-PIERRE P. LANGLOIS Department of Mathematics, San Francisco State University, San Francisco, California, USA CATHERINE C. LANGLOIS McDonough School of Business, Georgetown University, Georgetown, District of Columbia, USA When signatories of international agreements fail to comply unintentionally, sanctioning rules designed to deter intentional noncompliance are tested. To provide signatories with the best treaty value, we find that remedies in case of unilateral defection must account for the nature of the inequality between treaty partners, as well as the type of mixed motive game they are engaged in. Trigger type schemes, that rely on punishment by mutual defection, are the norm for sanctioning in treaty texts. Inequality is addressed by proposing that the process leading to retaliation be accelerated when a weaker partner faces the noncompliance of a stronger partner. Our analysis suggests instead that the prescription depends on the source of the inequality. If inequality stems from differences in the costs associated to compliance, the stronger partner, with the lower compliance costs, should be given more time, not less, to settle in the shadow of the law if he deviates. Despite their prevalence, trigger schemes are not well suited to the handling of Chicken or Called Bluff games that may define the stakes in environmental accords. This motivates our analysis of an alternative sanctioning scheme that builds in redress for the victim of a unilateral defection. In addition to its ability to handle alternative game structures, we find that this scheme provides better The listing order of the authors’ names is not indicative of their respective contributions, which they consider to be equal. Address correspondence to Professor Jean-Pierre P. Langlois, San Francisco State University, Department of Mathematics, 1600 Holloway Ave., Thorton Hall, Ninth Floor, Office: TH 926, San Francisco, CA 94132, USA. E-mail: [email protected] 347 348 J. P. Langlois and C. C. Langlois treaty value than trigger type schemes, as well as credible deterrence, to signatories engaged in a Prisoner’s Dilemma game. We conclude that, in the design of sanctioning schemes, redress for the injured party is better than punishment by defection. KEYWORDS dispute settlement design, treaty value, game theory Nations of unequal size, power or influence sign international agreements and devise ways to punish noncompliance that take explicit account of the differences in signatory circumstance.1 If mixed motives are present, the game theoretic principles of credible deterrence should inform the design of such dispute settlement rules. Indeed, treaty sanctioning provisions should credibly deter willful breach, and many possible schemes can achieve this goal. But, if signatories can deviate from treaty provisions unintentionally, the choice of a sanctioning rule will be consequential because sanctioning will necessarily come to pass. Punishment moves the parties away from the benefits of full cooperation, but some sanctioning provisions do this more than others. Our objective is to compare credible sanctioning rules, including those that mimic the rules typically prescribed by international treaties, according to their ability to keep unequal partners cooperating as much as possible throughout the life of the agreement. From a game theoretic viewpoint, the unequal status of treaty signatories shows up in the structure of payoffs, and this impacts the design and effectiveness of dispute settlement rules. In this paper, we identify and evaluate subgame perfect designs for dispute settlement in two-player asymmetric games. Such designs ensure that, despite their differences, willful breach is in the interest of neither party. Our analysis embraces the premise that even if special and differential treatment is applied to compensate for nations’ unequal stakes in the benefits of cooperation,2 many of the issues regulated by international treaty determine interests that would give advantage to a unilateral defector. In this we find ample support in the literature, including the writings of international lawyers such as Jacobson and Brown Weiss (1998). Following the writings of authors such as Chayes and Chayes (1991, 1993, 1995), we also recognize that noncompliance will arise from misunderstanding, lack of transparency, or lack of indigenous resources in support of compliance. The existence of mixed motives, when associated with possibly unintentional noncompliance, has profound consequences for a game theoretic analysis. Under these circumstances, game theoretic provisions must account for the observation of noncompliance, regardless of intent, and the possible testing of the sanctioning rules in place as a result. Indeed, the implementation of sanctions is consequential for signatory progress toward full cooperation. Clearly, if a grim trigger were proposed as a sanctioning Dispute Settlement Design for Unequal Partners 349 scheme, a single instance of noncompliance, if punished, would preclude cooperation forever more. By contrast, a trigger scheme that would only punish for a period of two years would allow for a return to cooperation thereafter. Intuitively, signatories are better off if they can return to full cooperation sooner rather than later. This simple insight enables us to define a formal criterion that measures the value to signatories of dispute settlement designs that credibly deter willful breach. This, together with an application of the criterion to a selection of possible game theoretic designs for two unequal partners, is the main contribution of the paper. Punishment schemes that credibly deter intentional noncompliance share the same characteristic: the noncompliant signatory stands to lose in punishment at least as much as he or she has gained from deviant behavior. But the possible schemes take on a bewildering variety of forms. Punishment of the deviating signatory can be achieved by implementing a triggerlike mechanism. Retaliation in kind—the recognized mode of punishment is such schemes—imposes the payoffs of mutual defection on both parties. The injured party therefore incurs costs when punishing the defector. But punishment can also involve at least partial redress for the injured party’s losses. Trigger-like schemes are the norm for sanctioning in treaty texts, and inequality is addressed by proposing that the process leading to retaliation be accelerated when a weaker partner faces the noncompliance of a stronger partner. Our analysis suggests, instead, that the prescription depends on the source of the inequality. If inequality stems from differences in the costs associated to compliance, the stronger partner, with the lower compliance costs, should be given more time, not less, to settle in the shadow of the law if he deviates. Our analysis also points to the inadequacy of trigger like schemes if the parties are not engaged in a classic Prisoner’s Dilemma but in alternative mixed motive games such as Chicken or Called Bluff. In such cases, we show that a tit-for-tat like scheme in which the injured party gets at least partial redress, offers credible deterrence. But most importantly, even if the signatories are engaged in a Prisoner’s Dilemma, this same scheme can be calibrated to provide all parties with better treaty value than if they adopted a trigger-like scheme. COMPLIANCE WITH INTERNATIONAL AGREEMENTS AS A GAME THEORETIC ISSUE Compliance with International Agreements: An Overview The literature on compliance is vast and a comprehensive overview is not our purpose here. However, it is of interest to note, in broad strokes, the prominent themes that mark the field. Treaty design and the issue of compliance has been approached by two groups of scholars that stand in stark 350 J. P. Langlois and C. C. Langlois contrast. Rationalist scholars such as Downs, Rocke, and Barsoom (1996) argue for the central role of deterrent punishment,3 while international lawyers such as Chayes and Chayes (1991, 1993, 1995) question the usefulness of game theory and the sanctioning propensities that stream from it. The managerial school recognizes that sanctioning is the appropriate response to willful breach, but points to “the error of conceptualizing most compliance problems as being due to intentional violations” (Chayes et al., 1998, p. 39). Instead, Chayes and Chayes emphasize the willingness of nations to comply with international agreements most of the time, and attribute noncompliance to a lack of means or a misunderstanding on the terms of the agreement. If noncompliance is mostly unintentional, the failure to cooperate should be managed and not punished (Chayes and Chayes, 1991, 1993, 1995). Downs, Rocke and Barsoom (1996) respond with arguments about the depth of treaties. If a treaty prescribes a course of action that signatories would have followed in the absence of an agreement, compliance will indeed be observed. But if treaties are deep, exposing a widening benefit to unilateral defection, signatories will defect in exploitative self-interest unless the threat of sanctions is powerful enough to deter such behavior. While proponents of the managerial school downplay the relevance of willful breach, rationalists argue that cooperation cannot be achieved without coercive means. Simmons (1998) argues that these approaches to compliance “are not mutually exclusive, and the less one is willing to straw-man the arguments of the major proponents, the clearer become the numerous points of overlap,” (Simmons, 1998, p. 76). We agree wholeheartedly. The stark contrast between managerial and rationalist prescriptions only exposes the surface of the compliance debate. Indeed it is necessary to understand the nature and context of compliance in order to devise means to encourage it. Those that have analyzed the complexities of the phenomenon are perhaps more inclined to be managerial than to be rationalist. Yet their conclusions bring much grist to the mill of a game theoretic approach to cooperation. Jacobson and Brown Weiss’s (1998) analysis of compliance with environmental accords is of particular interest. These authors conclude a major review of compliance, by eight nations and the European Union with five environmental treaties, by proposing that the measures that will most effectively enhance compliance depend on the relative strength of a signatory’s intent and ability to comply with the terms of the agreement. These authors offer a measured view of appropriate strategies where sanctions are deemed “crucial in agreements where free riding is possible and could carry significant rewards” (Jacobson and Brown Weiss, 1998, p. 548), but where positive inducements and monitoring also play a major role. Thus the threat of coercive action typical of a game theoretic approach is clearly recognized as necessary if mixed motives are present. Yet, the recognition that noncompliance can rarely be construed as an intentional and Dispute Settlement Design for Unequal Partners 351 self-interested attempt at exploitation of the treaty’s terms, tempers the call for coercive measures. A game theoretic approach to treaty design needs to take into account the possibility of unintentional noncompliance, while ensuring that willful breach is effectively deterred. Nations certainly project the appearance of compliance failure with the terms of a variety of treaties and agreements, even when these include provisions for the punishment of noncompliant behavior. To date, over three hundred cases have been brought to WTO dispute resolution bodies since 1995 despite a World Trade Organization (WTO) strengthening of the sanctioning rules that prevailed under General Agreement on Tariffs and Trade (GATT). Numerous infractions of the Convention on International Trade in Endangered Species (CITES) are reported by Sand (1997) despite provisions for dispute settlement that allow for retaliatory trade restrictions (Jacobson and Brown Weiss, 1998 p. 527). All or most of these instances of apparent breach may be unintentional, due to a lack of means rather than a lack of will. But their very existence brings up the issue of sanctioning. However, it is the stakes involved that determine the need for sanctioning rules rather than any clear evidence that nations do fail to comply. Indeed, if cooperation is the best outcome for all, signatories would all have a strong willingness to comply even if compliance was not always possible. In such a case, sanctioning rules could play no useful role. But if mixed motives are present, then game theoretic principles dictate that sanctioning of observed noncompliance is necessary to prevent intentional breach. The relevance and nature of sanctioning possibilities depends on signatory incentives. What, then, is the structure of the games that nations play and regulate? Do Treaties Regulate a Game of Mixed Motives? MIXED MOTIVES IN TRADE, SECURITY, AND THE ENVIRONMENT Mixed motives come in various shapes and sizes, from the incentives present in the classic Prisoner’s Dilemma to the risky temptations of the Chicken or the Called Bluff games. While many authors have examined the mixed motives of a Prisoner’s Dilemma, the structure of payoffs depends on the issue area. The Prisoner’s Dilemma has been the game structure of choice to describe security or trade games. Downs and Rocke (1990) and Brams and Kilgour (1988) use the Prisoner’s Dilemma to describe the stakes inherent in the arms control game. Weber (1991) justifies this choice in a careful analysis of the stakes involved. While the security game played by the ex-Soviet Union with the United States may have been construed as a game of equal partners, the current security relationship between the United States and Russia clearly is not. Indeed, Russia is at a severe economic disadvantage in the race to modernize arsenals while paring the existing stockpile of arms. As a result Russia would now lose more from a unilateral 352 J. P. Langlois and C. C. Langlois FIGURE 1 Changing Stakes in the Security Game. defection of the U.S. than the U.S. would from Russia’s defection, while mutual defection would be more costly to Russia than to the U.S. Using numerical values for parameters, the stage game has shifted from a symmetric Prisoner’s Dilemma as represented in Figure 1a to one that resembles Figure 1b. Downs et al. (1996), in their discussion of compliance with the WTO agreements, assume that the rewards from trade liberalization are also characterized by the mixed motives of the Prisoner’s Dilemma. The assumption is supported by a number of arguments from the economics literature: Giving additional weight to business interests over consumer welfare gains in the determination of the overall welfare gains from trade will reveal a unilateral incentive to adopt protectionist measures. Staiger (1995) incorporates such weighting in a simple partial equilibrium tariff-setting model highlighting the Prisoner’s Dilemma structure of the game created as a result.4 The argument for mixed motives in trade when extended to the relationship between a developed and developing trading partner highlights the unequal burden on developing countries of retaliation (Hoekman and Mavroidis, 2000). To the extent that developing country export bases are less diversified, restricted access to a developed market will impose severe costs. Developed countries, when facing similar restriction from a developing partner are less likely to lose a determining outlet for their products and are more able to compensate for the loss by boosting trade in other goods and switching to alternative markets. Thus a developed partner imposes more harm on the developing country by defecting than its smaller, poorer partner can impose by doing likewise. As in the security game, asymmetric payoffs would resemble those of Figure 1b. Games involving collaboration to produce a public good such as pollution abatement come with special characteristics that transform the nature of the stakes. In such games the participating nations incur a cost-when cooperating, but also benefit from their own cooperation as well as that of others. While it has been shown that the redistribution of the gains from Dispute Settlement Design for Unequal Partners 353 FIGURE 2 Payoff Structures in Environmental Games. collective action can always ensure that treaty signatories are collectively better off cooperating on pollution abatement, free-riding on other countries’ abatement efforts can make individual signatories still better off (Missfeldt, 1999). Most countries are both polluters and victims of pollution but “national costs and benefits of abatement efforts are distributed asymmetrically across nations” (Schmidt, 2000 p. 44). Lipnowski and Maital (1983), suggest that in trans-boundary pollution games, it may be worse to defect in the presence of a free rider than to continue to cooperate. Signatories would then be playing a game of Chicken. Such a structure is also suggested by Finus (2001) for pollution abatement games if the value of clean air is high enough relative to the costs of producing it. We illustrate an asymmetric Chicken game numerically in Figure 2a. Aggarwal and Dupont (1999) provide a comprehensive characterization of the structure of public goods provision games in which countries incur a cost of provision that is shared if the parties cooperate. Unequal benefits coupled with the unilateral incentive to free-ride on the other’s provision can lead to a situation where one party prefers to produce the public good alone despite the other side’s defection, while the other prefers mutual defection to a situation where he continues to cooperate when faced with free-riding behavior. Such a situation might well be relevant to the sadly timely provision of monitoring and control of terrorist activities. This is a Called Bluff game whose incentives are illustrated numerically in Figure 2b. We conclude from this overview that the presence of mixed motives in salient international issue areas is widely acknowledged. Moreover, the above discussion points to the importance of asymmetry, and to the variety of game structures that may lurk beneath a treaty’s terms. THE GENERIC STAGE GAME AND PLAYER OBJECTIVES The various game structures that we have discussed find formal representation in the following two player stage game: 354 J. P. Langlois and C. C. Langlois Player 2 Player 1 Cooperate C Defect D Cooperate C Defect D 0,0 1, −b2 −b1,1 –c1, −c2 FIGURE 3 The Generic Stage Game. The players’ decisions are to either comply or violate the terms of the treaty, cooperate or defect. We have normalized the payoffs to full cooperation to 0 and to unilateral defection to 1 and parameters ci and bi are strictly positive (ci , bi > 0). To ensure that mutual cooperation is the best long-run outcome if it can be sustained, we assume that b1, b2 > 1.5 The nature of the mixed motive game depends on the relative stakes bi and ci. If bi > ci for both i, the players face a classic Prisoner’s Dilemma. If bi < ci for both i, they are engaged in a game of Chicken, and if (say) b1 > c1 and b2 < c2, the game is one of Called Bluff, and player 2 is called. Players engaged in the repetition of the stage game of Figure 3 will base their decisions on the expected discounted payoffs (for player i at integer time t ≥ 0): U i (t ) = ∞ ∑ s = 0 w s ui (t + s ) (1) where ui (t + s) is the utility that player i expects to derive at time (t + s) and w is the discount factor.6 Utilities ui (t + s) are expectations that result from player intentions: the utility parameters of Figure 3 and the probability of mismatch between observed behavior and player intention. Before we provide explicit formulation for ui (t + s), we turn to the modeling of unintentional noncompliance. UNINTENTIONAL NONCOMPLIANCE AND TREATY VALUE Unintentional Noncompliance: Some Examples of Its Meaning Failure to comply can fall within legally acknowledged signatory capacity constraints, and be explicitly permitted under the terms of the international agreement. Conceptually distinct is the noncompliance that might result from a conflict with other legal obligations, or a misreading of the agreement’s terms. Game theory’s concern is with the latter because, if it results in de facto noncompliance, it may be difficult to distinguish from willful breach in exploitation of the treaty’s terms. From a game theoretic viewpoint, punishment of unintentional noncompliance is necessary if Dispute Settlement Design for Unequal Partners 355 intentional noncompliance is to be deterred. Thus it is de facto noncompliance that is punishable, regardless of intent. In this, game theoretic and legal principles converge. A few examples will illustrate instances of possibly unintentional noncompliance that trigger the remedies game theory has to propose. Ruling out the case where a signatory’s capacity to comply is so weak that defection is the best course of action, observed noncompliance must still be measured against the special and differential treatment afforded to weaker international partners. For example, India’s imposition of import restraints in the India – Quantitative Restrictions case brought to the WTO by the United States,7 the issue of noncompliance was to be weighed against the possibility for a developing country to restrict imports in order to maintain a favorable balance of payments position. If import restrictions are found necessary to support certain macroeconomic equilibria, maintaining them does not constitute a failure to comply. But there is noncompliance with the terms of the WTO agreements if the country’s balance of payments situation is too favorable to support imposing such import restrictions. Noncompliance then comes from a misinterpretation of the extent of the special measures that apply to a developing member of the WTO, rather than a willful attempt at self-interested exploitation of the treaty’s terms. But it is punishable nevertheless. Allegedly unintentional noncompliance as the byproduct of competing obligations is also frequently observed in trade disputes. Examples in the WTO are many, and panel rulings go to great lengths to judge a practice on its face rather than in the terms of any legislative intent. In a case brought against Chile’s taxation of alcoholic beverages by the European Union (EC),8 the EC contended that by taxing spirits according to alcohol content, imports of liquors such as whisky, gin or tequila were improperly inhibited although they constituted direct substitutes to the slightly weaker domestic pisco. On appeal the WTO panel ruled against Chile although the taxation system was acknowledged to be “facially neutral.” The fact that imports fell disproportionately in the high tax brackets made the measure de facto discriminatory and, the panel stated, “the subjective intentions inhabiting the minds of legislators or regulators do not bear on the inquiry.”9 The implicit presumption was that signatories do not intend to violate the terms of the WTO agreements in exploitative self interest.10 But the WTO panel commentary can also be interpreted as meaning that only deviation matters regardless of the nature of signatory intentions. While domestic regulatory considerations may inadvertently interfere with compliance with international agreements, nations may also disagree on the criteria for compliance. A nation’s interpretation of the terms of the agreement may then lead to a judgment of noncompliance by relevant legal bodies, although the intent is arguably not self-interested exploitation. The differing view of what constitutes protection of an endangered species in 356 J. P. Langlois and C. C. Langlois the Shrimp–Turtle case, brought against the United States by a number of shrimp exporting countries, is an example.11 In this case, a number of countries protested the United States decision to prohibit importation of shrimp from nations that did not use Turtle Excluder Devices (TEDs) when fishing for shrimp in certain waters. The U.S. was alleging improper compliance with the CITES agreement and imposing trade sanctions as a result. One of the injured parties, Malaysia, “while recognizing that the use of TEDs was a step in contributing to the conservation of turtles, . . . . . considered that it was just one of the many accepted methods for the conservation of turtles.”12 And Malaysia claimed “a comprehensive legal framework on the conservation and management of marine turtles,” including prohibition of capture and establishment of sanctuaries. Other plaintiffs were similarly adamant about their efforts to protect the species. While countries, in good faith, can claim compliance with the same standards, their practices may well differ (Jasanoff, 1998). But as signatories differ in their interpretation of treaty obligations, so do interpretations of behavior, and signatories will periodically be presumed to defect on their treaty obligations even if they meant to comply. From a game theoretic viewpoint, it matters little what the particular misunderstanding that leads to noncompliance might be. What matters is that defection will be observed in a context where purity of intent cannot be assumed since mixed motives are present. A priori, signatories can expect to observe noncompliance by the other side with some likelihood, presumably small, since it is established that all parties want to comply. Our goal will now be to highlight the properties of alternative game theoretic remedies within the simplest model that captures the essence of the incentive structure and behavioral patterns that prevail among treaty signatories. Our perspective is that of a designer of dispute settlement regimes. We do not seek to apply a mathematical mirror to legal processes that might already be in place.13 Modeling Unintentional Noncompliance and Treaty Value Unintentional failure to comply with the terms of an international agreement will show up in observations that are similarly shrouded in uncertainty about their true nature. A signatory may then observe that a treaty partner is failing to comply when in fact that same partner’s intent was to cooperate. But by the same token, actual deviations from compliant behavior may not be viewed as such. As a result our signatories operate in a noisy environment. We formalize this situation as follows: the observation of player i’s move xi e {C,D} reflects this player’s intent with probability (1 – e). But with some probability e, observation and intent do not coincide. Thus player i, with probability e, will be observed to defect while he intended to cooperate, an event that triggers implementation of the sanctioning scheme in Dispute Settlement Design for Unequal Partners 357 place. Thus if the players intended to cooperate at date t + s, player i’s utility ui (t + s) will read: ui (t + s ) = ui (C , C ) = e 2 × 0 + e (1 − e ) × ( −bi ) + (1 − e )e × 1 + (1 − e )2 × ( −ci ) = e (1 − e )(1 − bi ) − e 2ci (2) Utility ui (t + s) thus accounts for player intentions (to cooperate in this case), the utility parameters of Figure 3, and the probability of mismatch between observed behavior and player intention.14 When engaged in a mixed motive game, the best that a player can achieve is the payoff to mutual cooperation as long as neither party exploits the other by defecting unilaterally. One of the purposes of strategy is therefore to prevent such illegal defection by threatening appropriate punishment. But if each player is inevitably faced, at some point, with the other’s apparent noncompliance, she will need to be poised and ready to implement punishment. Sanctioning moves the players away from full cooperation and reduces the value of the treaty to the signatories. If retaliation means that both parties defect for some time, they will be receiving the payoffs of mutual defection rather than those they could reach by cooperating. The frequency of noncompliance, together with the type of sanctioning scheme in place, will determine the long run probability with which the players find themselves in one payoff state or another. Formally, let mk be the long-run frequency with which signatories visit state k, and U ik be player i’s expected discounted payoff when in state k.15 If the particular design for dispute resolution under study determines n possible states of the game, treaty value for player i reads: Vi = ∑ k =1 m kU ik n (3) What credible sanctioning schemes maximize treaty value for each of the signatories? Game theorists have dealt with the credibility issue by requiring that strategy pairs (in a two-player game) form a subgame perfect equilibrium (SPE). This means that the strategic plan adopted by each party must be the best possible, given player objectives, and regardless of the circumstances. But it also means that mutual cooperation in intent maximizes signatory payoffs. The best dispute settlement procedure from a game theoretic viewpoint therefore maximizes treaty value as defined in (3) by keeping both signatories cooperating with the highest long run likelihood. The strategies that we examine in what follows all participate in an SPE. This guarantees that it is in the best interest of the signatories to implement the sanctioning rules if noncompliance is observed.16 We now turn to our dispute resolution schemes. We first describe trigger designs, arguing that dispute resolution procedures implemented under agreements such as the WTO, or CITES can be interpreted as trigger 358 J. P. Langlois and C. C. Langlois schemes. We then examine an alternative scheme that calls for redress for the compliant signatory at the retaliation stage. Such a scheme can provide both sides with better treaty value. TRIGGER DESIGNS FOR UNEQUAL PARTNERS Treaty Dispute Resolution Procedures as Trigger Mechanisms Negotiated settlement is the preferred solution to disputes over compliance with a treaty’s terms. If settlement fails, then elaborate provisions typically involving third parties come into play. Thus, disputes over the Law of the Sea are brought to special arbitral tribunals, the International Tribunal for the Law of the Sea, or the International Court of Justice while trade disputes can be brought to a WTO panel. While the decisions of these bodies are binding, their implementation is left in the hands of the states involved. As Koremenos, Lipson, and Snidal (2001) put it, “most international organizations have relatively decentralized enforcement arrangements. They specify possible punishments for rule violations but leave it up to the members to apply them.” When states engage in formal dispute settlement, they obtain the terms of legitimate remedy and the allowed timing of their imposition, while enforcement remains the decision of the parties in the dispute. Thus, a WTO panel ruling against the perpetrator of a trade restriction opens the official door to retaliatory action, but does not preclude further attempts at negotiated settlements. Victims have, of course, also been known to impose or threaten, punishing sanctions unilaterally. As early as 1989, for example, the U.S. unilaterally invoked the Pelley Amendment to impose trade sanctions on Taiwan for trading in endangered Rhino horn.17 From a game theoretic perspective, two features of treaty dispute resolution processes are noteworthy: observed defection may not lead to retaliation at all, but if it does, punishment is imposed for an uncertain but possibly long period of time. The few instances of retaliation under WTO rules are instructive: In April of 1999, the United States imposed $191.4 million in retaliatory sanctions against European agricultural products to protest Europe’s refusal to adequately modify its banana importation regime which extended trade preferences to the less developed, banana - producing, African, Caribbean, and Pacific countries.18 A satisfactory resolution of the conflict was reached in April, 2001 (New York Times, April 12, 2001) and the United States removed retaliatory sanctions worth $191 million on July 1, 2001. The banana dispute had been ongoing since 1993, and retaliatory measures had been in place for just over two years. In another case against the European Union, the U.S. imposed $100 million in retaliatory sanctions as early as 1989 to protest Europe’s refusal to import hormone-treated beef. This particular set of retaliatory measures was rescinded by the United States in July of 1996 (Inside US Trade, July 19, 1996), Dispute Settlement Design for Unequal Partners 359 under European threat to call for an official WTO investigation of U.S. unilateral sanctions under Section 301. The measures had been in place for seven years. A new set of U.S. retaliatory sanctions worth $116.8 million was imposed in July of 1999 and is in force at the time of writing (Inside US Trade, July 23, 1999). The dispute on Europe’s ban of hormone-treated beef has now lasted for thirteen years, and retaliatory measures have been in place for ten of these years. Many disputes do not lead to retaliation at all. In those that do, retaliatory measures are in place for uncertain periods of time. These features of dispute resolution are characteristic of probabilistic trigger schemes.19 Probabilistic trigger schemes operate as follows: observed unilateral defection by signatory i is followed, with some probability qi, by reversion to “punishment mode” where defection is expected from both sides. Once in punishment mode, a return to cooperation by both parties occurs only after a punishment period of probabilistic or deterministic length Ti. In the probabilistic case, the expected length can be specified by a return probability 1 ri (the expected length is then Ti = ). The lower the trigger probability qi ri 1 the longer it will take ( in an expected sense) before a continuing unilatqi eral defection will be punished. In the meantime, mutual cooperation can be reestablished through negotiated settlement, in which case the defection goes unpunished. This represents generosity in the face of noncompliance. Clearly, unequal partners need not abide by the same trigger probability rules to ensure credibility of retaliatory threats. The time spent in punishment mode, Ti (or equivalently the likelihood of return to cooperation ri), depends on the identity of the presumed defector. And the signatories need not behave with the same generosity, giving the other more or less time to reestablish cooperation if he has failed to comply. Interestingly, the WTO dispute resolution understanding, arguably the most developed of international dispute settlement protocols (Cameron and Campbell, 1998), also calls for differential delays for less developed member countries. The focus of the WTO has been to help developing countries move through dispute resolution procedures faster than developed countries. In game theoretic terms, the WTO would prescribe that a developing country retaliate against a unilateral defector with a likelihood qj that is higher than the one prevailing for its developed partner. But Busch and Rheinhardt (2003) suggest that these differential rules have disadvantaged less developed signatories. These authors argue that developing nations obtain larger concessions “in the shadow of the law,” before a dispute is paneled. An increase in likelihood qj reduces the time afforded to the signatories to settle before costly legal proceedings are engaged. This, according to Busch and Rheinhardt (2003), works against less developed country interests, and developing nations should be given more, not less, time to 360 J. P. Langlois and C. C. Langlois settle disputes against a noncompliant developed partner before a panel is convened. Our game theoretic analysis of trigger type mechanisms for the settlement of disputes suggests a more complex reality. Differences in signatory circumstances can lead to a variety of possible payoff configurations. These in turn dictate the parameters of the trigger mechanisms that signatories can use. In some cases, developing partners are better off with more time to settle with a deviating developed partner. But in others it is better to speed up the dispute settlement process for the weaker partner. It all depends on the nature of the asymmetry between signatories. Identifying Best Trigger Mechanisms for Unequal Partners Trigger schemes, whether probabilistic or deterministic, call for the identification of a clear noncooperative state that is detrimental to the defector. Mutual defection in the Prisoner’s Dilemma, a Nash equilibrium of the stage game, provides a natural reversion point in case of unilateral defection of one signatory. TRIGGER SCHEMES AND TREATY VALUE FOR THE PRISONER’S DILEMMA Trigger designs for unequal partners must be player specific. Such schemes therefore distinguish three states of the game: Cooperation CO, in which both sides are expected to choose C, and two states Pi (i = 1,2) of reversion in which both states are expected to defect, following the observed noncompliance of one or the other signatory. The rules for reaching Pi and returning to CO are tailored to each player i: qi is the probability that an observed unilateral defection by i will trigger a reversion to state Pi, and ri is the probability of return from Pi to CO. The technical conditions of credibility on ri and qi (worked out in Proposition 1 in the appendix) emerge from a comparison of player discounted payoff from cooperation to the payoff he can expect from unilateral defection given the likelihood that such behavior will be punished. If U iCO is player i’s discounted payoff in mutual cooperation and U iPi his discounted payoff in punishment, probabilities ri and qi must be set to deter intentional unilateral defection for i at CO by ensuring that U iCO ≥ U iPi. Intuitively, there will be many possible combinations of qi and ri that ensure deterrence in a noisy environment. However, the choice of trigger and return probabilities now impact the likelihood with which players will inevitably find themselves in punishment mode as a result of the other side’s observed unilateral defection. For this reason, all possible schemes are no longer equal. While any scheme that provided enough deterrence to ensure perpetual cooperation was worth U iCO = 0 in a noiseless environment, when noise is present, a scheme’s value must account for the likelihood of being in any one of the three possible states CO and Pi. Thus treaty value, Vi, decreases when noise is introduced and becomes negative, because Dispute Settlement Design for Unequal Partners 361 signatories will necessarily find themselves in the configurations that punish observed unilateral defection. If the sanctioning rules are modeled according to a probabilistic trigger scheme, we show in Proposition 3 in appendix, that the rate vi at which Vi decreases with noise is given by: ⎛q q ⎞ vitrigger = 1 − bi − ⎜ 1 + 2 ⎟ ci ⎝ r1 r2 ⎠ (4) The best sanctioning rules reduce treaty value the least as noise is introduced. The best probabilistic trigger schemes, characterized in the corollary to proposition 2, therefore maximize negative rate vitrigger by mini- ⎛q q ⎞ mizing ⎜ 1 + 2 ⎟ under deterrence requirements for both i.20 This yields for ⎝ r1 r2 ⎠ each i = 1,2 (as e → 0): if ci ≤ 1 w if ci ≥ qi = 1 and ri = ci − 1 w qi = 1− w w 1 and ri = 1 wci (5a) (5b) Conditions (5a-b) on probabilities qi and ri, derived in the corollary to Proposition 3, are also conditions on settlement delays and they depend on the payoffs that each party would receive if both defect. IMPLICATIONS FOR UNEQUAL PARTNERS If the payoff to mutual defection is small for both parties, the conditions of (5a) hold. Treaty-maximizing sanctioning rules should then require prompt retaliation in case of observed noncompliance, with a return to cooperation that is faster in expected terms for the party who suffers the most from mutual defection. This is because ri, the probability that the signatories will return to cooperation when in punishment mode, increases with ci. Perhaps of more interest are the conditions of (5b). When the cost of mutual defection is high enough, the best triggers require that the signatory suffering the most from mutual defection be given more time, on average, to settle before retaliatory sanctions are implemented. Indeed, expected settlement time, 1 measured by , captures the expected delay between the observation of qi noncompliance by i and the implementation of retaliatory measures against i. It is, conceptually, the expected time that signatories have to settle “in the 362 J. P. Langlois and C. C. Langlois shadow of the law.” Following our analysis, the party who suffers the most from time spent in punishment mode should get more time to settle in the shadow of the law. Inequality between treaty partners can stem from unequal capacity to comply. Compliance for the weaker signatory is then more costly. For example, compliance with Sanitary and Phytosanitary Measures (SPS) can be particularly costly for developing countries. This is due to “the predominance of agricultural and food products in total exports and the technical capability of developing countries to comply with SPS requirements,” (Henson and Loader, 2001, p. 89). Finger and Schuler (1999) point out that compliance with SPS is more costly to developing countries than developed countries because “while the SPS agreement does not require that a country’s domestic standards meet the agreement’s requirements, it does require that the standards the country applies at the border meet those requirements.” (Finger and Schuler, 1999, p. 18). These authors cite costs to Argentina of $82.7 million over five years to ensure disease and pest-free exports of meat, and costs of $112 million to Algeria for locust control among other examples. Damodaran (2002) estimates that SPS compliance increases Indian Coffee farm replanting and operational costs by 46%. How do differential compliance costs affect payoffs all other things being equal? Starting from a symmetric payoff situation, a higher cost of compliance for player i, here assumed to be the column player, is a cost associated to cooperation regardless of what signatory j is doing. Compared to a symmetric situation, player i now receives less from cooperation than does j. Figure 4 illustrates the situation assuming that compliance costs for player i exceed j’s by 1. Then, if in the symmetric case full cooperation yields a payoff of 0 for both parties, it now only yields −1 for i. Similarly, i’s payoff when j defects unilaterally is also reduced by 1. This yields the payoff structure pictured in 4b. In Figure 4c, the payoffs of Figure 4b are once more normalized so that mutual cooperation yields 0 to both parties and unilateral defection yields a payoff of 1 to each signatory:21 FIGURE 4 Compliance Costs Determine Inequality. Dispute Settlement Design for Unequal Partners 363 As can be seen in Figure 4c, normalized coefficient ci for the developing partner is lower than it is for the developed partner.22 This is because defection relieves the developing partner of the costs of compliance. If the developed signatory fails to comply, best triggers dictate that the expected time for settlement in the shadow of the law should be longer for the developed nation than it is for a noncompliant developing nation. Indeed, given the parameters of Table 4c and assuming that discount factor w = 0.95, failure to comply by the developed partner leads to retalia1 tion with probability q j = wc = 0.26 . The expected delay between an j observed instance of noncompliance by the developed nation and retalia1 tion by the developing partner is then years if the year is the releq j = 3.85 vant unit to measure delays. By contrast, if the developing country fails to 1 = 0.70, comply, the developed country retaliates with probability qi = wci or with an expected delay of 1.43 years.23 The deviating developed country, in this case, is given more time to settle than its developing partner before sanctioning is imposed. This is in line with the recommendations of Busch and Rheinhardt (2003). Prescriptions are reversed if the source of the unequal stakes between treaty signatories is an unequal ability to benefit from the other side’s cooperation. Consider a trade agreement between a developed and a developing nation. The developing partner may be highly dependent on the developed partner’s market to grow its exports while the developed nation, with a wider portfolio of exportable goods, is less dependent on the developing country’s market for its exports. As a consequence, mutual cooperation is more valuable to the developing signatory than it is to its developed partner. The heated debate over NAFTA was in part linked to the unequal benefits of trade liberalization between developing and developed partners. Indeed, Kouparitsas’ macroeconomic analysis of NAFTA pointed to “welfare gains to all North American participants, with the greatest gains accruing to Mexico” (Kouparitsas, 1997, p. 25, italics are ours). Starting from the symmetric payoff situation, asymmetric benefits in favor of the developing partner show up as a larger payoff to developing signatory i from the cooperation of developed country j than j enjoys when i cooperates. In Figure 5b, column player i now enjoys payoff 1 from mutual cooperation instead of the symmetric payoff 0 of Figure 5a. He also enjoys a payoff of 2 if he defects unilaterally, instead of the symmetric payoff of 1 in Figure 5a. In Figure 5c, payoffs to mutual cooperation are normalized to 0 while payoffs to unilateral defection are normalized to 1 by simply subtracting 1 from column player’s payoffs in 5b. 364 J. P. Langlois and C. C. Langlois FIGURE 5 Differential Gains from Cooperation Determine Inequality. When developing partner i has a higher stake in j’s cooperation, mutual defection ends up being more costly for i than it is for j.24 Now the best trigger design for sanctioning gives the developed signatory less time to settle in the shadow of the law, before the developing partner is expected to retaliate. This is in line with the differential treatment proposed under WTO dispute settlement procedures. Indeed, given the parameters of Figure 5c and setting w = 0.95, developed country j retaliates against a noncompliant developing signatory i with probability 0.21, while the developing partner retaliates against noncompliant signatory j with probability 0.26. This leaves the developed signatory with an expected 3.8 years to settle before any sanctioning takes place, while the same expected delay increases to 4.75 years if it is the developing country that fails to comply.25 In conclusion, the best trigger designs for the settlement of disputes prescribe differential treatment of unequal partners, but the nature of that difference depends on the source of the inequality. To be sure, differences in compliance costs and differences in the ability to enjoy the fruits of international cooperation may simultaneously determine the inequality in treaty signatory stakes. But because these factors impact the design of dispute settlement procedures in opposing directions, their relative weight must be accounted for in determining differential treatments. These features notwithstanding, trigger type designs may not provide signatories with the best treaty value. We propose, below, an alternative sanctioning scheme that no longer relies on mutual defection for sanctioning. AN ALTERNATIVE DESIGN FOR CHICKEN, BLUFFS AND PRISONERS When Triggers are Hard to Use The Prisoner’s Dilemma has one pure-strategy Nash Equilibrium, mutual defection, and this allows for the construction of simple subgame perfect Dispute Settlement Design for Unequal Partners 365 trigger mechanisms. But things are not quite as simple if the parties are engaged in other mixed motive games. For example, the Chicken game has two pure-strategy Nash equilibria. In each of these, one party cooperates while the other defects. Clearly neither can individually serve as a threat for both sides. One could conceive of a punishment scheme where signatories would alternate between these two states, generating a mixed equilibrium that would serve the theoretical purpose of imposing punishment on a unilateral defector. Such a punishment scheme would have the signatories alternate between the two Nash equilibria of the Chicken game: I withdraw concessions while you cooperate for, say one month, then its your turn to withdraw concessions while I cooperate. We then repeat for the time it takes to punish the initial defector. This is hardly an attractive prescription. In the Called Bluff game, the situation is even more problematic since the single Nash equilibrium of the stage game, which involves cooperation of one party while the other defects, can only be conceived as punishment for one of the players. Thus, cooperation cannot be supported by the threat of reversion to a simple reversion point that potentially punishes either player. As a result, no simple trigger type schemes can be designed. Of course, a reversion point could theoretically be constructed in order to define subgame perfect equilibrium strategies for a repeated game of Called Bluff. But in engineering complex schemes, behavioral interpretations are lost, and the implementation of sanctioning rules becomes hard to define. It turns out that simple sanctioning rules can be devised for the Called Bluff and Chicken games if, instead of a trigger mechanism, we turn to a particular Tit-for-Tat like scheme. Moreover, this alternative scheme can also be implemented in Prisoner’s Dilemma type situations, and it can be calibrated to so that it systematically provides better treaty values to both signatories than trigger schemes. Contrite Tit-for-Tat A dispute resolution scheme that improves treaty value must avoid long spells of punishing defections. This is first achieved by delaying retaliation itself, ensuring that the signatories have some time to settle the dispute before escalating to punishing measures. This is the generosity that is built into a probabilistic trigger scheme by setting trigger probabilities qi below 1 when the cost of mutual defection is high enough. Generosity in the response to noncompliance addresses one of the problems associated with retaliation by avoiding it, but the organization of the retaliatory phase, if it comes to pass, is also critical to treaty value. The alternative scheme we will discuss also builds in the opportunity to settle, but in contrast to trigger designs, the retaliation phase actually brings redress to the victim of the unilateral defection.26 The subgame perfect scheme that we propose, Contrite Tit-for-Tat (CTFT), expands upon a scheme introduced by Sugden (1986).27 366 J. P. Langlois and C. C. Langlois Under Contrite Tit-for-Tat (CTFT), three states are relevant: CO in which both sides are expected to cooperate, and Gi for i = 1,2 in which one of the players has been judged as failing to comply for whatever reason. Panels judging WTO dispute cases make determinations of this kind, as do the international tribunals that hear cases related to CITES or the Law of the Sea. CTFT uses the vocabulary of guilt and innocence although intent is not presumed. At the outset, signatories are presumed innocent and the rules of play are as follows: a signatory should always cooperate with an innocent opponent. A signatory then becomes guilty if he defects while his partner is innocent. A guilty player is always expected to cooperate, reestablishing innocence if he does, but remaining guilty if he doesn’t. An innocent player can righteously defect against a guilty opponent without compromising his innocence, although he remains innocent if he doesn’t. We add generosity to Sugden’s original scheme, so that guilt no longer results automatically from observed defection against an innocent party, but it is established with some player specific probability qi. Boyd (1989) was first to point out that Sugden’s scheme forms an SPE under noise. Our study extends his result to the case where guilt and subsequent innocence are established with some player specific probabilities qi and ri. Under the CTFT scheme we propose, state Gi, in which player i is found guilty, is reached with probability qi when i’s defection is observed and provided j is not similarly guilty. In state Gi a (Ci,Dj) play is expected, followed by a return with probability ri to state CO, where a (C,C) play is expected. If i does not comply with that expected punishment he remains guilty (in state Gi) with certainty. When found guilty, CTFT requires that the defector allow his compliant partner to enjoy the benefits of unilateral defection for some time. Conceptually, play (Ci,Dj) in state Gi is equivalent to asking the defector to incur sufficient costs to wipe out the benefits of noncompliance, while compensating, at least in part, his compliant partner for the past losses incurred as a result of his failure to comply. Because the CTFT scheme is subgame perfect it is in the interest of both parties to abide by the scheme. The calibration of CTFT schemes is extremely flexible. Indeed, as shown in Proposition 1, to be subgame perfect, CTFT schemes require that (as e → 0): ri ≤ bi qi − 1− w and (1-w + wri )ci ≥ (1 − w )bi w (6) conditions that holds with any ri ≥ 0 whenever ci ≥ bi (for Chicken and player j in Called Bluff) and if ci ≥ (1 – w)bi in the Prisoner’s Dilemma and for player i in Called Bluff.28 How should a CTFT scheme be calibrated to maximize treaty value? As shown in Proposition 4 in the appendix, The rate at which treaty value decreases with noise is player specific and given by: Dispute Settlement Design for Unequal Partners victft = 1 + ⎛ q ⎞ − ⎜ 1 + i ⎟ bi rj ⎝ ri ⎠ qj 367 (7) In order to minimize the rate at which the treaty’s value declines with qi qj qj noise, i would like to see maximum and minimum. But increasing r rj rj i can always be achieved by choosing a smaller value for rj, making the punishment phase as long as possible and ensuring that, with noise, treaty value would actually increase for i. With low rj, signatory j would need to suffer the payoffs characteristic of i’s unilateral defection for a very long time, and meanwhile i would receive compensation that could rise above the losses incurred as a result of j’s past unilateral defection. As rj → 0 the CTFT scheme would resemble a grim trigger from j’s point of view, and while treaty value for i rises with j’s plight in the inevitable event of defecqj qi tion, treaty value for j would be very low. The setting of and must ri rj therefore emerge from the negotiations on the terms of the treaty itself, and qj q the critical issue will be whether or not the parties can choose and i to ri rj enhance treaty value for each, given alternative sanctioning schemes. The Merits of Redress A CTFT type scheme punishes the deviating signatory by imposing the payoffs to the other side’s unilateral defection. Such a scheme can be implemented even if treaty signatories are engaged in a Chicken or Called Bluff game, since the scheme only requires that unilateral defection be beneficial to the defector while it hurts the other side. But, perhaps more importantly, a sanctioning scheme based on CTFT can always provide better treaty value to both signatories, if engaged in a Prisoner’s Dilemma, than if they adopted a trigger-like scheme. A formal proof is given in appendix (Proposition 5). In order to capture the source of CTFT’s advantage, we now compare this scheme to treaty maximizing triggers for a range of parameter values. In order to compare a CTFT type sanctioning scheme to triggers, it is qj qi necessary to select values for and within the wide range that meets ri rj credibility requirements. We chose to compare treaty maximizing triggers to qj q CTFT schemes that minimize factors and i for both parties, ensuring rj ri that the rate at which treaty value declines with noise is small for both 368 J. P. Langlois and C. C. Langlois signatories under the CTFT scheme considered. Given the particular parameter values that we examine below, the CTFT scheme we exhibit sets probabilities ri and rj to 1 (see proposition 5 in the appendix). This is not the only CTFT scheme that can potentially provide signatories with better value than the best trigger scheme. But by showing that this particular choice of CTFT design dominates the best trigger, we prove existence of better CTFT type schemes in general. Table 1 below provides data on trigger and CTFT designs given the parameter values of Figures 4c and 5c. In all cases, noise e = 0.01 and w = 0.95: Table 1 gives the characteristics of a subgame perfect CTFT scheme that maximizes treaty value for each party, given the values of qi and ri chosen by the other side. Both sides would benefit from adopting the CTFT scheme. Indeed, recalling that treaty value under noise will be negative because unintentional deviation is inevitable, Table 1 illustrates that treaty value can decline less if a CTFT design is chosen for sanctioning. Given parameter values, the schemes of Table 1 all require swift return from punishment mode (ri = 1, i = 1,2). The difference between the schemes lies in the values of qi. In CTFT, qi is the probability that signatory i is found guilty if observed to deviate. For trigger schemes qi is the probability that observed deviation by i will lead to mutual defection. Since guilt in CTFT 1 implies certain punishment, represents the expected delay between an qi observation of noncompliance and its punishment for both schemes. CTFT’s ability to provide signatories with higher treaty value stems from the redress that is built in, as well as its generosity in case of deviation. In fact the two features are linked. Because the injured party gets some redress, more generosity can be built in without compromising treaty value. As can be seen in TABLE 1 Treaty Value Under Alternative Designs* r1 = 1 r2 = 1 5 3 , c1 = 4, c 2 = 4, c 2 = 2 2 q1 = 0.281 q2 = 0.733 V1 = −1.611 r1 = 1 r2 = 1 q1 = 0.214 The Conditions of Figure 4c: b1 = 5, b2 = The best trigger design A possible CTFT design q2 = 0.430 The Conditions of Figure 5c: b1 = 5, b2 = 6, c1 = 4, c2 = 5 The best trigger r1 = 1 r2 = 1 q1 = 0.275 q2 = 0.215 design A possible CTFT r1 = 1 r2 = 1 q1 = 0.221 q2 = 0.179 design V2 = −0.604 V1 = −1.211 V2 = −0.472 V1 = −1.192 V2 = −1.490 V1 = −0.985 V2 = −1.171 *Probabilities ri and qi are exact values that account for noise magnitude e. Treaty values are first order eni approximations: Vi ; 1− w Dispute Settlement Design for Unequal Partners 369 Table 1, qi for the CTFT scheme examined is systematically lower than it is for the best trigger scheme. While the inequality between partners is treated in the same way by both schemes, under CTFT, all parties are given more time to return to cooperation before any sanctioning takes place. How can a CTFT scheme be implemented in practice? The punishment phase in CTFT requires that the deviating party, if found guilty, must return to cooperation while the injured party is allowed to deviate, enjoying the benefits of unilateral defection. The same payoff outcome can be achieved by, for example, assessing a fine against the defector to wipe out any expected advantage to unilateral noncompliance, and using part of the proceeds to compensate the compliant signatory. Both parties then return to cooperation if one party is judged guilty, and monetary fines and compensations ensure that the payoffs of unilateral defection by the compliant signatory are realized. Interestingly in an effort to finalize the U.S.–Jordan trade agreement, the Bush administration proposed that monetary fines be the preferred enforcement mechanism for bilateral trade agreements (Inside US Trade, April 2001). And calls for the compensation of injured parties in the face of signatory noncompliance have occasionally been made. Jackson mentions the call for compensation of injured domestic producers in antidumping cases in the House version of the 1987 trade bill, a measure that “would dramatically change the trade policy impact of the antidumping (and subsidy) rules” (Jackson, 1999, p. 274). Our analysis suggests that monetary fines in lieu of retaliatory sanctions, together with compensation of the injured party deserve more attention. CONCLUSION Signatories of international treaties and agreements will sometimes fail to comply unintentionally. If what is at stake defines a mixed motive game, unintentional noncompliance must be sanctioned to avoid defection in selfinterested exploitation of a treaty’s terms. However, the very fact that noncompliance can occur by accident tests the sanctioning rules and requires that they be chosen to provide signatories with the best treaty outcome. But the inequalities between treaty signatories will impact the nature of the remedies in case of unilateral defection, and so will the nature of the mixed motive game that nations play. A reading of treaty texts reveals that sanctioning typically proceeds by retaliatory defection against the noncompliant signatory. The norm for sanctioning is therefore the implementation of trigger-like schemes, and inequality is addressed by proposing that the process leading to retaliation be accelerated when a weaker partner faces the noncompliance of a stronger signatory. Our game theoretic analysis of trigger-like sanctioning schemes suggests that all parties enjoy the best treaty value if settlement delays in the 370 J. P. Langlois and C. C. Langlois shadow of the law are adjusted according to the particular source of signatory inequality. If inequality stems from differences in the costs associated to compliance, as is often the case when developing countries need to comply with Sanitary and Phytosanitary measures, the developed partner with the lower compliance costs should be given more time to settle in the shadow of the law if he deviates than his higher compliance cost partner. Prescriptions are reversed if the source of the unequal stakes between treaty signatories is an unequal ability to benefit from the other side’s cooperation. In trade agreements between developed and developing countries such as NAFTA, the stakes are typically higher for the developing partner. Our analysis therefore suggests that, in this case, the developing partner be given more time to settle in the shadow of the law if he deviates than the developed partner. Trigger-like sanctioning schemes are easy to implement if signatories are engaged in Prisoner’s Dilemma stage games. But, despite their prevalence, they are not well suited to the handling of Chicken or Called Bluff games that may define the stakes in environmental accords. This motivates our analysis of Contrite Tit-for-Tat that builds in redress for the victim of a unilateral defection. While this scheme can handle alternative mixed motive stage games, we also find that it can always provide better treaty value to signatories engaged in a Prisoner’s Dilemma stage game than trigger-type schemes. Schemes based on Contrite Tit-for-Tat build in more time for settlement in the shadow of the law than the trigger-like schemes that are models for treaty dispute settlement designs. If, as Busch and Rheinhardt (2003) argue, developing countries do better in early settlements than they do when a dispute moves to a WTO panel ruling, extending the time allowed for such settlements to take place would seem desirable. If retaliation must come to pass, however, Contrite Tit-for-Tat advocates that punishment of the deviation be accompanied by compensation for the injured party. While compensation for damages to the injured party is not a prescription under WTO rules, a number of authors suggest that it might be appropriate in disputes involving developing signatories. As stated by Hoeakman and Marvroidis (2000), “violations of the WTO are disproportionally burdensome for developing countries given the fragility of many of their export industries and the fact that their export base is generally much less diversified than in high income countries.” In 2003, Ecuador’s banana exports represented almost 20 percent of the value of its exports.29 The never-ending banana dispute over the European Union’s preferential treatment of banana imports from its excolonies would have led to payment of damages to Ecuador under the Contrite Tit-for-Tat rule. Instead the WTO authorized Ecuador to implement retaliatory measures against the EU which do not help Ecuador develop one of its major export sectors and does not foster the free trade ideal of the WTO either. If implemented in the interests of all parties involved, a Contrite Dispute Settlement Design for Unequal Partners 371 Tit-for-Tat type scheme can increase treaty value for all. In the design of sanctioning schemes, our game theoretic analysis concludes that redress for the injured party is better than punishment by defection. CONTRIBUTORS Catherine Langlois teaches economics and game theory at the McDonough School of Business at Georgetown University. Her research interests include conflict and cooperation, treaty design and rationalist explanations of war. Her work has appeared in the American Journal of Political Science, International Studies Quarterly, the British Journal of Political Science and the Journal of Conflict Resolution. Jean-Pierre Langlois is an applied mathematician with research interests in game theory. His current work includes modeling in international relations and computational methods. He is the author of GamePlan, a game theory software. NOTES 1. For example, the WTO agreements on trade specify special dispute settlement conditions for developing countries (Footer, 2001; Michalopoulos, 2000). 2. In international law, special and differential treatment has emerged to alleviate differences in the capability to comply with international agreements (Cullet, 1999). However, while these measures have modified the stakes for unequal partners, they have not necessarily removed individual incentives to cheat, and the conditions for self enforcement must still be worked out. 3. Authors such as Koremenos, Lipson, and Snidal (2001) also embrace a rationalist perspective in their analysis of the design of international agreements and therefore adhere to the principle of deterrent punishment. Indeed, a full issue of International Organization (Vol. 55, No. 4 (Autumn) 2001), is dedicated to the rational design of institutions. While a number of authors explore rational design verbally, Kydd (2001) and Rosendorff and Milner (2001) adopt more formal approaches. Kydd presents a game theoretic analysis of the trust issues involved in NATO enlargement while Rosendorff and Milner show that, in the presence of political uncertainty over the domestic pressure for trade barriers, a trade agreement that includes an escape clause Pareto dominates one that does not have it. We do not seek to model the specific design features of any particular international agreement. Instead we seek to understand some of the generic properties of treaty enforcement regimes. 4. Authors such as Brander (1986) or Krugnam (1984) analyze imperfectly competitive situations that lead to welfare enhancing unilateral protectionism. If firms learn by doing, for example, protection of the domestic market moves domestic firms down their learning curves, and helps them to lower their costs faster than their rivals. This, argues Krugman (1984), enables domestic firms to gain share in unprotected foreign markets. Under such circumstances the consumer welfare lost from protectionism can be more than compensated for by enhanced profits made by the protected domestic firms. 5. This ensures that no alternating mix of (Ci, Dj ) and (Di, Cj) could achieve better discounted payoffs than constant cooperation (C,C). Geometrically, this means that the point (0,0) lies above the line joining (1,–b2) to (–b1,1) in the payoff space so that no mix of the two points could be better than cooperation for both sides. 6. We assume that signatory utility parameter remain constant over time and concentrate on the design features of punishment schemes. Relaxing this assumption requires consideration of future renegotiation and leads to the choice of a finite duration for the agreement. This aspect is examined by Koremenos et al., 2001. 372 J. P. Langlois and C. C. Langlois 7. India - Quantitative Restrictions on Imports of Agricultural, Textile and Industrial Products DS90/R. 8. Chile-Taxes on Alcoholic Beverages DS87/AB/R and DS110/AB/R. 9. Chile-Taxes on Alcoholic Beverages AB-1999–6, p. 21, para. 62. 10. Indeed, the appeals panel also commented on the suggestion that Chile’s taxation system was designed to continue providing once acknowledged import protection in the following terms: “Members of the WTO should not be assumed, in any way, to have continued previous protection or discrimination through the adoption of a new measure. This would come close to a presumption of bad faith.” 11. United States-Import prohibition of Certain Shrimp and Shrimp Products DS58. 12. United States-Import prohibition of Certain Shrimp and Shrimp Products DS58, p. 12, Para. 3.7. 13. This has been done successfully by authors such as Reinhardt (2001) for the WTO Dispute Settlement Understanding or authors such as Rosendorff and Milner (2001) for escape clauses in trade agreements. 14. Table A1 in the appendix gives noisy utilities for all possible intended moves by the players. 15. Precise definition of U ik requires specifying the strategy under consideration since strategy will determine the states of the game. For example, a trigger scheme defines three states: CO in which both parties are expected to cooperate and two states Pi of reversion in which both states are expected to defect subsequent to the defection of one party or the other. U iCO is then the expected discounted payoff to i when in state CO. 16. We will limit our discussion to Markov strategies for which players distinguish a finite (rather than an infinite) set of possible states of the game. Our exclusive consideration of Markov strategies rules out strategies that build on the whole of past history but admits consideration of a history of any length as long as it is finite. Although we focus on Markov strategies, and therefore identify Markov Perfect equilibria (MPE), the equilibrium result is not limited to Markov strategies. A signatory cannot do better by deviating from its Markov strategy in an MPE by using a non-Markov strategy. In other words, a MPE is also a SPE within the set of all possible strategies. The reader is referred to Fudenberg and Tirole (1991, pp. 513–515) for a more technical discussion of these issues. 17. The U.S. threatened China with trade sanctions for continued traffic in rhino horns and tiger bone in 1993, and imposed trade sanctions on Taiwan in April 1994. The sanctions on Taiwan were lifted in June 1995 (www.glo.gov.tw ). The U.S. also threatened to impose trade sanctions on Japan for its trade in endangered hawksbill sea turtles in 1991. While these threats and sanctions were CITES related, they were undertaken under the Pelley Agreement that allows the USTR to restrict trade in wildlife products originating from a country suspected of noncompliance with an international regime such as CITES (Glennon and Stuart, 1998). 18. European Communities-Regime for the Importation Sale and Distribution of Bananas WT/DS27. 19. It is worth noting here that probabilistic trigger schemes were actually developed to handle situations in which intent and observation could differ. Porter (1983), Green and Porter (1984) and Abreu, Pearce, and Stacchetti (1986) are standard references. 20. This assumes that noise is small enough and that the design of equilibria are stable under noise. These assumptions are elaborated upon conceptually in the non technical preamble in appendix. 21. Normalization involves two steps: first add 1 from all of i’s payoffs to normalize the payoff to ⎛ 0, 0 −5, 2 ⎞ mutual cooperation to 0. This yields payoff matrix ⎜ . Secondly, divide all of i’s payoffs by 2 ⎝ 1, −5 −4, −3⎟⎠ ⎛ 0, 0 to yield: ⎜ 5 ⎜ 1, − ⎝ 2 −5, 1 ⎞ 3 ⎟ , which is payoff matrix 4c. −4, − ⎟ 2⎠ −b, 1 ⎞ ⎛ 0, 0 22. In the general case, starting from the normalized symmetric payoff matrix ⎜ , and ⎝ 1, −b −c, −c⎟⎠ introducing a compliance cost κ for signatory i as the column player leads to payoff matrix ⎛ 0, −k ⎜⎝ 1, −b − k −b, 1 ⎞ . Normalizing first involves adding κ to i’s payoffs yielding payoff matrix −c, −c⎟⎠ −b, 1 + k ⎞ ⎛ 0, 0 ⎜⎝ 1, −b −c, −c + k ⎟⎠ . Dividing i’s payoffs by 1 + κ to normalize the payoff to unilateral defection to ⎛ 0, 0 1 yields ⎜ −b ⎜ 1, ⎝ 1+ k −b, 1 ⎞ c−k < c = c j. −c + k ⎟ . Clearly, ci ⎟ −c, 1+ k 1+ k ⎠ Dispute Settlement Design for Unequal Partners 373 23. We have approximated qi and qj by their values as noise ε goes to 0 following (5a–b). Exact values given e = 0.01, for example, are qi = 0.733 qj = 0.284. Mathematica notebooks that enable these calculations are available from the authors upon request. −b, 1 ⎞ ⎛ 0, 0 24. In the general case, starting from the normalized symmetric payoff matrix ⎜ , an ⎝ 1, −b −c, −c⎟⎠ extra benefit b, from partner j’s cooperation accruing to signatory i as the column player, leads to payoff 0, 0 −b, 1 ⎞ matrix ⎛ 0, + b −b, 1 + b ⎞ . Normalizing by subtracting b from i’s payoffs yields: ⎛ . ⎜⎝ 1, −b − b −c, −c − b ⎟⎠ ⎜⎝ 1, −b −c, −c ⎟⎠ Clearly, ci = c + b > c = cj. 25. Again, we approximate qi and qj by their values as noise ε goes to 0 following (5a and b). Exact values given ε = 0.01, for example, are qi = 0.220 qj = 0.274. 26. The idea of rewarding the victim of a unilateral defection is not new to game theorists and has been associated with trigger schemes in various guises. Fudenberg and Maskin (1991) and Morrow (1994) describe strategies that call for two phases. In the punishment phase, victim and perpetrator defect long enough to ensure that the perpetrator loses any short-term gain from unilateral defection. This phase is then followed by a “reward” phase during which the perpetrator cooperates while the victim is allowed to defect enough to reap his reward for carrying out punishment in the first place. But such a scheme would reduce treaty values since it would involve more defection on the part of the victim without avoiding the spate of joint defection by which trigger designs punish the perpetrator. 27. Note that while ordinary Tit-for-Tat is not subgame perfect, the scheme we propose is. 28. See Corollary of Proposition 3 in appendix for proof. 29. http://www.intracen.org/countries/structural05//ecu_8.pdf 30. A sharpening of our approximation of Vi by extending the calculation to involve second order terms, or any number of (nonlinear) higher order terms, would accommodate larger assumed noise magnitudes. 31. Condition (iii) can be ensured by assuming that the derivative Ψ′ (e) is bounded. Tit-for-Tat and the grim trigger are typical of a failure of condition (iv). That m is differentiable can be inferred from the previous conditions. REFERENCES Abreu, Dilip, David Pearce, and Ennio Stacchetti (1986). “Optimal Cartel Equilibria with Imperfect Monitoring.” Journal of Economic Theory, Vol. 39 (June), pp. 251–269. Aggarwal, Vinod, K. and Cedric Dupont (1999). “Goods Games and Institutions.” International Political Science Review, Vol. 20, No. 4, October, pp. 393–409. Boyd, Robert (1989). “Mistakes Allow Evolutionary Stability in the Repeated Prisoner’s Dilemma Game.” Journal of Theoretical Biology, Vol. 136, No. 1, pp. 47–56. Brams, Steven and Marc Kilgour (1988). Game Theory and National Security, New York: Basil Blackwell Brander, James A. (1986). “Rationales for Strategic Trade and Industrial Policy,” in Strategic Trade Policy and the New International Economics, Paul R. Krugman, ed., Cambridge, MA, MIT Press. Busch, Marc L. and Eric Rheinhardt (2003). “Developing Countries and GATT/WTO Dispute Settlement.” Manuscript, forthcoming, Journal of World Trade. Cameron, James and Karen Campbell, eds. (1998). Dispute Resolution in the World Trade Organization, London: Cameron May. Chayes, Abram and Antonia Chayes (1991). “Compliance Without Enforcement.” Negotiation Journal, Vol. 7 (July), pp. 311–331. 374 J. P. Langlois and C. C. Langlois Chayes, Abram and Antonia Chayes (1993). “On Compliance.” International Organization, Vol. 47 (Spring), pp. 175–205. Chayes, Abram and Antonia Handler Chayes (1995). The New Sovereignty, Cambridge, MA: Harvard University Press. Chayes, Abram, Antonia Handler Chayes, and Ronald B. Mitchell (1998). “Managing Compliance: A Comparative Perspective,” in Engaging Countries: Strengthening Compliance with International Environmental Accords, Edith Brown Weiss and Harold K. Jacobson, eds., Cambridge, MA: MIT Press. Cullet, Philippe (1999). “Differential treatment in International Law: Towards a New Paradigm of Inter-state Relations.” European Journal of International Law, Vol. 10, No. 3, pp. 549–582. Damodaran, A. (2002). “Conflict of Trade Facilitating Environmental Regulations with Biodiversity Concerns: The Case of Coffee-Farming Units in India.” World Development, Vol. 30, No. 7, pp. 1123–1135. Downs, George W., David M. Rocke (1990). Tacit Bargaining Arms Races and Arms Control, Ann Arbor: University of Michigan Press. Downs, George W., David M. Rocke, and Peter N. Barsoom (1996). “Is the Good News About Compliance Good News About Cooperation?” International Organization, Vol. 50 (Summer), pp. 379–406. Finger, J.M. and P. Schuler (1999). “Implementation of Uruguay Round Commitments: The Development Challenge.” World Bank Policy Research Working Paper No. 2215, October. Finus, Michael (2001). Game Theory and International Environmental Cooperation, Cheltenham, UK: Edward Elgar Publishing Ltd. Footer, Mary E. (2001). “Developing Country Practice in the Matter of WTO Dispute Settlement.” Journal of World Trade, Vol. 35, No. 1, February, pp. 55–98. Fudenberg, Drew and Eric Maskin (1991). “On the Dispensability of Public Randomization in Discounted Repeated Games.” Journal of Economic Theory, Vol. 53, pp. 428–438. Glennon, Michael J. and Alison L. Stuart (1998). “The United States: Taking Environmental Treaties Seriously,” in Engaging Countries: Strengthening Compliance with International Environmental Accords, Edith Brown Weiss and Harold K Jacobson, eds., Cambridge MA: MIT Press, pp. 173–213. Green, Edward. J. and Robert H. Porter (1984). “Noncooperative Collusion Under Imperfect Price Information.” Econometrica, Vol. 52 (January), pp. 87–100. Henson, Spencer and Rupert Loader (2001). “Barriers to Agricultural Exports from Developing Countries: The Role of Sanitary and Phytosanitary Requirenments.” World Development, Vol. 29, No. 1. pp. 85–102. Hoekman, Bernard M. and Petros C. Mavroidis (2000). “WTO Dispute Settlement, Transparency and Surveillance.” The World Economy, Vol. 23, No. 4, April, pp. 527–542. Inside US Trade (1996). “U.S. Rescinds Retaliatory Sanctions against EU for Hormone Ban,” Vol. 14 (July 19). Inside US Trade (1999). “Unlikely to Lift Beef Hormone Ban; U.S. Set to Retaliate,” Vol. 17 (July 23). Inside US Trade (2001). “Draft USTR Paper on Monetary Fines,” Vol. 19 (April 27). Dispute Settlement Design for Unequal Partners 375 Jackson, John H. (1999). The World Trading System: Law and Policy of International Economic Relations, 2nd ed., Cambridge, MA: MIT Press. Jacobson, Harold K. and Edith Brown Weiss (1998). “Assessing the Record and Designing Strategies to Engage Countries,” in Engaging Countries: Strengthening Compliance with International Environmental Accords, Edith Brown Weiss and Harold K. Jacobson, eds., Cambridge, MA: MIT Press. Jasanoff, Sheila (1998). “Contingent Knowledge: Implications for Implementation and Compliance,” in Engaging Countries: Strengthening Compliance with International Environmental Accords, Edith Brown Weiss and Harold K. Jacobson, eds., Cambridge, MA: MIT Press. Koremenos, Barbara, Charles Lipson, and Duncan Snidal (2001). “The Design of International Institutions,” International Organization, Vol. 55, No. 4, (Autumn), pp. 761–799. Kouparitsas, Michael (1997). “A Dynamic Macroeconomic Analysis of NAFTA.” Economic Perspectives, Vol. 21, Issue 1, Jan./Feb. pp. 14–36. Krugman, Paul R. (1984). “Import Protection as Export Promotion,” in Monopolistic Competition and International Trade, H. Kierzkowski, ed., Oxford: Oxford University Press. Kydd, Andrew (2001). “Trust Building, Trust Breaking: The Dilemma of NATO Enlargement.” International Organization, Vol. 55, No. 4 (Autumn), pp. 801–828. Lipnowski, Irwin and Shlomo Maital (1983). “Voluntary Provision of a Pure Public Good as a Game of ‘Chicken’.” Journal of Public Economics, Vol. 20, No. 2, April, pp. 381–386. Michalopoulos, Constantine (2000). “Trade and Development in the GATT and WTO: The Role of Special and Differential Treatment for Developing Countries.” World Bank Paper, April 19. Missfeldt, Fanny (1999). “Game Theoretic Modeling of Transboundary Pollution.” Journal of Economic Surveys, Vol. 13, No. 3, pp. 287–321. Morrow, James (1994). Game Theory for Political Scientists, Princeton NJ: Princeton University Press. Nottage, Hunter (2003). “Trade and Competition under the WTO: Pondering the Applicability of Special and Differential Treatment.” Journal of International Economic Law, Vol. 6, No. 1, (March), pp. 23–47. Porter, Robert H. (1983). “Optimal Cartel Trigger Price Strategies.” Journal of Economic Theory, Vol. 29, No. 2, pp. 313–338. Reinhardt, Eric (2001). “Adjudication without Enforcement in GATT Disputes.” Journal of Conflict Resolution, Vol. 45, No. 2, (April), pp. 174–196. Rosendorff, Peter B. and Helen V. Milner (2001). “The Optimal Design of International Trade Institutions: Uncertainty and Escape.” International Organization, Vol. 55, No. 4 (Autumn), pp. 829–857. Sand, P.H., 1997, “Commodity or Taboo? International Regulation of the Trade in Endangered Species,” in The Green Globe Yearbook 1997 H.O. Bergensen and G. Parmann, Editors, Oxford University Press, Oxford, pp. 19–36. Schmidt, Carsten, 2000, Designing International Environmental Agreements: Incentive Compatible Strategies for Cost-Effective Cooperation, Edward Elgar, Northampton MA, USA and Cheltenham UK. 376 J. P. Langlois and C. C. Langlois Simmons, Beth A.,1998, “Compliance with International Agreements,” Annual Review of Political Science, Vol. 1, pp.75–93. Staiger, Robert W., 1995, “International Rules and Institutions for Trade Policy,” in Handbook of International Economics, Volume 3, pp.1495–1551, Grossman Gene, M. and Kenneth Rogoff, Eds., Handbooks in Economics, Elseiver, North Holland, Amsterdam, New York and Oxford. Sugden, R., 1986, The Economics of Rights, Cooperation and Welfare, Oxford: Basil Blackwell. Weber, Steve, 1991, Cooperation and Discord in US-Soviet Arms Control, Princeton, NJ, Princeton University Press. APPENDIX 1. Non Technical Preamble An important characteristic of all the designs we will examine here is their behavior as noise is introduced. In all cases the formulation of strategy remains constant while its exact parameters may vary with the introduction of noise. Definition 1 in the appendix calls such strategies “stable under noise.” Stability under noise refers to the structural stability of the design, to the smoothness of adjustment of its strategic parameters, and to the changing long-run frequency with which the states distinguished by the design are visited, as noise is introduced. In particular, structure, parameters, and long-run frequencies approach those of the standard noiseless version of the design as noise approaches zero. In our perspective as “designers” of desirable strategic equilibria, noise is an inevitable nuisance that must be taken into account at treaty design time knowing that the signatories will be largely unable to change the design without substantial renegotiation. Our concept of stability under noise ensures that the design is structurally stable both as noise is introduced, and for different noise magnitudes. Propositions 3 and 4 in the appendix verify that the strategies we examine are “stable under noise” and exploit that fact to derive treaty values. Our technical developments also assume that noise magnitude e is small. This assumption has a number of technical advantages but it also reflects a point of view on the behavior of treaty signatories. If the managerial school is to be believed, treaty negotiators strive to make texts as transparent as possible, and signatories want and intend to abide by the agreement signed. Signatories might then be observed to defect some of the time but certainly not most of the time or even much of the time. Moreover, if the mismatch between intent and realization occurred too often—say half of the time—the consequences of intentional cooperation would be indistinguishable from those of intentional defection and game theory would be powerless. As a technical matter, a small noise parameter e is helpful for the following two reasons: First, various schemes can be compared with reference to the rate vi at which treaty value Vi Dispute Settlement Design for Unequal Partners 377 declines when noise is introduced. Secondly, the analytical difficulties involved in finding explicit formulae under noise drive a need to approximate treaty value Vi. If e is small, rate vi can be used in a standard “first order approximation” of Vi. The technical conditions for the first order approximation to exist are spelled out in Theorem 1 in the appendix and supported by the formal expression for Vi given in Proposition 2. If e is small, the first order approximation is reliable.30 Explicit formulae for rates vi at which long run values decline with noise, given the treaty designs we consider, are derived in the corollary to Proposition 3 and in Proposition 4. 2. Mathematical Appendix In vector form noisy expected utilities U ik must satisfy U i = Wi + w MU i Or (A1) U i = [ I − w M ]−1Wi (A2) where Wi and M are the appropriate vector representations of the states’ noisy payoffs and noisy transition probabilities. Noisy payoffs are given in Table A1 below: TABLE A1 Noisy Payoffs Wi Intentions By i C D C D By j C C D D k Observation Probabilities for Pair (C,C) (D,C) 2 (1 – e) e(1 – e) e(1 – e) e2 e(1 – e) (1 – e)2 e2 e(1 – e) (C,D) e(1 – e) e2 (1 – e)2 e(1 – e) (D,D) 2 e e(1 – e) e(1 – e) (1 – e)2 Noisy Payoffs to i ui(C,C)=e(1 – e)(1 – bi) – e2ci ui(D,C)=(1 – e)2 – e2bi – e(1 – e)ci ui(C,D)=e2 – (1 – e)2bi – e(1 – e)ci ui(D,D)=e(1 – e)(1 – bi) – (1 – e)2ci Wi in (A1) and (A2) depends on the Markov strategy Ψ under consideration. In Trigger: Wi = <ui(C,C),ui(D,D),ui(D,D)> and, in CTFT, Wi = <ui(C,C),ui(C,D),ui(D,C)> for i = 1 with ui(C,D) and ui(D,C) exchanged for i = 2. The solution formulae for U ik are, in general, complicated and hardly tractable for theoretical results. Instead, results can be based on the observation that noise magnitude e is “small” and that for stable definitions of strategy much can be learnt from a differential analysis (as e → 0). Proposition 1, Definition 1, and Theorem 1 are the basis for this approach. Proposition 1: As e → 0, credibility conditions obtain for pairs (qi (e),ri (e)) with limit (qi,ri) as e → 0 satisfying 1) For trigger: ri ≤ ci qi − 1− w w (A3) 378 J. P. Langlois and C. C. Langlois 1− w and (1 – w + wri)ci) ≥ (1 – w)bi (6) w Proof: As e → 0 (A2) yields 1) For trigger: 2) For CTFI: ri ≤ bi qi − U iCO = 0 + wU iCO = 0 U iPi = −ci + w {(1 − ri )U iPi + riU iCO } = −ci 1 − w + wri Deterrence (of player i) is achieved when 1 + wqiU iPi ≤ U iCO = 0 or (A3). It is optimal for player i to comply (play D while j plays D) with punishment in state Pi when: −ci + wU iPi ≤ U iPi which always holds since −ci ≤ (1 − w )U iPi = −(1 − w ) ci 1 − w + wri 2) For CTFT: U iCO = 0 + wU iCO = 0 U iGi = −bi + w {(1 − ri )U iGi + riU iCO } = −bi 1 − w + wri Deterrence (of player i) is achieved when 1 + wqiU iGi ≤ 0 or: ri ≤ bi qi − 1− w w (A4) It is optimal for player i to comply (play C while j plays D) with punishment in state Gi when: −ci + wU iGi ≤ U iGi which holds, provided that −ci ≤ (1 − w )U iGi = −(1 − w ) bi 1 − w + wri or (1 − w + wri )ci ≥ (1 − w )bi (A5) It is optimal for player j to play D against C in state Gi and to comply when returning to state CO in order to avoid becoming guilty. Indeed, if player j fails to use D while in state Gi, this does not affect status or return probability ri. Q.E.D. Dispute Settlement Design for Unequal Partners 379 Proposition 2: The long-run value Vi of a Markov strategy pair Ψ 1 ∑ m kWik where Wik is i’s noisy expected payoff deter1− w k mined by player intentions in state k (see Table A1) according to Markov strategy Ψ, and mk is the long-run frequency with which state k is visited. reads Vi = Proof: Following standard Markov chain theory, if player i receives discounted payoff U ik in state k, the long-run value to player i of the strategy characterized by the set of states 1,2..k,..n} and transition matrix M is Vi = ∑ k =1 m kU ik. In dot product form: n Vi = m ⋅ U i = m ⋅ Wi + wm MU i = m ⋅ Wi + wm ⋅ U i = m ⋅ Wi + wVi = 1 m ⋅ Wi 1− w (A6) Q.E.D. Definition 1: A family Ψ(e) of Markov strategy pairs is “stable under noise” (SUN) if: (i) Ψ(e) is differentiable for e > 0; (ii) all Ψ(e) share the same set of states; (iii) there exists a unique Ψ = lime → 0Ψ(e)—meaning that the probability of each move at each state in Ψ(e) approaches the probability of that move in Ψ; (iv) there is a unique invariant distribution m of the transition matrix M of Ψ; and (v) m(e) is differentiable in e and its derivative has a limit m′ = lime → 0m′(e).31 Theorem 1: If Ψ(e) is SUN then: (i) m satisfies m′ = m′M + mM′; (ii) player i’s long run value Vi(e) is differentiable in e; (iii) lime → 0Vi′(e) exists and is given 1 ni with ni = m 0′ ⋅ Wi0 + m 0 ⋅ wi0 –where Wi0 = Wi (0) is the by Vt′(0) = 1− w dWi noiseless utility vector and wi0 is the derivative ; and (iv) there exists d e e =0 e O(e) such that lime→0O(e) = 0 and Vi ( e ) = ni + e O(e ). 1− w Proof: Since m and M are differentiable, the product rule yields the first formula. Moreover, since Wi is also differentiable, by Proposition 3: Vi ′ ( e ) = dWi ⎫ 1 ⎧ ⎨m ′ ⋅ Wi + m ⋅ ⎬ de ⎭ 1− w ⎩ (A7) and by stability of Ψ under noise Vi′ has the given limit. By the Mean Value Theorem of Calculus Vi(e) = Vi(0) + eVi′(ni) for 0 < ni = ni(e) < e. But by the above limit there exists oi(n) such that Vi′(n) = Vi′(0) + oi(n) with lim n → 0oi(n) = 0. Letting O(e) = maxi {oi(ni(e))} and observing that Vi(0) = 0 yield the result. Q.E.D. 380 J. P. Langlois and C. C. Langlois Proposition 3: The family of triggers with fixed qi and ri is SUN. More- ⎛q q ⎞ over, nitrigger is given by nitrigger = 1 − bi − ci ⎜ 1 + 2 ⎟ . ⎝ r1 r2 ⎠ Proof: Condition (i), (ii), and (iii) of Definition 1 are satisfied since Ψ is constant in e. Under noise magnitude e the transition matrix (on states CO, P1, P2) reads: ⎛ 1 − e ( q − e )( q1 + q2 ) e (1 − e )q1 M (e ) = ⎜ r1 1 − r1 ⎜ r2 0 ⎝ e (1 − e )q2 ⎞ ⎟ 0 ⎟ 1 − r2 ⎠ and ⎛ −( q1 + q2 ) q1 M 0′ = ⎜ 0 0 ⎜ 0 0 ⎝ q2 ⎞ 0⎟ ⎟ 0⎠ The vector m0′ satisfying m0′[I – M0] = m0M0′ = <−(r1m1 + r2m2),r1m1,r2m2> is ⎛q q ⎞ q q m 0′ = < − ⎜ 1 + 2 ⎟ , 1 , 2 > . Treaty value and nitrigger then result from ⎝ r1 r2 ⎠ r1 r2 Theorem 1. Q.E.D. Corollary: As e → 0, subgame perfect triggers with maximum long-run values obtain for pairs (qi(e),ri(e)) with limit (qi,ri) as e → 0 satisfying ri = 1 1 1− w and qi = 1 if wci < 1. The correand qi = if wci > 1, and ri = ci − wci w q sponding minimum value aitrigger of i is given by ri aitrigger w ⎧ ⎪ wc − (1 − w ) ⎪ i =⎨ 1 ⎪ ⎪⎩ wci ⎫ if wci ≤ 1⎪ ⎪ ⎬ if wci ≥ 1⎪ ⎪⎭ (A8) qi are minimum. According to ri Proposition 1 we examine the limit case as e → 0. For e = 0 the minimum qi of r is found when a credibility constraint is saturated—which means i qi 1− w qi w qi with qi > (to ensure ri > 0). The minimum of = ri wci r wc q − (1 − w ) Proof: nitrigger is maximum when both i i i Dispute Settlement Design for Unequal Partners 381 occurs for the maximum qi that allows both qi and ri to be probabilities. Either this means qi = 1 and the given ri when it is a probability, or it means ri = 1 and the given qi which must then be a probability. The minimum value aitrigger follows immediately. Q.E.D. Proposition 4: The family of CTFT with constant qi and ri is SUN. Moreover, victft = 1 + ⎛ q ⎞ − ⎜ 1 + i ⎟ bi rj ⎝ ri ⎠ qj Proof: Conditions (i), (ii), and (iii) of Definition 1 clearly hold since Ψ (e) is constant in e. Condition (iv) is just as obvious with m II = 1. To obtain m′ we need M′ for which we need M = M (e). When CC is intended (at CO) a unilateral defection by i occurs with probability e(1 – e) and a bilateral defection occurs with probability e 2. The probability that player i alone becomes guilty from state CO as a result of noise is: qi e (1 − e ) + qi (1 − q j )e 2 − e qi (1 − e q j ) The first term corresponds to an observed unilateral defection by i and the second to a bilateral defection with i alone being found guilty. The probability of return from Gi to CO is simply ri (1 – e) since i will be observed to cooperate as expected with probability (1 – e). The transition matrix under noise e therefore reads: ⎛ 1 − e ( q1 + q2 − 2e q1q2 ) e q1 (1 − e q2 ) e q2 (1 − e q1 )⎞ ⎟ 0 M (e ) = ⎜ r1 (1 − e ) 1 − r1 (1 − e ) ⎜ ⎟ r2 (1 − e ) 0 1 − r2 (1 − e ) ⎠ ⎝ ⎛ −( q1 + q2 ) q1 q2 ⎞ Thus M 0′ = ⎜ −r1 r1 0 ⎟ ⎜ ⎟ −r2 0 r2 ⎠ ⎝ The vector m′0 satisfying m0′[I – M0] = m0M0′ = <–(q1 + q2), q1, q2> is ⎛ q1 q2 ⎞ q1 q2 >. Treaty value and nictft then result therefore: m 0′ = < − ⎜ + ⎟ , , ⎝ r1 r2 ⎠ r1 r2 from Theorem 1. Q.E.D. Proposition 5: For any Prisoner’s Dilemma there exists a CTFT scheme such that nictft > nitrigger for both i. 382 J. P. Langlois and C. C. Langlois Proof: In CTFT, the minimum value of ditions (A4) and (A5) is given by qi = aictft ri qi subject to the credibility conri w ⎧ ⎪ wb − (1 − w ) ⎪ i =⎨ 1 ⎪ ⎪⎩ wbi ⎫ if wbi ≤ 1⎪ ⎪ ⎬ . We if wbi ≥ 1⎪ ⎪⎭ verify that nictft > nitrigger for the choice aictft against the best trigger choice aitrigger : ctft trigger victft = 1 + a ctft + a trigger ) = vitrigger j − bi (1 + ai ) > 1 − bi − ci ( ai j clearly holds if: bi aictft − ci aitrigger ≤ 0 . This last inequality must be verified in three cases: 1. If wbi > wci ≥ 1 it reduces to bi c − i = 0; wbi wci wbi wci − ≤ 0 which holds wbi − (1 − w ) wci − (1 − w ) x is decreasing in x > 1 – w. since bi > ci and the expression x − (1 − w ) ≤ 1 and wbi ≥ 1 it reduces to 3. If wci bi wci (1 − w )(1 − wci ) ≤ 0 since wci > (1 – w) is implied − =− wbi wci − (1 − w ) w ( wci − (1 − w )) by (3). Q.E.D. 2. If wci ≤ wbi ≤ 1 it reduces to
© Copyright 2026 Paperzz