Dispute Settlement Design for Unequal Partners

International Interactions, 33:347–382, 2007
Copyright © Taylor & Francis Group, LLC
ISSN: 0305-0629
DOI: 10.1080/03050620701681809
Dispute Settlement Design for Unequal
Partners: A Game Theoretic Perspective
1547-7444 Interactions,
0305-0629
GINI
International
Interactions Vol. 33, No. 4, October 2007: pp. 1–57
Dispute
J.
P. Langlois
Settlement
and C.Design
C. Langlois
for Unequal Partners
JEAN-PIERRE P. LANGLOIS
Department of Mathematics, San Francisco State University, San Francisco,
California, USA
CATHERINE C. LANGLOIS
McDonough School of Business, Georgetown University, Georgetown,
District of Columbia, USA
When signatories of international agreements fail to comply unintentionally, sanctioning rules designed to deter intentional noncompliance are tested. To provide signatories with the best treaty
value, we find that remedies in case of unilateral defection must
account for the nature of the inequality between treaty partners,
as well as the type of mixed motive game they are engaged in. Trigger type schemes, that rely on punishment by mutual defection,
are the norm for sanctioning in treaty texts. Inequality is
addressed by proposing that the process leading to retaliation be
accelerated when a weaker partner faces the noncompliance of a
stronger partner. Our analysis suggests instead that the prescription depends on the source of the inequality. If inequality stems
from differences in the costs associated to compliance, the stronger
partner, with the lower compliance costs, should be given more
time, not less, to settle in the shadow of the law if he deviates.
Despite their prevalence, trigger schemes are not well suited to the
handling of Chicken or Called Bluff games that may define the
stakes in environmental accords. This motivates our analysis of an
alternative sanctioning scheme that builds in redress for the victim
of a unilateral defection. In addition to its ability to handle alternative game structures, we find that this scheme provides better
The listing order of the authors’ names is not indicative of their respective contributions, which they consider to be equal.
Address correspondence to Professor Jean-Pierre P. Langlois, San Francisco State University, Department of Mathematics, 1600 Holloway Ave., Thorton Hall, Ninth Floor, Office: TH
926, San Francisco, CA 94132, USA. E-mail: [email protected]
347
348
J. P. Langlois and C. C. Langlois
treaty value than trigger type schemes, as well as credible deterrence, to signatories engaged in a Prisoner’s Dilemma game. We
conclude that, in the design of sanctioning schemes, redress for the
injured party is better than punishment by defection.
KEYWORDS dispute settlement design, treaty value, game theory
Nations of unequal size, power or influence sign international agreements
and devise ways to punish noncompliance that take explicit account of the
differences in signatory circumstance.1 If mixed motives are present, the
game theoretic principles of credible deterrence should inform the design of
such dispute settlement rules. Indeed, treaty sanctioning provisions should
credibly deter willful breach, and many possible schemes can achieve this
goal. But, if signatories can deviate from treaty provisions unintentionally,
the choice of a sanctioning rule will be consequential because sanctioning
will necessarily come to pass. Punishment moves the parties away from the
benefits of full cooperation, but some sanctioning provisions do this more
than others. Our objective is to compare credible sanctioning rules, including those that mimic the rules typically prescribed by international treaties,
according to their ability to keep unequal partners cooperating as much as
possible throughout the life of the agreement.
From a game theoretic viewpoint, the unequal status of treaty signatories shows up in the structure of payoffs, and this impacts the design and
effectiveness of dispute settlement rules. In this paper, we identify and evaluate subgame perfect designs for dispute settlement in two-player asymmetric games. Such designs ensure that, despite their differences, willful breach
is in the interest of neither party. Our analysis embraces the premise that
even if special and differential treatment is applied to compensate for
nations’ unequal stakes in the benefits of cooperation,2 many of the issues
regulated by international treaty determine interests that would give advantage to a unilateral defector. In this we find ample support in the literature,
including the writings of international lawyers such as Jacobson and Brown
Weiss (1998). Following the writings of authors such as Chayes and Chayes
(1991, 1993, 1995), we also recognize that noncompliance will arise from
misunderstanding, lack of transparency, or lack of indigenous resources in
support of compliance.
The existence of mixed motives, when associated with possibly unintentional noncompliance, has profound consequences for a game theoretic
analysis. Under these circumstances, game theoretic provisions must
account for the observation of noncompliance, regardless of intent, and the
possible testing of the sanctioning rules in place as a result. Indeed, the
implementation of sanctions is consequential for signatory progress toward
full cooperation. Clearly, if a grim trigger were proposed as a sanctioning
Dispute Settlement Design for Unequal Partners
349
scheme, a single instance of noncompliance, if punished, would preclude
cooperation forever more. By contrast, a trigger scheme that would only
punish for a period of two years would allow for a return to cooperation
thereafter. Intuitively, signatories are better off if they can return to full
cooperation sooner rather than later. This simple insight enables us to
define a formal criterion that measures the value to signatories of dispute
settlement designs that credibly deter willful breach. This, together with an
application of the criterion to a selection of possible game theoretic designs
for two unequal partners, is the main contribution of the paper.
Punishment schemes that credibly deter intentional noncompliance
share the same characteristic: the noncompliant signatory stands to lose in
punishment at least as much as he or she has gained from deviant behavior.
But the possible schemes take on a bewildering variety of forms. Punishment of the deviating signatory can be achieved by implementing a triggerlike mechanism. Retaliation in kind—the recognized mode of punishment is
such schemes—imposes the payoffs of mutual defection on both parties.
The injured party therefore incurs costs when punishing the defector. But
punishment can also involve at least partial redress for the injured party’s
losses.
Trigger-like schemes are the norm for sanctioning in treaty texts, and
inequality is addressed by proposing that the process leading to retaliation
be accelerated when a weaker partner faces the noncompliance of a stronger partner. Our analysis suggests, instead, that the prescription depends on
the source of the inequality. If inequality stems from differences in the costs
associated to compliance, the stronger partner, with the lower compliance
costs, should be given more time, not less, to settle in the shadow of the
law if he deviates. Our analysis also points to the inadequacy of trigger like
schemes if the parties are not engaged in a classic Prisoner’s Dilemma but in
alternative mixed motive games such as Chicken or Called Bluff. In such
cases, we show that a tit-for-tat like scheme in which the injured party gets
at least partial redress, offers credible deterrence. But most importantly,
even if the signatories are engaged in a Prisoner’s Dilemma, this same
scheme can be calibrated to provide all parties with better treaty value than
if they adopted a trigger-like scheme.
COMPLIANCE WITH INTERNATIONAL AGREEMENTS
AS A GAME THEORETIC ISSUE
Compliance with International Agreements: An Overview
The literature on compliance is vast and a comprehensive overview is not
our purpose here. However, it is of interest to note, in broad strokes, the
prominent themes that mark the field. Treaty design and the issue of compliance has been approached by two groups of scholars that stand in stark
350
J. P. Langlois and C. C. Langlois
contrast. Rationalist scholars such as Downs, Rocke, and Barsoom (1996)
argue for the central role of deterrent punishment,3 while international lawyers such as Chayes and Chayes (1991, 1993, 1995) question the usefulness
of game theory and the sanctioning propensities that stream from it.
The managerial school recognizes that sanctioning is the appropriate
response to willful breach, but points to “the error of conceptualizing most
compliance problems as being due to intentional violations” (Chayes et al.,
1998, p. 39). Instead, Chayes and Chayes emphasize the willingness of
nations to comply with international agreements most of the time, and
attribute noncompliance to a lack of means or a misunderstanding on the
terms of the agreement. If noncompliance is mostly unintentional, the failure to cooperate should be managed and not punished (Chayes and
Chayes, 1991, 1993, 1995). Downs, Rocke and Barsoom (1996) respond
with arguments about the depth of treaties. If a treaty prescribes a course of
action that signatories would have followed in the absence of an agreement,
compliance will indeed be observed. But if treaties are deep, exposing a
widening benefit to unilateral defection, signatories will defect in exploitative self-interest unless the threat of sanctions is powerful enough to deter
such behavior. While proponents of the managerial school downplay the
relevance of willful breach, rationalists argue that cooperation cannot be
achieved without coercive means. Simmons (1998) argues that these
approaches to compliance “are not mutually exclusive, and the less one is
willing to straw-man the arguments of the major proponents, the clearer
become the numerous points of overlap,” (Simmons, 1998, p. 76). We agree
wholeheartedly.
The stark contrast between managerial and rationalist prescriptions
only exposes the surface of the compliance debate. Indeed it is necessary to
understand the nature and context of compliance in order to devise means
to encourage it. Those that have analyzed the complexities of the phenomenon are perhaps more inclined to be managerial than to be rationalist. Yet
their conclusions bring much grist to the mill of a game theoretic approach
to cooperation. Jacobson and Brown Weiss’s (1998) analysis of compliance
with environmental accords is of particular interest. These authors conclude
a major review of compliance, by eight nations and the European Union
with five environmental treaties, by proposing that the measures that will
most effectively enhance compliance depend on the relative strength of a
signatory’s intent and ability to comply with the terms of the agreement.
These authors offer a measured view of appropriate strategies where sanctions are deemed “crucial in agreements where free riding is possible and
could carry significant rewards” (Jacobson and Brown Weiss, 1998, p. 548),
but where positive inducements and monitoring also play a major role.
Thus the threat of coercive action typical of a game theoretic approach is
clearly recognized as necessary if mixed motives are present. Yet, the recognition that noncompliance can rarely be construed as an intentional and
Dispute Settlement Design for Unequal Partners
351
self-interested attempt at exploitation of the treaty’s terms, tempers the call
for coercive measures. A game theoretic approach to treaty design needs to
take into account the possibility of unintentional noncompliance, while
ensuring that willful breach is effectively deterred.
Nations certainly project the appearance of compliance failure with the
terms of a variety of treaties and agreements, even when these include provisions for the punishment of noncompliant behavior. To date, over three
hundred cases have been brought to WTO dispute resolution bodies since
1995 despite a World Trade Organization (WTO) strengthening of the sanctioning rules that prevailed under General Agreement on Tariffs and Trade
(GATT). Numerous infractions of the Convention on International Trade in
Endangered Species (CITES) are reported by Sand (1997) despite provisions
for dispute settlement that allow for retaliatory trade restrictions (Jacobson
and Brown Weiss, 1998 p. 527). All or most of these instances of apparent
breach may be unintentional, due to a lack of means rather than a lack of
will. But their very existence brings up the issue of sanctioning. However, it
is the stakes involved that determine the need for sanctioning rules rather
than any clear evidence that nations do fail to comply. Indeed, if cooperation is the best outcome for all, signatories would all have a strong willingness to comply even if compliance was not always possible. In such a case,
sanctioning rules could play no useful role. But if mixed motives are
present, then game theoretic principles dictate that sanctioning of observed
noncompliance is necessary to prevent intentional breach. The relevance
and nature of sanctioning possibilities depends on signatory incentives.
What, then, is the structure of the games that nations play and regulate?
Do Treaties Regulate a Game of Mixed Motives?
MIXED
MOTIVES IN TRADE, SECURITY, AND THE ENVIRONMENT
Mixed motives come in various shapes and sizes, from the incentives
present in the classic Prisoner’s Dilemma to the risky temptations of the
Chicken or the Called Bluff games. While many authors have examined the
mixed motives of a Prisoner’s Dilemma, the structure of payoffs depends on
the issue area. The Prisoner’s Dilemma has been the game structure of
choice to describe security or trade games. Downs and Rocke (1990) and
Brams and Kilgour (1988) use the Prisoner’s Dilemma to describe the stakes
inherent in the arms control game. Weber (1991) justifies this choice in a
careful analysis of the stakes involved. While the security game played by
the ex-Soviet Union with the United States may have been construed as a
game of equal partners, the current security relationship between the United
States and Russia clearly is not. Indeed, Russia is at a severe economic disadvantage in the race to modernize arsenals while paring the existing stockpile of arms. As a result Russia would now lose more from a unilateral
352
J. P. Langlois and C. C. Langlois
FIGURE 1 Changing Stakes in the Security Game.
defection of the U.S. than the U.S. would from Russia’s defection, while
mutual defection would be more costly to Russia than to the U.S. Using
numerical values for parameters, the stage game has shifted from a symmetric Prisoner’s Dilemma as represented in Figure 1a to one that resembles
Figure 1b.
Downs et al. (1996), in their discussion of compliance with the WTO
agreements, assume that the rewards from trade liberalization are also characterized by the mixed motives of the Prisoner’s Dilemma. The assumption
is supported by a number of arguments from the economics literature: Giving additional weight to business interests over consumer welfare gains in
the determination of the overall welfare gains from trade will reveal a unilateral incentive to adopt protectionist measures. Staiger (1995) incorporates
such weighting in a simple partial equilibrium tariff-setting model highlighting the Prisoner’s Dilemma structure of the game created as a result.4 The
argument for mixed motives in trade when extended to the relationship
between a developed and developing trading partner highlights the unequal
burden on developing countries of retaliation (Hoekman and Mavroidis,
2000). To the extent that developing country export bases are less diversified, restricted access to a developed market will impose severe costs.
Developed countries, when facing similar restriction from a developing
partner are less likely to lose a determining outlet for their products and are
more able to compensate for the loss by boosting trade in other goods and
switching to alternative markets. Thus a developed partner imposes more
harm on the developing country by defecting than its smaller, poorer partner can impose by doing likewise. As in the security game, asymmetric payoffs would resemble those of Figure 1b.
Games involving collaboration to produce a public good such as pollution abatement come with special characteristics that transform the nature of
the stakes. In such games the participating nations incur a cost-when cooperating, but also benefit from their own cooperation as well as that of others. While it has been shown that the redistribution of the gains from
Dispute Settlement Design for Unequal Partners
353
FIGURE 2 Payoff Structures in Environmental Games.
collective action can always ensure that treaty signatories are collectively
better off cooperating on pollution abatement, free-riding on other countries’ abatement efforts can make individual signatories still better off (Missfeldt, 1999). Most countries are both polluters and victims of pollution but
“national costs and benefits of abatement efforts are distributed asymmetrically across nations” (Schmidt, 2000 p. 44). Lipnowski and Maital (1983),
suggest that in trans-boundary pollution games, it may be worse to defect in
the presence of a free rider than to continue to cooperate. Signatories
would then be playing a game of Chicken. Such a structure is also suggested by Finus (2001) for pollution abatement games if the value of clean
air is high enough relative to the costs of producing it. We illustrate an
asymmetric Chicken game numerically in Figure 2a. Aggarwal and Dupont
(1999) provide a comprehensive characterization of the structure of public
goods provision games in which countries incur a cost of provision that is
shared if the parties cooperate. Unequal benefits coupled with the unilateral
incentive to free-ride on the other’s provision can lead to a situation where
one party prefers to produce the public good alone despite the other side’s
defection, while the other prefers mutual defection to a situation where he
continues to cooperate when faced with free-riding behavior. Such a situation might well be relevant to the sadly timely provision of monitoring and
control of terrorist activities. This is a Called Bluff game whose incentives
are illustrated numerically in Figure 2b.
We conclude from this overview that the presence of mixed motives in
salient international issue areas is widely acknowledged. Moreover, the
above discussion points to the importance of asymmetry, and to the variety
of game structures that may lurk beneath a treaty’s terms.
THE
GENERIC STAGE GAME AND PLAYER OBJECTIVES
The various game structures that we have discussed find formal representation in the following two player stage game:
354
J. P. Langlois and C. C. Langlois
Player 2
Player 1
Cooperate C
Defect D
Cooperate C
Defect D
0,0
1, −b2
−b1,1
–c1, −c2
FIGURE 3 The Generic Stage Game.
The players’ decisions are to either comply or violate the terms of the
treaty, cooperate or defect. We have normalized the payoffs to full cooperation to 0 and to unilateral defection to 1 and parameters ci and bi are strictly
positive (ci , bi > 0). To ensure that mutual cooperation is the best long-run
outcome if it can be sustained, we assume that b1, b2 > 1.5 The nature of the
mixed motive game depends on the relative stakes bi and ci. If bi > ci for
both i, the players face a classic Prisoner’s Dilemma. If bi < ci for both i,
they are engaged in a game of Chicken, and if (say) b1 > c1 and b2 < c2, the
game is one of Called Bluff, and player 2 is called.
Players engaged in the repetition of the stage game of Figure 3 will
base their decisions on the expected discounted payoffs (for player i at integer time t ≥ 0):
U i (t ) =
∞
∑ s = 0 w s ui (t + s )
(1)
where ui (t + s) is the utility that player i expects to derive at time (t + s) and
w is the discount factor.6 Utilities ui (t + s) are expectations that result from
player intentions: the utility parameters of Figure 3 and the probability of
mismatch between observed behavior and player intention. Before we provide explicit formulation for ui (t + s), we turn to the modeling of unintentional noncompliance.
UNINTENTIONAL NONCOMPLIANCE AND TREATY VALUE
Unintentional Noncompliance: Some Examples of Its Meaning
Failure to comply can fall within legally acknowledged signatory capacity
constraints, and be explicitly permitted under the terms of the international
agreement. Conceptually distinct is the noncompliance that might result
from a conflict with other legal obligations, or a misreading of the agreement’s terms. Game theory’s concern is with the latter because, if it results
in de facto noncompliance, it may be difficult to distinguish from willful
breach in exploitation of the treaty’s terms. From a game theoretic
viewpoint, punishment of unintentional noncompliance is necessary if
Dispute Settlement Design for Unequal Partners
355
intentional noncompliance is to be deterred. Thus it is de facto noncompliance that is punishable, regardless of intent. In this, game theoretic and
legal principles converge. A few examples will illustrate instances of possibly unintentional noncompliance that trigger the remedies game theory has
to propose.
Ruling out the case where a signatory’s capacity to comply is so weak
that defection is the best course of action, observed noncompliance must
still be measured against the special and differential treatment afforded to
weaker international partners. For example, India’s imposition of import
restraints in the India – Quantitative Restrictions case brought to the WTO
by the United States,7 the issue of noncompliance was to be weighed
against the possibility for a developing country to restrict imports in order to
maintain a favorable balance of payments position. If import restrictions are
found necessary to support certain macroeconomic equilibria, maintaining
them does not constitute a failure to comply. But there is noncompliance
with the terms of the WTO agreements if the country’s balance of payments
situation is too favorable to support imposing such import restrictions. Noncompliance then comes from a misinterpretation of the extent of the special
measures that apply to a developing member of the WTO, rather than a
willful attempt at self-interested exploitation of the treaty’s terms. But it is
punishable nevertheless.
Allegedly unintentional noncompliance as the byproduct of competing
obligations is also frequently observed in trade disputes. Examples in the
WTO are many, and panel rulings go to great lengths to judge a practice on
its face rather than in the terms of any legislative intent. In a case brought
against Chile’s taxation of alcoholic beverages by the European Union
(EC),8 the EC contended that by taxing spirits according to alcohol content,
imports of liquors such as whisky, gin or tequila were improperly inhibited
although they constituted direct substitutes to the slightly weaker domestic
pisco. On appeal the WTO panel ruled against Chile although the taxation
system was acknowledged to be “facially neutral.” The fact that imports fell
disproportionately in the high tax brackets made the measure de facto discriminatory and, the panel stated, “the subjective intentions inhabiting the
minds of legislators or regulators do not bear on the inquiry.”9 The implicit
presumption was that signatories do not intend to violate the terms of the
WTO agreements in exploitative self interest.10 But the WTO panel commentary can also be interpreted as meaning that only deviation matters
regardless of the nature of signatory intentions.
While domestic regulatory considerations may inadvertently interfere
with compliance with international agreements, nations may also disagree
on the criteria for compliance. A nation’s interpretation of the terms of the
agreement may then lead to a judgment of noncompliance by relevant legal
bodies, although the intent is arguably not self-interested exploitation. The
differing view of what constitutes protection of an endangered species in
356
J. P. Langlois and C. C. Langlois
the Shrimp–Turtle case, brought against the United States by a number of
shrimp exporting countries, is an example.11 In this case, a number of countries protested the United States decision to prohibit importation of shrimp
from nations that did not use Turtle Excluder Devices (TEDs) when fishing
for shrimp in certain waters. The U.S. was alleging improper compliance
with the CITES agreement and imposing trade sanctions as a result. One of
the injured parties, Malaysia, “while recognizing that the use of TEDs was
a step in contributing to the conservation of turtles, . . . . . considered that
it was just one of the many accepted methods for the conservation of
turtles.”12 And Malaysia claimed “a comprehensive legal framework on the
conservation and management of marine turtles,” including prohibition of
capture and establishment of sanctuaries. Other plaintiffs were similarly
adamant about their efforts to protect the species. While countries, in good
faith, can claim compliance with the same standards, their practices may
well differ (Jasanoff, 1998). But as signatories differ in their interpretation of
treaty obligations, so do interpretations of behavior, and signatories will
periodically be presumed to defect on their treaty obligations even if they
meant to comply.
From a game theoretic viewpoint, it matters little what the particular
misunderstanding that leads to noncompliance might be. What matters is
that defection will be observed in a context where purity of intent cannot
be assumed since mixed motives are present. A priori, signatories can
expect to observe noncompliance by the other side with some likelihood,
presumably small, since it is established that all parties want to comply. Our
goal will now be to highlight the properties of alternative game theoretic
remedies within the simplest model that captures the essence of the incentive structure and behavioral patterns that prevail among treaty signatories.
Our perspective is that of a designer of dispute settlement regimes. We do
not seek to apply a mathematical mirror to legal processes that might
already be in place.13
Modeling Unintentional Noncompliance and Treaty Value
Unintentional failure to comply with the terms of an international agreement
will show up in observations that are similarly shrouded in uncertainty
about their true nature. A signatory may then observe that a treaty partner is
failing to comply when in fact that same partner’s intent was to cooperate.
But by the same token, actual deviations from compliant behavior may not
be viewed as such. As a result our signatories operate in a noisy environment. We formalize this situation as follows: the observation of player i’s
move xi e {C,D} reflects this player’s intent with probability (1 – e). But with
some probability e, observation and intent do not coincide. Thus player i,
with probability e, will be observed to defect while he intended to cooperate, an event that triggers implementation of the sanctioning scheme in
Dispute Settlement Design for Unequal Partners
357
place. Thus if the players intended to cooperate at date t + s, player i’s utility ui (t + s) will read:
ui (t + s ) = ui (C , C ) = e 2 × 0 + e (1 − e ) × ( −bi ) + (1 − e )e × 1 + (1 − e )2
× ( −ci ) = e (1 − e )(1 − bi ) − e 2ci
(2)
Utility ui (t + s) thus accounts for player intentions (to cooperate in this
case), the utility parameters of Figure 3, and the probability of mismatch
between observed behavior and player intention.14
When engaged in a mixed motive game, the best that a player can
achieve is the payoff to mutual cooperation as long as neither party exploits
the other by defecting unilaterally. One of the purposes of strategy is therefore to prevent such illegal defection by threatening appropriate punishment.
But if each player is inevitably faced, at some point, with the other’s apparent
noncompliance, she will need to be poised and ready to implement punishment. Sanctioning moves the players away from full cooperation and reduces
the value of the treaty to the signatories. If retaliation means that both parties
defect for some time, they will be receiving the payoffs of mutual defection
rather than those they could reach by cooperating. The frequency of noncompliance, together with the type of sanctioning scheme in place, will determine
the long run probability with which the players find themselves in one payoff
state or another. Formally, let mk be the long-run frequency with which signatories visit state k, and U ik be player i’s expected discounted payoff when in
state k.15 If the particular design for dispute resolution under study determines
n possible states of the game, treaty value for player i reads:
Vi =
∑ k =1 m kU ik
n
(3)
What credible sanctioning schemes maximize treaty value for each of the
signatories? Game theorists have dealt with the credibility issue by requiring that
strategy pairs (in a two-player game) form a subgame perfect equilibrium (SPE).
This means that the strategic plan adopted by each party must be the best possible, given player objectives, and regardless of the circumstances. But it also
means that mutual cooperation in intent maximizes signatory payoffs. The best
dispute settlement procedure from a game theoretic viewpoint therefore maximizes treaty value as defined in (3) by keeping both signatories cooperating
with the highest long run likelihood. The strategies that we examine in what follows all participate in an SPE. This guarantees that it is in the best interest of the
signatories to implement the sanctioning rules if noncompliance is observed.16
We now turn to our dispute resolution schemes. We first describe trigger designs, arguing that dispute resolution procedures implemented under
agreements such as the WTO, or CITES can be interpreted as trigger
358
J. P. Langlois and C. C. Langlois
schemes. We then examine an alternative scheme that calls for redress for
the compliant signatory at the retaliation stage. Such a scheme can provide
both sides with better treaty value.
TRIGGER DESIGNS FOR UNEQUAL PARTNERS
Treaty Dispute Resolution Procedures as Trigger Mechanisms
Negotiated settlement is the preferred solution to disputes over compliance
with a treaty’s terms. If settlement fails, then elaborate provisions typically
involving third parties come into play. Thus, disputes over the Law of the
Sea are brought to special arbitral tribunals, the International Tribunal for
the Law of the Sea, or the International Court of Justice while trade disputes
can be brought to a WTO panel. While the decisions of these bodies are
binding, their implementation is left in the hands of the states involved. As
Koremenos, Lipson, and Snidal (2001) put it, “most international organizations have relatively decentralized enforcement arrangements. They specify
possible punishments for rule violations but leave it up to the members to
apply them.” When states engage in formal dispute settlement, they obtain
the terms of legitimate remedy and the allowed timing of their imposition,
while enforcement remains the decision of the parties in the dispute. Thus,
a WTO panel ruling against the perpetrator of a trade restriction opens the
official door to retaliatory action, but does not preclude further attempts at
negotiated settlements. Victims have, of course, also been known to impose
or threaten, punishing sanctions unilaterally. As early as 1989, for example,
the U.S. unilaterally invoked the Pelley Amendment to impose trade sanctions on Taiwan for trading in endangered Rhino horn.17
From a game theoretic perspective, two features of treaty dispute resolution processes are noteworthy: observed defection may not lead to retaliation at all, but if it does, punishment is imposed for an uncertain but
possibly long period of time. The few instances of retaliation under WTO
rules are instructive: In April of 1999, the United States imposed $191.4 million in retaliatory sanctions against European agricultural products to protest
Europe’s refusal to adequately modify its banana importation regime which
extended trade preferences to the less developed, banana - producing,
African, Caribbean, and Pacific countries.18 A satisfactory resolution of the
conflict was reached in April, 2001 (New York Times, April 12, 2001) and the
United States removed retaliatory sanctions worth $191 million on July 1,
2001. The banana dispute had been ongoing since 1993, and retaliatory
measures had been in place for just over two years.
In another case against the European Union, the U.S. imposed $100
million in retaliatory sanctions as early as 1989 to protest Europe’s refusal to
import hormone-treated beef. This particular set of retaliatory measures was
rescinded by the United States in July of 1996 (Inside US Trade, July 19, 1996),
Dispute Settlement Design for Unequal Partners
359
under European threat to call for an official WTO investigation of U.S. unilateral sanctions under Section 301. The measures had been in place for seven
years. A new set of U.S. retaliatory sanctions worth $116.8 million was
imposed in July of 1999 and is in force at the time of writing (Inside US Trade,
July 23, 1999). The dispute on Europe’s ban of hormone-treated beef has now
lasted for thirteen years, and retaliatory measures have been in place for ten
of these years. Many disputes do not lead to retaliation at all. In those that do,
retaliatory measures are in place for uncertain periods of time. These features
of dispute resolution are characteristic of probabilistic trigger schemes.19
Probabilistic trigger schemes operate as follows: observed unilateral
defection by signatory i is followed, with some probability qi, by reversion
to “punishment mode” where defection is expected from both sides. Once
in punishment mode, a return to cooperation by both parties occurs only
after a punishment period of probabilistic or deterministic length Ti. In the
probabilistic case, the expected length can be specified by a return probability
1
ri (the expected length is then Ti = ). The lower the trigger probability qi
ri
1
the longer it will take ( in an expected sense) before a continuing unilatqi
eral defection will be punished. In the meantime, mutual cooperation can
be reestablished through negotiated settlement, in which case the defection
goes unpunished. This represents generosity in the face of noncompliance.
Clearly, unequal partners need not abide by the same trigger probability
rules to ensure credibility of retaliatory threats. The time spent in punishment mode, Ti (or equivalently the likelihood of return to cooperation ri),
depends on the identity of the presumed defector. And the signatories need
not behave with the same generosity, giving the other more or less time to
reestablish cooperation if he has failed to comply.
Interestingly, the WTO dispute resolution understanding, arguably the
most developed of international dispute settlement protocols (Cameron and
Campbell, 1998), also calls for differential delays for less developed member
countries. The focus of the WTO has been to help developing countries
move through dispute resolution procedures faster than developed countries. In game theoretic terms, the WTO would prescribe that a developing
country retaliate against a unilateral defector with a likelihood qj that is
higher than the one prevailing for its developed partner. But Busch and
Rheinhardt (2003) suggest that these differential rules have disadvantaged
less developed signatories. These authors argue that developing nations
obtain larger concessions “in the shadow of the law,” before a dispute is
paneled. An increase in likelihood qj reduces the time afforded to the signatories to settle before costly legal proceedings are engaged. This, according
to Busch and Rheinhardt (2003), works against less developed country
interests, and developing nations should be given more, not less, time to
360
J. P. Langlois and C. C. Langlois
settle disputes against a noncompliant developed partner before a panel is
convened. Our game theoretic analysis of trigger type mechanisms for the
settlement of disputes suggests a more complex reality. Differences in signatory circumstances can lead to a variety of possible payoff configurations.
These in turn dictate the parameters of the trigger mechanisms that signatories can use. In some cases, developing partners are better off with more
time to settle with a deviating developed partner. But in others it is better to
speed up the dispute settlement process for the weaker partner. It all
depends on the nature of the asymmetry between signatories.
Identifying Best Trigger Mechanisms for Unequal Partners
Trigger schemes, whether probabilistic or deterministic, call for the identification of a clear noncooperative state that is detrimental to the defector. Mutual
defection in the Prisoner’s Dilemma, a Nash equilibrium of the stage game, provides a natural reversion point in case of unilateral defection of one signatory.
TRIGGER
SCHEMES AND TREATY VALUE FOR THE
PRISONER’S DILEMMA
Trigger designs for unequal partners must be player specific. Such schemes
therefore distinguish three states of the game: Cooperation CO, in which
both sides are expected to choose C, and two states Pi (i = 1,2) of reversion
in which both states are expected to defect, following the observed noncompliance of one or the other signatory. The rules for reaching Pi and
returning to CO are tailored to each player i: qi is the probability that an
observed unilateral defection by i will trigger a reversion to state Pi, and ri
is the probability of return from Pi to CO. The technical conditions of credibility on ri and qi (worked out in Proposition 1 in the appendix) emerge
from a comparison of player discounted payoff from cooperation to the
payoff he can expect from unilateral defection given the likelihood that
such behavior will be punished. If U iCO is player i’s discounted payoff in
mutual cooperation and U iPi his discounted payoff in punishment, probabilities ri and qi must be set to deter intentional unilateral defection for i at CO
by ensuring that U iCO ≥ U iPi.
Intuitively, there will be many possible combinations of qi and ri that
ensure deterrence in a noisy environment. However, the choice of trigger
and return probabilities now impact the likelihood with which players will
inevitably find themselves in punishment mode as a result of the other
side’s observed unilateral defection. For this reason, all possible schemes
are no longer equal. While any scheme that provided enough deterrence to
ensure perpetual cooperation was worth U iCO = 0 in a noiseless environment,
when noise is present, a scheme’s value must account for the likelihood of
being in any one of the three possible states CO and Pi. Thus treaty value,
Vi, decreases when noise is introduced and becomes negative, because
Dispute Settlement Design for Unequal Partners
361
signatories will necessarily find themselves in the configurations that punish
observed unilateral defection. If the sanctioning rules are modeled according to a probabilistic trigger scheme, we show in Proposition 3 in appendix,
that the rate vi at which Vi decreases with noise is given by:
⎛q
q ⎞
vitrigger = 1 − bi − ⎜ 1 + 2 ⎟ ci
⎝ r1 r2 ⎠
(4)
The best sanctioning rules reduce treaty value the least as noise is
introduced. The best probabilistic trigger schemes, characterized in the
corollary to proposition 2, therefore maximize negative rate vitrigger by mini-
⎛q
q ⎞
mizing ⎜ 1 + 2 ⎟ under deterrence requirements for both i.20 This yields for
⎝ r1 r2 ⎠
each i = 1,2 (as e → 0):
if ci ≤
1
w
if ci ≥
qi = 1 and ri = ci −
1
w
qi =
1− w
w
1
and ri = 1
wci
(5a)
(5b)
Conditions (5a-b) on probabilities qi and ri, derived in the corollary to
Proposition 3, are also conditions on settlement delays and they depend on
the payoffs that each party would receive if both defect.
IMPLICATIONS
FOR UNEQUAL PARTNERS
If the payoff to mutual defection is small for both parties, the conditions of
(5a) hold. Treaty-maximizing sanctioning rules should then require prompt
retaliation in case of observed noncompliance, with a return to cooperation
that is faster in expected terms for the party who suffers the most from
mutual defection. This is because ri, the probability that the signatories will
return to cooperation when in punishment mode, increases with ci. Perhaps
of more interest are the conditions of (5b). When the cost of mutual defection is high enough, the best triggers require that the signatory suffering the
most from mutual defection be given more time, on average, to settle before
retaliatory sanctions are implemented. Indeed, expected settlement time,
1
measured by , captures the expected delay between the observation of
qi
noncompliance by i and the implementation of retaliatory measures against
i. It is, conceptually, the expected time that signatories have to settle “in the
362
J. P. Langlois and C. C. Langlois
shadow of the law.” Following our analysis, the party who suffers the most
from time spent in punishment mode should get more time to settle in the
shadow of the law.
Inequality between treaty partners can stem from unequal capacity to
comply. Compliance for the weaker signatory is then more costly. For
example, compliance with Sanitary and Phytosanitary Measures (SPS) can
be particularly costly for developing countries. This is due to “the predominance of agricultural and food products in total exports and the technical
capability of developing countries to comply with SPS requirements,” (Henson
and Loader, 2001, p. 89). Finger and Schuler (1999) point out that compliance with SPS is more costly to developing countries than developed countries because “while the SPS agreement does not require that a country’s
domestic standards meet the agreement’s requirements, it does require that
the standards the country applies at the border meet those requirements.”
(Finger and Schuler, 1999, p. 18). These authors cite costs to Argentina of
$82.7 million over five years to ensure disease and pest-free exports of
meat, and costs of $112 million to Algeria for locust control among other
examples. Damodaran (2002) estimates that SPS compliance increases
Indian Coffee farm replanting and operational costs by 46%.
How do differential compliance costs affect payoffs all other things
being equal? Starting from a symmetric payoff situation, a higher cost of
compliance for player i, here assumed to be the column player, is a cost
associated to cooperation regardless of what signatory j is doing. Compared
to a symmetric situation, player i now receives less from cooperation than
does j. Figure 4 illustrates the situation assuming that compliance costs for
player i exceed j’s by 1. Then, if in the symmetric case full cooperation
yields a payoff of 0 for both parties, it now only yields −1 for i. Similarly, i’s
payoff when j defects unilaterally is also reduced by 1. This yields the payoff structure pictured in 4b. In Figure 4c, the payoffs of Figure 4b are once
more normalized so that mutual cooperation yields 0 to both parties and
unilateral defection yields a payoff of 1 to each signatory:21
FIGURE 4 Compliance Costs Determine Inequality.
Dispute Settlement Design for Unequal Partners
363
As can be seen in Figure 4c, normalized coefficient ci for the developing partner is lower than it is for the developed partner.22 This is because
defection relieves the developing partner of the costs of compliance. If the
developed signatory fails to comply, best triggers dictate that the expected
time for settlement in the shadow of the law should be longer for the developed nation than it is for a noncompliant developing nation.
Indeed, given the parameters of Table 4c and assuming that discount
factor w = 0.95, failure to comply by the developed partner leads to retalia1
tion with probability q j = wc = 0.26 . The expected delay between an
j
observed instance of noncompliance by the developed nation and retalia1
tion by the developing partner is then
years if the year is the releq j = 3.85
vant unit to measure delays. By contrast, if the developing country fails to
1
= 0.70,
comply, the developed country retaliates with probability qi =
wci
or with an expected delay of 1.43 years.23 The deviating developed country,
in this case, is given more time to settle than its developing partner before
sanctioning is imposed. This is in line with the recommendations of Busch
and Rheinhardt (2003).
Prescriptions are reversed if the source of the unequal stakes
between treaty signatories is an unequal ability to benefit from the other
side’s cooperation. Consider a trade agreement between a developed
and a developing nation. The developing partner may be highly dependent on the developed partner’s market to grow its exports while the
developed nation, with a wider portfolio of exportable goods, is less
dependent on the developing country’s market for its exports. As a consequence, mutual cooperation is more valuable to the developing signatory than it is to its developed partner. The heated debate over NAFTA
was in part linked to the unequal benefits of trade liberalization between
developing and developed partners. Indeed, Kouparitsas’ macroeconomic analysis of NAFTA pointed to “welfare gains to all North American
participants, with the greatest gains accruing to Mexico” (Kouparitsas,
1997, p. 25, italics are ours).
Starting from the symmetric payoff situation, asymmetric benefits in
favor of the developing partner show up as a larger payoff to developing
signatory i from the cooperation of developed country j than j enjoys when
i cooperates. In Figure 5b, column player i now enjoys payoff 1 from
mutual cooperation instead of the symmetric payoff 0 of Figure 5a. He also
enjoys a payoff of 2 if he defects unilaterally, instead of the symmetric
payoff of 1 in Figure 5a. In Figure 5c, payoffs to mutual cooperation are
normalized to 0 while payoffs to unilateral defection are normalized to 1 by
simply subtracting 1 from column player’s payoffs in 5b.
364
J. P. Langlois and C. C. Langlois
FIGURE 5 Differential Gains from Cooperation Determine Inequality.
When developing partner i has a higher stake in j’s cooperation,
mutual defection ends up being more costly for i than it is for j.24 Now the
best trigger design for sanctioning gives the developed signatory less time to
settle in the shadow of the law, before the developing partner is expected
to retaliate. This is in line with the differential treatment proposed under
WTO dispute settlement procedures. Indeed, given the parameters of Figure
5c and setting w = 0.95, developed country j retaliates against a noncompliant developing signatory i with probability 0.21, while the developing partner retaliates against noncompliant signatory j with probability 0.26. This
leaves the developed signatory with an expected 3.8 years to settle before
any sanctioning takes place, while the same expected delay increases to
4.75 years if it is the developing country that fails to comply.25
In conclusion, the best trigger designs for the settlement of disputes
prescribe differential treatment of unequal partners, but the nature of that
difference depends on the source of the inequality. To be sure, differences
in compliance costs and differences in the ability to enjoy the fruits of international cooperation may simultaneously determine the inequality in treaty
signatory stakes. But because these factors impact the design of dispute settlement procedures in opposing directions, their relative weight must be
accounted for in determining differential treatments. These features notwithstanding, trigger type designs may not provide signatories with the best
treaty value. We propose, below, an alternative sanctioning scheme that no
longer relies on mutual defection for sanctioning.
AN ALTERNATIVE DESIGN FOR CHICKEN, BLUFFS AND PRISONERS
When Triggers are Hard to Use
The Prisoner’s Dilemma has one pure-strategy Nash Equilibrium, mutual
defection, and this allows for the construction of simple subgame perfect
Dispute Settlement Design for Unequal Partners
365
trigger mechanisms. But things are not quite as simple if the parties are
engaged in other mixed motive games. For example, the Chicken game has
two pure-strategy Nash equilibria. In each of these, one party cooperates
while the other defects. Clearly neither can individually serve as a threat for
both sides. One could conceive of a punishment scheme where signatories
would alternate between these two states, generating a mixed equilibrium
that would serve the theoretical purpose of imposing punishment on a unilateral defector. Such a punishment scheme would have the signatories
alternate between the two Nash equilibria of the Chicken game: I withdraw
concessions while you cooperate for, say one month, then its your turn to
withdraw concessions while I cooperate. We then repeat for the time it
takes to punish the initial defector. This is hardly an attractive prescription.
In the Called Bluff game, the situation is even more problematic since
the single Nash equilibrium of the stage game, which involves cooperation
of one party while the other defects, can only be conceived as punishment
for one of the players. Thus, cooperation cannot be supported by the threat
of reversion to a simple reversion point that potentially punishes either
player. As a result, no simple trigger type schemes can be designed. Of
course, a reversion point could theoretically be constructed in order to
define subgame perfect equilibrium strategies for a repeated game of Called
Bluff. But in engineering complex schemes, behavioral interpretations are
lost, and the implementation of sanctioning rules becomes hard to define. It
turns out that simple sanctioning rules can be devised for the Called Bluff
and Chicken games if, instead of a trigger mechanism, we turn to a particular Tit-for-Tat like scheme. Moreover, this alternative scheme can also be
implemented in Prisoner’s Dilemma type situations, and it can be calibrated
to so that it systematically provides better treaty values to both signatories
than trigger schemes.
Contrite Tit-for-Tat
A dispute resolution scheme that improves treaty value must avoid long
spells of punishing defections. This is first achieved by delaying retaliation
itself, ensuring that the signatories have some time to settle the dispute
before escalating to punishing measures. This is the generosity that is built
into a probabilistic trigger scheme by setting trigger probabilities qi below 1
when the cost of mutual defection is high enough. Generosity in the
response to noncompliance addresses one of the problems associated with
retaliation by avoiding it, but the organization of the retaliatory phase, if it
comes to pass, is also critical to treaty value. The alternative scheme we will
discuss also builds in the opportunity to settle, but in contrast to trigger
designs, the retaliation phase actually brings redress to the victim of the unilateral defection.26 The subgame perfect scheme that we propose, Contrite
Tit-for-Tat (CTFT), expands upon a scheme introduced by Sugden (1986).27
366
J. P. Langlois and C. C. Langlois
Under Contrite Tit-for-Tat (CTFT), three states are relevant: CO in
which both sides are expected to cooperate, and Gi for i = 1,2 in which one
of the players has been judged as failing to comply for whatever reason.
Panels judging WTO dispute cases make determinations of this kind, as do
the international tribunals that hear cases related to CITES or the Law of the
Sea. CTFT uses the vocabulary of guilt and innocence although intent is not
presumed. At the outset, signatories are presumed innocent and the rules of
play are as follows: a signatory should always cooperate with an innocent
opponent. A signatory then becomes guilty if he defects while his partner is
innocent. A guilty player is always expected to cooperate, reestablishing
innocence if he does, but remaining guilty if he doesn’t. An innocent player
can righteously defect against a guilty opponent without compromising his
innocence, although he remains innocent if he doesn’t. We add generosity
to Sugden’s original scheme, so that guilt no longer results automatically
from observed defection against an innocent party, but it is established with
some player specific probability qi. Boyd (1989) was first to point out that
Sugden’s scheme forms an SPE under noise. Our study extends his result to
the case where guilt and subsequent innocence are established with some
player specific probabilities qi and ri.
Under the CTFT scheme we propose, state Gi, in which player i is
found guilty, is reached with probability qi when i’s defection is observed
and provided j is not similarly guilty. In state Gi a (Ci,Dj) play is expected,
followed by a return with probability ri to state CO, where a (C,C) play is
expected. If i does not comply with that expected punishment he remains
guilty (in state Gi) with certainty. When found guilty, CTFT requires that the
defector allow his compliant partner to enjoy the benefits of unilateral
defection for some time. Conceptually, play (Ci,Dj) in state Gi is equivalent
to asking the defector to incur sufficient costs to wipe out the benefits of
noncompliance, while compensating, at least in part, his compliant partner
for the past losses incurred as a result of his failure to comply. Because the
CTFT scheme is subgame perfect it is in the interest of both parties to abide
by the scheme. The calibration of CTFT schemes is extremely flexible.
Indeed, as shown in Proposition 1, to be subgame perfect, CTFT schemes
require that (as e → 0):
ri ≤ bi qi −
1− w
and (1-w + wri )ci ≥ (1 − w )bi
w
(6)
conditions that holds with any ri ≥ 0 whenever ci ≥ bi (for Chicken and
player j in Called Bluff) and if ci ≥ (1 – w)bi in the Prisoner’s Dilemma and
for player i in Called Bluff.28
How should a CTFT scheme be calibrated to maximize treaty value? As
shown in Proposition 4 in the appendix, The rate at which treaty value
decreases with noise is player specific and given by:
Dispute Settlement Design for Unequal Partners
victft = 1 +
⎛
q ⎞
− ⎜ 1 + i ⎟ bi
rj ⎝
ri ⎠
qj
367
(7)
In order to minimize the rate at which the treaty’s value declines with
qi
qj
qj
noise, i would like to see
maximum and
minimum. But increasing
r
rj
rj
i
can always be achieved by choosing a smaller value for rj, making the punishment phase as long as possible and ensuring that, with noise, treaty
value would actually increase for i. With low rj, signatory j would need to
suffer the payoffs characteristic of i’s unilateral defection for a very long
time, and meanwhile i would receive compensation that could rise above
the losses incurred as a result of j’s past unilateral defection. As rj → 0 the
CTFT scheme would resemble a grim trigger from j’s point of view, and
while treaty value for i rises with j’s plight in the inevitable event of defecqj
qi
tion, treaty value for j would be very low. The setting of
and
must
ri
rj
therefore emerge from the negotiations on the terms of the treaty itself, and
qj
q
the critical issue will be whether or not the parties can choose
and i to
ri
rj
enhance treaty value for each, given alternative sanctioning schemes.
The Merits of Redress
A CTFT type scheme punishes the deviating signatory by imposing the payoffs to the other side’s unilateral defection. Such a scheme can be implemented even if treaty signatories are engaged in a Chicken or Called Bluff
game, since the scheme only requires that unilateral defection be beneficial
to the defector while it hurts the other side. But, perhaps more importantly,
a sanctioning scheme based on CTFT can always provide better treaty value
to both signatories, if engaged in a Prisoner’s Dilemma, than if they adopted
a trigger-like scheme. A formal proof is given in appendix (Proposition 5).
In order to capture the source of CTFT’s advantage, we now compare this
scheme to treaty maximizing triggers for a range of parameter values.
In order to compare a CTFT type sanctioning scheme to triggers, it is
qj
qi
necessary to select values for
and
within the wide range that meets
ri
rj
credibility requirements. We chose to compare treaty maximizing triggers to
qj
q
CTFT schemes that minimize factors
and i for both parties, ensuring
rj
ri
that the rate at which treaty value declines with noise is small for both
368
J. P. Langlois and C. C. Langlois
signatories under the CTFT scheme considered. Given the particular parameter values that we examine below, the CTFT scheme we exhibit sets probabilities ri and rj to 1 (see proposition 5 in the appendix). This is not the
only CTFT scheme that can potentially provide signatories with better value
than the best trigger scheme. But by showing that this particular choice of
CTFT design dominates the best trigger, we prove existence of better CTFT
type schemes in general. Table 1 below provides data on trigger and CTFT
designs given the parameter values of Figures 4c and 5c. In all cases, noise
e = 0.01 and w = 0.95:
Table 1 gives the characteristics of a subgame perfect CTFT scheme
that maximizes treaty value for each party, given the values of qi and ri chosen by the other side. Both sides would benefit from adopting the CTFT
scheme. Indeed, recalling that treaty value under noise will be negative
because unintentional deviation is inevitable, Table 1 illustrates that treaty
value can decline less if a CTFT design is chosen for sanctioning. Given
parameter values, the schemes of Table 1 all require swift return from punishment mode (ri = 1, i = 1,2). The difference between the schemes lies in
the values of qi. In CTFT, qi is the probability that signatory i is found guilty
if observed to deviate. For trigger schemes qi is the probability that
observed deviation by i will lead to mutual defection. Since guilt in CTFT
1
implies certain punishment,
represents the expected delay between an
qi
observation of noncompliance and its punishment for both schemes. CTFT’s
ability to provide signatories with higher treaty value stems from the redress
that is built in, as well as its generosity in case of deviation. In fact the two
features are linked. Because the injured party gets some redress, more generosity can be built in without compromising treaty value. As can be seen in
TABLE 1 Treaty Value Under Alternative Designs*
r1 = 1
r2 = 1
5
3
, c1 = 4, c 2 = 4, c 2 =
2
2
q1 = 0.281
q2 = 0.733
V1 = −1.611
r1 = 1
r2 = 1
q1 = 0.214
The Conditions of Figure 4c: b1 = 5, b2 =
The best trigger
design
A possible CTFT
design
q2 = 0.430
The Conditions of Figure 5c: b1 = 5, b2 = 6, c1 = 4, c2 = 5
The best trigger
r1 = 1
r2 = 1
q1 = 0.275
q2 = 0.215
design
A possible CTFT
r1 = 1
r2 = 1
q1 = 0.221
q2 = 0.179
design
V2 = −0.604
V1 = −1.211
V2 = −0.472
V1 = −1.192
V2 = −1.490
V1 = −0.985
V2 = −1.171
*Probabilities ri and qi are exact values that account for noise magnitude e. Treaty values are first order
eni
approximations:
Vi ;
1− w
Dispute Settlement Design for Unequal Partners
369
Table 1, qi for the CTFT scheme examined is systematically lower than it is
for the best trigger scheme. While the inequality between partners is treated
in the same way by both schemes, under CTFT, all parties are given more
time to return to cooperation before any sanctioning takes place.
How can a CTFT scheme be implemented in practice? The punishment
phase in CTFT requires that the deviating party, if found guilty, must return
to cooperation while the injured party is allowed to deviate, enjoying the
benefits of unilateral defection. The same payoff outcome can be achieved
by, for example, assessing a fine against the defector to wipe out any
expected advantage to unilateral noncompliance, and using part of the proceeds to compensate the compliant signatory. Both parties then return to
cooperation if one party is judged guilty, and monetary fines and compensations ensure that the payoffs of unilateral defection by the compliant signatory are realized. Interestingly in an effort to finalize the U.S.–Jordan trade
agreement, the Bush administration proposed that monetary fines be the
preferred enforcement mechanism for bilateral trade agreements (Inside US
Trade, April 2001). And calls for the compensation of injured parties in the
face of signatory noncompliance have occasionally been made. Jackson
mentions the call for compensation of injured domestic producers in antidumping cases in the House version of the 1987 trade bill, a measure that
“would dramatically change the trade policy impact of the antidumping
(and subsidy) rules” (Jackson, 1999, p. 274). Our analysis suggests that
monetary fines in lieu of retaliatory sanctions, together with compensation
of the injured party deserve more attention.
CONCLUSION
Signatories of international treaties and agreements will sometimes fail to
comply unintentionally. If what is at stake defines a mixed motive game,
unintentional noncompliance must be sanctioned to avoid defection in selfinterested exploitation of a treaty’s terms. However, the very fact that noncompliance can occur by accident tests the sanctioning rules and requires
that they be chosen to provide signatories with the best treaty outcome. But
the inequalities between treaty signatories will impact the nature of the remedies in case of unilateral defection, and so will the nature of the mixed
motive game that nations play.
A reading of treaty texts reveals that sanctioning typically proceeds by
retaliatory defection against the noncompliant signatory. The norm for sanctioning is therefore the implementation of trigger-like schemes, and inequality is addressed by proposing that the process leading to retaliation be
accelerated when a weaker partner faces the noncompliance of a stronger
signatory. Our game theoretic analysis of trigger-like sanctioning schemes
suggests that all parties enjoy the best treaty value if settlement delays in the
370
J. P. Langlois and C. C. Langlois
shadow of the law are adjusted according to the particular source of signatory inequality. If inequality stems from differences in the costs associated to
compliance, as is often the case when developing countries need to comply
with Sanitary and Phytosanitary measures, the developed partner with the
lower compliance costs should be given more time to settle in the shadow
of the law if he deviates than his higher compliance cost partner. Prescriptions are reversed if the source of the unequal stakes between treaty signatories is an unequal ability to benefit from the other side’s cooperation. In
trade agreements between developed and developing countries such as
NAFTA, the stakes are typically higher for the developing partner. Our analysis
therefore suggests that, in this case, the developing partner be given more
time to settle in the shadow of the law if he deviates than the developed
partner.
Trigger-like sanctioning schemes are easy to implement if signatories
are engaged in Prisoner’s Dilemma stage games. But, despite their prevalence, they are not well suited to the handling of Chicken or Called Bluff
games that may define the stakes in environmental accords. This motivates
our analysis of Contrite Tit-for-Tat that builds in redress for the victim of a
unilateral defection. While this scheme can handle alternative mixed motive
stage games, we also find that it can always provide better treaty value to
signatories engaged in a Prisoner’s Dilemma stage game than trigger-type
schemes.
Schemes based on Contrite Tit-for-Tat build in more time for settlement
in the shadow of the law than the trigger-like schemes that are models for
treaty dispute settlement designs. If, as Busch and Rheinhardt (2003) argue,
developing countries do better in early settlements than they do when a dispute moves to a WTO panel ruling, extending the time allowed for such settlements to take place would seem desirable. If retaliation must come to
pass, however, Contrite Tit-for-Tat advocates that punishment of the deviation be accompanied by compensation for the injured party. While compensation for damages to the injured party is not a prescription under WTO
rules, a number of authors suggest that it might be appropriate in disputes
involving developing signatories. As stated by Hoeakman and Marvroidis
(2000), “violations of the WTO are disproportionally burdensome for developing countries given the fragility of many of their export industries and the
fact that their export base is generally much less diversified than in high
income countries.” In 2003, Ecuador’s banana exports represented almost 20
percent of the value of its exports.29 The never-ending banana dispute over
the European Union’s preferential treatment of banana imports from its excolonies would have led to payment of damages to Ecuador under the Contrite Tit-for-Tat rule. Instead the WTO authorized Ecuador to implement
retaliatory measures against the EU which do not help Ecuador develop one
of its major export sectors and does not foster the free trade ideal of the
WTO either. If implemented in the interests of all parties involved, a Contrite
Dispute Settlement Design for Unequal Partners
371
Tit-for-Tat type scheme can increase treaty value for all. In the design of
sanctioning schemes, our game theoretic analysis concludes that redress for
the injured party is better than punishment by defection.
CONTRIBUTORS
Catherine Langlois teaches economics and game theory at the McDonough
School of Business at Georgetown University. Her research interests include
conflict and cooperation, treaty design and rationalist explanations of war.
Her work has appeared in the American Journal of Political Science, International Studies Quarterly, the British Journal of Political Science and the
Journal of Conflict Resolution.
Jean-Pierre Langlois is an applied mathematician with research interests in
game theory. His current work includes modeling in international relations and
computational methods. He is the author of GamePlan, a game theory software.
NOTES
1. For example, the WTO agreements on trade specify special dispute settlement conditions for
developing countries (Footer, 2001; Michalopoulos, 2000).
2. In international law, special and differential treatment has emerged to alleviate differences in
the capability to comply with international agreements (Cullet, 1999). However, while these measures
have modified the stakes for unequal partners, they have not necessarily removed individual incentives
to cheat, and the conditions for self enforcement must still be worked out.
3. Authors such as Koremenos, Lipson, and Snidal (2001) also embrace a rationalist perspective in
their analysis of the design of international agreements and therefore adhere to the principle of deterrent
punishment. Indeed, a full issue of International Organization (Vol. 55, No. 4 (Autumn) 2001), is dedicated to the rational design of institutions. While a number of authors explore rational design verbally,
Kydd (2001) and Rosendorff and Milner (2001) adopt more formal approaches. Kydd presents a game
theoretic analysis of the trust issues involved in NATO enlargement while Rosendorff and Milner show
that, in the presence of political uncertainty over the domestic pressure for trade barriers, a trade agreement that includes an escape clause Pareto dominates one that does not have it. We do not seek to
model the specific design features of any particular international agreement. Instead we seek to understand some of the generic properties of treaty enforcement regimes.
4. Authors such as Brander (1986) or Krugnam (1984) analyze imperfectly competitive situations
that lead to welfare enhancing unilateral protectionism. If firms learn by doing, for example, protection
of the domestic market moves domestic firms down their learning curves, and helps them to lower their
costs faster than their rivals. This, argues Krugman (1984), enables domestic firms to gain share in unprotected foreign markets. Under such circumstances the consumer welfare lost from protectionism can be
more than compensated for by enhanced profits made by the protected domestic firms.
5. This ensures that no alternating mix of (Ci, Dj ) and (Di, Cj) could achieve better discounted payoffs than constant cooperation (C,C). Geometrically, this means that the point (0,0) lies above the line
joining (1,–b2) to (–b1,1) in the payoff space so that no mix of the two points could be better than cooperation for both sides.
6. We assume that signatory utility parameter remain constant over time and concentrate on the
design features of punishment schemes. Relaxing this assumption requires consideration of future renegotiation and leads to the choice of a finite duration for the agreement. This aspect is examined by
Koremenos et al., 2001.
372
J. P. Langlois and C. C. Langlois
7. India - Quantitative Restrictions on Imports of Agricultural, Textile and Industrial Products DS90/R.
8. Chile-Taxes on Alcoholic Beverages DS87/AB/R and DS110/AB/R.
9. Chile-Taxes on Alcoholic Beverages AB-1999–6, p. 21, para. 62.
10. Indeed, the appeals panel also commented on the suggestion that Chile’s taxation system was
designed to continue providing once acknowledged import protection in the following terms: “Members
of the WTO should not be assumed, in any way, to have continued previous protection or discrimination through the adoption of a new measure. This would come close to a presumption of bad faith.”
11. United States-Import prohibition of Certain Shrimp and Shrimp Products DS58.
12. United States-Import prohibition of Certain Shrimp and Shrimp Products DS58, p. 12, Para. 3.7.
13. This has been done successfully by authors such as Reinhardt (2001) for the WTO Dispute Settlement Understanding or authors such as Rosendorff and Milner (2001) for escape clauses in trade agreements.
14. Table A1 in the appendix gives noisy utilities for all possible intended moves by the players.
15. Precise definition of U ik requires specifying the strategy under consideration since strategy will
determine the states of the game. For example, a trigger scheme defines three states: CO in which both
parties are expected to cooperate and two states Pi of reversion in which both states are expected to
defect subsequent to the defection of one party or the other. U iCO is then the expected discounted payoff to i when in state CO.
16. We will limit our discussion to Markov strategies for which players distinguish a finite (rather
than an infinite) set of possible states of the game. Our exclusive consideration of Markov strategies
rules out strategies that build on the whole of past history but admits consideration of a history of any
length as long as it is finite. Although we focus on Markov strategies, and therefore identify Markov Perfect equilibria (MPE), the equilibrium result is not limited to Markov strategies. A signatory cannot do
better by deviating from its Markov strategy in an MPE by using a non-Markov strategy. In other words,
a MPE is also a SPE within the set of all possible strategies. The reader is referred to Fudenberg and
Tirole (1991, pp. 513–515) for a more technical discussion of these issues.
17. The U.S. threatened China with trade sanctions for continued traffic in rhino horns and tiger
bone in 1993, and imposed trade sanctions on Taiwan in April 1994. The sanctions on Taiwan were
lifted in June 1995 (www.glo.gov.tw ). The U.S. also threatened to impose trade sanctions on Japan for
its trade in endangered hawksbill sea turtles in 1991. While these threats and sanctions were CITES
related, they were undertaken under the Pelley Agreement that allows the USTR to restrict trade in wildlife products originating from a country suspected of noncompliance with an international regime such
as CITES (Glennon and Stuart, 1998).
18. European Communities-Regime for the Importation Sale and Distribution of Bananas WT/DS27.
19. It is worth noting here that probabilistic trigger schemes were actually developed to handle situations in which intent and observation could differ. Porter (1983), Green and Porter (1984) and Abreu,
Pearce, and Stacchetti (1986) are standard references.
20. This assumes that noise is small enough and that the design of equilibria are stable under
noise. These assumptions are elaborated upon conceptually in the non technical preamble in appendix.
21. Normalization involves two steps: first add 1 from all of i’s payoffs to normalize the payoff to
⎛ 0, 0 −5, 2 ⎞
mutual cooperation to 0. This yields payoff matrix ⎜
. Secondly, divide all of i’s payoffs by 2
⎝ 1, −5 −4, −3⎟⎠
⎛ 0, 0
to yield: ⎜
5
⎜ 1, −
⎝
2
−5, 1 ⎞
3 ⎟ , which is payoff matrix 4c.
−4, − ⎟
2⎠
−b, 1 ⎞
⎛ 0, 0
22. In the general case, starting from the normalized symmetric payoff matrix ⎜
, and
⎝ 1, −b −c, −c⎟⎠
introducing a compliance cost κ for signatory i as the column player leads to payoff matrix
⎛ 0, −k
⎜⎝ 1, −b − k
−b, 1 ⎞
. Normalizing first involves adding κ to i’s payoffs yielding payoff matrix
−c, −c⎟⎠
−b, 1 + k ⎞
⎛ 0, 0
⎜⎝ 1, −b −c, −c + k ⎟⎠ . Dividing i’s payoffs by 1 + κ to normalize the payoff to unilateral defection to
⎛ 0, 0
1 yields ⎜ −b
⎜ 1,
⎝ 1+ k
−b, 1 ⎞
c−k
< c = c j.
−c + k ⎟ . Clearly, ci
⎟
−c,
1+ k
1+ k ⎠
Dispute Settlement Design for Unequal Partners
373
23. We have approximated qi and qj by their values as noise ε goes to 0 following (5a–b). Exact
values given e = 0.01, for example, are qi = 0.733 qj = 0.284. Mathematica notebooks that enable these
calculations are available from the authors upon request.
−b, 1 ⎞
⎛ 0, 0
24. In the general case, starting from the normalized symmetric payoff matrix ⎜
, an
⎝ 1, −b −c, −c⎟⎠
extra benefit b, from partner j’s cooperation accruing to signatory i as the column player, leads to payoff
0, 0
−b, 1 ⎞
matrix ⎛ 0, + b −b, 1 + b ⎞ . Normalizing by subtracting b from i’s payoffs yields: ⎛
.
⎜⎝ 1, −b − b −c, −c − b ⎟⎠
⎜⎝ 1, −b
−c, −c ⎟⎠
Clearly, ci = c + b > c = cj.
25. Again, we approximate qi and qj by their values as noise ε goes to 0 following (5a and b).
Exact values given ε = 0.01, for example, are qi = 0.220 qj = 0.274.
26. The idea of rewarding the victim of a unilateral defection is not new to game theorists
and has been associated with trigger schemes in various guises. Fudenberg and Maskin (1991)
and Morrow (1994) describe strategies that call for two phases. In the punishment phase, victim and
perpetrator defect long enough to ensure that the perpetrator loses any short-term gain from unilateral defection. This phase is then followed by a “reward” phase during which the perpetrator cooperates while the victim is allowed to defect enough to reap his reward for carrying out punishment in
the first place. But such a scheme would reduce treaty values since it would involve more defection
on the part of the victim without avoiding the spate of joint defection by which trigger designs punish the perpetrator.
27. Note that while ordinary Tit-for-Tat is not subgame perfect, the scheme we propose is.
28. See Corollary of Proposition 3 in appendix for proof.
29. http://www.intracen.org/countries/structural05//ecu_8.pdf
30. A sharpening of our approximation of Vi by extending the calculation to involve second
order terms, or any number of (nonlinear) higher order terms, would accommodate larger assumed
noise magnitudes.
31. Condition (iii) can be ensured by assuming that the derivative Ψ′ (e) is bounded. Tit-for-Tat
and the grim trigger are typical of a failure of condition (iv). That m is differentiable can be inferred from
the previous conditions.
REFERENCES
Abreu, Dilip, David Pearce, and Ennio Stacchetti (1986). “Optimal Cartel Equilibria with
Imperfect Monitoring.” Journal of Economic Theory, Vol. 39 (June), pp. 251–269.
Aggarwal, Vinod, K. and Cedric Dupont (1999). “Goods Games and Institutions.”
International Political Science Review, Vol. 20, No. 4, October, pp. 393–409.
Boyd, Robert (1989). “Mistakes Allow Evolutionary Stability in the Repeated Prisoner’s
Dilemma Game.” Journal of Theoretical Biology, Vol. 136, No. 1, pp. 47–56.
Brams, Steven and Marc Kilgour (1988). Game Theory and National Security, New
York: Basil Blackwell
Brander, James A. (1986). “Rationales for Strategic Trade and Industrial Policy,” in
Strategic Trade Policy and the New International Economics, Paul R. Krugman,
ed., Cambridge, MA, MIT Press.
Busch, Marc L. and Eric Rheinhardt (2003). “Developing Countries and GATT/WTO
Dispute Settlement.” Manuscript, forthcoming, Journal of World Trade.
Cameron, James and Karen Campbell, eds. (1998). Dispute Resolution in the World
Trade Organization, London: Cameron May.
Chayes, Abram and Antonia Chayes (1991). “Compliance Without Enforcement.”
Negotiation Journal, Vol. 7 (July), pp. 311–331.
374
J. P. Langlois and C. C. Langlois
Chayes, Abram and Antonia Chayes (1993). “On Compliance.” International
Organization, Vol. 47 (Spring), pp. 175–205.
Chayes, Abram and Antonia Handler Chayes (1995). The New Sovereignty,
Cambridge, MA: Harvard University Press.
Chayes, Abram, Antonia Handler Chayes, and Ronald B. Mitchell (1998). “Managing
Compliance: A Comparative Perspective,” in Engaging Countries: Strengthening Compliance with International Environmental Accords, Edith Brown Weiss
and Harold K. Jacobson, eds., Cambridge, MA: MIT Press.
Cullet, Philippe (1999). “Differential treatment in International Law: Towards a New
Paradigm of Inter-state Relations.” European Journal of International Law,
Vol. 10, No. 3, pp. 549–582.
Damodaran, A. (2002). “Conflict of Trade Facilitating Environmental Regulations
with Biodiversity Concerns: The Case of Coffee-Farming Units in India.” World
Development, Vol. 30, No. 7, pp. 1123–1135.
Downs, George W., David M. Rocke (1990). Tacit Bargaining Arms Races and
Arms Control, Ann Arbor: University of Michigan Press.
Downs, George W., David M. Rocke, and Peter N. Barsoom (1996). “Is the Good
News About Compliance Good News About Cooperation?” International Organization, Vol. 50 (Summer), pp. 379–406.
Finger, J.M. and P. Schuler (1999). “Implementation of Uruguay Round Commitments: The Development Challenge.” World Bank Policy Research Working
Paper No. 2215, October.
Finus, Michael (2001). Game Theory and International Environmental Cooperation,
Cheltenham, UK: Edward Elgar Publishing Ltd.
Footer, Mary E. (2001). “Developing Country Practice in the Matter of WTO Dispute
Settlement.” Journal of World Trade, Vol. 35, No. 1, February, pp. 55–98.
Fudenberg, Drew and Eric Maskin (1991). “On the Dispensability of Public Randomization in Discounted Repeated Games.” Journal of Economic Theory, Vol. 53,
pp. 428–438.
Glennon, Michael J. and Alison L. Stuart (1998). “The United States: Taking Environmental Treaties Seriously,” in Engaging Countries: Strengthening Compliance
with International Environmental Accords, Edith Brown Weiss and Harold K
Jacobson, eds., Cambridge MA: MIT Press, pp. 173–213.
Green, Edward. J. and Robert H. Porter (1984). “Noncooperative Collusion
Under Imperfect Price Information.” Econometrica, Vol. 52 (January),
pp. 87–100.
Henson, Spencer and Rupert Loader (2001). “Barriers to Agricultural Exports from
Developing Countries: The Role of Sanitary and Phytosanitary Requirenments.”
World Development, Vol. 29, No. 1. pp. 85–102.
Hoekman, Bernard M. and Petros C. Mavroidis (2000). “WTO Dispute Settlement,
Transparency and Surveillance.” The World Economy, Vol. 23, No. 4, April,
pp. 527–542.
Inside US Trade (1996). “U.S. Rescinds Retaliatory Sanctions against EU for
Hormone Ban,” Vol. 14 (July 19).
Inside US Trade (1999). “Unlikely to Lift Beef Hormone Ban; U.S. Set to Retaliate,”
Vol. 17 (July 23).
Inside US Trade (2001). “Draft USTR Paper on Monetary Fines,” Vol. 19 (April 27).
Dispute Settlement Design for Unequal Partners
375
Jackson, John H. (1999). The World Trading System: Law and Policy of International Economic Relations, 2nd ed., Cambridge, MA: MIT Press.
Jacobson, Harold K. and Edith Brown Weiss (1998). “Assessing the Record and
Designing Strategies to Engage Countries,” in Engaging Countries: Strengthening Compliance with International Environmental Accords, Edith Brown Weiss
and Harold K. Jacobson, eds., Cambridge, MA: MIT Press.
Jasanoff, Sheila (1998). “Contingent Knowledge: Implications for Implementation
and Compliance,” in Engaging Countries: Strengthening Compliance with
International Environmental Accords, Edith Brown Weiss and Harold K.
Jacobson, eds., Cambridge, MA: MIT Press.
Koremenos, Barbara, Charles Lipson, and Duncan Snidal (2001). “The Design of
International Institutions,” International Organization, Vol. 55, No. 4,
(Autumn), pp. 761–799.
Kouparitsas, Michael (1997). “A Dynamic Macroeconomic Analysis of NAFTA.”
Economic Perspectives, Vol. 21, Issue 1, Jan./Feb. pp. 14–36.
Krugman, Paul R. (1984). “Import Protection as Export Promotion,” in Monopolistic
Competition and International Trade, H. Kierzkowski, ed., Oxford: Oxford
University Press.
Kydd, Andrew (2001). “Trust Building, Trust Breaking: The Dilemma of NATO
Enlargement.” International Organization, Vol. 55, No. 4 (Autumn),
pp. 801–828.
Lipnowski, Irwin and Shlomo Maital (1983). “Voluntary Provision of a Pure Public
Good as a Game of ‘Chicken’.” Journal of Public Economics, Vol. 20, No. 2,
April, pp. 381–386.
Michalopoulos, Constantine (2000). “Trade and Development in the GATT and
WTO: The Role of Special and Differential Treatment for Developing Countries.” World Bank Paper, April 19.
Missfeldt, Fanny (1999). “Game Theoretic Modeling of Transboundary Pollution.”
Journal of Economic Surveys, Vol. 13, No. 3, pp. 287–321.
Morrow, James (1994). Game Theory for Political Scientists, Princeton NJ: Princeton
University Press.
Nottage, Hunter (2003). “Trade and Competition under the WTO: Pondering the
Applicability of Special and Differential Treatment.” Journal of International
Economic Law, Vol. 6, No. 1, (March), pp. 23–47.
Porter, Robert H. (1983). “Optimal Cartel Trigger Price Strategies.” Journal of
Economic Theory, Vol. 29, No. 2, pp. 313–338.
Reinhardt, Eric (2001). “Adjudication without Enforcement in GATT Disputes.”
Journal of Conflict Resolution, Vol. 45, No. 2, (April), pp. 174–196.
Rosendorff, Peter B. and Helen V. Milner (2001). “The Optimal Design of International Trade Institutions: Uncertainty and Escape.” International Organization,
Vol. 55, No. 4 (Autumn), pp. 829–857.
Sand, P.H., 1997, “Commodity or Taboo? International Regulation of the Trade in
Endangered Species,” in The Green Globe Yearbook 1997 H.O. Bergensen and
G. Parmann, Editors, Oxford University Press, Oxford, pp. 19–36.
Schmidt, Carsten, 2000, Designing International Environmental Agreements: Incentive Compatible Strategies for Cost-Effective Cooperation, Edward Elgar,
Northampton MA, USA and Cheltenham UK.
376
J. P. Langlois and C. C. Langlois
Simmons, Beth A.,1998, “Compliance with International Agreements,” Annual
Review of Political Science, Vol. 1, pp.75–93.
Staiger, Robert W., 1995, “International Rules and Institutions for Trade Policy,” in
Handbook of International Economics, Volume 3, pp.1495–1551, Grossman
Gene, M. and Kenneth Rogoff, Eds., Handbooks in Economics, Elseiver, North
Holland, Amsterdam, New York and Oxford.
Sugden, R., 1986, The Economics of Rights, Cooperation and Welfare, Oxford: Basil
Blackwell.
Weber, Steve, 1991, Cooperation and Discord in US-Soviet Arms Control, Princeton,
NJ, Princeton University Press.
APPENDIX
1. Non Technical Preamble
An important characteristic of all the designs we will examine here is their
behavior as noise is introduced. In all cases the formulation of strategy
remains constant while its exact parameters may vary with the introduction
of noise. Definition 1 in the appendix calls such strategies “stable under
noise.” Stability under noise refers to the structural stability of the design, to
the smoothness of adjustment of its strategic parameters, and to the changing long-run frequency with which the states distinguished by the design
are visited, as noise is introduced. In particular, structure, parameters, and
long-run frequencies approach those of the standard noiseless version of
the design as noise approaches zero. In our perspective as “designers” of
desirable strategic equilibria, noise is an inevitable nuisance that must be
taken into account at treaty design time knowing that the signatories will be
largely unable to change the design without substantial renegotiation. Our
concept of stability under noise ensures that the design is structurally stable
both as noise is introduced, and for different noise magnitudes. Propositions
3 and 4 in the appendix verify that the strategies we examine are “stable
under noise” and exploit that fact to derive treaty values.
Our technical developments also assume that noise magnitude e is small.
This assumption has a number of technical advantages but it also reflects a
point of view on the behavior of treaty signatories. If the managerial school is
to be believed, treaty negotiators strive to make texts as transparent as possible,
and signatories want and intend to abide by the agreement signed. Signatories
might then be observed to defect some of the time but certainly not most of the
time or even much of the time. Moreover, if the mismatch between intent and
realization occurred too often—say half of the time—the consequences of
intentional cooperation would be indistinguishable from those of intentional
defection and game theory would be powerless. As a technical matter, a small
noise parameter e is helpful for the following two reasons: First, various
schemes can be compared with reference to the rate vi at which treaty value Vi
Dispute Settlement Design for Unequal Partners
377
declines when noise is introduced. Secondly, the analytical difficulties involved
in finding explicit formulae under noise drive a need to approximate treaty
value Vi. If e is small, rate vi can be used in a standard “first order approximation” of Vi. The technical conditions for the first order approximation to exist
are spelled out in Theorem 1 in the appendix and supported by the formal
expression for Vi given in Proposition 2. If e is small, the first order approximation is reliable.30 Explicit formulae for rates vi at which long run values decline
with noise, given the treaty designs we consider, are derived in the corollary to
Proposition 3 and in Proposition 4.
2. Mathematical Appendix
In vector form noisy expected utilities U ik must satisfy
U i = Wi + w MU i
Or
(A1)
U i = [ I − w M ]−1Wi
(A2)
where Wi and M are the appropriate vector representations of the
states’ noisy payoffs and noisy transition probabilities. Noisy payoffs are
given in Table A1 below:
TABLE A1 Noisy Payoffs Wi
Intentions
By i
C
D
C
D
By j
C
C
D
D
k
Observation Probabilities for Pair
(C,C)
(D,C)
2
(1 – e)
e(1 – e)
e(1 – e)
e2
e(1 – e)
(1 – e)2
e2
e(1 – e)
(C,D)
e(1 – e)
e2
(1 – e)2
e(1 – e)
(D,D)
2
e
e(1 – e)
e(1 – e)
(1 – e)2
Noisy Payoffs to i
ui(C,C)=e(1 – e)(1 – bi) – e2ci
ui(D,C)=(1 – e)2 – e2bi – e(1 – e)ci
ui(C,D)=e2 – (1 – e)2bi – e(1 – e)ci
ui(D,D)=e(1 – e)(1 – bi) – (1 – e)2ci
Wi in (A1) and (A2) depends on the Markov strategy Ψ under consideration.
In Trigger: Wi = <ui(C,C),ui(D,D),ui(D,D)> and, in CTFT, Wi =
<ui(C,C),ui(C,D),ui(D,C)> for i = 1 with ui(C,D) and ui(D,C) exchanged for
i = 2. The solution formulae for U ik are, in general, complicated and hardly
tractable for theoretical results. Instead, results can be based on the observation that noise magnitude e is “small” and that for stable definitions of strategy much can be learnt from a differential analysis (as e → 0). Proposition 1,
Definition 1, and Theorem 1 are the basis for this approach.
Proposition 1: As e → 0, credibility conditions obtain for pairs (qi (e),ri
(e)) with limit (qi,ri) as e → 0 satisfying
1) For trigger: ri ≤ ci qi −
1− w
w
(A3)
378
J. P. Langlois and C. C. Langlois
1− w
and (1 – w + wri)ci) ≥ (1 – w)bi (6)
w
Proof: As e → 0 (A2) yields
1) For trigger:
2) For CTFI: ri ≤ bi qi −
U iCO = 0 + wU iCO = 0
U iPi = −ci + w {(1 − ri )U iPi + riU iCO } =
−ci
1 − w + wri
Deterrence (of player i) is achieved when 1 + wqiU iPi ≤ U iCO = 0 or (A3).
It is optimal for player i to comply (play D while j plays D) with punishment in state Pi when: −ci + wU iPi ≤ U iPi which always holds since
−ci ≤ (1 − w )U iPi = −(1 − w )
ci
1 − w + wri
2) For CTFT:
U iCO = 0 + wU iCO = 0
U iGi = −bi + w {(1 − ri )U iGi + riU iCO } =
−bi
1 − w + wri
Deterrence (of player i) is achieved when 1 + wqiU iGi ≤ 0 or:
ri ≤ bi qi −
1− w
w
(A4)
It is optimal for player i to comply (play C while j plays D) with
punishment in state Gi when: −ci + wU iGi ≤ U iGi which holds, provided
that
−ci ≤ (1 − w )U iGi = −(1 − w )
bi
1 − w + wri
or
(1 − w + wri )ci ≥ (1 − w )bi
(A5)
It is optimal for player j to play D against C in state Gi and to comply
when returning to state CO in order to avoid becoming guilty. Indeed, if
player j fails to use D while in state Gi, this does not affect status or return
probability ri. Q.E.D.
Dispute Settlement Design for Unequal Partners
379
Proposition 2: The long-run value Vi of a Markov strategy pair Ψ
1
∑ m kWik where Wik is i’s noisy expected payoff deter1− w k
mined by player intentions in state k (see Table A1) according to Markov
strategy Ψ, and mk is the long-run frequency with which state k is visited.
reads Vi =
Proof: Following standard Markov chain theory, if player i receives
discounted payoff U ik in state k, the long-run value to player i of the strategy
characterized by the set of states 1,2..k,..n} and transition matrix M is
Vi =
∑ k =1 m kU ik. In dot product form:
n
Vi = m ⋅ U i = m ⋅ Wi + wm MU i =
m ⋅ Wi + wm ⋅ U i = m ⋅ Wi + wVi =
1
m ⋅ Wi
1− w
(A6)
Q.E.D.
Definition 1: A family Ψ(e) of Markov strategy pairs is “stable under noise”
(SUN) if: (i) Ψ(e) is differentiable for e > 0; (ii) all Ψ(e) share the same set of
states; (iii) there exists a unique Ψ = lime → 0Ψ(e)—meaning that the probability
of each move at each state in Ψ(e) approaches the probability of that move in
Ψ; (iv) there is a unique invariant distribution m of the transition matrix M of Ψ;
and (v) m(e) is differentiable in e and its derivative has a limit m′ = lime → 0m′(e).31
Theorem 1: If Ψ(e) is SUN then: (i) m satisfies m′ = m′M + mM′; (ii) player
i’s long run value Vi(e) is differentiable in e; (iii) lime → 0Vi′(e) exists and is given
1
ni with ni = m 0′ ⋅ Wi0 + m 0 ⋅ wi0 –where Wi0 = Wi (0) is the
by Vt′(0) =
1− w
dWi
noiseless utility vector and wi0 is the derivative
; and (iv) there exists
d e e =0
e
O(e) such that lime→0O(e) = 0 and Vi ( e ) =
ni + e O(e ).
1− w
Proof: Since m and M are differentiable, the product rule yields the first
formula. Moreover, since Wi is also differentiable, by Proposition 3:
Vi ′ ( e ) =
dWi ⎫
1 ⎧
⎨m ′ ⋅ Wi + m ⋅
⎬
de ⎭
1− w ⎩
(A7)
and by stability of Ψ under noise Vi′ has the given limit. By the Mean Value Theorem of Calculus Vi(e) = Vi(0) + eVi′(ni) for 0 < ni = ni(e) < e. But by the above
limit there exists oi(n) such that Vi′(n) = Vi′(0) + oi(n) with lim n → 0oi(n) = 0. Letting O(e) = maxi {oi(ni(e))} and observing that Vi(0) = 0 yield the result. Q.E.D.
380
J. P. Langlois and C. C. Langlois
Proposition 3: The family of triggers with fixed qi and ri is SUN. More-
⎛q
q ⎞
over, nitrigger is given by nitrigger = 1 − bi − ci ⎜ 1 + 2 ⎟ .
⎝ r1 r2 ⎠
Proof: Condition (i), (ii), and (iii) of Definition 1 are satisfied since Ψ is
constant in e. Under noise magnitude e the transition matrix (on states CO,
P1, P2) reads:
⎛ 1 − e ( q − e )( q1 + q2 ) e (1 − e )q1
M (e ) = ⎜
r1
1 − r1
⎜
r2
0
⎝
e (1 − e )q2 ⎞
⎟
0
⎟
1 − r2 ⎠
and
⎛ −( q1 + q2 ) q1
M 0′ = ⎜
0
0
⎜
0
0
⎝
q2 ⎞
0⎟
⎟
0⎠
The vector m0′ satisfying m0′[I – M0] = m0M0′ = <−(r1m1 + r2m2),r1m1,r2m2> is
⎛q
q ⎞ q q
m 0′ = < − ⎜ 1 + 2 ⎟ , 1 , 2 > . Treaty value and nitrigger then result from
⎝ r1 r2 ⎠ r1 r2
Theorem 1. Q.E.D.
Corollary: As e → 0, subgame perfect triggers with maximum long-run
values obtain for pairs (qi(e),ri(e)) with limit (qi,ri) as e → 0 satisfying ri = 1
1
1− w
and qi = 1 if wci < 1. The correand qi =
if wci > 1, and ri = ci −
wci
w
q
sponding minimum value aitrigger of i is given by
ri
aitrigger
w
⎧
⎪ wc − (1 − w )
⎪ i
=⎨
1
⎪
⎪⎩
wci
⎫
if wci ≤ 1⎪
⎪
⎬
if wci ≥ 1⎪
⎪⎭
(A8)
qi
are minimum. According to
ri
Proposition 1 we examine the limit case as e → 0. For e = 0 the minimum
qi
of r is found when a credibility constraint is saturated—which means
i
qi
1− w
qi
w qi
with qi >
(to ensure ri > 0). The minimum of
=
ri
wci
r
wc q − (1 − w )
Proof: nitrigger is maximum when both
i
i i
Dispute Settlement Design for Unequal Partners
381
occurs for the maximum qi that allows both qi and ri to be probabilities.
Either this means qi = 1 and the given ri when it is a probability, or it means
ri = 1 and the given qi which must then be a probability. The minimum
value aitrigger follows immediately. Q.E.D.
Proposition 4: The family of CTFT with constant qi and ri is SUN.
Moreover,
victft = 1 +
⎛
q ⎞
− ⎜ 1 + i ⎟ bi
rj ⎝
ri ⎠
qj
Proof: Conditions (i), (ii), and (iii) of Definition 1 clearly hold
since Ψ (e) is constant in e. Condition (iv) is just as obvious with m II =
1. To obtain m′ we need M′ for which we need M = M (e). When CC is
intended (at CO) a unilateral defection by i occurs with probability e(1
– e) and a bilateral defection occurs with probability e 2. The probability that player i alone becomes guilty from state CO as a result of
noise is:
qi e (1 − e ) + qi (1 − q j )e 2 − e qi (1 − e q j )
The first term corresponds to an observed unilateral defection by i and
the second to a bilateral defection with i alone being found guilty. The
probability of return from Gi to CO is simply ri (1 – e) since i will be
observed to cooperate as expected with probability (1 – e). The transition
matrix under noise e therefore reads:
⎛ 1 − e ( q1 + q2 − 2e q1q2 ) e q1 (1 − e q2 ) e q2 (1 − e q1 )⎞
⎟
0
M (e ) = ⎜
r1 (1 − e )
1 − r1 (1 − e )
⎜
⎟
r2 (1 − e )
0
1 − r2 (1 − e ) ⎠
⎝
⎛ −( q1 + q2 ) q1 q2 ⎞
Thus M 0′ = ⎜
−r1
r1 0 ⎟
⎜
⎟
−r2
0 r2 ⎠
⎝
The vector m′0 satisfying m0′[I – M0] = m0M0′ = <–(q1 + q2), q1, q2> is
⎛ q1 q2 ⎞ q1 q2
>. Treaty value and nictft then result
therefore: m 0′ = < − ⎜ + ⎟ , ,
⎝ r1 r2 ⎠ r1 r2
from Theorem 1. Q.E.D.
Proposition 5: For any Prisoner’s Dilemma there exists a CTFT
scheme such that nictft > nitrigger for both i.
382
J. P. Langlois and C. C. Langlois
Proof: In CTFT, the minimum value of
ditions (A4) and (A5) is given by
qi
= aictft
ri
qi
subject to the credibility conri
w
⎧
⎪ wb − (1 − w )
⎪ i
=⎨
1
⎪
⎪⎩
wbi
⎫
if wbi ≤ 1⎪
⎪
⎬ . We
if wbi ≥ 1⎪
⎪⎭
verify that nictft > nitrigger for the choice aictft against the best trigger choice
aitrigger :
ctft
trigger
victft = 1 + a ctft
+ a trigger
) = vitrigger
j − bi (1 + ai ) > 1 − bi − ci ( ai
j
clearly holds if: bi aictft − ci aitrigger ≤ 0 . This last inequality must be verified in three cases:
1. If wbi > wci ≥ 1 it reduces to
bi
c
− i = 0;
wbi wci
wbi
wci
−
≤ 0 which holds
wbi − (1 − w ) wci − (1 − w )
x
is decreasing in x > 1 – w.
since bi > ci and the expression
x − (1 − w )
≤
1
and
wbi
≥
1
it
reduces
to
3. If
wci
bi
wci
(1 − w )(1 − wci )
≤ 0 since wci > (1 – w) is implied
−
=−
wbi wci − (1 − w )
w ( wci − (1 − w ))
by (3). Q.E.D.
2. If wci ≤ wbi ≤ 1 it reduces to