Appendix - Jeffrey Kaplow

A1
Simulating Compliance Equilibria in International Institutions
Jeffrey M. Kaplow
One implication of my theory is that a state may fluctuate in its compliance with an
institution when its assessment of the extent to which other states are cheating is close to the
minimum level it is willing to tolerate while staying in compliance. A discovery of a new
violation may send a state’s assessments below its minimum requirement, causing the state
to drop out of compliance with the institution. Some period of success for the institution may
then cause the state’s assessment of overall compliance to rise, prompting the state to return
to compliance. This effect may be particularly pronounced when there are still high levels of
uncertainty around the true effectiveness of the treaty (for example, early in the life of the
international organization or when the information provided by the institution’s track record
has been contradictory). When uncertainty is greater, each piece of new information is likely
to carry more weight and lead to greater adjustments in states’ expectations of overall
compliance.
These decisions by individual states to comply or not to comply with international
agreements, when aggregated to the level of the international organization (IO), determine
the overall effectiveness of international institutions. This theory suggests that IOs can settle
into different equilibria based on the institution’s design, the compliance and enforcement
track record of the IO, and variation in member states’ requirement for reciprocity. It is easy
to see that these factors can lead to a spiral of IO decline. As violations of an agreement are
discovered, states see others as less likely to comply and thus are less likely to do so
themselves. This leads to additional violations, making states even less likely to comply, and
so on. Ultimately, such IOs are reduced to zombie institutions; they may continue to exist
(IOs are very difficult to dispose of) but they no longer exert any constraining power on their
members.
On the other hand, confidence in an institution can breed more confidence, leading to
a virtuous cycle of self-reinforcing compliance. As time passes without news of violations
uncovered by strong verification mechanisms, states revise upward their assessments of the
compliance of others, and become more likely to comply themselves. These IOs become
constraining institutions, with high levels of compliance.
IO’s may also settle into a “good enough” equilibrium that occupies a middle ground,
reaching neither universal nor zero compliance. This equilibrium is likely under reasonable
assumptions about the set of state requirements for reciprocity, for example when these
approximate a bimodal distribution. In a final equilibrium, states vacillate between
compliance and noncompliance. This can be the result of a high level of compliance
uncertainty caused either by weak verification mechanisms or by a narrow distribution of
state requirements for reciprocity. These wavering institutions can exhibit a very different
constraining effect on member states depending on where we look in the cycle of compliance
and noncompliance.
A simulated n-player prisoner’s dilemma
To illustrate the way in which these different end-states can arise as a result of the
decisions of individual member states, I simulate a version of an iterated n-player prisoner’s
dilemma, with incomplete information and a continuum of player types.1 These simulations
do not constitute a test of my theory; they are merely one way to highlight some of the
implications of theory. At most, this approach can show that particular outcomes are not
logically inconsistent with the theory I have proposed. Because it is the complexity of this
1
See Molander (1992) for a closed-form solution to a simplified version of this game with
fewer than 5 players.
2
strategic interaction that drives the theorized track record mechanism—particularly variation
in the payoffs among a large number of players and uncertainty over outcomes—simulations
can be particularly helpful in understanding the system-level effects of decisions made at the
level of the individual state.
I simulate the choice of member states to comply with or violate an international
agreement, using a very simple decision rule: if the state’s assessment of overall compliance
exceeds a particular level, it complies, and if not, it violates. I now describe the functioning
of the underlying model, explain the simulation’s parameters, and finally present results with
a focus on the determinants of end-states for IO effectiveness.
The players in this game are the n member states of an international institution. In
each period of the infinitely repeated game, each member state decides whether to comply
with or violate the rules of the institution, and each receives a payoff equal to a function of
the number of other member states that choose to comply. Payoffs for compliance
incorporate the present value of the stream of payoffs into the future, while the value states
receive for violation is the payoff for the present period only. States share the same discount
rate. The payoffs conform to the structure of a prisoner’s dilemma, so the payoff for the
current period is always greater for violating than for complying. The specific payoff function
is assigned by nature according to some distribution.
As described above, any payoff in this game can be expressed in terms of the
minimum number of other states that must comply for the state to prefer compliance to
violation. States thus play a simple strategy: comply if i number of states comply, and violate
otherwise; where i is the state’s minimum requirement for reciprocity based on the payoff
function assigned by nature. Prior to the first period of play, nature assigns each member
3
state a prior expectation about overall compliance with the institution, according to some
distribution. In subsequent periods, each member state revises its beliefs about overall
compliance based on some function of the actual overall compliance in the previous period.
At the beginning of the simulation, then, some number of states is assumed to have
joined a new international institution. These states each have a particular level of overall
compliance that they require in order to comply themselves (their required reciprocity), as
well as some prior expectation about overall compliance (perhaps based on the monitoring
and verification measures present in the IO). If a state’s expectation for overall compliance is
greater than its required reciprocity, it will choose to comply in period 1. If a state’s
expectation for overall compliance is less than its required reciprocity, it will choose to
violate the agreement in period 1. After period 1, each state receives some information about
the number of states that complied in period 1 (the track record of the IO). States then revise
their previous expectation of overall compliance to reach a new assessment for period 2. If
that revised assessment is greater than a state’s required reciprocity, the state will comply in
period 2; if the revised assessment is less than its required reciprocity, it will violate the
agreement in period 2. After period 2, each state updates its assessment again, and this
process continues for some number of periods.
Table 1 lists the model’s parameters, with baseline values or distributions.2 The
baseline value for the number of states is set arbitrarily to 100, thus modeling an inclusive
but not universal international agreement. The number of periods is set to 100, enough for
2
These baseline values are reasonable starting points, but they represent inherently arbitrary
assumptions that are relaxed in the analysis that follows.
4
the simulation to reach equilibrium under most specifications.3 Changing either of these
values does not substantively affect the results of the simulation.
The required level of reciprocity is by default a normal distribution with mean 0.4
and standard deviation 0.2, providing a reasonable range of values.4 The initial assessment
of overall compliance is similarly a normal distribution with mean 0.6 and standard
deviation 0.2. The baseline values, then, represent a case in which, on average, member
states plan to comply in period 1 and expect that most states will do the same. At the same
time, there is a fairly broad range of expectations and requirements for compliance,
reflecting substantial uncertainty about outcomes.
After each period, states receive a noisy signal of IO effectiveness based on the actual
compliance in the previous round. In the baseline case, this information is normally
distributed around the actual level of compliance in the previous period with a standard
deviation of 0.1. The standard deviation of this distribution can be thought of as
representing the strength of verification measures within the IO. When verification is strong,
the information provided by the track record of the institution will be more accurate. When
verification is weak, the standard deviation will be high and the signal will be difficult to
distinguish from the noise.
3
We often treat each period as a calendar year, but there is no overriding reason that this
should be so. We might, for example, think of the baseline 100 periods as representing a 25year treaty that is evaluated quarterly by each of its members.
4
For this and other bounded values in the model, values greater than 1 and less than 0 are
reset to 1 and 0, respectively. Beta and logistic normal distributions, bounded between 0 and
1, yield similar results.
5
Table 1: Parameters for a simulation of IO compliance
Parameter
Number of states
Number of periods
Required level of reciprocity
Initial assessment of overall compliance
Information about previous round’s overall
compliance (track record)
Weight on previous round’s assessed compliance
versus earlier rounds (α)
Baseline Value/Distribution
100
100
N(μ = 0.4, σ = 0.2)
N(μ = 0.6, σ = 0.2)
N(μ = actual compliance in previous
round, σ = 0.1)
0.05
State assessments of the overall level of compliance with the IO are an exponential
weighted moving average of the initial expectation and prior assessments of compliance.
New information is weighted according to the α parameter, such that At = (α × Ct–1 ) + [(1
– α) × At–1], where At is the assessed level of overall compliance for period t, Ct–1 is the noisy
signal of compliance in the previous period, At–1 is the state’s assessment before the previous
period, and A1 is the state’s initial assessment of compliance. Higher values for α mean that
new information is weighted more heavily, at the risk of overreaction. Low values of α
suggest states have a longer memory or are more concerned with longer-term trends.5 The
baseline value of α, 0.05, makes only small adjustments to states’ assessments of compliance
based on the most recent compliance information they receive.
Compliance equilibria in international institutions
The baseline values for the model parameters produce a self-reinforcing cycle of
compliance. States’ initial expectations with regard to overall compliance are generally
5
Note that the exponential weighted moving average does not allow players to “forget”
about past assessments or the initial expectation of compliance—these earlier judgments
always exert some influence over current assessments, even many periods later, as long as α
is less than 1.
6
higher than their required levels of reciprocity. Because of this, states are likely to comply
with the IO from the start. Their decision is then reinforced by the positive signal provided
by the track record of the regime; still more states then comply, creating an even more
positive signal. The end-state suggested by these simulations is one of near-universal
compliance.
Figure 1 shows the result of 1,000 simulations using the baseline values for the
model. In this and other figures depicting model output, the solid line tracks overall
compliance with the agreement, averaged over the simulations, and the dashed line shows
the average of states’ assessments of overall compliance in each period. The histogram on the
left side of the chart illustrates the distribution of required reciprocity among the states.
Shaded bars on the right side of the chart show the relative frequency of IO end-states—the
overall compliance with the agreement in the final period of each simulation. In the default
case shown in Figure 1, compliance starts high and gets higher (the solid line) as state
assessments increase (the dashed line) based on the track record of the IO. The speed with
which the IO reaches an equilibrium of universal compliance is largely governed by the α
parameter; higher values for α lead to more rapid adjustments in compliance in response to
new information.
7
Figure 1: A positive cycle of compliance
0.6
0.4
0.0
0.2
Compliance
0.8
1.0
0
20
40
Period
60
80
100
Mean actual compliance
Mean assessed compliance
Distribution of required reciprocity
Distribution of compliance end-states
Required reciprocity: N(µ = 0.4, σ = 0.2)
Initial assessment: N(µ = 0.6, σ = 0.2)
σTrack Record = 0.1
α = 0.05
Number of simulations: 1000
The race to universal compliance in the baseline example is driven by the relationship
between states’ initial expectations of compliance and states’ required reciprocity. In
Figure 1, initial expectations are on average higher than states’ required reciprocity. If the
reverse is true, the IO descends into an equilibrium where few or no states comply.6 The top
panel of Figure 2 shows such a negative cycle of decline. For these 1,000 simulation, I have
reversed the distributions of states’ required reciprocity and their initial expectations of
compliance from the baseline values, so that most states now expect that overall compliance
will not be high enough in the initial period to justify their own compliance.
6
In reality, there is likely to be some subset of states that complies with an agreement under
any circumstances. Setting such a compliance “floor” in the simulation does not change the
analysis.
8
0.6
0.4
0.0
0.2
Compliance
0.8
1.0
Figure 2: Negative cycles of IO compliance
0
20
40
Period
60
80
100
Mean actual compliance
Mean assessed compliance
Distribution of required reciprocity
Distribution of compliance end-states
0.6
0.4
0.2
0.0
Compliance
0.8
1.0
Required reciprocity: N(µ = 0.6, σ = 0.2)
Initial assessment: N(µ = 0.4, σ = 0.2)
σTrack Record = 0.1
α = 0.05
Number of simulations: 1000
0
20
40
Period
60
80
100
Mean actual compliance
Mean assessed compliance
Distribution of required reciprocity
Distribution of compliance end-states
Shock of increased violations
Required reciprocity: N(µ = 0.5, σ = 0.2)
Initial assessment: N(µ = 0.6, σ = 0.2)
σTrack Record = 0.1
α = 0.05
Number of simulations: 1000
9
While it is certainly possible for states to join a treaty with the knowledge that they
will violate it, it seems more likely that many of the states in the previous example would
simply choose not to sign such an agreement in the first place.7 A perhaps more plausible
story is shown in the bottom panel of Figure 2. Here, the mean required reciprocity (0.5) is
closer to the mean initial assessment of overall compliance (0.6), but most states still enter
the agreement with the intent to comply. An exogenous shock is then introduced in period
10 in the form of a 20-percentage point decrease in overall compliance.8 This shock, due,
say, to an international financial crisis or new military activity in some global flashpoint,
changes the trend of overall compliance from a positive cycle to a negative cycle. As states
adjust to the shock by incorporating information from the track record of the IO, compliance
drops further. The modal result is an equilibrium of zero compliance.9 An external shock that
leads to an increase in violations, then, is one way in which a majority of states can join a
treaty with the intent to comply, but later choose to violate the treaty because too few
member states are in compliance.
To keep the focus on the question of member state compliance, my simulation does
not allow states to exit the IO. Violation of an international agreement, however, is properly
7
I do not attempt to model treaty accession here, but we should not lose sight of the fact
that states which choose to participate in an IO are likely to be different in important ways
from those that decline to join.
8
Such a shock need not be so dramatic nor occur at only a single point. Relatively small and
repeated adjustments in the baseline level of compliance or violation can have the same
effect if the average initial assessment of compliance is close enough to the average required
reciprocity, and if enough weight is placed on new events (i.e. if α is high enough).
9
The timing of the shock matters. When α is low, so that new information affects states’
assessments only slightly, early shocks have a much greater effect on the IO’s end state. In
the example shown in the lower panel of Figure 2, the modal outcome shifts above zero
compliance when the shock occurs after period 30.
10
seen as one in a portfolio of options available to states for which compliance has become too
costly. These options include not just exit, but also the execution of flexibility mechanisms
and attempts to change the obligations required under the agreement.10 To the extent that
violation is most damaging to the IO overall (because it is more likely to cause other member
states to fall out of compliance in turn), flexibility mechanisms or even treaty exit may be
important tools to prevent the IO from falling into a zero-compliance equilibrium.
Of course, the world is not divided into those IOs enjoying universal compliance and
those suffering with zero compliance. The simulation illuminates many results in the wide
middle ground between these two extremes. Whether compliance with IOs will spiral all the
way up or down depends on the distribution of states’ requirement for reciprocity. If this
parameter is normally distributed, compliance will be pushed up or down depending on
states’ initial assessment of overall compliance. But a more realistic set of requirements for
reciprocity might be a bimodal distribution with spikes close to 0 and 1. We might expect
such a distribution in IOs, for example, that are only weakly constraining. Most member
states in these institutions either have no willingness or no capability to violate the
international agreement, but a minority is weakly constrained and will comply as long as
almost all others do so. The upper panel in Figure 3 shows 1,000 simulations of such a case.
Here, compliance is relatively high, but the states with the highest levels of required
reciprocity will choose not to comply.
Alternatively, a majority of states might have requirements for reciprocity that are
clustered around 1, with a minority near 0, as shown in the lower panel of Figure 3. We
10
On withdrawal from treaties, see Helfer (2005). On flexibility mechanisms, see HafnerBurton, Helfer, and Fariss (2011) and Rosendorff and Milner (2001). On renegotiation, see
Koremenos (2001, 2005).
11
Figure 3: Middle ground end-states of IO compliance
12
might find such a distribution in IOs with a two-tiered set of obligations, where a minority of
states faces fewer costs to complying, or simply in agreements, such as those governing free
trade, where a majority of states are very sensitive to the compliance of their fellow members
while a minority are very insensitive. In this example, compliance falls, but the small group
of member states with low requirements for reciprocity will remain in compliance with the
treaty.
Finally, the simulation illustrates that IOs may vacillate between positive and
negative cycles of compliance. This is most likely when verification measures are weak,
represented in the model by a high standard deviation associated with the track record of the
IO. Figure 4 gives an example from a single simulation.11 Here, the mean requirement for
reciprocity (0.5) is close to the mean initial assessment of compliance (0.6).12 The track
record is more noise than signal (σTrack Record = 1), and states pay relatively more attention to
recent events (α = 0.2). The result is a curve that vacillates between cycles of violation and
compliance, with changes driven by the very noisy signal of the IO track record.13
11
Multiple simulations cancel out the vacillating pattern in IO compliance, leaving a straight
line. Figure 4 is a representative example of this category of wavering IO.
12
The vacillating effect is more pronounced under more narrow distributions of states’
required reciprocity (i.e. when the standard deviation of states’ requirement for reciprocity is
lower than the baseline of 0.2).
13
Shifts in cycles of violation and compliance could also be driven by external shocks, as
discussed above.
13
Figure 4: Positive and negative cycles of IO compliance
0.6
0.4
0.0
0.2
Compliance
0.8
1.0
0
20
40
Period
60
80
100
Mean actual compliance
Mean assessed compliance
Distribution of required reciprocity
Required reciprocity: N(µ = 0.5, σ = 0.2)
Initial assessment: N(µ = 0.6, σ = 0.2)
σTrack Record = 1
α = 0.2
Number of simulations: 1
This simulation of IO compliance illustrates the dynamic nature of compliance
decisions implied by the track record mechanism. The theory is consistent with IO equilibria
in which compliance is high, in which violations are rampant, or in which IOs vacillate
between these two extremes. Simulating the emergent properties of state behavior in this
way also clarifies the importance within the theory of state-level factors that lead to variation
in states’ requirements for reciprocity. The proportion of states that are willing and capable
of violating the treaty plays a key role in pushing the IO either toward a positive, selfreinforcing equilibrium with high-levels of compliance, or toward a spiral of IO decline.
14
References
Hafner-Burton, Emilie M., Laurence R. Helfer, and Christopher J. Fariss. 2011. “Emergency
and Escape: Explaining Derogations from Human Rights Treaties.” International
Organization 65(04): 673–707.
Helfer, Laurence R. 2005. “Exiting Treaties.” Virginia Law Review 91(7): 1579–1648.
Koremenos, Barbara. 2001. “Loosening the Ties That Bind: A Learning Model of Agreement
Flexibility.” International Organization 55(2): 289–325.
———. 2005. “Contracting around International Uncertainty.” The American Political Science
Review 99(4): 549–65.
Molander, Per. 1992. “The Prevalence of Free Riding.” Journal of Conflict Resolution 36(4):
756–71.
Rosendorff, B. Peter, and Helen V. Milner. 2001. “The Optimal Design of International Trade
Institutions: Uncertainty and Escape.” International Organization 55(4): 829–57.
15