Cooperative and Non-Cooperative Equilibrium Problems with

Cooperative and Non-Cooperative Equilibrium
Problems with Equilibrium Constraints:
Applications in Economics and Transportation
Andrew Koh⋆
Institute for Transport Studies,
University of Leeds,
Leeds, LS2 9JT, United Kingdom,
[email protected]
http://www.its.leeds.ac.uk
Abstract. In recent years, a plethora of multi-objective evolutionary
algorithms (MOEAs) have been proposed which are able to effectively
handle complex multi-objective problems. In this paper, we focus on
Equilibrium Problems with Equilibrium Constraints. We show that one
interpretation of the game can also be handled by MOEAs and then discuss a simple methodology to map the non-cooperative outcome to the
cooperative outcome. We demonstrate our proposed methodology with
examples sourced from the economics and transportation systems management literature. In doing so we suggest resulting policy implications
which will be of importance to regulatory authorities.
Keywords: Equilibrium Problems with Equilibrium Constraints, MultiObjective Evolutionary Algorithms, Nash Equilibrium, Collusion, Transportation
1
Introduction
This focus of this paper is on hierarchical optimization problems. Fig. 1 shows the
structure of a single-leader follower game/bilevel optimization problem [8] which
has attracted attention in the Evolutionary Computation community in recent
years 1 . In this paper we study a generalized variant of such problems known as
Equilibrium Problems with Equilibrium Constraints (EPECs) as illustrated in
Fig. 2. Both are hierarchical games where the followers take the leader’s variables
as given and their responses are subsequently imposed as a nonlinear binding
constraint on the actions of the leader(s). The difference is that EPECs are
characterized by the presence of multiple leaders.
In this multi-leader generalization of the classic Stackelberg[18] game, researchers have conjectured that there could be two possible behaviors of the
⋆
1
The author is grateful for financial support by the Engineering and Physical Sciences
Research Council of the UK under Grant EP/H021345/1.
E.g. a Special Session on Bilevel Optimization was convened at the 2012 IEEE
Congress on Evolutionary Computation (CEC) (June 10-15) in Brisbane, Australia.
2
Andrew Koh
Fig. 1. Bilevel Program or Mathematical Problem with Equilibrium Constraint Fig. 2. Equilibrium Problem with Equilibrium Constraint (EPEC)
(MPEC)
leaders[11, 15]. At one extreme, they could cooperate and such a postulate leads
naturally to a Multi-Objective EPEC (hereinafter termed MOPEC) [20]. At the
other extreme, these leaders could act engage in a non-cooperative Nash[14] game
amongst themselves thereby resulting in a Non Cooperative EPEC (NCEPEC).
Under either postulate of leader behavior, we argue that meta-heuristics offer a powerful solution methodology for EPECs that are usually tackled using
tools of generalized calculus [11]. For MOPECs, population based MultiObjective Evolutionary Algorithms (MOEAs) are particularly suitable due to their
inherent ability to identify multiple Pareto Optimal solutions in a single run [2].
For the latter class of NCEPECs, a Differential Evolution [16] based algorithm
exploiting a concept from [10] has been proposed in [9]. In this paper we suggest a methodology that maps the Non Cooperative outcome to the Cooperative
outcome by modification of the algorithm proposed in [9].
The rest of this paper is structured thus. We give an overview of notation
used in this paper in the next section. In Section 3 fundamental notions of MultiObjective optimization are reviewed before an evolutionary algorithm for solving
MO problems is given. Section 4 outlines a solution algorithm for NCEPECs.
Section 5 discusses a simple method to map the Non-Cooperative outcome to
the Cooperative outcome. Section 6 illustrates the concepts with numerical examples. Section 7 summarizes and provides directions for further research.
2
Notation
In this paper we consider ρ-person games. Focusing on the leaders, each game is
defined by a tuple {N, Xi , Ui } where N is the set of leaders {1, 2, . . . , ρ}, Xi is the
strategy/action space for leader i, i ∈ N and Ui is the payoff function (or reward),
Ui : Rρ → R1 , that a leader gets by playing an action/strategy, dependent on the
actions which all others take. The collective action of all leaders, often referred
to as a strategy profile, is denoted by x = [x1 , ..., xi , ..., xρ ]⊺ . It is convenient to
write x−i when referring to the strategies of every leader excluding the leader
i i.e. x−i = [x1 , . . . , xi−1 , xi+1 , ..., xρ ]⊺ . With a slight abuse of notation, we also
Cooperative and Non Cooperative EPECs
3
write x = [xi , x−i ]⊺ . Note that [xi , x−i ] does not mean that the components of
x are reordered, so that strategies of leader i becomes the first block. Unless
otherwise specified, all vectors are assumed to be column vectors.
The response of the followers that affects the actions of the leaders is assumed
to take the form of a Variational Inequality (VI) constraint that defines equilibrium in some parametric system. Following [12] we assume that the solution of
this VI exists and is unique for a given a vector of the leaders’ strategies.
3
Cooperative EPECs (MOPECS)
A generic MOPEC is shown in Equation 1. Except for the variational inequality
constraint, this problem takes the form of a generic multi objective optimization
problem conventionally handled by MOEAs [1],[2].
Program MOPEC
n
max(U1 (x, y), . . . , Uρ (x, y))⊺
x∈X
(1a)
where for given x, y is the unique solution of the Variational Inequality in 1b:
L(x, y)⊺ (y − y ∗ ) ≥ 0, ∀y ∈ Υ (x)
(1b)
MOEAs apply stochastic operators to a parent population to evolve a fitter
child population to solve multi-objective problems. During the selection phase,
a comparison is made between a chromosome a from the parent population and
a chromosome b from the child population on the basis of fitness and the weaker
of the two is discarded. Since one of the tasks of an MOEA is to identify the
entire Pareto front [2], fitness is assigned based on Pareto Domination: a Pareto
Dominates b if a is no worse off than b in all objectives and a is strictly better
than b in at least one objective ([2], Definition 2.5, pp. 28).
Algorithm 1 outlines the Multi-Objective Self Adaptive Differential Evolution (MOSADE) Algorithm [5] that was used to generate the Pareto Fronts for
the MOPECs to be described in Section 6. MOSADE uses an archive to store
solutions as they are discovered during the search process. To evaluate the candidate, it is necessary to solve the lower level VI problem in Eq. 1b to maintain
the leader-follower paradigm implicit in such hierarchical optimization problems
[8].
4
Non Cooperative EPECs (NCEPECS)
In the NCEPEC each leader i treats his competitor’s strategic variables as exogenous when maximizing his payoff as in Eq. 2.
n
max Ui (x, y)
∀i ∈ N ,Player i solves:
(2)
x∈X
where for given x, y is the unique solution of the Variational Inequality in 1b.
4
Andrew Koh
Algorithm 1 Multi-Objective Self Adaptive Differential Evolution (MOSADE)
[5]
1. Evaluate initial population P of |P | random individuals.
2. Set archive A to ∅
3. While stopping criterion not met, do:
For each individual Pi , i ∈ {1, . . . , |P |} repeat:
(a) Use DE to create candidate Ci from parent Pi .
(b) Evaluate Ci by solving lower level VI 1b
(c) If Pi dominates Ci , discard Ci else go to Step 4
4. Compare Ci with each member of A,
(a) if maximum size of A reached, choose between Ci or each member of A
depending on which occupies the less crowded region of function space
(b) if Ci dominates any A, remove the member of A so dominated,
accept Ci into A
(c) if Ci is dominated by any A, reject Ci
5. Update DE control parameters as described in [5]
It can be shown that the solution of Eq. 2, if one exists, is a Nash Equilibrium
(NE) which is obtained when the condition in Eq. 3 is satisfied [14].
Ui (x∗i , x∗−i ) ≥ Ui (xi , x∗−i ) ∀xi ∈ Xi , ∀i ∈ N
(3)
Traditional approaches for locating NE are based on fixed point algorithms
(e.g. non-linear Gauss-Siedel [4]) or through resolution of a Complementarity
Problem formulation [6]. If players can benefit (i.e. increase payoff) from deviating from their current action, then that action cannot be a NE. By counting the
number of players that can potentially benefit from deviating, we can compare
two chromosomes (representing two strategy profiles, ({a, b} ∈ X)) to determine
which is closer to a NE and thus deemed “fitter”. We say that a Nash Dominates b if there are fewer players that benefit from unilaterally deviating (to b)
when playing a compared to deviating when playing b. Based on this principle
an evolutionary algorithm for NCEPECS called Nash Domination Evolutionary
Multiplayer Optimization (NDEMO) was proposed in [9] as summarized in Algorithm 2. From the proof in [10], the convergence of NDEMO to the NE, if one
exists, is theoretically assured for any arbitary NCEPEC.
5
Mapping Non Cooperative to Cooperative Solution
We can map the non-cooperative to the cooperative solution by modifying the
objective function of leader i such that he takes into account a proportion (α, 0 ≤
α ≤ 1) of the payoff of all other leaders when optimizing his own payoff Ui as
shown in Eq.4. It is not hard to see that with α = 0 we recover the objective
function in Eq.2.
∀i ∈ N ,Player i solves:
(
max(Ui (x, y) +
x∈X
N
P
j=1,j6=i
αUj (x, y))
(4)
Cooperative and Non Cooperative EPECs
5
Algorithm 2 Nash Domination Evolutionary Multiplayer Optimization
(NDEMO) [9]
1. Evaluate initial population P of |P | random individuals.
2. While stopping criterion not met, do:
For each individual Pi , i ∈ {1, . . . , |P |} repeat:
(a) Use DE to create candidate Ci from parent Pi .
(b) Evaluate Ci by solving lower level VI 1b
(c) If Ci Nash Dominates Pi , Ci replaces Pi .
Else discard Ci .
where for given x, y is the unique solution of the Variational Inequality in 1b.
6
Numerical Examples
In this section, we present two numerical examples that aim to demonstrate the
mapping of the NCEPEC to the MOPEC solution.
6.1
Example 1: Competition between Producers
We first consider a 5 player model from [12] using data found in [4] and [13].
The case where 2 of the 5 players emerged as the leaders was discussed in [12]
who reported two possible solutions of the resulting MOPEC 2 .
The leaders’ optimization problem is subject to the followers’ response manifesting in a VI that is imposed as a constraint on the actions of the leaders. In
practical implementation, the PATH solver [3] is used to resolve the lower-level
VI for a given tuple of the leaders’ strategies x to obtain the responses of the
lower-level followers y. [9] shows that the solutions reported in [12] are only two
out of all possible Pareto non dominated solutions found by MOSADE (see Fig.
3 and Table 1).
Table 1. The two solutions reported in [12] and indicated on Fig. 3 with ⋆
Solution 1 Solution 2
Profit of Leader 1
Profit of Leader 2
840.86
485.63
978.89
410.97
In [9], we also computed the case for when these two leaders engaged in noncooperative behavior resulting in a NCEPEC. As reported in [9] the NCEPEC
solution results in production levels of 97.70 units for Leader 1 and 42.14 units
for Leader 2 with corresponding profits of 950.56 and 414.72. In profit space,
2
Recall that this is the case where we assumed that the leaders cooperated.
6
Andrew Koh
this point is indicated as × on Fig. 3. The arrows indicate that the NCEPEC
solution is not Pareto Optimal since any leader can be made better off (i.e.
increase profit) without making the other worse off. Note that while the solution
with α = 1 lies on the Pareto Front (and hence Pareto Optimal) this is different
from Solution 2 reported in [12] and shown in Fig. 4 with ⋆.
500
Profit: Leader 2
α=0.4
450
α=1
400
α=0
(Nash)
350
840
860
880
900
920
940
960
980
1000
Profit: Leader 1
Fig. 3. Example 1: Pareto Front generated Fig. 4. Example 1 : (zoomed) Pareto Front
by MOSADE alongside solutions in [12] in- (Cooperative EPEC) and “collusion path”
dicated by ⋆, NCEPEC solution indicated mapping NCEPEC (α = 0) to Pareto Front
(α = 1)
by ×
In order to map the non-cooperative outcome to the cooperative outcome,
NDEMO (Algorithm 2) was applied each time fixing α, in steps of 0.2, between
0 and 1. The results are indicated on Fig. 4 with detailed results in Table 2.
Notice that the NCEPEC solution reported in [9] is obtained with α = 0. It is
clear that as α increases, the profits accruing to the leaders tend towards the
Pareto Front. We term the path, from the NCEPEC solution (α = 0) to the
Pareto Front (α = 1) the “collusion path”.
Table 2. Example 1: Production Quantities and Profits for Leaders as α increases
α
0 (NCEPEC)
0.2
0.4
0.6
0.8
1 (Pareto Front)
Leader 1
Leader 2
Quantity Profit Quantity Profit
97.70
95.81
94.04
92.40
90.90
89.56
950.56
956.95
962.76
968.08
973.07
977.93
42.14
40.58
39.01
37.43
35.84
34.23
414.73
417.53
418.70
418.19
416.01
412.06
This example shows that it is possible for leaders to signal to each other their
intention to cooperate and maximize profit (through the α parameter) and hence
engage in “tacit collusion”. In doing so, a leader can reduce output resulting in
Cooperative and Non Cooperative EPECs
7
increased total profit. Doing so would send signals to the other leader(s) to
indicate their willingness to collude.
Notice in Fig. 4 that the “collusion path” does not lie on a straight line
between the NCEPEC solution and the Pareto Front. While Leader 1’s profit
continues to rise as α increases and thus will have more to gain from collusion,
this is not true for Leader 2. In particular, the “collusion path” provides maximum profit for Leader 2 at α = 0.4 (c.f. Table 2) but decreases beyond that. In
fact the profit for Leader 2 at α = 1 is lower than that obtained under NCEPEC
even though that solution lies on the Pareto Front.
This implies the stability of any collusion might be undermined. This could
eventually lead to an all out quantity war for which Leader 1 could be made
worse off if conditions reverted back to the non-cooperative situation. Leader 1
could potentially compensate Leader 2 (e.g. by allocating Leader 2 a share of
the profits gained) in such a way that Leader 2 would still be incentivised to
cooperate. Regulators clearly need to be aware of such behavior when enforcing
anti-trust legislation.
6.2
Example 2: Competition between Authorities
A situation in which two city transportation authorities were the leaders at the
upper level was studied in [21]. In this context the strategic variable was the
toll price to charge on traffic using road(s) on the network upon which each city
exercised jurisdiction with the aim of maximising individual city welfare.
City welfare is a function of the traffic flows due to the routing of traffic on the road network which is in turn influenced by the toll levels charged.
Traffic routing must satisfy Wardrop’s equilibrium principle [19] which states
that traffic arranges on the network such that at an equilibrium, the cost of all
used routes/paths connecting any individual Origin-Destination pair is equalized. Wardrop’s equilibrium principle can be expressed as a VI [17]. The usual
way of obtaining the traffic flows, once the tolls are input, is through traffic
assignment [7].
Among several different governance models studied in [21], two cases of most
relevance to this work are as follows:
1. the authorities engage in a Nash game by setting tolls with each maximising
individual city welfare subject to Wardrop’s equilibrium expressed as a VI
i.e. a NCEPEC.
2. the authorities cooperate to set tolls to maximise both cities’ welfare simultaneously subject to Wardrop’s equilibrium expressed as a VI i.e. a MOPEC.
The network is shown in Fig. 5 where dashed arcs are subject to tolls in
each Authority respectively. The Pareto Front for the MOPEC where the cities
cooperate to maximize welfare is shown in Fig. 6. The single solution reported
in [21] also lies on this Pareto Front (indicated by ⋆ on Fig. 6).
Fig. 7 shows the result (indicated by ×) when α = 0 i.e. the NCEPEC
solution where the leaders played a Nash game instead. This point is not Pareto
Optimal since one city can increase welfare without making the other worse off.
8
Andrew Koh
The “collusion path” mapping the NCEPEC to the Pareto Front and Table
3 shows that the welfare for City I marginally decreases as α increases (and
the opposite for City II). Hence whether the cooperative solution is sustainable
(because City II benefits but City I marginally loses out) is an issue that warrants
further research. We also notice that the tolls fall as α rises, again as an indication
of signalling behavior to the other authority.
We notice that City I’s welfare (c.f. Table 3) continuously decreases as we
move from the NCEPEC solution to the MOPEC solution (and eventually lower
than the NCEPEC outcome at α = 1). Such a situation implies that City II might
have to compensate City I so that the latter would be incentivised to cooperate.
Again, this opens up a plethora of further research possibilities studying the
policy implications in such situations e.g. stability of agreements.
ŝƚLJ/
ŝƚLJ//
Fig. 5. Directed network [21] for Example 2 with the line down the middle demarcating
authority jurisdiction. Dashed arcs are tolled arcs in each city.
8
x 10
1.17
α=1
1.165
Welfare: City II
1.16
α=0.6
1.155
1.15
1.145
1.14
α=0
(Nash)
1.135
1.13
7.45
7.5
7.55
Welfare: City I
7.6
7.65
7
x 10
Fig. 7. Example 2 : (zoomed) Pareto Front
Fig. 6. Example 2: Pareto Front (Cooper- (Cooperative EPEC) and “collusion path”
ative EPEC) Cities I and II, Cooperative mapping NCEPEC ( ×, α = 0) to Pareto
Front (α = 1)
Solution from [21] indicated with ⋆
Cooperative and Non Cooperative EPECs
9
Table 3. Example 2: Tolls and Welfare for Leaders as α increases
City I
Toll
Welfare
(secs) (10,000 secs)
0.00 (NCEPEC) 4943
7656.28
0.20
4847
7656.25
0.40
4748
7655.97
0.60
4648
7655.45
0.80
4544
7654.65
1.00 (Pareto Front) 4439
7653.56
α
7
City II
Toll
Welfare
(secs) (10,000 secs)
4957 11630.22
4931 11631.40
4902 11632.59
4871 11633.80
4838 11635.03
4802 11636.27
Conclusions
In this paper, we studied a class of hierarchical optimization problems with multiple leaders characterized by the presence of a binding variational inequality.
This problem is collectively referred to as Equilibrium Problems with Equilibrium Constraints. Two assumption of leader behavior were discussed depending
on whether the leaders cooperated to optimize their objectives or otherwise. We
showed that advances in multi-objective evolutionary algorithms could be used
to generate Pareto Fronts that represented the situation in which the leaders
cooperate. In addition we have already proposed an algorithm for the non cooperative situation in our earlier research. The main contribution of this paper
is to demonstrate the potential mapping of the non-cooperative solution to the
cooperative outcome through the use of evolutionary algorithms. We term the
mapping between these two solutions the “collusion path” since it paths the collusion possibilities between leaders in a game. With numerical examples drawn
from both the economics and transportation systems management literature, we
demonstrated the role that this path plays in assisting policy makers in developing anti-trust legislation.
In terms of policy research, further work should be undertaken to understand
the “collusion path” as this will affect the incentives for cooperative action. With
regard to evolutionary algorithms, though we have theoretical assurance of the
convergence of NDEMO to a NE for arbitary NCEPEC, each run will still take
some time. Hence investigating methodologies to speed up convergence of the
NDEMO algorithm for the non-cooperative case would be continue to be a useful
area of research.
References
1. Coello-Coello, C., Lamont, G.: Applications of multi-objective evolutionary algorithms. World Scientific, Singapore (2004)
2. Deb, K: Multi-objective Optimization using Evolutionary Algorithms. John Wiley, Chichester (2001)
3. Ferris, M., Munson, T.: Complementarity problems in GAMS and the PATH
solver. Journal of Economic Dynamics and Control, 24(2), 165–188 (2000)
10
Andrew Koh
4. Harker, P. T.: A variational inequality approach for the determination of
Oligopolistic market equilibrium. Math. Program. 30(1), 105–111 (1984)
5. Huang, V. L., Qin, A. K., Suganthan, P. N., Tasgetiren, M. F.:Multi-objective
optimization based on self-adaptive differential evolution algorithm. Proceedings
of IEEE CEC, 3601–3608. IEEE Press, Piscataway, New Jersey (2007)
6. Karamardian, S.: Generalized complementarity problems. J. Optimiz. Theory
App. 8(3), 161–168 (1971)
7. Koh A., Watling D.: Traffic Assignment Modelling. In: Button, K., Vega,
H., Nijkamp, P.(eds.) A Dictionary of Transport Analysis. Edward Elgar,
Cheltenham:418-420 (2010)
8. Koh, A.: Solving transportation bi-level programs with Differential Evolution.
Proceedings of the IEEE Congress on Evolutionary Computation pp. 2243–2250.
IEEE Press, Piscataway, New Jersey (2007)
9. Koh, A.: An evolutionary algorithm based on Nash dominance for equilibrium
problems with equilibrium constraints. Appl. Soft Comput. 12(1) 161–173 (2012)
10. Lung, R. I., Dumitrescu, D.: Computing Nash equilibria by means of evolutionary
computation. Int. J. Comput. Commun. III (Suppl. Issue - ICCCC 2008) 364–368
(2008)
11. Mordukhovich, B. S.: Variational Analysis and Generalized Differentiation, I: Basic Theory. Springer, Berlin (2006)
12. Mordukhovich, B. S., Outrata, J. V., Červinka, M.: Equilibrium problems with
complementarity constraints: Case study with applications to oligopolistic markets. Optimization 56(4), 479–494 (2007)
13. Murphy, F. H., Sherali, H. D., Soyster, A. L.: A mathematical programming
approach for determining oligopolistic market equilibrium. Math. Prog. 24(1),
92–106 (1982)
14. Nash, J.:(1951) Non-Cooperative games. Ann. Math. Second Series 54(2),286–295
(1951)
15. Outrata, J. V.: A note on a class of equilibrium problems with equilibrium constraints. Kybernetika 40(5),585–594 (2004)
16. Price, K., Storn, R., Lampinen, J.: Differential evolution: a practical approach to
global optimization. Springer, Berlin (2005)
17. Smith, M. J.: The existence, uniqueness and stability of traffic equilibria. Transport. Res. B-Meth. 13(4),295–304 (1979)
18. von Stackelberg, H. H.: The theory of the market economy. William Hodge, London (1952)
19. Wardrop, J. G.: Some theoretical aspects of road traffic research. P. I. Civil Eng.
Pt. 2., 1(36), 325-378 (1952)
20. Ye, J. J., Zhu, Q. J.: Multiobjective optimization problem with variational inequality constraints. Math. Prog. 96A(1), 139–160 (2003)
21. Zhang, X.N., Zhang, H.M., Huang, H.J., Sun, L.J., Tang, T.Q.: Competitive, cooperative and Stackelberg congestion pricing for multiple regions in transportation
networks. Transportmetrica, 7(4), 297-320(2011)