Cooperative and Non-Cooperative Equilibrium Problems with Equilibrium Constraints: Applications in Economics and Transportation Andrew Koh⋆ Institute for Transport Studies, University of Leeds, Leeds, LS2 9JT, United Kingdom, [email protected] http://www.its.leeds.ac.uk Abstract. In recent years, a plethora of multi-objective evolutionary algorithms (MOEAs) have been proposed which are able to effectively handle complex multi-objective problems. In this paper, we focus on Equilibrium Problems with Equilibrium Constraints. We show that one interpretation of the game can also be handled by MOEAs and then discuss a simple methodology to map the non-cooperative outcome to the cooperative outcome. We demonstrate our proposed methodology with examples sourced from the economics and transportation systems management literature. In doing so we suggest resulting policy implications which will be of importance to regulatory authorities. Keywords: Equilibrium Problems with Equilibrium Constraints, MultiObjective Evolutionary Algorithms, Nash Equilibrium, Collusion, Transportation 1 Introduction This focus of this paper is on hierarchical optimization problems. Fig. 1 shows the structure of a single-leader follower game/bilevel optimization problem [8] which has attracted attention in the Evolutionary Computation community in recent years 1 . In this paper we study a generalized variant of such problems known as Equilibrium Problems with Equilibrium Constraints (EPECs) as illustrated in Fig. 2. Both are hierarchical games where the followers take the leader’s variables as given and their responses are subsequently imposed as a nonlinear binding constraint on the actions of the leader(s). The difference is that EPECs are characterized by the presence of multiple leaders. In this multi-leader generalization of the classic Stackelberg[18] game, researchers have conjectured that there could be two possible behaviors of the ⋆ 1 The author is grateful for financial support by the Engineering and Physical Sciences Research Council of the UK under Grant EP/H021345/1. E.g. a Special Session on Bilevel Optimization was convened at the 2012 IEEE Congress on Evolutionary Computation (CEC) (June 10-15) in Brisbane, Australia. 2 Andrew Koh Fig. 1. Bilevel Program or Mathematical Problem with Equilibrium Constraint Fig. 2. Equilibrium Problem with Equilibrium Constraint (EPEC) (MPEC) leaders[11, 15]. At one extreme, they could cooperate and such a postulate leads naturally to a Multi-Objective EPEC (hereinafter termed MOPEC) [20]. At the other extreme, these leaders could act engage in a non-cooperative Nash[14] game amongst themselves thereby resulting in a Non Cooperative EPEC (NCEPEC). Under either postulate of leader behavior, we argue that meta-heuristics offer a powerful solution methodology for EPECs that are usually tackled using tools of generalized calculus [11]. For MOPECs, population based MultiObjective Evolutionary Algorithms (MOEAs) are particularly suitable due to their inherent ability to identify multiple Pareto Optimal solutions in a single run [2]. For the latter class of NCEPECs, a Differential Evolution [16] based algorithm exploiting a concept from [10] has been proposed in [9]. In this paper we suggest a methodology that maps the Non Cooperative outcome to the Cooperative outcome by modification of the algorithm proposed in [9]. The rest of this paper is structured thus. We give an overview of notation used in this paper in the next section. In Section 3 fundamental notions of MultiObjective optimization are reviewed before an evolutionary algorithm for solving MO problems is given. Section 4 outlines a solution algorithm for NCEPECs. Section 5 discusses a simple method to map the Non-Cooperative outcome to the Cooperative outcome. Section 6 illustrates the concepts with numerical examples. Section 7 summarizes and provides directions for further research. 2 Notation In this paper we consider ρ-person games. Focusing on the leaders, each game is defined by a tuple {N, Xi , Ui } where N is the set of leaders {1, 2, . . . , ρ}, Xi is the strategy/action space for leader i, i ∈ N and Ui is the payoff function (or reward), Ui : Rρ → R1 , that a leader gets by playing an action/strategy, dependent on the actions which all others take. The collective action of all leaders, often referred to as a strategy profile, is denoted by x = [x1 , ..., xi , ..., xρ ]⊺ . It is convenient to write x−i when referring to the strategies of every leader excluding the leader i i.e. x−i = [x1 , . . . , xi−1 , xi+1 , ..., xρ ]⊺ . With a slight abuse of notation, we also Cooperative and Non Cooperative EPECs 3 write x = [xi , x−i ]⊺ . Note that [xi , x−i ] does not mean that the components of x are reordered, so that strategies of leader i becomes the first block. Unless otherwise specified, all vectors are assumed to be column vectors. The response of the followers that affects the actions of the leaders is assumed to take the form of a Variational Inequality (VI) constraint that defines equilibrium in some parametric system. Following [12] we assume that the solution of this VI exists and is unique for a given a vector of the leaders’ strategies. 3 Cooperative EPECs (MOPECS) A generic MOPEC is shown in Equation 1. Except for the variational inequality constraint, this problem takes the form of a generic multi objective optimization problem conventionally handled by MOEAs [1],[2]. Program MOPEC n max(U1 (x, y), . . . , Uρ (x, y))⊺ x∈X (1a) where for given x, y is the unique solution of the Variational Inequality in 1b: L(x, y)⊺ (y − y ∗ ) ≥ 0, ∀y ∈ Υ (x) (1b) MOEAs apply stochastic operators to a parent population to evolve a fitter child population to solve multi-objective problems. During the selection phase, a comparison is made between a chromosome a from the parent population and a chromosome b from the child population on the basis of fitness and the weaker of the two is discarded. Since one of the tasks of an MOEA is to identify the entire Pareto front [2], fitness is assigned based on Pareto Domination: a Pareto Dominates b if a is no worse off than b in all objectives and a is strictly better than b in at least one objective ([2], Definition 2.5, pp. 28). Algorithm 1 outlines the Multi-Objective Self Adaptive Differential Evolution (MOSADE) Algorithm [5] that was used to generate the Pareto Fronts for the MOPECs to be described in Section 6. MOSADE uses an archive to store solutions as they are discovered during the search process. To evaluate the candidate, it is necessary to solve the lower level VI problem in Eq. 1b to maintain the leader-follower paradigm implicit in such hierarchical optimization problems [8]. 4 Non Cooperative EPECs (NCEPECS) In the NCEPEC each leader i treats his competitor’s strategic variables as exogenous when maximizing his payoff as in Eq. 2. n max Ui (x, y) ∀i ∈ N ,Player i solves: (2) x∈X where for given x, y is the unique solution of the Variational Inequality in 1b. 4 Andrew Koh Algorithm 1 Multi-Objective Self Adaptive Differential Evolution (MOSADE) [5] 1. Evaluate initial population P of |P | random individuals. 2. Set archive A to ∅ 3. While stopping criterion not met, do: For each individual Pi , i ∈ {1, . . . , |P |} repeat: (a) Use DE to create candidate Ci from parent Pi . (b) Evaluate Ci by solving lower level VI 1b (c) If Pi dominates Ci , discard Ci else go to Step 4 4. Compare Ci with each member of A, (a) if maximum size of A reached, choose between Ci or each member of A depending on which occupies the less crowded region of function space (b) if Ci dominates any A, remove the member of A so dominated, accept Ci into A (c) if Ci is dominated by any A, reject Ci 5. Update DE control parameters as described in [5] It can be shown that the solution of Eq. 2, if one exists, is a Nash Equilibrium (NE) which is obtained when the condition in Eq. 3 is satisfied [14]. Ui (x∗i , x∗−i ) ≥ Ui (xi , x∗−i ) ∀xi ∈ Xi , ∀i ∈ N (3) Traditional approaches for locating NE are based on fixed point algorithms (e.g. non-linear Gauss-Siedel [4]) or through resolution of a Complementarity Problem formulation [6]. If players can benefit (i.e. increase payoff) from deviating from their current action, then that action cannot be a NE. By counting the number of players that can potentially benefit from deviating, we can compare two chromosomes (representing two strategy profiles, ({a, b} ∈ X)) to determine which is closer to a NE and thus deemed “fitter”. We say that a Nash Dominates b if there are fewer players that benefit from unilaterally deviating (to b) when playing a compared to deviating when playing b. Based on this principle an evolutionary algorithm for NCEPECS called Nash Domination Evolutionary Multiplayer Optimization (NDEMO) was proposed in [9] as summarized in Algorithm 2. From the proof in [10], the convergence of NDEMO to the NE, if one exists, is theoretically assured for any arbitary NCEPEC. 5 Mapping Non Cooperative to Cooperative Solution We can map the non-cooperative to the cooperative solution by modifying the objective function of leader i such that he takes into account a proportion (α, 0 ≤ α ≤ 1) of the payoff of all other leaders when optimizing his own payoff Ui as shown in Eq.4. It is not hard to see that with α = 0 we recover the objective function in Eq.2. ∀i ∈ N ,Player i solves: ( max(Ui (x, y) + x∈X N P j=1,j6=i αUj (x, y)) (4) Cooperative and Non Cooperative EPECs 5 Algorithm 2 Nash Domination Evolutionary Multiplayer Optimization (NDEMO) [9] 1. Evaluate initial population P of |P | random individuals. 2. While stopping criterion not met, do: For each individual Pi , i ∈ {1, . . . , |P |} repeat: (a) Use DE to create candidate Ci from parent Pi . (b) Evaluate Ci by solving lower level VI 1b (c) If Ci Nash Dominates Pi , Ci replaces Pi . Else discard Ci . where for given x, y is the unique solution of the Variational Inequality in 1b. 6 Numerical Examples In this section, we present two numerical examples that aim to demonstrate the mapping of the NCEPEC to the MOPEC solution. 6.1 Example 1: Competition between Producers We first consider a 5 player model from [12] using data found in [4] and [13]. The case where 2 of the 5 players emerged as the leaders was discussed in [12] who reported two possible solutions of the resulting MOPEC 2 . The leaders’ optimization problem is subject to the followers’ response manifesting in a VI that is imposed as a constraint on the actions of the leaders. In practical implementation, the PATH solver [3] is used to resolve the lower-level VI for a given tuple of the leaders’ strategies x to obtain the responses of the lower-level followers y. [9] shows that the solutions reported in [12] are only two out of all possible Pareto non dominated solutions found by MOSADE (see Fig. 3 and Table 1). Table 1. The two solutions reported in [12] and indicated on Fig. 3 with ⋆ Solution 1 Solution 2 Profit of Leader 1 Profit of Leader 2 840.86 485.63 978.89 410.97 In [9], we also computed the case for when these two leaders engaged in noncooperative behavior resulting in a NCEPEC. As reported in [9] the NCEPEC solution results in production levels of 97.70 units for Leader 1 and 42.14 units for Leader 2 with corresponding profits of 950.56 and 414.72. In profit space, 2 Recall that this is the case where we assumed that the leaders cooperated. 6 Andrew Koh this point is indicated as × on Fig. 3. The arrows indicate that the NCEPEC solution is not Pareto Optimal since any leader can be made better off (i.e. increase profit) without making the other worse off. Note that while the solution with α = 1 lies on the Pareto Front (and hence Pareto Optimal) this is different from Solution 2 reported in [12] and shown in Fig. 4 with ⋆. 500 Profit: Leader 2 α=0.4 450 α=1 400 α=0 (Nash) 350 840 860 880 900 920 940 960 980 1000 Profit: Leader 1 Fig. 3. Example 1: Pareto Front generated Fig. 4. Example 1 : (zoomed) Pareto Front by MOSADE alongside solutions in [12] in- (Cooperative EPEC) and “collusion path” dicated by ⋆, NCEPEC solution indicated mapping NCEPEC (α = 0) to Pareto Front (α = 1) by × In order to map the non-cooperative outcome to the cooperative outcome, NDEMO (Algorithm 2) was applied each time fixing α, in steps of 0.2, between 0 and 1. The results are indicated on Fig. 4 with detailed results in Table 2. Notice that the NCEPEC solution reported in [9] is obtained with α = 0. It is clear that as α increases, the profits accruing to the leaders tend towards the Pareto Front. We term the path, from the NCEPEC solution (α = 0) to the Pareto Front (α = 1) the “collusion path”. Table 2. Example 1: Production Quantities and Profits for Leaders as α increases α 0 (NCEPEC) 0.2 0.4 0.6 0.8 1 (Pareto Front) Leader 1 Leader 2 Quantity Profit Quantity Profit 97.70 95.81 94.04 92.40 90.90 89.56 950.56 956.95 962.76 968.08 973.07 977.93 42.14 40.58 39.01 37.43 35.84 34.23 414.73 417.53 418.70 418.19 416.01 412.06 This example shows that it is possible for leaders to signal to each other their intention to cooperate and maximize profit (through the α parameter) and hence engage in “tacit collusion”. In doing so, a leader can reduce output resulting in Cooperative and Non Cooperative EPECs 7 increased total profit. Doing so would send signals to the other leader(s) to indicate their willingness to collude. Notice in Fig. 4 that the “collusion path” does not lie on a straight line between the NCEPEC solution and the Pareto Front. While Leader 1’s profit continues to rise as α increases and thus will have more to gain from collusion, this is not true for Leader 2. In particular, the “collusion path” provides maximum profit for Leader 2 at α = 0.4 (c.f. Table 2) but decreases beyond that. In fact the profit for Leader 2 at α = 1 is lower than that obtained under NCEPEC even though that solution lies on the Pareto Front. This implies the stability of any collusion might be undermined. This could eventually lead to an all out quantity war for which Leader 1 could be made worse off if conditions reverted back to the non-cooperative situation. Leader 1 could potentially compensate Leader 2 (e.g. by allocating Leader 2 a share of the profits gained) in such a way that Leader 2 would still be incentivised to cooperate. Regulators clearly need to be aware of such behavior when enforcing anti-trust legislation. 6.2 Example 2: Competition between Authorities A situation in which two city transportation authorities were the leaders at the upper level was studied in [21]. In this context the strategic variable was the toll price to charge on traffic using road(s) on the network upon which each city exercised jurisdiction with the aim of maximising individual city welfare. City welfare is a function of the traffic flows due to the routing of traffic on the road network which is in turn influenced by the toll levels charged. Traffic routing must satisfy Wardrop’s equilibrium principle [19] which states that traffic arranges on the network such that at an equilibrium, the cost of all used routes/paths connecting any individual Origin-Destination pair is equalized. Wardrop’s equilibrium principle can be expressed as a VI [17]. The usual way of obtaining the traffic flows, once the tolls are input, is through traffic assignment [7]. Among several different governance models studied in [21], two cases of most relevance to this work are as follows: 1. the authorities engage in a Nash game by setting tolls with each maximising individual city welfare subject to Wardrop’s equilibrium expressed as a VI i.e. a NCEPEC. 2. the authorities cooperate to set tolls to maximise both cities’ welfare simultaneously subject to Wardrop’s equilibrium expressed as a VI i.e. a MOPEC. The network is shown in Fig. 5 where dashed arcs are subject to tolls in each Authority respectively. The Pareto Front for the MOPEC where the cities cooperate to maximize welfare is shown in Fig. 6. The single solution reported in [21] also lies on this Pareto Front (indicated by ⋆ on Fig. 6). Fig. 7 shows the result (indicated by ×) when α = 0 i.e. the NCEPEC solution where the leaders played a Nash game instead. This point is not Pareto Optimal since one city can increase welfare without making the other worse off. 8 Andrew Koh The “collusion path” mapping the NCEPEC to the Pareto Front and Table 3 shows that the welfare for City I marginally decreases as α increases (and the opposite for City II). Hence whether the cooperative solution is sustainable (because City II benefits but City I marginally loses out) is an issue that warrants further research. We also notice that the tolls fall as α rises, again as an indication of signalling behavior to the other authority. We notice that City I’s welfare (c.f. Table 3) continuously decreases as we move from the NCEPEC solution to the MOPEC solution (and eventually lower than the NCEPEC outcome at α = 1). Such a situation implies that City II might have to compensate City I so that the latter would be incentivised to cooperate. Again, this opens up a plethora of further research possibilities studying the policy implications in such situations e.g. stability of agreements. ŝƚLJ/ ŝƚLJ// Fig. 5. Directed network [21] for Example 2 with the line down the middle demarcating authority jurisdiction. Dashed arcs are tolled arcs in each city. 8 x 10 1.17 α=1 1.165 Welfare: City II 1.16 α=0.6 1.155 1.15 1.145 1.14 α=0 (Nash) 1.135 1.13 7.45 7.5 7.55 Welfare: City I 7.6 7.65 7 x 10 Fig. 7. Example 2 : (zoomed) Pareto Front Fig. 6. Example 2: Pareto Front (Cooper- (Cooperative EPEC) and “collusion path” ative EPEC) Cities I and II, Cooperative mapping NCEPEC ( ×, α = 0) to Pareto Front (α = 1) Solution from [21] indicated with ⋆ Cooperative and Non Cooperative EPECs 9 Table 3. Example 2: Tolls and Welfare for Leaders as α increases City I Toll Welfare (secs) (10,000 secs) 0.00 (NCEPEC) 4943 7656.28 0.20 4847 7656.25 0.40 4748 7655.97 0.60 4648 7655.45 0.80 4544 7654.65 1.00 (Pareto Front) 4439 7653.56 α 7 City II Toll Welfare (secs) (10,000 secs) 4957 11630.22 4931 11631.40 4902 11632.59 4871 11633.80 4838 11635.03 4802 11636.27 Conclusions In this paper, we studied a class of hierarchical optimization problems with multiple leaders characterized by the presence of a binding variational inequality. This problem is collectively referred to as Equilibrium Problems with Equilibrium Constraints. Two assumption of leader behavior were discussed depending on whether the leaders cooperated to optimize their objectives or otherwise. We showed that advances in multi-objective evolutionary algorithms could be used to generate Pareto Fronts that represented the situation in which the leaders cooperate. In addition we have already proposed an algorithm for the non cooperative situation in our earlier research. The main contribution of this paper is to demonstrate the potential mapping of the non-cooperative solution to the cooperative outcome through the use of evolutionary algorithms. We term the mapping between these two solutions the “collusion path” since it paths the collusion possibilities between leaders in a game. With numerical examples drawn from both the economics and transportation systems management literature, we demonstrated the role that this path plays in assisting policy makers in developing anti-trust legislation. In terms of policy research, further work should be undertaken to understand the “collusion path” as this will affect the incentives for cooperative action. With regard to evolutionary algorithms, though we have theoretical assurance of the convergence of NDEMO to a NE for arbitary NCEPEC, each run will still take some time. Hence investigating methodologies to speed up convergence of the NDEMO algorithm for the non-cooperative case would be continue to be a useful area of research. References 1. Coello-Coello, C., Lamont, G.: Applications of multi-objective evolutionary algorithms. World Scientific, Singapore (2004) 2. Deb, K: Multi-objective Optimization using Evolutionary Algorithms. John Wiley, Chichester (2001) 3. Ferris, M., Munson, T.: Complementarity problems in GAMS and the PATH solver. Journal of Economic Dynamics and Control, 24(2), 165–188 (2000) 10 Andrew Koh 4. Harker, P. T.: A variational inequality approach for the determination of Oligopolistic market equilibrium. Math. Program. 30(1), 105–111 (1984) 5. Huang, V. L., Qin, A. K., Suganthan, P. N., Tasgetiren, M. F.:Multi-objective optimization based on self-adaptive differential evolution algorithm. Proceedings of IEEE CEC, 3601–3608. IEEE Press, Piscataway, New Jersey (2007) 6. Karamardian, S.: Generalized complementarity problems. J. Optimiz. Theory App. 8(3), 161–168 (1971) 7. Koh A., Watling D.: Traffic Assignment Modelling. In: Button, K., Vega, H., Nijkamp, P.(eds.) A Dictionary of Transport Analysis. Edward Elgar, Cheltenham:418-420 (2010) 8. Koh, A.: Solving transportation bi-level programs with Differential Evolution. Proceedings of the IEEE Congress on Evolutionary Computation pp. 2243–2250. IEEE Press, Piscataway, New Jersey (2007) 9. Koh, A.: An evolutionary algorithm based on Nash dominance for equilibrium problems with equilibrium constraints. Appl. Soft Comput. 12(1) 161–173 (2012) 10. Lung, R. I., Dumitrescu, D.: Computing Nash equilibria by means of evolutionary computation. Int. J. Comput. Commun. III (Suppl. Issue - ICCCC 2008) 364–368 (2008) 11. Mordukhovich, B. S.: Variational Analysis and Generalized Differentiation, I: Basic Theory. Springer, Berlin (2006) 12. Mordukhovich, B. S., Outrata, J. V., Červinka, M.: Equilibrium problems with complementarity constraints: Case study with applications to oligopolistic markets. Optimization 56(4), 479–494 (2007) 13. Murphy, F. H., Sherali, H. D., Soyster, A. L.: A mathematical programming approach for determining oligopolistic market equilibrium. Math. Prog. 24(1), 92–106 (1982) 14. Nash, J.:(1951) Non-Cooperative games. Ann. Math. Second Series 54(2),286–295 (1951) 15. Outrata, J. V.: A note on a class of equilibrium problems with equilibrium constraints. Kybernetika 40(5),585–594 (2004) 16. Price, K., Storn, R., Lampinen, J.: Differential evolution: a practical approach to global optimization. Springer, Berlin (2005) 17. Smith, M. J.: The existence, uniqueness and stability of traffic equilibria. Transport. Res. B-Meth. 13(4),295–304 (1979) 18. von Stackelberg, H. H.: The theory of the market economy. William Hodge, London (1952) 19. Wardrop, J. G.: Some theoretical aspects of road traffic research. P. I. Civil Eng. Pt. 2., 1(36), 325-378 (1952) 20. Ye, J. J., Zhu, Q. J.: Multiobjective optimization problem with variational inequality constraints. Math. Prog. 96A(1), 139–160 (2003) 21. Zhang, X.N., Zhang, H.M., Huang, H.J., Sun, L.J., Tang, T.Q.: Competitive, cooperative and Stackelberg congestion pricing for multiple regions in transportation networks. Transportmetrica, 7(4), 297-320(2011)
© Copyright 2026 Paperzz