Decentralized and distributed control Introduction to dynamic non-cooperative games M. Farina1 1 Dipartimento 2 Dipartimento G. Ferrari Trecate2 di Elettronica, Informazione e Bioingegneria (DEIB) Politecnico di Milano, Italy [email protected] di Ingegneria Industriale e dell’Informazione (DIII) Università degli Studi di Pavia, Italy [email protected] EECI-HYCON2 Graduate School on Control 2015 Supélec, France Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 1 / 23 Outline 1 Dynamic non-cooperative games 2 Connections with distributed control 3 Example 4 Conclusions 5 Suggested readings Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 2 / 23 Outline 1 Dynamic non-cooperative games 2 Connections with distributed control 3 Example 4 Conclusions 5 Suggested readings Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 3 / 23 Dynamic non-cooperative games To classify and understand the main available distributed optimization-based control algorithms, we need to introduce non-cooperative dynamic games. Game theory It is the study of the interactions among different agents, involving multi-person decision-making Dynamic games A game is dynamic (or differential) if the order in which decisions are taken is relevant. I.e., the decision taken by an agent at instant t may depend on the state of the system (the environment), which in turn depends on the decision taken also by the “competing” agents at previous time instants. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 4 / 23 Dynamic non-cooperative games Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 5 / 23 Dynamic non-cooperative games Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 5 / 23 Dynamic non-cooperative games Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 5 / 23 Dynamic non-cooperative games Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 5 / 23 Dynamic non-cooperative games Non-cooperative game A game is said to be non-cooperative when each player pursues its own interests. This can lead to conflicting goals among players. In fact a player has to take a decision based on its own utility, or payback, agents must take decisions with (in general) different utility functions, then a conflicting situation can be produced. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 6 / 23 Dynamic non-cooperative games Definition A normal-form game is a tuple (M , A, g), where M is a finite set of M players, indexed by i; A = A1 × · · · × AM , where Ai is the (finite) set of actions available to player i; any vector a = (a1 , . . . , aM ) ∈ A is an action profile; g = (g1 , . . . , gM ), where gi = A 7→ R is the utility (gain or payoff) function for player i. Remark: in general the payoff gi of an agent depends both: on the action of agent i, ai ∈ Ai , the action of the ”competing” agents, aj ∈ Aj , j 6= i, we define by A−i the set of actions of the competing agents: A−i = A1 × · · · × Ai−1 × Ai+1 × · · · × AM a−i is the action profile of the competing agents of i, a−i ∈ A−i . Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 7 / 23 Dynamic non-cooperative games Strategy A strategy is a set of decision rules, defining the actions to be taken by a player in each situation. It can depend on the state of the system (especially in dynamic games e.g., a control law!), can be I) fixed (i.e., a pure strategy), II) probabilistic (i.e., mixed strategy), when decisions in Ai are not deterministic, but are taken according to a given probability distribution. si and Si denote a strategy and a set of strategies, respectively, for agent i; s = (s1 , . . . , sM ) is strategy profile, and S = S1 × · · · × SM ; s−i = (s1 , . . . , si−1 , si+1 , . . . , sM ) is the strategy profile of the competing agents to i, and S−i is the set where s−i lies; note that s = (si , s−i ) ∈ S. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 8 / 23 Dynamic non-cooperative games Optimality in a single-player framework The optimal strategy is the strategy that defines the action (a) that maximizes the utility function g for a given environment where the (single) agent operates, i.e., g(a) ≥ g(a0 ) for all a0 ∈ A Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 9 / 23 Dynamic non-cooperative games Optimality in a single-player framework The optimal strategy is the strategy that defines the action (a) that maximizes the utility function g for a given environment where the (single) agent operates, i.e., g(a) ≥ g(a0 ) for all a0 ∈ A ... and in a multi-player framework what is an optimal strategy? Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 9 / 23 Dynamic non-cooperative games Optimality in a single-player framework The optimal strategy is the strategy that defines the action (a) that maximizes the utility function g for a given environment where the (single) agent operates, i.e., g(a) ≥ g(a0 ) for all a0 ∈ A ... and in a multi-player framework what is an optimal strategy? Desired properties: an optimal strategy optimizes the ”system-wide” outcome of a game, should be invariant with respect to additive or scaling operations on the single player’s utility functions, Different solution concepts have been defined, i.e., different ”definitions” of optimality. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 9 / 23 Dynamic non-cooperative games Solution concepts Pareto-optimal strategies A given strategy profile s is said to Pareto-dominate the strategy profile s0 if, for all i, gi (s) ≥ gi (s0 ), and if this inequality is strict for at least a value of i ∈ M . A strategy profile s is Pareto-optimal if there does not exist any other strategy profile s0 ∈ S that Pareto-dominates s. Pareto-optimality defines an unambiguous way to establish that a given strategy is globally dominating. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 10 / 23 Dynamic non-cooperative games Solution concepts Pareto-optimal strategies A given strategy profile s is said to Pareto-dominate the strategy profile s0 if, for all i, gi (s) ≥ gi (s0 ), and if this inequality is strict for at least a value of i ∈ M . A strategy profile s is Pareto-optimal if there does not exist any other strategy profile s0 ∈ S that Pareto-dominates s. Pareto-optimality defines an unambiguous way to establish that a given strategy is globally dominating. Nash equilibria A strategy profile s = (s1 , . . . , sM ) is a Nash equilibrium if, for all agents i, si is i’s best response to s−i , meaning that gi (si , s−i ) ≥ gi (si0 , s−i ) for all si0 ∈ Si . A Nash equilibrium defines optimality from a single player’s point of view, with respect to the states of all the other agents. It is possible to prove that every game has at least one (possibly mixed) Nash equilibrium. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 10 / 23 Dynamic non-cooperative games Solution concepts Max-min strategies The maxmin strategy of player i is a (not necessarily unique or fixed) strategy that maximizes i’s worst case utility. For player i, it is defined as argmaxsi ∈Si mins−i ∈ gi (si , s−i ) It is the choice taken with the aim of maximizing one’s expected utility without having to make any assumption on the other player’s adopted strategy. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 11 / 23 Outline 1 Dynamic non-cooperative games 2 Connections with distributed control 3 Example 4 Conclusions 5 Suggested readings Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 12 / 23 Connections with distributed control Why basics of game theory are useful for study of distributed optimization-based control? to classify and understand the main rationale underlying the methods proposed in the literature: basically all the proposed methods have a clear game-theoretical characterization; to provide stimulating starting points for the development of novel control schemes: many recent control schemes are explicitly inspired by game theoretical solution concepts. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 13 / 23 Connections with distributed control Type of games in distributed MPC: I) dynamic infinite games, where players have an infinite number of actions to make, i.e., ui ∈ Rmi ; II) a pure (fixed) strategy, for each player, is represented as a real input vector ui for the subsystem i; III) the cost functions Vi = −gi , i = 1, . . . , M, are strictly convex; IV) the utility functions gi (u1 , . . . , uM ), i = 1, . . . , M are jointly continuous in all its arguments and strictly concave - generally quadratic in ui for every uj , j 6= i; V) mixed strategies are not likely to be implemented, since this would mean to use control laws with statistically changing parameters. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 14 / 23 Connections with distributed control Three types of algorithms are studied: Nash equilibrium solutions of general non-cooperative games where the utility functions of the players differ from each other: Nash solution of a non-cooperative game; Maxmin solutions of general non-cooperative games where the utility functions of the players differ from each other: robust solution of a non-cooperative game; Pareto-optimal (i.e., Nash) solution of non-cooperative games where the utility functions are the same for all players (gi = g for all i = 1, . . . , M): solution of a ”cooperative game”. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 15 / 23 Outline 1 Dynamic non-cooperative games 2 Connections with distributed control 3 Example 4 Conclusions 5 Suggested readings Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 16 / 23 Example: a variation of the prisoner’s dilemma Each of two players (P1 and P2 ) are asked to choose, independently and anonymously, whether or not to provide a gift of 10 Euros to the other player, at a cost of 2 Euros; Each player has two choices: cooperate (C, i.e., pay 2 Euros) and defect (D, i.e., pay 0 Euros). The utilities gi , i = 1, 2 are ui = net gain for player Pi which are indicated in the table P1 | P2 C D C 8,8 10,-2 D -2,10 0,0 the maxmin solution is the choice each player makes to maximize its own utility (or minimize its own loss), in face of the worst choice that the other prisoner can make the Nash equilibrium is the choice such that, for each player, if one player at a time changes its mind, then its utility does not increase. In both cases, the solution is DD. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 17 / 23 Example: the prisoner’s dilemma To change the problem into a ”cooperative” one, we assume that the utility ui = u for both players is the same, u is the total collective utility (gain for P1 + gain for P2 ). The corresponding table is P1 | P2 C D C 16,16 8,8 D 8,8 0,0 Apparently, the Nash equilibrium (actually corresponding both to the Pareto-optimal solution and to the maxmin solution) corresponds to the choices CC. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 18 / 23 Outline 1 Dynamic non-cooperative games 2 Connections with distributed control 3 Example 4 Conclusions 5 Suggested readings Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 19 / 23 Conclusions Take-home messages: game theory provides useful tools to understand, classify, and provide ideas for optimization-based distributed control algorithms; to avoid conflicts: the agents must have the same utility function (i.e., the same goal) - cooperative agents. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 20 / 23 Outline 1 Dynamic non-cooperative games 2 Connections with distributed control 3 Example 4 Conclusions 5 Suggested readings Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 21 / 23 Suggested readings Books 1. T. Başar and G. J. Olsder. Dynamic Noncooperative Game Theory. Academic Press, 2nd edition, 1995. 2. Y. Shoham and K. Leyton-Brown. Multiagent systems. Algorithmic, Game-Theoretical, and Logical Foundations. Cambridge University Press, 2009. http://www.masfoundations.org/download.html. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 22 / 23 Suggested readings Recent papers 3. J. B. Rawlings and B. T. Stewart. Coordinating multiple optimization-based controllers: New opportunities and challenges. Journal of Process Control, 18(9):839845, October 2008. 4. J. M. Maestre, D. Muñoz de la Peña, and E. F. Camacho. Distributed model predictive control based on a cooperative game. Optimal Control Applications and Methods, 32(2):153 – 176, 2011. 5. J. M. Maestre, D. Muñoz de la Peña, E. F. Camacho, and T. Alamo. Distributed model predictive control based on agent negotiation. Journal of Process Control, 21(5):685 - 697, 2011. 6. K. Stankova and B. De Schutter. Stackelberg equilibria for discrete-time dynamic games part i: Deterministic games. / part ii: Stochastic games with deterministic information structure. In Proceedings of the IEEE International Conference on Networking, Sensing and Control, ICNSC 2011, 2011. Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2015 23 / 23
© Copyright 2026 Paperzz