Introduction to dynamic non-cooperative games

Decentralized and distributed control
Introduction to dynamic non-cooperative games
M. Farina1
1 Dipartimento
2 Dipartimento
G. Ferrari Trecate2
di Elettronica, Informazione e Bioingegneria (DEIB)
Politecnico di Milano, Italy
[email protected]
di Ingegneria Industriale e dell’Informazione (DIII)
Università degli Studi di Pavia, Italy
[email protected]
EECI-HYCON2 Graduate School on Control 2015
Supélec, France
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
1 / 23
Outline
1
Dynamic non-cooperative games
2
Connections with distributed control
3
Example
4
Conclusions
5
Suggested readings
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
2 / 23
Outline
1
Dynamic non-cooperative games
2
Connections with distributed control
3
Example
4
Conclusions
5
Suggested readings
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
3 / 23
Dynamic non-cooperative games
To classify and understand the main available distributed
optimization-based control algorithms, we need to introduce
non-cooperative dynamic games.
Game theory
It is the study of the interactions among different agents, involving
multi-person decision-making
Dynamic games
A game is dynamic (or differential) if the order in which decisions are
taken is relevant.
I.e., the decision taken by an agent at instant t may depend on the
state of the system (the environment), which in turn depends on the
decision taken also by the “competing” agents at previous time
instants.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
4 / 23
Dynamic non-cooperative games
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
5 / 23
Dynamic non-cooperative games
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
5 / 23
Dynamic non-cooperative games
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
5 / 23
Dynamic non-cooperative games
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
5 / 23
Dynamic non-cooperative games
Non-cooperative game
A game is said to be non-cooperative when each player pursues its
own interests.
This can lead to conflicting goals among players.
In fact
a player has to take a decision based on its own utility, or payback,
agents must take decisions with (in general) different utility
functions,
then a conflicting situation can be produced.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
6 / 23
Dynamic non-cooperative games
Definition
A normal-form game is a tuple (M , A, g), where
M is a finite set of M players, indexed by i;
A = A1 × · · · × AM , where Ai is the (finite) set of actions available to player i;
any vector a = (a1 , . . . , aM ) ∈ A is an action profile;
g = (g1 , . . . , gM ), where gi = A 7→ R is the utility (gain or payoff) function for player
i.
Remark: in general the payoff gi of an agent depends both:
on the action of agent i, ai ∈ Ai ,
the action of the ”competing” agents, aj ∈ Aj , j 6= i,
we define by A−i the set of actions of the competing agents:
A−i = A1 × · · · × Ai−1 × Ai+1 × · · · × AM
a−i is the action profile of the competing agents of i, a−i ∈ A−i .
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
7 / 23
Dynamic non-cooperative games
Strategy
A strategy is a set of decision rules, defining the actions to be taken by a player in
each situation. It
can depend on the state of the system (especially in dynamic games e.g., a
control law!),
can be
I) fixed (i.e., a pure strategy),
II) probabilistic (i.e., mixed strategy), when decisions in Ai are not
deterministic, but are taken according to a given probability
distribution.
si and Si denote a strategy and a set of strategies, respectively, for agent i;
s = (s1 , . . . , sM ) is strategy profile, and S = S1 × · · · × SM ;
s−i = (s1 , . . . , si−1 , si+1 , . . . , sM ) is the strategy profile of the competing agents to
i, and S−i is the set where s−i lies;
note that s = (si , s−i ) ∈ S.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
8 / 23
Dynamic non-cooperative games
Optimality in a single-player framework
The optimal strategy is the strategy that defines the action (a) that maximizes the
utility function g for a given environment where the (single) agent operates, i.e.,
g(a) ≥ g(a0 ) for all a0 ∈ A
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
9 / 23
Dynamic non-cooperative games
Optimality in a single-player framework
The optimal strategy is the strategy that defines the action (a) that maximizes the
utility function g for a given environment where the (single) agent operates, i.e.,
g(a) ≥ g(a0 ) for all a0 ∈ A
... and in a multi-player framework
what is an optimal strategy?
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
9 / 23
Dynamic non-cooperative games
Optimality in a single-player framework
The optimal strategy is the strategy that defines the action (a) that maximizes the
utility function g for a given environment where the (single) agent operates, i.e.,
g(a) ≥ g(a0 ) for all a0 ∈ A
... and in a multi-player framework
what is an optimal strategy?
Desired properties: an optimal strategy
optimizes the ”system-wide” outcome of a game,
should be invariant with respect to additive or scaling operations on the single
player’s utility functions,
Different solution concepts have been defined, i.e., different ”definitions” of
optimality.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
9 / 23
Dynamic non-cooperative games
Solution concepts
Pareto-optimal strategies
A given strategy profile s is said to Pareto-dominate the strategy profile s0 if, for
all i, gi (s) ≥ gi (s0 ), and if this inequality is strict for at least a value of i ∈ M .
A strategy profile s is Pareto-optimal if there does not exist any other strategy
profile s0 ∈ S that Pareto-dominates s.
Pareto-optimality defines an unambiguous way to establish that a given strategy is
globally dominating.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
10 / 23
Dynamic non-cooperative games
Solution concepts
Pareto-optimal strategies
A given strategy profile s is said to Pareto-dominate the strategy profile s0 if, for
all i, gi (s) ≥ gi (s0 ), and if this inequality is strict for at least a value of i ∈ M .
A strategy profile s is Pareto-optimal if there does not exist any other strategy
profile s0 ∈ S that Pareto-dominates s.
Pareto-optimality defines an unambiguous way to establish that a given strategy is
globally dominating.
Nash equilibria
A strategy profile s = (s1 , . . . , sM ) is a Nash equilibrium if, for all agents i, si is i’s best
response to s−i , meaning that gi (si , s−i ) ≥ gi (si0 , s−i ) for all si0 ∈ Si .
A Nash equilibrium defines optimality from a single player’s point of view, with respect
to the states of all the other agents.
It is possible to prove that every game has at least one (possibly mixed) Nash
equilibrium.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
10 / 23
Dynamic non-cooperative games
Solution concepts
Max-min strategies
The maxmin strategy of player i is a (not necessarily unique or fixed) strategy
that maximizes i’s worst case utility.
For player i, it is defined as
argmaxsi ∈Si mins−i ∈ gi (si , s−i )
It is the choice taken with the aim of maximizing one’s expected utility without
having to make any assumption on the other player’s adopted strategy.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
11 / 23
Outline
1
Dynamic non-cooperative games
2
Connections with distributed control
3
Example
4
Conclusions
5
Suggested readings
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
12 / 23
Connections with distributed control
Why basics of game theory are useful for study of distributed
optimization-based control?
to classify and understand the main rationale underlying the
methods proposed in the literature: basically all the proposed
methods have a clear game-theoretical characterization;
to provide stimulating starting points for the development of novel
control schemes: many recent control schemes are explicitly
inspired by game theoretical solution concepts.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
13 / 23
Connections with distributed control
Type of games in distributed MPC:
I) dynamic infinite games, where players have an infinite number
of actions to make, i.e., ui ∈ Rmi ;
II) a pure (fixed) strategy, for each player, is represented as a real
input vector ui for the subsystem i;
III) the cost functions Vi = −gi , i = 1, . . . , M, are strictly convex;
IV) the utility functions gi (u1 , . . . , uM ), i = 1, . . . , M are jointly
continuous in all its arguments and strictly concave - generally
quadratic in ui for every uj , j 6= i;
V) mixed strategies are not likely to be implemented, since this would
mean to use control laws with statistically changing parameters.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
14 / 23
Connections with distributed control
Three types of algorithms are studied:
Nash equilibrium solutions of general non-cooperative games
where the utility functions of the players differ from each
other: Nash solution of a non-cooperative game;
Maxmin solutions of general non-cooperative games where the
utility functions of the players differ from each other: robust
solution of a non-cooperative game;
Pareto-optimal (i.e., Nash) solution of non-cooperative games
where the utility functions are the same for all players (gi = g for
all i = 1, . . . , M): solution of a ”cooperative game”.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
15 / 23
Outline
1
Dynamic non-cooperative games
2
Connections with distributed control
3
Example
4
Conclusions
5
Suggested readings
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
16 / 23
Example: a variation of the prisoner’s dilemma
Each of two players (P1 and P2 ) are asked to choose, independently and
anonymously, whether or not to provide a gift of 10 Euros to the other player, at a
cost of 2 Euros;
Each player has two choices: cooperate (C, i.e., pay 2 Euros) and defect (D, i.e.,
pay 0 Euros).
The utilities gi , i = 1, 2 are
ui = net gain for player Pi
which are indicated in the table
P1 | P2
C
D
C
8,8
10,-2
D
-2,10
0,0
the maxmin solution is the choice each player makes to maximize its own utility
(or minimize its own loss), in face of the worst choice that the other prisoner can
make
the Nash equilibrium is the choice such that, for each player, if one player at a
time changes its mind, then its utility does not increase.
In both cases, the solution is DD.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
17 / 23
Example: the prisoner’s dilemma
To change the problem into a ”cooperative” one, we assume that
the utility ui = u for both players is the same,
u is the total collective utility (gain for P1 + gain for P2 ).
The corresponding table is
P1 | P2
C
D
C
16,16
8,8
D
8,8
0,0
Apparently, the Nash equilibrium (actually corresponding both to the Pareto-optimal
solution and to the maxmin solution) corresponds to the choices CC.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
18 / 23
Outline
1
Dynamic non-cooperative games
2
Connections with distributed control
3
Example
4
Conclusions
5
Suggested readings
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
19 / 23
Conclusions
Take-home messages:
game theory provides useful tools to understand, classify, and
provide ideas for optimization-based distributed control
algorithms;
to avoid conflicts: the agents must have the same utility function
(i.e., the same goal) - cooperative agents.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
20 / 23
Outline
1
Dynamic non-cooperative games
2
Connections with distributed control
3
Example
4
Conclusions
5
Suggested readings
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
21 / 23
Suggested readings
Books
1. T. Başar and G. J. Olsder. Dynamic Noncooperative Game Theory.
Academic Press, 2nd edition, 1995.
2. Y. Shoham and K. Leyton-Brown. Multiagent systems. Algorithmic,
Game-Theoretical, and Logical Foundations. Cambridge University
Press, 2009. http://www.masfoundations.org/download.html.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
22 / 23
Suggested readings
Recent papers
3. J. B. Rawlings and B. T. Stewart. Coordinating multiple
optimization-based controllers: New opportunities and challenges.
Journal of Process Control, 18(9):839845, October 2008.
4. J. M. Maestre, D. Muñoz de la Peña, and E. F. Camacho. Distributed
model predictive control based on a cooperative game. Optimal Control
Applications and Methods, 32(2):153 – 176, 2011.
5. J. M. Maestre, D. Muñoz de la Peña, E. F. Camacho, and T. Alamo.
Distributed model predictive control based on agent negotiation. Journal
of Process Control, 21(5):685 - 697, 2011.
6. K. Stankova and B. De Schutter. Stackelberg equilibria for discrete-time
dynamic games part i: Deterministic games. / part ii: Stochastic games
with deterministic information structure. In Proceedings of the IEEE
International Conference on Networking, Sensing and Control, ICNSC
2011, 2011.
Farina, Ferrari Trecate ()
Decentralized and distributed control
EECI-HYCON2 School 2015
23 / 23