Dynamic Coalitional TU Games: Distributed Bargaining among

Dynamic Coalitional TU Games:
Distributed Bargaining among Players’ Neighbors
Angelia Nedić
[email protected]
Industrial and Enterprise Systems Engineering Department
University of Illinois at Urbana-Champaign
joint work with Dario Bauso
[email protected]
Università di Palermo, Italy
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Introduction
• A tutorial paper
Coalitional Game Theory for Communication Networks
by W. Saad, Z. Han, M. Debbah, A. Hjorungnes, and T. Basar, 2009
1
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Coalitional Transferable Utility (TU) Game
A TU game (N, v) is specified by
• Set N = {1, 2, . . . , n} of players
• Characteristic function v that assigns a scalar value vS to every nonempty S ⊆ N .
• Formally: we let P (N ) be the set of all possible (nonempty) subsets of N and let
m be its cardinality.
• Then, the characteristic function is a vector in Rm , v : P (N ) → Rm .
• Value vS can be thought of as a monetary value that the players in S will distribute
among themselves in some fair manner.
• The grand coalition N is stable when no player has incentive to leave coalition N ,
i.e., the core C of the game is nonempty
C = {x ∈ Rn | e0N x = vN , e0S x ≥ vS for S ⊂ N },
where x is an allocation vector for players and eS is the incidence vector of coalition S :
(
1 when i ∈ S,
[eS ]i =
0 when i 6∈ S.
• Bargaining is an allocation process by which the players reach an agreement on some
allocation in the core.
2
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Bargaining: determining an allocation in the core
• This is a feasibility problem of finding a point in a polyhedral set C :
C = {x ∈ Rn | e0N x = vN , e0S x ≥ vS for S ⊂ N }
• Dynamic allocation is of interest where the players “negotiate” without a “central”
entity
• J.C. Cesco 1998 has proposed “transfer scheme” where coalitions are updating
• E. Lehrer 2002 has considered “gradient- based” scheme where a randomly selected
player updates at each time
• Both consider “repeated” static game (N, v) - optimization aspect obscured
3
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Bargaining: Optimization perspective
• Solving feasibility problem distributedly among the players
• Define a bounding set Xi of player i:
Xi = {x ∈ Rn | e0N x = vN , e0S x ≥ vS for S ⊂ N with i ∈ S}
• Note that
C = {x ∈ Rn | e0N x = vN , e0S x ≥ vS for S ⊂ N } = ∩ni=1 Xi
• Possible formulation
minimize
n
X
dist2(x, Xi) =
i=1
n
X
kx − ΠXi [x]k2 ,
i=1
where dist(x, X) is the Euclidean distance from a point x to the set X and
ΠX [x] is the projection of x on X .
• Iterative gradient method will work with
Incremental (cyclic) update or random player update
• Distributed gradient method over a network will also work
4
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
TU Game over a Dynamic Network
• Players are viewed as nodes in a graph (N, E(t))
• Player j is a neighbor of i at time t if (j, i) ∈ E(t)
• Ni (t) is the set of neighbors of i at time t
• Allocation of player i at time t is xi (t)
• Player i can see allocations xj (t) of his neighbors
Distributed over network bargaining where every player i updates using its bounding
set Xi and allocations xj (t) of his neighbors j ∈ Ni(t):
X
X
i
j
aij (t) ≥ 0,
aij (t) = 1
w (t + 1) =
aij (t)x (t)
j∈Ni (t)
j∈Ni (t)
|
{z
}
alignment of allocations with neigbors
xi (t + 1) =
ΠXi [wi (t + 1)]
|
{z
}
gradient step to minimize dist2 (x, Xi )
Convergence of such scheme will follow from a more general optimization setting∗
∗ A. Nedić, A. Ozdaglar, P.A. Parrilo Constrained consensus and optimization in multi-agent networks. IEEE Trans. on
Automatic Control, 55(4):922–938, 2010.
A. Nedić, J. Liu ”On Convergence Rate of Weighted-Averaging Dynamics for Consensus Problems,” under review, 2014
5
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Bargaining: What if the players are not honest?
What if the characteristic functions are random?
In some applications (supply chain, network controlled flows)
• The players may have an incentive to provide false information about vi ’s trying to
increase their own allocation values xi
• The characteristic function v may depend on some random demand for a service of
the players at any given time
This behavior leads us to consider dynamic TU game (N, {v(t)}), specified by
• Set N = {1, 2, . . . , n} of players
• Characteristic function v(t) defining the instantaneous game (N, v(t)) at time t
In order to accommodate both situations, we assume that v(t) is random and investigate
• Robust game - when uncertainty in v(t) is unknown but bounded (in a way)
• Averaging game - under some ergodicity assumption on {v(t)}
6
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Robust Game
Interested in a bargaining process for dynamic TU game (N, {v(t)}) over a network.
Assumption 1 There exists v max ∈ Rm such that for all t ≥ 0:
vS (t) ≤ vSmax
for all S ⊂ N,
max
vN (t) = vN
.
The robust TU game is the game (N, v max).
Assumption 2 The core C(v max) of the robust game is nonempty, i.e,
max
C(v max ) = {x ∈ Rn | e0N x = vN
, e0S x ≥ vSmax for S ⊂ N } 6= ∅
• Instantaneous game (N, v(t)): player’s bargaining involves time-varying bounding sets
max
Xi (t) = {x ∈ Rn | e0N x = vN
, e0S x ≥ vS (t) for S ⊂ N with i ∈ S}
• Bargaining protocol:
X
i
w (t + 1) =
j
aij (t)x (t)
j∈Ni (t)
|
aij (t) ≥ 0,
X
aij (t) = 1
j∈Ni (t)
{z
}
alignment of allocations with neigbors
xi (t + 1) =
ΠXi (t) [wi (t + 1)]
|
{z
}
with arbitrary initial xi(0) ∈ Rn
gradient step to minimize dist2 (x, Xi (t))
• It is well defined. Does it converge? If it does - where are the limit points?
7
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Impact of the Network Connectivity Graphs
Assumption 3 Assume that the graph (N, E(t)) is strongly connected†. Also, assume
P
that aij (t) ≥ 0 and j∈N (t) aij (t) = 1 for all i and t. In addition, there exists an
i
α > 0 such that aii (t) ≥ α for all t and aij (t) ≥ α whenever aij (t) > 0.
• If only averaging
X
i
w (t + 1) =
j
aij (t)w (t)
j∈Ni (t)
|
aij (t) ≥ 0,
X
aij (t) = 1
j∈Ni (t)
{z
}
alignment of allocations with neigbors
• Semi-linear dynamics
w(t + 1) = A(t)w(t)
A(t) = [a(t)]ij with 0-entries when (j, i) 6∈ Ni (t).
• Under Assumption 3, such dynamic will converge with geometric rate
• The limit point w∗ is of the form w1∗ = · · · = wn∗
† Not critical. Strong connectivity over a period of time will work.
8
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Bargaining Dynamic
X
i
w (t + 1) =
j
aij (t)x (t)
aij (t) ≥ 0,
j∈Ni (t)
|
X
aij (t) = 1
j∈Ni (t)
{z
}
alignment of allocations with neighbors
xi (t + 1) =
ΠXi (t) [wi (t + 1)]
{z
}
|
gradient step to minimize dist2 (x, Xi (t))
• Isolate the ”linear” part
xi (t + 1) = wi (t + 1) + ΠXi (t) [wi (t + 1)] − wi (t + 1)
|
{z
}
nonlinear error: ei(t)
• Define A(t) = [a(t)]ij with 0-entries when (j, i) 6∈ Ni (t) = 0
• Write it as “perturbed” semi-linear dynamic
x(t + 1) = A(t)x(t) + e(t)
• Under Assum.4, the convergence of such dynamics will depend on the behavior of e(t)
9
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Convergence to the Core of the Robust Game
Let Assumptions 1–3 hold. Also, assume that
Prob {v(t) = v max infinitely often} = 1.
Then, the bargaining protocol converges to a (random) allocation in the core
C(v max ) with probability 1.
Pm i
1
• Consider y(t) = n i=1 x (t)
• ky(t) − xi (t)k → 0 for all i (non-emptiness of the core C(v max ))
• {ky(t) − xk} convergent w.p. 1 for any x ∈ C(v max )
• Critical observation: bounding sets (and the core) of instantaneous game have the
same “normals”
max
Xi (t) = {x ∈ Rn | e0N x = vN
, e0S x ≥ vS (t) for S ⊂ N with i ∈ S}
As a consequence (by Hoffman’s bound)
2
dist (y(t + 1), C(v(t)) ≤ µ
n
X
dist2(y(t + 1), Xi(t))
i=1
10
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Dynamic Average Game
Consider dynamic TU game (N, {v(t)})
• Define
t
1 X
v̄(t) =
v(k)
t+1
for all t ≥ 0
k=0
• Average instantaneous game (N, v̄(t))
• Bounding sets
X̄i (t) = {x ∈ Rn | e0N x = v̄N (t), e0S x ≥ v̄S (t) for S ⊂ N with i ∈ S}
• Bargaining protocol:
X
i
w (t + 1) =
j
aij (t)x (t)
aij (t) ≥ 0,
aij (t) = 1
j∈Ni (t)
j∈Ni (t)
|
X
{z
}
alignment of allocations with neighbors
xi (t + 1) =
ΠX̄i (t) [wi (t + 1)]
|
{z
}
gradient step to minimize dist2(x, X̄i(t))
11
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Average TU Game
Assumption 4 With probability 1, we have
lim v̄(t) = v mean ,
t→∞
mean
vN (t) = vN
for all t ≥ 0
• Average game (N, v mean ) with the core C(v mean ):
mean
C(v mean ) = {x ∈ Rn | e0N x = vN
, e0S x ≥ vSmean for S ⊂ N } 6= ∅
Let Assumptions 3 and 4 hold. Assume also that dimC(v mean) = n − 1. Then, the
bargaining process converges to a (random) allocation in the core C(v mean) of the average
game with probability 1.
12
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Proof Sketch
• Consider y(t) =
1
n
Pm
i
i=1 x (t)
• Assumption that dim C(v mean ) = n − 1 implies: for every z ∈ relintC(v mean ) there
is tz large enough so that z ∈ C(v̄(t)) for all t ≥ tz with probability 1
• The preceding helps establish
• {ky(t) − zk} convergent w.p. 1 for any z ∈ relintC(v mean )
• ky(t) − xi (t)k → 0 for all i
=⇒
dist(y(t + 1), X̄i(t)) → 0
• Bounding sets of the instantaneous average game have the same “normals”
mean
X̄i (t) = {x ∈ Rn | e0N x = vN
, e0S x ≥ v̄S (t) for S ⊂ N with i ∈ S}
As a consequence, for some Li > 0 w.p. 1 for any x, any i and all t
dist(x, X̄i) ≤ dist(x, X̄i(t)) + Likv̄(t) − v meank
• This yields dist(y(t + 1), X̄i ) → 0 for all i w.p. 1
13
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Numerical Examples
v1
v2
v1
v2
v3
(a)

v1
v2
v3
(b)


1 0 0


A(0) =  0 12 12  ,
0 12 12
v3
(c)
1
2
1
2

0


A(1) =  0 1 0  ,
1
0 12
2

1
2
1
2
1
2
1
2

0


A(2) = 
0 .
0 0 1
• Robust game: flip a fair coin; if head then choose v(t) with uniform distribution,
otherwise choose v max
• The core of the robust game C(v max ) = {(7, 3, 0, 0, 0, 0, 10)}
• Average game: we always use uniform distribution over the given intervals
Robust game
Average game
v{1}
[4, 7]
[4, 9]
v{2}
[0, 3]
[0, 5]
v{3}
0
0
v{1,2}
0
0
v{1,3}
0
0
v{2,3}
0
0
v{1,2,3}
10
10
14
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Figure 1: Robust Game Results blue - player 1, green - player 2,
Initial Allocation: selfish
15
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Figure 2: Average Game Results blue - player 1, green - player 2,
Initial Allocation: selfish
16
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Conclusion
• Considered dynamic TU games over networks: dynamic in the game and
in the player’s network
• Main assumption: grand coalition is stable for some well defined “limiting
game”
• Bargaining protocols converge to an allocation in the core of the limiting
game
17
Game Theory Workshop, ACC’15 Chicago
June 29, 2015
Future Directions
• Other dynamic games such as zero-sum games, potential games etc.
• Framework and algorithms needed
18