Graph-Based Strategies for Multi-Player Pursuit Evasion

Graph-Based Strategies for Multi-player Pursuit Evasion Games
Dongxu Li, Member, IEEE, and Jose B. Cruz, Jr., Life Fellow, IEEE
Abstract— Maximization of the second smallest eigenvalue of
the graph Laplacian has recently been studied in the field of
cooperative control. Instead of the second smallest eigenvalue,
we design a gradient-based control law for multiple agents
to maximize an arbitrary nonzero eigenvalue. The gradient
of an eigenvalue is derived through a standard sensitivity
analysis. Furthermore, connections are drawn between the
connectivity control and Pursuit-Evasion (PE) problems with
multiple players. Gradient-based strategies are designed and
the performance is verified by simulations. A comparison with
the previously designed suboptimal strategy is provided. This is
a preliminary study of a graph theoretical approach to multiplayer PE problems.
I. INTRODUCTION
Cooperative Control of multiple Autonomous Vehicles
(AV) or agents has become as an active research area, which
covers consensus, formation control and tracking, etc [1]–
[3]. One interesting approach is to use dynamic graphs to
model mobile agents such that some desirable group behavior
can be translated to certain property of the underlying
graph [4], [5]. The second smallest eigenvalue of the graph
Laplacian, also called algebraic connectivity [6], turns out
to be an important factor to many networked systems [1],
[7], which is particularly related to stability and robustness
the system [7], [8]. Motivated by these observations, Kim
and Meshbhi have studied the problem of finding a graph
with the maximum second smallest eigenvalue of the graph
Laplacian [5]. An iterative algorithm based on Semi-definite
Programming (SDP) is proposed, in which at each step a
direction towards a local maximum of the eigenvalue is
searched over the graph space.
Another area of research relevant to AV applications that
has recently drawn much attention [9], [10] is PursuitEvasion (PE) game involving multiple players. The motivation mainly comes from a military scenario where a
team of AVs is tasked to track or strike a group of mobile
targets. In [9], Hespanha et al. formulated PE games in
discrete time under a probabilistic framework, in which
greedy and one-step Nash equilibrium strategies are solved
respectively. The system structure and implementation issues
are discussed in [10]. General multi-player PE differential
games are studied in [11], [12]. In [11], a suboptimal solution
is solved by a hierarchical decomposition method, and it
This work was supported by the Collaborative Center of Control Science
at the Ohio State University under Grant F33615-01-2-3154 from the
Air Force Research Laboratory (AFRL/VA) and the Air Force Office of
Scientific Research (AFOSR).
D. Li and J.B. Cruz are with Department of Electrical and Computer Engineering, The Ohio State University, 205 Dreese Lab, 2015
Neil Ave, Columbus, OH 43202, USA; Email: [email protected],
[email protected].
is further generalized as a class of “structured” suboptimal
methods [12]. In [12], the optimization based on limited
look-ahead is used to improve the suboptimal solutions
iteratively. The performance enhancement by limited lookahead is further analyzed in [13].
Instead of the second smallest eigenvalue of the graph
laplacian, we generalize the concept of connectivity and
show that the generalization corresponds to all the nonzero
eigenvalues (spectrum). We derive the gradient of an arbitrary
nonzero eigenvalue, based on which the control of agents is
designed to maximize that eigenvalue. Moreover, inspired
by the connectivity control of the state-dependent graphs
that are constructed based on physical proximity such as
communication, sensing and pair-wise distance, we study the
potential application of the graph-based control law to multiplayer PE problems. A bipartite graph model is constructed
and simulation results are presented.
The paper is organized as follows. In section II, we
derive the gradient of an arbitrary nonzero eigenvalue of
the graph Laplacian using a standard sensitivity analysis.
The gradient-based control law is designed to maximize the
eigenvalues. In section III, we first model multi-player PE
problems by a bipartite graph, based on which, graph-based
strategies of players are designed. The performance of the
approach is demonstrated by simulation. Concluding remarks
are presented in Section IV with suggestions for future work.
II. O N M AXIMIZATION OF THE S PECTRUM OF
S TATE - DEPENDENT G RAPH L APLACIAN
A. Dynamic Graph Model
Consider N mobile agents in Rn . Denote by xi ∈ Rn the
position of the agent i. The dynamics of agent i is given by
ẋit = fi (xit , uit ),
where ui ∈ Rm is the control input. Let x = [xT1 , · · · , xTN ]T
be an nN × 1 vector by stacking xi for each i = 1, · · · , N
and similarly u = [uT1 , · · · , uTN ]T . In vector notation, we
write the aggregate dynamics for all N agents as
ẋt = f (xt , ut ).
(1)
Suppose that each agent is associated with a vertex of a
state-dependent graph Gt = (Vt , Et ), where the vertex set
is Vt = {x1t , · · · , xnt }. An edge eij ∈ Et (i 6= j) of
the graph depends certain proximity relation between the
agents i and j, which in practice may be related to sensing
or communication. Let w : R≥0 7→ R≥0 be a piece-wise
differentiable function, where R≥0 , {r ∈ R, r ≥ 0}. A
weight aij is assigned to each eij ∈ Et as
aij = w(kxi − xj k).
(2)
where k · k is the standard Euclidean norm. A generalized
adjacency matrix A(x) of Gt can be defined by element as
[A(x)]ij = aij . Then, the Laplacian matrix LG (x) Gt is
LG (x) = ∆(x) − A(x),
where ∆(x) is a diagonal matrix with [∆(x)]ii =
which satisfy
dLij = dLji
(3)
X
aij .
j,j6=i
The graph Laplacian LG (x) is symmetric and positive semidefinite [6]. Given y = [y1 , · · · , yN ]T ∈ RN , it satisfies that
X
aij (yi − yj )2 .
(4)
y T LG (x)y =
dLii
xi − xj
dxi for i 6= j;
kxi − xj k
X xi − xj
dxi .
(7)
w′ kx −x k
i
j kx − x k
i
j
= −w′ kx
i −xj k
=
j,j6=i
Here, w′ is the derivative of w evaluated at kxi −xj k.
kxi −xj k
Denote by λk (x) + dλk and vk + dvk the k th eigenvalue and
the eigenvector of LG (x + dx) respectively, and namely,
(LG (x) + ∆LG )(vk + dvk ) = (λk (x) + dλk )(vk + dvk ).
eij ∈E,i<j
Suppose that the spectrum of LG (x) is arranged in order as
0 = λ1 ≤ λ2 ≤ · · · ≤ λN .
Note that LG (x) is singular, and by (4), vector e =
[1, · · · , 1]T belongs to the null space of LG (x). The second
smallest eigenvalue λ2 (LG ) of LG (x) is referred to as
algebraic connectivity [6], which represents certain aspect of
connections among the vertices in G. In general, the bigger
λ2 (LG ), the better the agents are connected.
Note that λk (x) and vk are associated with LG (x). Then,
LG (x)vk + LG (x)dvk + ∆LG vk + ∆LG dvk
= λk (x)vk + λk (x)dvk + dλk vk + dλk dvk .
Note that LG (x)vk = λk (x)vk . Multiply vkT from the left on
the both sides, and we obtain
vkT LG (x)dvk + vkT ∆LG vk + vkT ∆LG dvk
= vkT λk (x)dvk + vkT dλk vk + vkT dλk dvk .
Since vkT LG (x) = vkT λk (x),
B. On Maximization of Spectrum of LG (x)
In the literature, λ2 (LG ) is used as a guidance to control
mobile agents to maximize connectivity among the agents.
In [5], the optimization of λ2 has been performed by an iterative method based on SDP. An alternative iterative algorithm
based on the supergradient method has been considered for
decentralization in [4]. The convergence of the both methods
is local in nature, and their formulations dictate that they
are only applicable to optimization of λ2 (LG ). Under the
same framework, we generalize optimization of λ2 to an
arbitrary nonzero eigenvalue of the graph Laplacian by a
gradient method with the aid of a sensitivity analysis. We
first derive the gradient of an nonzero eigenvalue as follows.
Theorem 1: Let λk (x) and v k be the k th (2 ≤ k ≤ N )
eigenvalue and the corresponding normalized eigenvector of
LG (x) in (3). Suppose that xi 6= xj for i 6= j. The gradient
∇xi λk (x) with respect to xi is
X
xi − xj
vk i vk i − 2vk j w′ kx −x k
∇xi λk (x) =
.
i
j kx − x k
i
j
j,j6=i
(5)
Proof: We prove by using a standard sensitivity analysis. Let dxi be a small perturbation of xi . By the definition
of LG in (3), the perturbation in LG (x) induced by dxi is


0
· · · dL1i · · ·
0
 ..

..
..
..
..
 .

.
.
.
.



∆LG =  dLi1 · · · dLii · · · dLiN 
 . (6)
 .

.
.
.
.
..
..
..
..
 ..

0
· · · dLN i · · ·
0
Here, due the definition of (2) and (3), the only nonzero
elements in △LG (x) are in the ith row or the ith column,
dλk =
vkT ∆LG vk + o(dxi )
,
vkT vk
where o(dxi ) includes those terms that
lim o(dxi ) = 0. By inspection of (6) and (7),
(8)
satisfy
kdxi k→0
vkT ∆LG vk = dLii vk 2i + 2
X
dLij vki vkj
j,j6=i
=
X
j,j6=i
vki vki − 2vkj w′ kx
i −xj k
(xi − xj )T
dxi .
kxi − xj k
Note that vk is normalized, i.e., vkT vk = 1. Let kdxi k → 0,
and by (8), we obtain
X
xi − xj
.
∇xi λk (x) =
vki vki − 2vkj w′ kxi −xj k
kxi − xj k
j,j6=i
Remark 1: It turns out that the supergradient of λ2 derived
in [4] is essentially the gradient indicated in (5).
We design a gradient-based cooperative control law for the
agents. Define ∇x λk (x) = [∇x1 λTk (x), · · · , ∇xN λTk (x)]T .
Then, λ̇k = ∇x λk ·f (x, u). Clearly, the (cooperative) control
u(x) that drives the agents to a formation with a locally
maximal λk (x) can be determined as
(9)
u(x) = arg max ∇x λTk (x)f (x, u) .
u
C. Examples of Maximization of Spectrum of LG (x)
We verify the control u(x) in (9) by simulating a team
of agents in R2 . We first consider an example similar to the
one adopted in [5]. The dynamics of agent i (1 ≤ i ≤ N ) is
ẋi = ui ,
where xi ∈ R2 , ui ∈ R2 with kui k = 1. Consider the
function w as

x>R
 0
x−r
w(x) =
ǫ R−r r < x ≤ R

1
x≤r
with ǫ = 0.1, r = 1.5 and R = 3. According to (9), the
optimal control becomes u(x) = ∇x λTk (x)/k∇x λTk (x)k.
As in [5], collision between any two agents should be
avoided. We assume that the distance between any two agents
must be greater than rs = 1. To enforce this constraint, we
design the following safety strategy. For each agent i, define
the set Si = {j, j 6= i and kxi − xj k ≤ rs }. If Si 6= ∅, the
control of agent i is chosen as
. X
X
(xi − xj ) ui =
(xi − xj ).
j∈Si
j∈Si
Furthermore, if the positions of the agents are degenerate,
i.e., Rank([x10 , · · · , xN 0 ]) < 2, the agents will remain in
the subspace spanned by [x1 , · · · , xN ] under u(x) in (9). It
is certainly not desirable when maximum connectivity of the
agents is of interest. To avoid this dilemma, we implement
a random control for the agents when it is the case.
Denote by uk (x) the control u(x) designed above associated with λk . Fig.1 depicts the behavior of six mobile
agents under u2 (x), in which the agents start form an initial
configuration of a straight line with λ2 = 0.7977 and reach a
stable formation with λ2 = 3.9582 in the end. Similar results
have been obtained in [5]. Fig.2 illustrates the trajectories of
the agents under u6 (x), where λ6 changes from 4.16 to 4.66.
Trajectories of the Agents Under u(x) Relevant to λ2
3
2.5
2
1.5
1
Y
0.5
0
Compare Fig.2 with Fig.1, and clearly, the final configuration reached by the agents under u2 are better connected
than that under u6 . According to the final configuration, all
the associated eigenvalues of LG under u2 are greater than
that under u6 . Equation (4) implies that bigger eigenvalues
implies bigger entries aij in the adjacency matrix, and in
turn better connectivity among the agents.
The reason that u2 outperforms u6 (in fact uk for k > 2)
in improving connectivity is that u2 aims to improve the
“weakest” links among the agents. By (4),


 X

aij (yi − yj )2 ,
λ2 (x) = min
(10)

y∈RN 
eij ∈E,i<j
subject to eT y = 0 and y T y = 1. Equation (10) attains its
minimum under the eigenvector v2 , in which |v2i −v2j | tends
to be small when aij is large, and large when aij is small.
Moreover, u2 is designed to (locally) maximize λ2 , i.e.,


 X

dλ2 (x) = max
daij (v2 i − v2 j )2 ,
(11)

u∈R2N 
eij ∈E,i<j
subject to daij = dw(kxi − xj k) and dx = udt. By
(10) and (11), the agents move cooperatively such that the
“weakest” links are strengthened under u2 . On the contrary,
the control u6 focuses on the edges between the agents
that are closely connected. The agents may be trapped in
a formation associated with a local maximum more easily
when a safety distance is required. In addition, an increase
in λ2 is relevant to an increase in other λk for k > 2; while
optimization of λn for large n tends to be myopic on stronger
links and may not be relevant to λk for k < n.
According to (4), each nonzero eigenvalue λk (k =
2, · · · , N ) of LG can measure certain aspect of connectivity
of graph G. We call any nonzero eigenvalue of the graph
laplacian as the “generalized connectivity”. In general, bigger
aij ’s lead to larger λk ; and
−0.5
N
X
−1
−1.5
0
0.5
1
1.5
2
2.5
X
3
3.5
4
4.5
5
Trajectories of the Agents Generated by u2 (x)
Trajectories of the Agents under u(x) Associated with λN
2.5
2
1.5
1
0.5
X
aij .
eij ∈E,i<j
k=2
−2
Fig. 1.
λk = 2
Under uk (k > 2), the agents may evolve into such a
formation with several well connected parts (subgraphs), as
shown in Fig.2.
In the following, we demonstrate that λk for k > 2 can be
used to accelerate the convergence of the agents to a wellconnected formation. Consider the following objective
!
N
Y
J(x) = ln
λk (x) .
Y
k=2
0
The gradient of J is
−0.5
−1
−1.5
∇x J(x) =
−2
−2.5
0
Fig. 2.
0.5
1
1.5
2
2.5
X
3
3.5
4
4.5
5
Trajectories of the Agents Generated by u6 (x)
N
X
k=2
1
∇x λk (x).
λk (x)
(12)
Denote by u2−6 (x) the control associated with J. Fig.3
illustrates the evolution of all the eigenvalues under u2−6
and u2 starting from the same initial configuration as above1 .
From the top to the bottom, each curve represents the
evolution of λ6 to λ2 respectively. The final formation under
u2−6 is slightly different with that under u2 , but the transient
time under u2−6 is about half of that under u2 . It indicates
that uk (k > 2) can speed up the convergence process.
E. Edges in KN,M only exist between P and E. An example
is illustrated in Fig.4. Let LK be the Laplacian of KN,M .
Graph
Model
P1
P3
Evolution of λ to λ under u
2
6
Evolution of λ to λ under u
2−6
2
6
2
P2
6
6
E1
5
5
4
4
3
3
2
2
1
1
E2
E3
0
Fig. 3.
0
2
4
Time(s)
6
0
Fig. 4.
0
2
4
Time(s)
6
Evolution of All the Eigenvalues under u2−6 (x) and u2 (x)
III. A PPLICATION TO M ULTI - PLAYER PE G AMES
A. Motivation and Strategy Design
In what follows, we examine a potential use of the
gradient-based control law designed based on the “generalized connectivity” in multi-player PE games. The PE
problem is to study how a team of pursuers track or catch a
group of evaders. It is usually modeled as a zero-sum game,
to which Dynamic Programming (DP) methods that are
usually applied. Unfortunately, DP methods suffer from the
“curse of dimensionality”, and thus, in practice, suboptimal
solution techniques are often used [11], [14]. The purpose of
introducing the graph-based control law to PE problems is to
take advantage of the computational easiness by the physical
implication from connectivity.
In a multi-player PE problem, we denote by P and E
the sets of pursuers and evaders respectively, i.e., P =
{1, · · · , N } and E = {1, · · · , M }. Consider a PE problem
in a Rn space, and let xip ∈ Rn (or xje ∈ Rn ) represent the
state of pursuer i ∈ P (or evader j ∈ E). The dynamics of
pursuer i and evader j are given as
ẋip = fip (xip , ui ) and ẋje = fje (xje , vj ).
The goal of the pursuers is to catch2 all the evaders, while
the evaders
want to escape.
Defineh the aggregate istates as
h
i
T
T T
T
T T
.
xp = x1p , · · · , xN
and xe = x1e , · · · , xM
p
e
Instead of formulating a multi-player PE problem under
the framework of differential games, we view the relationship
between the pursuers and the evaders from a differential
perspective. Here, the locational proximity between the players plays an important role. Inspired by the relevance to
connectivity, a dynamic graph is constructed. Let KN,M be
a bipartite graph based on the vertex sets P and E, where
the first N vertices are from P and the rest M vertices from
The Bipartite Graph Based on Pursuers and Evaders
The generalized connectivity of KN,M may provide a
measure of the “distance” between the pursuers and the
evaders. It is clear in a two-player PE problem, where LK is
w(kxp − xe k) −w(kxp − xe k)
Lpe =
(13)
−w(kxp − xe k) w(kxp − xe k)
with xp ∈ Rn and xe ∈ Rn . Obviously, algebraic connectivity λ2 (Lpe ) is a deterministic function of the distance
between the two players. In a (bipartite) graph model of
a multi-player problem, this relation is not that clear since
all possible links between the pursuers and the evaders are
taken into account. This is very important in a cooperative
pursuit problem [12], [13]. In what follows, we study the
feasibility of designing control laws based on the generalized
connectivity and the emerging cooperative players’ behavior.
Before introducing a potential objective function, we look
at a special case of the graph model KN,M . Consider the
hierarchical decomposition suboptimal method introduced
in [11], where a multi-player problem is decomposed into
distributed problem with a single pursuer and a single evader.
Suppose that N = M and each pursuer i ∈ P can only be
engaged to a distinct evader ji ∈ E. This engagement rule
imposes an additional structure to the original problem, and
only the edges between pursuer i and its engaged evader ji
is considered (c.f. Fig.4). Denote by LE
K the graph Laplacian
under an engagement E. Clearly, LE
K is a block diagonal
matrix LE = diag{L1j1 , · · · , LN jN }, where each block
Liji is defined as in (13) between pursuer i and evader ji .
The
eigenvalues of LE are the eigenvalues of Liiji ’s, j i.e.,
{0, 2a1j1 }, · · · , {0, 2aN jN } , where aiji = w(kxp − xei k).
The largest N eigenvalues of LE correspond to the distances
between the pursuers and their engaged evaders.
Instead of decomposing a multi-player game, we design
the players’ control based on the generalized connectivity.
Here, there is no pre-assigned evader for each pursuer. We
consider an objective JP E as follows.
!
NY
+M
(14)
JP E = ln
λLK
N +M −K+1
1 The
same random strategy of the agents is used at the initial step to
avoid the degenerateness.
2 “Catch” may mean “track” or “destroy” depending on applications.
Here, 2 ≤ K ≤ N + M . We select K = min{M, N },
i.e., the links associated with the M evaders (or N pursuers)
up
ue
= arg max{∇xp JP E · fp (xp , up )},
= arg min{∇xe JP E · fe (xe , ue )}.
(15)
In a PE problem, the pursuers may simply track the
evaders or destroy them. In the former case, the formulation
above is suitable; while the latter case is more complicated,
where the number of evaders M decreases as the game
proceeds. Henceforth, we focus on the latter case. Suppose
that evader j ∈ E is considered captured if there exists
pursuer i ∈ P such that kxip (t) − xje (t)k ≤ ε for some
t > 0 and ε > 0. Once it is captured, it is removed from
the set E, i.e., the set E is updated as E = E − {j}, and the
pursuers’s attention should be on the “alive” evaders. The
underlying graph KN,M is updated accordingly, and so is
the objective (14). The authors want to emphasize that in a
PE problem, optimization of the connectivity only provides
a local movement guidance for the players.
range of the pursuers. This result is similar to that in [12].
For a comparison purpose, we simulate the game with
different combinations of the look-ahead strategy [12] and
the graph-based strategy. The sum of capture times of the
both evaders is listed in Table II. Clearly, the performance
of the graph-based strategy is comparable to the limited lookahead strategy. However, the computation time of the graphbased strategy is one third of the look-ahead strategy.
Cooperative Pursuit Trajectory with Graph Based Strategies
6
4
E2
P2
2
Y
are considered. Here, the control based on λ2 are no longer
feasible because it only focuses on the “weakest” links.
Similar to (9) and (12), the pursuers and the evaders’s
controls are given as
0
P1
E1
−2
−4
0
5
X
10
15
Fig. 5. Cooperative Pursuit Trajectories under the Graph-Based Strategies
TABLE II
S UM OF THE C APTURE T IMES UNDER D IFFERENTIAL S TRATEGIES
B. Simulation Results
We show by simulation the feasibility of the gradientbased strategies. Consider PE games in R2 . The dynamics
of pursuer i ∈ P and evader j ∈ E are
Pursuers
Strategy
Lookahead
Graph Based
Evaders
Lookahead Graph Based
10.8 (s)
11 (s)
10.6 (s)
10.4 (s)
ẋip = ui and ẋje = vi
with kui k = ū, kvj k = v̄. Here, for simplicity, the pursuers
and evaders are assumed to have a uniform speeds ū and
v̄ respectively. In the following examples, we also assume
that ū > v̄. Although the graph-based control is still
applicable (in tracking situations) without this assumption,
capturability of the evaders is absent, which is generally a
hard problem [12]. We consider a weight function w as
x−ε
w(x) = exp −
(16)
R−ε
for some R > 0. Here, R is an important parameter, which
indicates how pursuer i (or evader j) rates the importance of
the evaders (or pursuers) with different distances from it.
First of all, consider a game with two pursuers and two
evaders with R = 10 and ε = 0.5. The initial positions and
the speeds of the players are given in Table I. Note that this
example is similar to the one adopted in [12], where the
performance enhancement by limited lookahead based on a
hierarchical decomposition approach was illustrated.
TABLE I
S IMULATION PARAMETERS
Pursuer 1
Pursuer 2
Evader 1
Evader 2
Initial Position
(0,0)
(14,3)
(6,0)
(8,3)
Speed
2
2
1
1
Fig.5 illustrates the players’ trajectories under the controls
in (15), where circles are drawn to indicate the capture
In the second example, we simulate a PE game scenario
with 2 pursuers and 3 evaders. The initial positions of the
players are specified in Table III, and the speeds of the players are the same as in the previous example. Fig.6 illustrates
the cooperative pursuit trajectories at various stages under
the graph-based strategies with R = 10. For comparison, the
simulation result with R = 1 is illustrated in Fig.7.
TABLE III
S IMULATION PARAMETERS
Initial Position
Pursuer
Evader
1
(0,0)
(-1,3)
2
(0,3)
(1,5)
3
(3,3))
Clearly, by Fig.6, the two pursuers successfully capture all
the three evaders. Due to the nature of a bipartite graph as
in Fig.4, the pursuers do not seem to have specific targeted
evaders to go after at the beginning stage. This is because
that the pursuers want to maximize the “connections” to all
the evaders within certain distance. In fact, this behavior
can deceive the evaders, and it is hard for them to infer the
pursers’ strategies and take the direct adversarial actions. On
the contrary, with a smaller R, the pursuers become engaged
with the evaders from the beginning (c.f. Fig.7). By (16), it
is expected that a smaller R implies more myopic behaviors
of the pursuers (evaders).
We have simulated a number of PE problems with different
numbers of the pursuers/evaders and various initial configurations. In most of the cases, the graph-based strategies
4.2 Second
2 Second
IV. CONCLUSIONS AND FUTURE WORK
10
E2
8
6
E3
E1
6
Y
Y
4
4
2
2
0
P2
P1
−2
0
2
0
4
−4
−2
0
X
2
4
6
X
6.5 Second
10.4 Second
12
15
10
10
6
Y
Y
8
5
4
2
0
0
−5
0
X
5
−5
0
5
X
Cooperative Pursuit Trajectories with R = 10
Fig. 6.
In this paper, we have derived the gradient of an arbitrary
nonzero eigenvalue of the Laplacian of a state-dependent
graph through a standard sensitivity analysis. A gradientbased control of multiple agents have been designed to
maximize an arbitrary nonzero eigenvalue instead of the
second smallest eigenvalue. Furthermore, connections have
been drawn between the connectivity control and multiplayer PE problems. Preliminary gradient-based strategies
have been designed, and have been justified by simulations.
The appealing features of this approach include the pursuers’
emerging deceptive behaviors and the computational easiness. This research presents a preliminary graph theoretical
approach to multi-player PE problems.
In the future work, a more sophisticated algorithm may
be designed to avoid local optima in the underlying graph
model. In addition, the effect of the weight function can be
further studied.
Cooperative Pursuit Trajectories with R=1
R EFERENCES
10
8
Y
6
4
2
0
−2
−1
0
1
2
3
4
X
Fig. 7.
Cooperative Pursuit Trajectories with R = 1
perform very well and lead to desirable cooperative behaviors
for both the the pursuers and the evaders. However, special
attention should be paid on the case where the players
are in the configurations with certain symmetry, e.g., two
evaders are on the different sides of a pursuer with an equal
distance. The performance of the graph-based controls can
be degraded because the method is local in nature.
C. Discussion
The feasibility of the graph-based strategy in PE problems
results from the relevance of the connectivity of the graph
based on the players. In some scenarios, the simple graphbased strategy performs very well, which is comparable
the previously designed suboptimal strategy (c.f.Fig5). The
most appealing feature of this approach based on gradient
is that it is easy to compute. However, players may be
trapped in local optima. Thus, additional cooperative control
laws should be designed to avoid local optima, especially
when the players are in some configurations with certain
symmetry. Generally speaking, this is a preliminary study of
usefulness of the generalized connectivity in multi-player PE
problems. Cooperative behaviors of the players under graphbased strategies need to be further studied.
[1] A. Jadbabaie, J. Lin, and A.S. Morse, “Coordination of groups
of mobile autonomous agents using nearest neighbor rules,” IEEE
Transactions on Automatic Control, vol. 48, no. 6, pp. 988–1001, 2003.
[2] R. Olfati-Saber, “Flocking in multiagent dynamic systems: Algorithms
and theory,” IEEE Transactions on Automatic Control, vol. 51, no. 3,
pp. 401– 420, 2006.
[3] E. Justh and P. Krishnaprasad, “Equilibria and steering laws for planar
formations,” Systems & Control Letters, vol. 52, no. 1, pp. 25–38,
2004.
[4] M.C. De Gennaro and A. Jadbabaie, “Decentralized control of connectivity for multi-agent systems,” in Proceedings of the 45th IEEE
Conference on Decision and Control, (San Diego, CA), pp. 3628–
3633, December 2006.
[5] Y. Kim and M. Mesbahi, “On maximizing the second smallest eigenvalue of a state-dependent graph laplacian,” IEEE Transactions on
Automatic Control, vol. 51, no. 1, pp. 116–120, 2006.
[6] C. Godsil and G. Royle, Algebraic Graph Theory. New York: Springer,
2001.
[7] J.A. Fax and R.M. Murray, “Information flow and cooperative control
of vehicle formation,” IEEE Transactions on Automatic Control,
vol. 49, no. 9, pp. 1465–1476, 2004.
[8] R. Olfati-Saber and R.M. Murray, “Consensus problems in networks of
agents with switching topology and time-delays,” IEEE Transactions
on Automatic Control, vol. 49, no. 9, pp. 1520–1333, 2004.
[9] J. Hespanha, M. Prandini, and S. Sastry, “Probabilistic pursuit-evasion
games: A one-step nash approach,” in Proceedings of the 39th IEEE
Conference on Decision and Control, (Sydney, Australia), pp. 2272–
2277, 2000.
[10] L. Schenato, S. Oh, and S. Sastry, “Swarm coordination for pursuit
evasion games using sensor networks,” in Proceedings of the International Conference on Robotics and Automation, (Barcelona, Spain),
pp. 2493–2498, 2005.
[11] D. Li, J.B. Cruz, Jr., G. Chen, C. Kwan, and M. Chang, “A hierarchical approach to multi-player pursuit-evasion differential games,” in
Proceedings of the 44th Joint Conference of CDC-ECC05, (Seville,
Spain), pp. 5674–5679, December 2005.
[12] D. Li and J.B. Cruz, Jr., “Better cooperative control with limited lookahead,” in Proceedings of American Control Conference, (Minneapolis, MN), pp. 4914–4919, June 2006.
[13] D. Li and J.B. Cruz, Jr., “Improvement with look-ahead on cooperative
pursuit games,” in Proceedings of the 44th IEEE Conference on
Decision and Control, (San Diego, CA), December 2006.
[14] A. Antoniades, H. Kim, and S. Sastry, “Pursuit-evasion strategies for
teams of multiple agents with incomplete information,” in Proceedings
of the 42th IEEE Conference on Decision and Control, (Maui, HI),
pp. 756–761, 2003.