Coordinated Control and Estimation for Multi

Coordinated Control and Estimation for Multi-agent Systems:
Theory and Practice
Daniel J. Klein
A dissertation submitted in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
University of Washington
2008
Program Authorized to Offer Degree: Aeronautics & Astronautics
University of Washington
Graduate School
This is to certify that I have examined this copy of a doctoral dissertation by
Daniel J. Klein
and have found that it is complete and satisfactory in all respects,
and that any and all revisions required by the final
examining committee have been made.
Chair of the Supervisory Committee:
Kristi A. Morgansen
Reading Committee:
Kristi A. Morgansen
Mehran Mesbahi
Dieter Fox
Date:
In presenting this dissertation in partial fulfillment of the requirements for the doctoral degree at the University of Washington, I agree that the Library shall make
its copies freely available for inspection. I further agree that extensive copying of
this dissertation is allowable only for scholarly purposes, consistent with “fair use” as
prescribed in the U.S. Copyright Law. Requests for copying or reproduction of this
dissertation may be referred to Proquest Information and Learning, 300 North Zeeb
Road, Ann Arbor, MI 48106-1346, 1-800-521-0600, to whom the author has granted
“the right to reproduce and sell (a) copies of the manuscript in microform and/or (b)
printed copies of the manuscript made from microform.”
Signature
Date
University of Washington
Abstract
Coordinated Control and Estimation for Multi-agent Systems: Theory and Practice
Daniel J. Klein
Chair of the Supervisory Committee:
Professor Kristi A. Morgansen
Aeronautics & Astronautics
The problems studied in this thesis center on coordinated control and estimation tasks
for multi-agent systems. The work begins technically, with a Kuramoto-inspired study
of coordinated phasor centroid control for a group of N homogeneous phase integrators.
A main result is the development of a controller which guarantees asymptotic stability
of the phasor centroid to a known reference vector. An extensions of the splay state to
non-balanced group states is then explored. The work then moves on to a group of N
heterogeneous agents wherein some agents act as leaders to the others, which obey a
standard sinusoidal protocol. One interesting point here is that unlike in the related
controlled linear agreement problem, symmetry of the communication topology does
not determine uncontrollability. While these results are derived in continuous time,
the next contributions shows that many of the desirable coordination properties can be
preserved in discrete time.
The work then moves to a more applied focus. A simulation study of coordinated
target tracking is conducted, considering questions of control, estimation, and communication for distributed systems. Finally, the design and development of a computer
vision system which enables a demonstration of some of the theory-oriented work on
the University of Washington’s Underwater Fin-actuated Autonomous Vehicle testbed
is discussed.
TABLE OF CONTENTS
Page
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 1:
iii
Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . .
1
1.1 Multi-agent Coordinated Control . . . . . . . . . . . . . . . . . . . . . . .
2
1.2 Communication Limitations . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3 University of Washington Fin-Actuated Vehicles . . . . . . . . . . . . . .
4
1.4 Contributions and Organization . . . . . . . . . . . . . . . . . . . . . . . .
6
Chapter 2:
Kuramoto-Inspired Coordinated Control of Phase Integrators . .
2.1 System and Problem Descriptions
8
. . . . . . . . . . . . . . . . . . . . . .
11
2.2 Coordinated Phasor Centroid Control . . . . . . . . . . . . . . . . . . . .
12
2.3 Controlled Oscillation in the Matched Manifold . . . . . . . . . . . . . . .
21
2.4 Equivariance and Splay Orbits . . . . . . . . . . . . . . . . . . . . . . . .
28
2.5 Closed Kinematic Chain Analogy . . . . . . . . . . . . . . . . . . . . . . .
34
2.6 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
Chapter 3:
Controlled Sinusoidal Coupling: Heterogeneity through Leadership 42
3.1 Preliminaries
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
3.2 System Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
3.3 Aligned Set Reachability . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
3.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
Chapter 4:
Sinusoidal Phase Coupling in Discrete Time
. . . . . . . . . . . .
57
4.1 The Discrete-Time Phase Coupling Model . . . . . . . . . . . . . . . . . .
58
4.2 All-to-All Aligned Set Stability . . . . . . . . . . . . . . . . . . . . . . . .
59
4.3 Random One-to-All Aligned Set Stability . . . . . . . . . . . . . . . . . .
62
4.4 All-to-All Balanced Set Stability
. . . . . . . . . . . . . . . . . . . . . . .
63
4.5 Reference Set Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
i
4.6 Network Routing Optimization . . . . . . . . . . . . . . . . . . . . . . . .
4.7 Performance vs. K∆T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
78
78
Chapter 5:
Coordinated Target Tracking in a Cluttered Environment
5.1 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Coordinated Target Estimation . . . . . . . . . . . . . . . . . . . .
5.3 Coordinated Target Pursuit . . . . . . . . . . . . . . . . . . . . . .
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
81
83
85
90
92
98
Chapter 6:
Tracking Multiple Fish Robots using Underwater Cameras
6.1 Problem Description and Experimental Apparatus . . . . . . . . .
6.2 Particle Filter Tracking of a Single Fish Robot . . . . . . . . . . .
6.3 Tracking a Robot in an Outdoor Environment . . . . . . . . . . . .
6.4 Multi-Target Tracking . . . . . . . . . . . . . . . . . . . . . . . . .
6.5 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
100
101
103
106
107
109
Chapter 7:
Implementation with the UW Fin-Actuated Underwater Vehicles
7.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Coordinated Heading Alignment . . . . . . . . . . . . . . . . . . . . . . .
7.3 Coordinated Heading Anti-Alignment . . . . . . . . . . . . . . . . . . . .
7.4 Coordinated Reference Matching . . . . . . . . . . . . . . . . . . . . . . .
7.5 Coordinated Target Tracking using Phase Coupling . . . . . . . . . . . .
7.6 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
110
110
113
114
116
116
122
Chapter 8:
Conclusion . . . . . . . .
8.1 Summary . . . . . . . . . . . . .
8.2 Questions for Future Research
8.3 Final Remarks . . . . . . . . . .
.
.
.
.
123
123
124
125
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
126
.
.
.
.
.
.
.
.
ii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
LIST OF FIGURES
Figure Number
Page
1.1 A comparison of classical linear and modern nonlinear multi-agent control. 4
1.2 The three fin-actuated underwater vehicles of the UW-FAV testbed. . . .
5
2.1 In both of the above phasor plots, the phasors corresponding to N = 4
agents are plotted on the unit circle, and the phasor centroid, x̄, and reference vector, xref , are drawn and labeled. The first coordinated control
objective of this work is to drive the state from an initial condition (left)
to a state in which the phasor centroid is collocated with the reference
vector (right). In this example, xref = 0.5∠ − 0.2. . . . . . . . . . . . . . .
13
2.2 Example aligned, balanced, and reference matched group states are shown
on a phasor plot. Each gray circle represents the heading of one agent.
The phasor centroid, x̄, is marked with a gold star, and the reference
vector, xref , shown only in (c), is denoted with a green circle. . . . . . . . 36
2.3 A balanced splay state is shown for N = 4 agents. . . . . . . . . . . . . .
36
2.4 (a) No centroid-locked control is possible for N = 2 agents when 0 <
ρ̄ < 1 because the matched manifold consists only of two points as shown
above. (b) When the state is splay (ρ̄ = 0), uniform rotation is possible,
as shown by the red arrows. . . . . . . . . . . . . . . . . . . . . . . . . . .
37
2.5 A simulation of θ̇ = u + v with u from (2.8) and v from (2.72). Matching control causes the initial transient in which the phasor centroid is
brought to the reference. Then the controlled oscillation results in either
a circling mode, when ρref < 1/3 (a), or a back-and-forth mode, when
ρref > 1/3 (b). The oscillations are equally spaced in time, and the phasor centroid is at the reference for all points, so these oscillations are
splay orbits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
2.6 A non-splay initial balanced state (a) results in a non-splay orbit (b), but
a splay initial balanced state (c) does result in a splay orbit (d). This
initial condition sensitivity exists for non-balanced initial states as well,
but deciding which initial states will yield splay orbits is an open problem. 39
2.7 Mode stabilization using control (2.87) for x̄ = 0.4∠0 is enabled at Time=20s.
The initial state was selected randomly and produces a non-splay oscillation. Splay oscillations of either the back-and-forth mode (a) or the
circling mode (b) are stabilized with I ∗ = 0 or I ∗ = π, repectively. . . . . 40
iii
2.8 One full period of the chain links for a back-and-forth splay oscillaiton.
The beginning of the chain is located at the origin whereas the end of the
chain as driven to N x̄ = [1.6, 0] by the matching control (2.8). . . . . . . .
41
3.1 This figure shows the (a) star, (b) complete, and (c) chain example graphs
considered in this section. Edges from the leader are directed indicating
the follower is coupled sinusoidally to the leader, but not vice versa. . . .
50
3.2 The controlled state trajectory (left) and leader control (right) are shown
for a simulation of the three node star graph with the leader at the center.
Notice how the leader control must oscillate to keep the follower state
near the goal. The state space, T2 , is represented as the shaded area
between the two identical aligned sets, the initial state is φ(0) = [0, −1],
and the goal state, denoted by the red dot, is [6, 1]. . . . . . . . . . . . . .
52
3.3 The controllable set on T2 is shown for the complete graph. When the angle between the followers is less that 120◦ , alignment cannot be prevented. 54
3.4 The sets for the chain graph are shown. All states are reachable from the
controllable set (light gray), however the positively invariant reachable
set (dark gray) cannot be avoided after entering the drift dominated set
(medium gray). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
4.1 This figure shows the general setup of the proofs in this chapter. The
unit circle is shown in light green (gray). The phasor centroid is drawn
at x̄ = [0.5, 0] and is marked with a red “o”, the dot is xi , and the diamond
is located at xi − x̄. The square denoted x∗ is the farthest point xi could
move around the unit circle before leaving the shaded ball, Bρ̄ (xi − x̄). .
65
4.2 A logical one-to-all graph (a) can be realized via simple wireless broadcasting (b) or gossip schemes (c). . . . . . . . . . . . . . . . . . . . . . . .
72
4.3 The general trend of number of iterations to convergence of headings
to an -ball versus K∆T for both an expected broadcast (i.e. all-to-all)
and an actual one-to-all broadcast sequence (4.15) is shown. The same
randomly selected broadcast sequence was used for each value of K∆T .
79
4.4 Total communication (normalized by noise) energy versus time discretization step, ∆T , for various levels of quantization, M , for a linear network.
The tradeoff between single hop and multi-hop is most apparent for the
seven bit case, M = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
5.1 Example sensor reliability dependence on vehicle distance to target. The
circle denotes the end of the optimal sensor range, while the x denotes
the absolute sensor range. . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
iv
5.2 An example in which two vehicles communicate their most recent measurement is depicted. Here, black dots represent successful measurements and arrows point from the time at which the data was communicated to the time the communication arrives at the other vehicle. . . . .
86
5.3 Estimator performance vs. sensor success rate is measured by (a) mean
log likelihood and (b) integrated position error as a function of sensor
reliability. All pursuit agents have the same sensor precision. . . . . . .
93
5.4 Effects of changes in relative sensor precision of a one agent on the estimator performance of the other two agents. This figure shows the estimate results for one of the fixed measurement covariance agents, which
have σx = σy = 9. Sensor precision on the lower axis indicates the factor
by which sensor precision was increased. Sensor reliability is fixed at 70%. 94
5.5 Estimator performance dependence on communication period for sensor
reliabilities of (a) 70% and (b) 30%. The sensor measurement period is
fixed at Tb = 1. The No comm case, and the Centralized estimate case
are unaffected by communication. Drop-off in mean likelihood at low
communication periods for the Fus B case indicates effects of the neglected cross-covariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
5.6 Target tracking performance for the four communication cases with variable obstacle separation distance. . . . . . . . . . . . . . . . . . . . . . . .
96
5.7 Pursuit vehicle trajectories under the controller without coordination.
Here, shaded circles around an agent indicate loss of sensor view of the
target and lines trailing the vehicles indicate trajectory history. . . . . .
97
5.8 Pursuit vehicle trajectories when the vehicles employ coordinated behavior. Compared to the uncoordinated case, the vehicles here have a much
better chance of maintaining a good estimate of the target through the
clutter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
6.1 The user interface for the fish tracking software showing the four camera views (left) and a 3D view of the pool (right). Some of the projected
particles are shown over the fish robot in the camera views as red dots. .
107
6.2 The fish tracking vision system was used extensively at the 2008 University of Washington Engineering Open House. This series of images shows
(a) the outdoor pool - black markers were used for camera calibartion, (b)
the base station running the tracking software, (c) the underwater view
showing the fish in the top two images, and (d) the 3D view showing the
origin in the center of the pool, the four cameras around the edge of the
pool, and the expected fish state. . . . . . . . . . . . . . . . . . . . . . . .
108
7.1 Coordinated controller program flow block diagram. . . . . . . . . . . . .
112
v
7.2 A comparison of simulated and experimental heading alignment with
random one-to-all broadcasts, as viewed from above. The data in (b) was
collected from an early version of the computer vision system from Chapter 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.3 Heading alignment data, viewed as compass heading vs. time for the
three robots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.4 A comparison of simulated and experimental heading anti-alignment
with random one-to-all broadcasts, as viewed from above. The data in
(b) was collected from an early version of the computer vision system
from Chapter 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.5 Heading anti-alignment data, viewed as compass heading vs. time for
the three robots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.6 Heading data (a) collected for an experiment in which three vehicles are
performing reference matching. The phasor centroid (b) stabilized to the
(constant) reference vector, xref = [0, 0.75], in approximately 30s. . . . . . 117
7.7 Demonstration of coordinated unicycle control. In (a), a group of five
unicycles steers to match the velocity of their centroid to a constant reference vector. In (b), the orientation of the reference vector changes in
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.8 A demonstration of the outerloop controller, which is used here to bring
the group centroid to the target vehicle, shown as a blue circle. To produce this simulation, K = 1, N = 12, T = 45, and α = −.05. . . . . . . . . 119
7.9 A simulation of a swirling motion produced by (7.7). Here, N = 11, K =
0.1, ρ0 = 7, and speed of the target, shown as the tank, was half that of
the pursuit vehicles. If desired, the outerloop can be used here to bring
the centroid to the target. . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.10 A simulation of a back-and-forth motion produced by the general N centroidlocked control (2.4.4), with N = 5 vehicles and coupling strength K = 0.1.
A specific initial was selected to produce this splay orbit. . . . . . . . . . 121
vi
ACKNOWLEDGMENTS
This thesis was conducted in the Nonlinear Dynamics and Control lab of Prof. Kristi
Morgansen at the University of Washington. I first wish to sincerely thank Prof. Morgansen for her persistent encouragement and patience throughout the entire Ph.D
process. I would also like to thank everyone in the lab who has helped with the robotic
fish project. The list of students is long, but I wish to name a few. Dr. Benjamin
Triplett designed and maintained the robotic fish. In addition, he helped to develop
many of the ideas presented in this thesis. I would also like to thank Emmett Lalish,
Skander Mzali, Octavian Blaga, Brad Basler, Patrick Bettale, and Nathan Powel for
their hard work in getting the fish up and running. The implementation portion of this
thesis would not have been possible without their hard work.
I would also like to thank the members of my committee for their guidance. Without
their feedback at the General Exam, I would have moved on to other research problems
before completing the in-depth study of phase-integrator models found in this thesis.
I owe many thanks to Profs. Morgansen, Mesbahi, and Fox for reading and providing
feedback on this thesis.
Finally, I cannot say enough for my friends and family. Chris Lum and I entered
graduate school together and took many of the same classes. He as always been supportive and willing to listen to every practice talk (sometime even twice). When times
were difficult, my parents, Sandy and Jan, always encouraged me to “power through”
and get things done. It was their encouragement that kept me on track. My brother,
Linden, helped me to get away from my research to enjoy the great outdoor opportunities in the Seattle area. And finally, I cannot possibly say enough about my wife, Judy.
Her love though all the ups and downs, conferences, exams, and papers really made
this thesis possible.
vii
1
Chapter 1
INTRODUCTION AND MOTIVATION
“More than 70 percent of the Earth’s surface is covered by ocean, but to date, we’ve
explored less than 5 percent of it. Finding new living marine resources and
understanding how they fit into the larger ecosystem is critical to our future.”
–Scott Gudes, acting administrator of NOAA, 2001
While we humans know relatively little about our oceans, we have come to realize
that they play an influential role in the stability of our ecosystem. From absorbing
greenhouse gasses to creating winds and rain, the oceans of the earth drive and stabilize our ever-changing climate. With worries about global warming, understanding the
earth as a dynamical system and how our actions affect this system is more important
now than ever before. The engineering challenges in undersea exploration, however,
are tantamount to interstellar exploration. The ocean is too large to be explored by
a single submarine vehicle, so a multi-vehicle approach is almost certainly necessary.
Some multi-vehicle undersea exploratory systems, like the ARGO [60], SeaGlider [83],
and Neptune [61] projects, already exist. However, coordinating the actions of the autonomous agents involved in these projects, to maximize effectiveness and lifespan,
is complicated by another challenge of the undersea environment: long-range highfrequency communication, including GPS and most radio-frequency communication,
requires a tremendous amount of power due to the conductivity of seawater. Coordinated control algorithms designed for underwater applications must instead work
with limited information exchanges. A final challenge of the undersea environment
is that it is rapidly evolving. Undersea volcanic events, for example, often do not last
sufficiently long to deploy a ship and submarine. To address these needs, we envision
a future in which a coordinated school of robotic vehicles can be rapidly deployed from
a helicopter.
2
The work in this thesis addresses a series of control-theoretic problems derived
from applications in which a group of underwater fin-actuated robotic fish are assigned tasks requiring inter-agent coordination. In particular, emphasis is placed on
theory-oriented coordinated heading control and application-driven coordinated target
pursuit and estimation. This emphasis leads to several challenges not encountered
in classical control problems. The tasks considered in this thesis are formulated so
that the overall group performance is improved when the robots act in a coordinated
manner. A related challenge is that the underwater environment severely limits the
connectivity and bandwidth of the inter-robot communication. Thus, the problems we
consider in this thesis are derived from coordinated control with limited communication.
Multi-agent systems have received an increasing amount of academic, military, and
commercial attention over the past several years. This interest is due in large part to
a number of benefits offered by these systems. In particular, a multi-agent system is
capable of simultaneously working on multiple tasks or subtasks, thus offering a huge
potential for performance increases. Tasks that easily break into component subtasks
including manufacturing using an assembly line [26], searching a large region [75, 4],
and moving many or heavy objects from one place to another [52, 72] are ideal here.
A second main benefit offered by multi-agent systems is that one vehicle can take
the place of another in case of a vehicle failure. This task robustness increase is best
leveraged when vehicles have overlapping capabilities.
1.1
Multi-agent Coordinated Control
Some tasks benefit from, or even require, individual group members to work together
to achieve a group objective. Such tasks are here defined as coordinated control tasks.
One well-studied problem in coordinated control is the average consensus problem in
which each agent is required to determine a common value, often the average value of
the initial group state, using only state measurements relative to the local state. This
task is a coordinated control task because individuals are required to work together.
3
Were any one agent not to participate, the overall task would be a failure because
consensus would not be achieved. Some of the work in this thesis addresses a nonlinear
consensus problem in which the state of each agent is heading.
If the cooperative task requires the agents to compute a nonlinear function of the
state, the problem at best reduces to a centralized nonlinear control problem when
inter-vehicle communication is assumed to be perfect. Relatively little is known about
control of complicated nonlinear systems compared to their linear counterparts. The
early parts of this thesis focus on a centralized nonlinear phase-c@loupled oscillator
problem derived from a particular coordinated control task.
1.2
Communication Limitations
The benefits of coordinated, yet distributed, multi-robot systems have thus far been
difficult to realize. One main challenge is due to the increase in complexity involved
with these systems compared to classical control problems. In a classical linear control application, the objective is to design a single controller that takes the observation
vector as input and produces a control signal (see the block diagram in Figure 1.1). In
networked multi-agent systems, each vehicle can receive a limited amount of information via communication from other agents in addition to making local observations. For
much of the work in this thesis, communication is assumed to take place over wireless
links. These communication limitations drive the additional complexity of multi-agent
systems. Without limitations, each agent can relay all local information to all other
agents, thereby enabling each agent to operate on a common world view.
Communication limitations can take many forms. Communication range limitations prevent agents from giving (receiving) information to (from) distant agents. Bandwidth restrictions mean that agents can only transmit (receive) a specific number of
data bits per second. Delay inherent in communication is more pronounced than in
centralized control. These limitations together mean that no two agents can share
a common world view. Without a common world view, control and particularly coordinated control, becomes much more difficult. The control designer must not only
4
ẋ1 = f1(x, u1)
ẋ3 = f3(x, u3)
ẋ = Ax + Bu
y = Cx
u = −κ(y)
(a) Classical Control
ẋ5 = f5(x, u5)
ẋ4 = f4(x, u4)
ẋ2 = f2(x, u2)
ẋN = fN (x, uN )
(b) Multi-Agent Coordinated Control
Figure 1.1: Classical linear control block diagram (a) and nonlinear multi-agent system
diagram (b) in which lines represent inter-agent communication.
deal with this information discrepancy, but also create a communication protocol to
distribute information to the agents. The net result of this added complexity is that
classical techniques of linear theory and separation between control and estimation
fail to transfer directly to communication-limited multi-agent systems.
1.3
University of Washington Fin-Actuated Vehicles
The University of Washington’s Fin-Actuated underwater multi-Vehicle (UW-FAV) system has many of the features common to any autonomous underwater multi-vehicle
system and is capable of demonstrating the theory-based work developed in this thesis. This system consists of three non-holonomic submarine vehicles which are actuated using fins rather than propellers and an instrumented above-ground pool. A
non-holonomic system has, by definition, a velocity constraint that, for example, prevents lateral motion. Limited communication from robot to robot or from robot to base
is achieved using either low-frequency radio communication modules for short ranges
or in cluttered environments or acoustic modems for long ranges in open environments.
To avoid loss due to data collision, both of these communication systems must exhibit
a broadcast topology in which only one vehicle transmits at a time. The communication rate is on the order of 80-bits per second. This section describes the vehicles, the
5
instrumented testbed, and the communication devices.
1.3.1
Autonomous Underwater Vehicles
Each of the three UW-FAVs is completely self-contained and untethered with a servoactuated two-link tail providing forward motion control and two independently servoactuated “pectoral fin” bowplanes. Using both the tailfin and the pectoral fins, the
vehicles have complete motion in 3D. Further, the vehicles can be fully operated in 3D
using the pectoral fins alone. In this mode, the vehicles can propel both forward and
backward at slow speeds. Measuring approximately 0.5m in length and 3kg, the robots
are each equipped with a microprocessor for collecting data and computing control
commands, a pressure sensor for depth, a 3D compass, a radio-frequency transceiver,
and NiMH rechargeable batteries. Full details of the construction and modeling of the
individual vehicles can be found in [56]. See Figure 1.2 for a picture of the vehicles.
Figure 1.2: The three fin-actuated underwater vehicles of the UW-FAV testbed.
6
1.3.2
Instrumented Testbed
The operation environment for the fin-actuated robots is a freshwater tank of dimensions 2.4m deep, 2.4m wide, and 6m long. Because the robots have insufficient sensing
capabilities to self-localize, the tank has been equipped with an external real-time vision system consisting of four identical underwater cameras connected to a central
computer. One of the contributions presented in this thesis is the development of estimation software capable of tracking one or more fin-actuated vehicles in this instrumented testbed.
1.3.3
Short Range RF Underwater Communication
Underwater communication is made possible by a custom transceiver based on 315MHz
radio modules from Linx Technologies and designed to minimize the attenuating effects of the underwater medium. Communication to and from the modules is done over
a 2400baud serial link. The robots have externally mounted antennas to mitigate loss
incurred at the air-water interface. Implementation details about these radios can be
found in [6].
These low-frequency RF transceivers allow a single vehicle to communicate with
one or more other vehicles and the base station during each transmission session, depending on the separation distance. Thus, inter-vehicle communication can be modeled
as a sequence of one-to-some or one-to-all logical broadcast graphs. These logical communication graphs can be realized either as a single hop from the transmitter to the
receivers or as a sequence of multiple hops. The energy efficiency of single vs. multiple
hopping is considered in this thesis.
1.4
Contributions and Organization
The main contributions and organization of this thesis build up from theoretical to
practical. In the following chapter, a system of N phase integrators (θ̇ = u, θ ∈ TN )
is considered. The coordinated control objective is to drive the phasor centroid to a
reference vector. Previous work only allows stabilization of the phasor centroid to the
7
origin or to a point on the unit circle. Then coordinated motion within the set of states
for which the phasor centroid matches the reference vector is investigated, and an
extension of the splay state is presented.
Then, in Chapter 3, a coordinated phase integrator problem is considered in which
some of the agents act as leaders for the others. Interesting results here contrast
related results from the controlled linear agreement problem. Three specific example
topologies are selected and explored to highlight reachability and controllability of this
nonlinear heterogeneous model.
A time-discretized phase coupling model is presented in Chapter 4. This model
preserves many of the nice properties of the continuous-time model, but requires much
less communication. Additionally, we study broadcast communication topologies and
consider energy efficient routing schemes.
Coordinated control theory is considered in a more realistic setting in Chapter 5.
Here a coordinated target tracking problem is considered from the perspective of control, communication, and state estimation. The results are simulation-based due to
the real-world nature of the problem setting, yet show a trend that quantifies the performance benefits of coordination.
In Chapter 6, a computer program is developed to visually track one or more finactuated robots for the UW-FAV testbed. The system takes in video from four underwater cameras, estimates the 3D position, 3D orientation, and forward speed of each
robot, and broadcasts the estimate to the robots using the radio-frequency communication modules. This system has been used extensively in data collection operations of
the UW-FAV testbed and at the 2008 UW Engineering Open House.
A demonstration of the collective work in this thesis is presented in Chapter 7.
Here, some of the control algorithms from the second and fourth chapters are applied
to the the UW-FAV testbed. The vision system developed in the 6th chapter is used to
track the robots as they perform the coordinated control.
Concluding remarks are available in Chapter 8.
8
Chapter 2
KURAMOTO-INSPIRED COORDINATED CONTROL OF PHASE
INTEGRATORS
One of the main challenges of multi-agent coordinated control is complexity. Modeling the system at a sufficiently low level to capture the nature of the problem without
the need for excessive detail is one of the keys to gaining traction into these problems.
Here, we consider coordinated control at the level of phase, which could represent a
number of quantities ranging from the heading of a non-holonomic vehicle to the angle
of a leg on a walking robot. By abstracting the core of the problem away from any one
particular application, the material here stands on its own as a mathematical tool that
can be used for a variety of control and biological modeling applications.
Specifically, here we consider a system consisting of N identical phase integrators
whose collective state is an element of the N -Torus, TN . The coordinated control objectives require: 1) stabilization of the phasor centroid, a vector function of the entire
state, to a reference vector, and 2) oscillation that preserves the phasor centroid. The
approach we take was inspired by the Kuramoto model of phase-coupled oscillators
[47, 78].
2.0.1
Preliminaries and Related Work
Related work on phase coordination is extensive due in part to broad interest in phasecoupled oscillator models. These models were developed to reproduce frequency synchronization, a coordinated phenomenon, observed in chemical and biological systems
of phase integrators, θ̇k = uk , k = 1, . . . , N , having heterogeneous natural frequencies,
ωk . For example, in the Kuramoto model of N phase-coupled oscillators,
θ̇k = uk = ωk +
N
KX
sin(θj − θk ), θk ∈ T, k = 1 . . . N,
N
j=1
(2.1)
9
a bifurcation to partial or complete frequency synchronization (θ̇k = ω̄, ∀k) occurs
when the coupling strength K becomes sufficiently large. While the majority of the
research has been focused on understanding this bifurcation, the homogeneous natural
frequency case, ωk = ω, k = 1, . . . N , is as interesting for coordinated control. Here, any
positive coupling gain will result in phase and frequency alignment (θk = θ̄, θ̇k = ω̄, ∀k)
whereas a negative gain will result in phase anti-alignment and frequency alignment
[87]. Any constant natural frequency can be taken as zero without loss of generality
by rewriting (2.1) in a rotating coordinate frame. While all-to-all communication is
implicit in (2.1), incomplete communication topologies have been examined in [28, 54].
Another body of work on coordinated phase control stems from recent work with
coordinated control of non-holonomic vehicles. In this line of research, individual vehicles are modeled abstractly as constant-speed kinematic unicycles in the plane:
h
iT
ṙk = s cos(θk ) sin(θk )
(2.2)
θ̇k = suk .
Each vehicle has fixed speed s, state (rk , θk ) ∈ SE(2), and a single control input, uk ,
that determines the curvature of the path taken by the vehicle. This model is a commonly used abstraction of more complicated and realistic UGV and UAV dynamics.
The connection to coordinated phase control comes from temporarily ignoring position, in which case the remaining “heading-only” system is a phase integrator. Then,
phase coupled oscillator models can be utilized for control by choosing the “Kuramoto
controller” in which uk is set to the right hand side of (2.1). Naturally, a control designer may choose to modify the controller in which case the controller can be seen
as Kuramoto-inspired. Early work along these lines was done by Justh and Krishnaprasad [34, 33], who developed controllers to stabilize relative equilibria, including
aligned and circling states. The research of Leonard, Sepulchre, Paley, and coauthors
has continued work with constant-speed unicycles for purposes of coordinating undersea gliders for science missions. Their extensive studies of collective motion stabilization has addressed splay state and symmetric pattern generation [50], communication
limitations [29, 74], and includes a demonstration in Monterey Bay [20].
10
Coordinated heading control has a symbiotic relationship with modeling of biological aggregations. In particular, ideas from the modeling of schooling fish have been
applied to coordinated control, and coordinated control ideas have been similarly used
to improve aggregation models. A good summary of biological modeling of aggregations can be found in [21, 68]. More along the lines of coordinated control of phase
integrators is the work of Birnir [7], which employs many of the mathematical tools
used in the above control-theoretic studies. Also, Leonard et al. have some work in
this direction, for example [66].
One feature common throughout the collective control literature mentioned thus far
is that the physical centroid of the group is stabilized either to a maximal velocity, corresponding to aligned headings, or to a minimal velocity, corresponding to anti-aligned
headings. Stabilization of headings corresponding to intermediate group motions, between fully aligned and fully anti-aligned, is in fact a focus of the present chapter,
which summarizes and extends our previous work on the topic [43, 44]. Perhaps the
work most closely related to the content of the present chapter is that of Kingston and
Beard, who considered a UAV coordination problem in wind [37]. Again, individual
vehicles were modeled as constant-speed unicycles, however, a constant wind added to
the model causes vehicles to drift. The proposed controller stabilizes circular motion
relative to the wind, and a generalization of the classical splay state is proposed in
which vehicles are equally spaced in time along the circular relative orbit.
2.0.2
Contributions
The material in this chapter is the culmination of a masters thesis [38], several peerreviewed conference papers [43, 44], and one journal publication [42] (with a second in
preparation [46]) by the author. The contributions are as follows. First, we design a
coordinated controller inspired by the Kuramoto model to stabilize the phasor centroid
of an N -phase integrator system to a reference vector. When the resulting controller is
used to steer unit-speed unicycles, the result is that the velocity of the group centroid
is stabilized to the reference vector. The main difference between this result and the
11
work of Kingston and Beard is that in the work here, the result holds exactly rather
than on average. The second contribution of this work focuses on oscillatory motion on
the submanifold of states for which the phasor centroid matches the reference vector.
Again, the splay state extension of Kingston and Beard is similar in spirit, but differs
because the phasor centroid does not instantaneously match the reference vector. The
work in this thesis is more general as no particular oscillation (e.g. relative circling)
is required. In fact, we demonstrate other modes of oscillation, such as back-and-forth
motions, that overcome limitations of relative circling and provide our own generalization of the splay state that is more in line with the classical phase-coupled oscillator
literature from which the term originated.
2.0.3
Organization
The material in this chapter is organized into two main sections corresponding to the
primary contributions. The phase-integrator system and coordinated control problem
are described in Section 2.1. The first contribution is presented in Section 2.2, wherein
we propose a controller to regulate error between the phasor centroid and a static or dynamic reference vector. The second contribution follows in Section 2.3, wherein phasorcentroid-preserving oscillations are explored. An extension of the balanced splay state
to non-balanced states is presented in Section 2.4. A connection between the coordinated controller and wave-like motion on closed kinematic chains is presented in
Section 2.5. Some final remarks are presented in Section 2.6.
2.1
System and Problem Descriptions
The system considered here is a group of N identical agents whose collective state is
represented by a point in the N -torus, θ ∈ TN , and whose dynamics allow direct control
of the state,
θ̇k = uk , for k = 1 . . . N.
(2.3)
Because the state lives in a torus, we refer to these dynamics as a system of phase
integrators and to the control as phase control.
12
The coordinated control objectives are derived from the phasor centroid,
N
1 X
xk
x̄ =
N
(2.4)
k=1
= ρ̄∠θ̄,
(2.5)
where
h
iT
xk = cos(θk ) sin(θk )
(2.6)
is the unit phasor associated with the k th oscillator. The magnitude of the phasor
centroid, denoted ρ̄, is called the order parameter. The order parameter has played a
significant role in the history of the Kuramoto model and will continue to be influential
throughout this thesis. The two main control objectives are as follows.
Objective I:
The first phase control objective is to stabilize the phasor centroid to a reference vector,
xref = ρref ∠θref ,
(2.7)
located within the closed unit ball. This objective requires coordination amongst individual agents because the phasor centroid is a function of the full state, θ.
Objective II:
The second phase control objective is to design oscillatory motion within the set of
states for which the phasor centroid is collocated with the reference vector. Again, this
objective requires coordination to prevent the phasor centroid from diverging from the
reference.
2.2
Coordinated Phasor Centroid Control
We first turn attention to the design of a Kuramoto-inspired phase controller that
stabilizes the phasor centroid to a reference vector. Initially, we assume the reference
vector to be constant but later relax this assumption.
13
1
2
3
x̄
θ1
3
1
4
4
xref
2
x̄ = xref
Figure 2.1: In both of the above phasor plots, the phasors corresponding to N = 4
agents are plotted on the unit circle, and the phasor centroid, x̄, and reference vector,
xref , are drawn and labeled. The first coordinated control objective of this work is to
drive the state from an initial condition (left) to a state in which the phasor centroid is
collocated with the reference vector (right). In this example, xref = 0.5∠ − 0.2.
2.2.1
Constant Reference Vector Matching
The following theorem establishes asymptotic stability of the phasor centroid to a
known reference vector.
Theorem 2.2.1 (Matching a Constant Reference Vector). Consider a system consisting
of N > 1 phase integrators (2.3) and a constant reference vector, xref = ρref ∠θref , in the
closed unit ball (ρref ≤ 1). The phase control law,
uk =
N
KX
sin(θj − θk ) − Kρref sin(θref − θk ), K < 0
N
(2.8)
j=1
will asymptotically stabilize the phasor centroid (2.4) to the reference vector provided
the initial condition satisfies either: (a) the matrix
h
i
P = x1 x2 . . . xN − 1N ⊗ xref
has full row rank or (b) x̄ = xref . Here, ⊗ denotes the Kronecker product.
(2.9)
14
Proof. To begin, the vector of error between the phasor centroid and the reference
vector,
x̃ = x̄ − xref ,
(2.10)
can be used to define the phasor centroid error potential,
Ũ (θ) =
N
kx̃k2 .
2
(2.11)
This potential is everywhere positive except at points where the phasor centroid matches
the reference vector, where the potential is zero. Because the phasor centroid is not
uniquely determined by the group state, θ, the proof will employ LaSalle’s Invariance
Principle to show stability of θ to the set of states for which x̄ = xref . Given the nonnegativity of the phasor centroid error potential (2.11), we choose it as a candidate
Lyapunov function:
V (θ) = Ũ (θ)
N
hx̄ − xref , x̄ − xref i .
=
2
(2.12)
(2.13)
The controller can be written using inner product notation as
N
E
K XD
⊥
uk =
xj − xref , xk ,
N
(2.14)
j=1
T
where x⊥
k = [− sin(θk ), cos(θk )] is a unit vector perpendicular to xk . Then, using the
fact that the reference vector is constant, the time derivative of the Lyapunov candidate along trajectories of the system is
˙
V̇ (θ) = N x̄(θ) − xref , x̄(θ)
,
(2.15)
15
where the velocity of the phasor centroid can be computed as
x̄˙ =
=
N
1 X
x˙k
N
(2.16)
1
N
(2.17)
k=1
N
X
x⊥
k uk
k=1


N
N D
E
X
X
1
K

=
x⊥
xj − xref , x⊥
k
k
N
N
k=1
=
(2.18)
j=1
N
E
K X ⊥D
xk x̄ − xref , x⊥
k
N
(2.19)
k=1
Substituting this result for x̄˙ into (2.15) and using linearity properties of the inner
product,
+
N
E
K X ⊥D
x̄ − xref ,
xk x̄ − xref , x⊥
k
N
*
V̇ (θ) = N
(2.20)
k=1
N D
E2
X
x̄ − xref , x⊥
,
=K
k
(2.21)
k=1
which is negative semi-definite for K < 0.
Because TN is compact, LaSalle’s Invariance Principle [35] can be used to conclude
that all solutions converge to the largest invariant set contained in
E
o
n
D
=
0,
∀k
=
1,
.
.
.
,
N
.
E = θ ∈ TN x̄(θ) − xref , x⊥
k
(2.22)
Every point in E is invariant because θ̇ = 0 for all θ ∈ E.
The set E contains both stable and unstable equilibria. The unstable equilibria are
characterized by x̄ − xref , x⊥
k and x̄ 6= xref . At these points, the phasor centroid error
is parallel to every heading, so the Lyapunov function (2.13) is non-decreasing. However, these points correspond to local maxima of the phasor centroid error potential
and any small perturbation will allow the control to descend the Lyapunov function.
This point can be seen mathematically by rotating the phasor centroid to the origin
and linearizing, which is trivial because θk ∈ {0, π}. At all other points in E, x̄ = xref
as desired. The restriction on the initial state given in the statement of the theorem
16
prevents the system from ever reaching an undesirable, locally unstable, equilibrium
point. Thus, the state necessarily flows to a point in the matched submanifold wherein
x̄ = xref .
Remark 2.2.2. The above proof works in part because the control (2.8) is a gradient
controller of the phasor centroid error potential:
∂ Ũ
∂θk
∂x̄
= KN x̄ − xref ,
∂θk
E
D
= K x̄ − xref , x⊥
k
θ̇k = K
=
(2.23)
(2.24)
(2.25)
N
E
K XD
xj − xref , x⊥
k
N
(2.26)
N
KX
sin(θj − θk ) − Kρref sin(θref − θk ).
N
(2.27)
j=1
=
j=1
When the coupling gain K is negative, the state descends the phasor error potential to
zero, at which point x̄ = xref as shown in the proof. Previous work has shown that
the Kuramoto model, with natural frequencies set to zero, can also be derived from a
gradient of a potential [73, 74, 87]. Here, the gradient is simply the phasor centroid
potential,
U (θ) =
N
kx̄k2 .
2
(2.28)
When the reference vector is at the origin, the two potentials are equivalent. Thus, the
Kuramoto-inspired nature of the control (2.8) is apparent, and further, Theorem 2.2.1
can be seen as a generalization of this previous work.
Remark 2.2.3. The initial condition requirement in the theorem has a physical interpretation. If the phasor centroid and reference vectors are not initially collocated,
x̄ 6= xref , at least one agent must be non-parallel with the vector from the phasor centroid to the reference vector. This condition is not difficult to satisfy as the set of points
where P loses rank has measure zero.
17
The control in Theorem 2.2.1 was inspired by the Kuramoto model (2.1) of phase
coupled oscillators. As mentioned in Remark 2.2.2, the control (2.8) is equivalent to
the Kuramoto model when all natural frequencies are zero. In this case, the theorem
is consistent with previous results which show that a negative control gain will result
in an anti-aligned, or equivalently balanced, state in which, by definition, the phasor
centroid is located at the origin.
The set of states θ for which group state is aligned or balanced (equivalently antialigned) can be respectively defined as follows,
Aligned Submanifold:
A = {θ ∈ TN | ρ̄ = 1}
(2.29)
Balanced Submanifold:
B = {θ ∈ TN | ρ̄ = 0}.
(2.30)
Similarly, we define the set of all states θ ∈ TN for which the phasor centroid matches
the reference vector as the matched submanifold,
(
M(xref ) ≡
N
θ∈T
)
N
1 X
xk = xref .
N
(2.31)
k=1
The control (2.8) asymptotically stabilizes the phasor centroid to the reference vector
and in doing so, brings the state to a point in the matched submanifold. Please refer
to Figure 2.2.
In relation to previous work, Theorem 2.2.1 provides a means of stabilizing the
phasor centroid to an arbitrary reference vector whereas other results only permit
stabilization of the phasor centroid magnitude to either one or zero, corresponding to
aligned and anti-aligned group states.
2.2.2
Dynamic Reference Vector Matching
We now relax the constant reference vector assumption and extend the previous result
to allow for a dynamic reference vector.
Theorem 2.2.4 (Matching a Dynamic Reference Vector). Consider a system consisting of N > 1 phase integrators (2.3) and a known dynamic reference vector, xref (t) =
18
ρref (t)∠θref (t), in the closed unit ball (ρref (t) ≤ 1, ∀t > 0). The phase control law
uk =
N
KX
sin(θj − θk ) − Kρref sin(θref − θk ) + hk (t), K < 0,
N
(2.32)
j=1
will asymptotically stabilize the phasor centroid (2.4) to the known dynamic reference
vector if h(t) ∈ RN exists satisfying


N
1 X − sin(θk )
hk (t) = ẋref (t)
N
cos(θ )
(2.33)
k
k=1
at each point in time and the initial condition satisfies either (a) P from (2.9) has full
row rank or (b) x̄ = xref .
Proof. The proof follows that of Theorem 2.2.1 closely until the calculation of the Lyapunov function derivative (2.15). Because xref (t) is no longer constant, the derivative
has an additional term,
V̇ (θ) = N x̄ − xref , x̄˙ − ẋref (t) ,
(2.34)
The controller, with the additional feedforward term, can now be written as,
N
E
K XD
xj − xref , x⊥
uk =
k + hk (t),
N
(2.35)
j=1
so that the velocity of the phasor centroid becomes,


N
N
E
1 X ⊥ K X D

x̄˙ =
xj − xref , x⊥
xk
k + hk (t)
N
N
k=1
=
(2.36)
j=1
N
N
E
K X ⊥D
1 X ⊥
xk x̄ − xref , x⊥
xk hk (t)
+
k
N
N
k=1
(2.37)
k=1
N
E
K X ⊥D
=
xk x̄ − xref , x⊥
k + ẋref (t).
N
(2.38)
k=1
When substituted back into (2.34), the motion of the reference vector drops out of the
equation, so we are left with
N D
E2
X
V̇ (θ) = K
x̄(θ) − xref , x⊥
,
k
i=1
as in Theorem 2.2.1, and the result follows.
(2.39)
19
Remark 2.2.5. Note that (2.33) can be rewritten as the linear system
Jh(t) = ẋref (t),
(2.40)


− sin(θ1 ) − sin(θ2 ) . . . − sin(θN )
.
J =
cos(θ1 )
cos(θ2 ) . . . cos(θN )
(2.41)
where
A solution for h(t) exists whenever ẋref (t) is in the range space of J. A sufficient condition here is that all agents be non-parallel, which results in a full row-rank matrix
J guaranteeing a solution for h(t). Situations in which vehicles end up parallel occur
infrequently and can even be avoided in most cases by choosing h(t) to keep headings
separated.
An implicit requirement of Theorem 2.2.4 is that the dynamics of the reference
vector be known precisely. If tracking a planned trajectory, the prescribed reference
vector dynamics can be known a priori by each agent. Otherwise, in situations like
target tracking and UAV operation in wind, the reference vector dynamics may only
be known approximately through an estimation process. Informally, an input-to-state
stability property, in which a better approximation of the reference vector dynamics
results in a smaller steady state error between the phasor centroid and the reference
vector, appears here.
2.2.3
Extension to Higher Dimensions
The ideas presented in this section extend naturally from θk ∈ T1 to higher dimensional
orientation spaces. Let m denote the dimension of the state of each agent. In the
material presented thus far, the state of each agent is a single point in T, so m = 1. This
state could, for example, represent the yaw angle (only) of a UW-FAV. To represent and
apply the coordinated to higher-dimensional spaces, for example to include yaw and
pitch of each UW-FAV (m = 2), a change in notation is required. The change amounts
to representing orientation as an element of a special orthogonal group, SO(·), instead
20
of a point in a torus. For the 1D case,
h
i
θk ∈ T ⇒ tk nk ∈ SO(2),
(2.42)
h
iT
h
iT
where tk = cos θk sin θk and nk = − sin θk cos θk are unit tangent and normal
basis vectors. In these coordinates, the phase integrator dynamics can be written as

ṫk


0
 =
ṅk
−uk
uk

tk

 .
0
nk
(2.43)
As for the control law, recall that the controller can be derived as the gradient of the
phasor centroid error potential (2.11). In the new coordinates, this potential becomes
Ũ (g) =
N
kt̄(g) − tref k2 ,
2
(2.44)
where g ∈ SO(2)N is the collective group state, tref is a unit vector in the reference
direction (tref = xref for m = 1), and
t̄(g) =
N
1 X
tk
N
(2.45)
k=1
is the familiar phasor centroid. Taking the gradient of (2.44) with respect to tk reveals
ṫk = uk nk = K
∂U
= t̄ − tref ,
∂tk
(2.46)
analogous to (2.23). The scalar input, uk , can be isolated by pre-multiplying both sides
of (2.46) by nTk , resulting in
uk = KnTk (t̄ − tref ),
(2.47)
When the inner product is expanded using the definition of t̄, the resulting control is
equivalent to (2.8).
This new notation and control derivation extends easily to higher-dimensions. For
example, with m = 2, the state of each agent is
h
i
tk nk bk ∈ SO(3),
(2.48)
21
where bk is a unit binormal vector, bk = tk × nk . The phase integrator dynamics (2.3)
now have two inputs,
ṫk = uk nk + vk bk
(2.49)
∂U
∂tk
(2.50)
=K
= K t̄.
(2.51)
Because nk and bk are orthonormal, uk and vk can be isolated by pre-multiplying by
nTk and bTk , respectively, to obtain
uk = −KnTk t̄,
(2.52)
vk = −KbTk t̄.
(2.53)
This control will drive the phasor centroid to the reference vector. The theorem and
proof are similar to that of Theorem 2.2.1, but in the modified notation. They can be
found in our conference paper [43], which also contains a theorem and proof to allow
dynamic reference vectors, similar to Theorem 2.2.4. Extensions to general dimension
m follows directly from these results.
2.3
Controlled Oscillation in the Matched Manifold
The results of the previous section guarantee stability of the state, θ ∈ TN , to a point
in the matched submanifold (2.31) under the Kuramoto-inspried phase integrator control. In this section, we explore an additive control term to move the state within
the matched submanifold. In particular, we are interested in creating symmetric and
periodic oscillations (orbits of theta) that extend the classical definition of the splay
state.
Motion within the matched submanifold, specifically periodic motion, is desirable
for a number of reasons. For example, in the application of coordinated target tracking
by constant-speed unicycles, motion within the matched submanifold is needed to prevent vehicles from going straight once the velocity of the centroid matches the desired
vector. Also, in biological modeling applications, the agents often continue heading
control even when the group is moving at a desirable velocity.
22
The additive term considered in this section will be denoted v, and thus the resulting composite phase integrator control will have the form:
θ̇ = u + v,
(2.54)
where u is one of the previously derived controllers like (2.8) or (2.32).
2.3.1
The Splay State and Splay Orbits
In the terminology of classical phase-coupling, the splay state is a special balanced
state in which angle differences between adjacent nodes are equal to 2π/N , see Figure
2.3. Here, we define the submanifold of all such balanced splay states as
S = θ ∈ TN |θγ(k+1) − θγ(k) = 2π/N, k = 1, . . . , N ,
(2.55)
where γ is a permutation of 1, 2, . . . , N . When γ is the identity permutation, we will
refer to the balanced splay state as being ordered.
The tangent bundle to the splay manifold (i.e. motion the keep the state splay) is
the vector of ones. Thus a constant, homogeneous, rotation of a splay state,
θ̇ = ω 1,
θ(0) ∈ S,
(2.56)
produces a periodic balanced splay orbit during which the phasor centroid is located
at the origin at every time instant. Here, we extend this notion of a splay orbit to
situations in which the phasor centroid is located away from the origin. In doing so,
we will focus only on the design of centroid-locked control, denoted v, and later add in
u to create a composite phase-integrator control law as in (2.54).
The main feature that characterizes the balanced splay state is that agents are
equally spaced in phase, Figure 2.3. This characterization, however, does not extend to
non-balanced states because equal angle separation between agents necessarily places
the phasor centroid at the origin. A concept which does extend is that along the balanced splay orbit, the agent headings are equally spaced in time. With this concept in
mind, we provide the following definitions which generalize and extend the splay-state
concept:
23
Definition 2.3.1 (Centroid-Locked Orbit). A centroid-locked orbit is a periodic trajectory of θ in which all points in the trajectory have the same phasor centroid.
Definition 2.3.2 (Splay Orbit). A splay orbit is a specific centroid-locked orbit in which
the phases are identically equally spaced (splay) in time.
The concept of orbits in which the phases are equally spaced in time has been explored extensively in the literature. For example in [53], the existence of such orbits is
proved for Josephson junction phase systems, which has dynamics parameterized by
constants c1 , c2 , and c3 :
θ̇k = c1 + c2 sin(θk ) + c3
N
X
sin(θj ).
(2.57)
j=1
Other authors have referred to orbits in which the phases are equally spaced in time as
“wagon wheel” or “ponies on a merry-go-round” solutions [1, 2]. Our definition differs
in the requirement that all points along the trajectory must be in the equivalence class
of states for which the phasor centroid is collocated with the reference vector. The
related work by Kingston and Beard [37] also develops splay orbits, but again they
lack the centroid-locked requirement. In fact, the relative-circling control proposed in
[37] causes motion of the phasor centroid whenever the state is not balanced.
2.3.2
Equivariant Dynamical Systems
One concept prevalent in related work is the notion of equivariance. For the system
θ̇ = F (θ), Γ-equivariance implies
σ ◦ F (θ) = F (σ ◦ θ),
(2.58)
where σ ∈ Γ is a group action, and Γ is a group. For the work in this chapter, F is the
centroid-locked control, v, and the group action of interest is the ring permutation,
σ : 1, 2, . . . , N → 2, 3, . . . , N, 1.
(2.59)
Equivariant dynamical systems [86] have been studied since the 1970s and recently
caught the attention of mathematicians interested in studying phase-coupled oscillator
24
systems, see [12] for example. Early work on equivariant differential topology focused
on the geometry and symmetry present in these systems [19, 25], and has many parallels to the work in this section. Throughout much of this literature, splay-phase orbits
are largely suspected to be neutrally stable [59], at least for the Kuramoto model (2.1).
Our ultimate goal here is to stabilize splay orbits, as we shall soon see. In fact, using
properties of equivariance, Mirollo was able to prove the existence of splay orbits in
the Kuramoto model [53] (however, not the centroid-locked variety in which we are
interested) using a theorem along the lines of the following lemma.
Lemma 2.3.3 (Equivariant Centroid-Locked Control). An equivariant centroid-locked
control v that takes the state of the N -phase integrator system (2.3) from
θ(t) = [θ1 (t), θ2 (t), . . . , θN (t)]
(2.60)
θ(t + T /N ) = σθ(t) = [θ2 (t), θ3 (t), . . . , θN (t), θ1 (t)]
(2.61)
to
in finite time T /N ensures that the ensuing state trajectory will be a periodic splay orbit
with period T .
Proof. This lemma comes from the symmetry inherent in ring-permutation-equivariant
control. All agents apply the same control, but splay in time. Thus, at time t +
2T
N ,
the state will be θ(t + 2T /N ) = σ 2 θ(t) = [θ3 (t), θ4 (t), . . . , θ1 (t), θ2 (t)] and so forth until
θ(t + T ) = σ N θ(t) = θ(t).
Example 2.3.4. When the θ ∈ S, the homogeneous centroid-locked control, v = ω 1,
satisfies the precondition of Lemma 2.3.3. The resulting balanced splay orbit has period
T = 2π/ω.
2.3.3
Constants of Motion
Perhaps the most illuminating work on phase coupled systems, including equivariant
ones, is the work of Watanabe and Strogatz [87, 88] from 1994. In this work, they
studied phase coupled systems of the form
θ̇k = f + g cos(θk ) + h sin(θk ),
j = 1, . . . , N,
(2.62)
25
where f, g, and h are 2π periodic functions. Importantly, the Josephson junction phase
system (2.57) can be written in the above form. While the system (2.62) appears to
have N degrees of freedom, [87] presents a nonlinear coordinate transformation in
which only three degrees of freedom are required to drive the entire system. The other
N − 3 degrees are constants of motion. We shall see many parallels between [87] and
the development of motion on the matched manifold in the following section.
2.3.4
Motion on the Matched Manifold
The following theorem establishes the space of allowable additive controls, v.
Theorem 2.3.5 (Centroid-Locked Control). Consider a system consisting of N > 1
phase integrators (2.3) and assume the state is initially in the matched submanifold,
θ ∈ M(xref ), for a given constant reference vector, xref = ρref ∠θref , ρref ≤ 1. The
state will remain in the matched submanifold under any phase control law (θ̇k = vk )
satisfying
v ∈ ker J
(2.63)
v=0
(2.64)
when J has full row rank and
otherwise. Here, J was previously defined in (2.41).
The proof of this theorem relies upon formal definitions of tangent spaces to a manifold and in particular, the submersion theorem. More information about these topics
can be found in [51]. We first review these concepts and then present proof of the
theorem.
Definition 2.3.6 (Tangent Space [51]). Given a manifold M and element m ∈ M , the
tangent space to M at m is the set of equivalence classes of curves at m:
Tm M = {[c]m |c is a curve at m}.
For a subset A ⊂ M , let
T M |A = ∪m∈A Tm M (disjoint union).
26
We call T M = T M |M the tangent bundle of M . The mapping τM : T M → M defined by
τM ([c]m ) = m is the tangent bundle projection of M .
Definition 2.3.7 (Tangent to a Map). If f : M → P is of class C 1 , we define T f : T M →
T P by
T f ([c]m ) = [f ◦ c]f (m) .
We call T f the tangent of f .
Definition 2.3.8 ([51], Ch. 3). Suppose M and P are manifolds with f : M → P of
class C r , r ≥ 1. A point p ∈ P is called a regular value of f if for each m ∈ f −1 ({p}),
Tm f is surjective with split kernel. Let Rf denote the set of regular values of f : M → P ;
note P \f (M ) ⊂ Rf ⊂ P . If, for each m in a set S, Tm f is surjective with split kernel, we
say f is a submersion on S. Thus p ∈ Rf iff f is a submersion on f −1 ({p}). If Tm f is not
surective, m ∈ M is called a singular point and p = f (m) ∈ P a singular value of f .
Theorem 2.3.9 (Submersion Theorem). Let f : M → P be of class C ∞ and p ∈ Rf . The
the level set
f −1 (p) = {m|m ∈ M, f (m) = p}
is a closed submanifold of M with tangent space given by Tm f −1 (p) = ker Tm f .
With these ideas in mind, we are now ready to prove Theorem 2.3.5.
Proof of Theorem 2.3.5. The tangent space to the matched manifold is Zariski in that
the tangent space has variable dimension. Here, we take

 

N
X
cos(θ
)
ρ
cos(θ
)
1
k
ref

 −  ref
.
f (θ, xref ) =
N
sin(θ )
ρ sin(θ )
k=1
k
ref
ref
For this system Tθ f = J, and the tangent space of the matched submanifold is given by
ker Tθ f at regular points, by the submersion theorem. These regular points correspond
to points θ where J has rank 2 so that the kernel of J is surjective. Points where J
loses rank are singular because the kernel of the null of J is not surjective at these
points. These points correspond to situations in which all phasors are parallel.
27
Moving in a direction tangent to a manifold keeps the state in the manifold, by
definition. For the regular points, this condition corresponds to choosing v ∈ ker J. At
singular points, v ∈ ker J is not sufficient to keep the state within the manifold. Here,
we require v = 0, which certainly does keep the state within the matched submanifold.
In the following exposition, we will focus on the design of v satisfying (2.63) that
produces oscillatory state trajectories within the matched submanifold. To distinguish
between the allowable control inputs, we have the following definition.
Definition 2.3.10 (Centroid-Locked Control). A centroid-locked control, henceforth denoted by v, is a phase control input that satisfies (2.63). Lemma 2.3.5 guarantees that
˙
the phasor centroid will not move (x̄(t)
= 0) for any state trajectory generated by a
centroid-locked control.
Centroid-locked control moves the state in such a way so as to preserve the phasor
centroid, independent of whether or not the phasor centroid matches the reference
vector. This concept is formalized in the following theorem.
Theorem 2.3.11 (Combined Control). Consider a system consisting of N > 1 phase
integrators (2.3) and a constant reference vector, xref = ρref ∠θref , in the closed unit
ball (ρref ≤ 1). Any phase control law of the form
N
KX
uk =
sin(θj − θk ) − Kρref sin(θref − θk ) + vk , K < 0
N
(2.65)
j=1
where v = [v1 , v2 , . . . , vN ]T is a centroid-locked control, will asymptotically stabilize the
phasor centroid (2.4) to the reference vector provided the initial condition is such that
either P from (2.9) has full rank or x̄ = xref .
Proof. We return here to the proof of Theorem 2.2.1, but now with the composite control (2.65). Rewriting the composite control using the inner product,
uk =
N
E
K XD
xj − xref , x⊥
k + vk ,
N
j=1
(2.66)
28
the velocity of the phasor centroid,
x̄˙ =
N
1 X
x˙k
N
(2.67)
i=1


N
N
E
1 X ⊥ K X D

xk
xj − xref , x⊥
=
k + vk
N
N
i=1
=
(2.68)
j=1
N
E
K X ⊥D
xk x̄ − xref , x⊥
k ,
N
(2.69)
i=1
remains as it was in Theorem 2.2.1 because either v = 0 (when θ is a singular point)
or
N
1 X ⊥
xk vk = Jv = 0.
N
(2.70)
i=1
The remainder of the proof follows the previous result.
Remark 2.3.12. The control in Theorem 2.3.11 is designed for a constant reference
vector. However, the result can be extended to a dynamic reference vector by adding
a centroid-locked control to (2.32) from Theorem 2.2.4. This extension again works
because of the orthogonality of the centroid-locked control.
2.4
Equivariance and Splay Orbits
This section is devoted to development of an equivariant centroid-locked control that
produces splay orbits for non-balanced group states. The exposition will begin with
two agents and build up to general N . Some open problems remaining here will be
pointed out along the way.
2.4.1
Two Agents
With only two agents, J from (2.41) is full rank whenever the order parameter, ρ̄, is
between zero and one, 0 < ρ̄ < 1. For any of these points, however, no centroid-locked
control is possible because x̄ = xref for only two points in T2 , see Figure 2.4(a). When
the order parameter is zero, J is rank one, and the null space corresponds to the usual
homogeneous rotation, see Figure 2.4(b). When the order parameter is one, the state
29
is singular because both agents are aligned. The null space of J is [1, -1]. Note that
motion along this direction would take the state off the matched submanifold, and the
theorem appropriately requires v = 0 here.
2.4.2
Three Agents
For the case of three agents,


− sin(θ1 ) − sin(θ2 ) − sin(θ3 )

J = 1/3 
cos(θ1 )
cos(θ2 )
cos(θ3 )
(2.71)
has full row rank whenever the agents are non-parallel, in which case the null space
of J can be computed analytically. For example, the cross product of the two rows of J
gives


sin(θ3 − θ2 )




v = sin(θ1 − θ3 ) ,


sin(θ2 − θ1 )
(2.72)
which is both equivariant and in the kernel of J. Further, (2.72) is a valid centroidlocked control because it is in the kernel of J when the state is non-singular and is zero
whenever J loses rank due to parallel agents. Choosing this v as the centroid locked
control makes clear the fact that the centroid-locked control is generally non-constant.
Instead, it has a natural all-to-all nonlinear feedback form. The information pattern
is unusual in that information used by each agent is the difference between the other
two agents.
For three agents, the balanced and splay manifolds are equivalent. When the state
is balanced/splay, all angle differences are equal, and the resulting centroid-locked
control, v from (2.72), reduces to a constant vector in the span of 1. Thus the equivariant centroid-locked control presented above can be seen an an an extension of the
homogeneous natural frequency often selected for Kuramto-inspired heading control.
When the state is aligned (a singular point), the control (2.72) reduces to zero, as it
must because the matched submanifold reduces to a single point whenever the order
parameter is one.
30
The null space of J gains a dimension when all headings are parallel (a singular
point). To explore these degenerate cases, assume without loss of generality that the
phasor centroid is located at [ρ̄, 0]. The degenerate cases are when one, two, or three
headings are at θ = 0, and the other headings are at θ = π. With the first two headings at zero and the third at π, the null of J is spanned by the following two linearly
independent vectors
   



0 
1


   
   
ker J = span 0 , 1 .
   




 1
1 
(2.73)
The case of two headings at π is equivalent after a rotation of θ by π. In either case,
motion along either of these directions is sustainable although linear combinations are
not. The centroid-locked control from v (2.72) is zero when all vehicles are parallel (as
required by the definition of centroid-locked control). When all three headings are at
zero, the state is aligned (again singular), and the null space is perpendicular to the
vector of ones. Any non-zero rotation is not possible as these rotations will break the
aligned state. Thus no rotation is possible, and v = 0 accordingly, for an aligned state.
An interesting bifurcation occurs when the order parameter is 1/3. Above 1/3, no
heading can point opposite of the phasor centroid because even if the other two headings were aligned with the phasor centroid, the order parameter could only be as large
as 1/3. The centroid-locked control (2.72) automatically results in a back-and-forth
motion with winding number of zero. This sort of back-and-forth motion was not possible using the controller proposed in [37]. When the order parameter is less than
1/3, one heading can point opposite of the phasor centroid. The centroid-locked control
(2.72) results in a clockwise or counter clockwise circling motion with winding number
of plus or minus one. This circling motion preserves the phasor centroid, unlike the
relative-circling control proposed in [37]. When the order parameter is exactly 1/3,
(2.72) reaches an equilibrium point when all oscillators are parallel.
When the order parameter is not 1/3 (or 1), the oscillations produced by (2.72) are
splay orbits, see Figure 2.5. Proving this result, however, is complicated by the fact
that the equivariant centroid-locked control law (2.72) is not analytically integrable.
31
We hope in the future to be able to derive constants of motion as done by Watanabe
and Strogatz [87], but the system (2.72) does not fit into the required structure (2.62).
2.4.3
Four agents
In the case of four agents, the matrix J is



− sin(θ1 ) − sin(θ2 ) − sin(θ3 ) − sin(θ4 )
 .
v ∈ ker J = 1/4 
cos(θ1 )
cos(θ2 )
cos(θ3 )
cos(θ4 )
(2.74)
As with the previous examples, J is rank two whenever the agents are non-parallel,
and thus v ∈ ker J is sufficient to keep θ ∈ M(xref ) for these non-singular points.
When all agents are parallel, the kernel of J gains a dimension, so these points are
singular.
Finding an equivariant centroid-locked control analogous to (2.72) is more difficult
with four agents. Nonetheless, an equivariant control like (2.72) can be found by summing the cross products generated by taking all four combinations of three agents at
a time. Denote by bijk the cross product obtained by considering agents i, j, and k and
zeroing remaining agent,
h
iT
b123 = sin(θ3 − θ2 ) sin(θ1 − θ3 ) sin(θ2 − θ1 ) 0
h
iT
b124 = sin(θ4 − θ2 ) sin(θ1 − θ4 ) 0 sin(θ2 − θ1 )
h
iT
b134 = sin(θ4 − θ3 ) 0 sin(θ1 − θ4 ) sin(θ3 − θ1 )
h
iT
b234 = 0 sin(θ4 − θ3 ) sin(θ2 − θ4 ) sin(θ3 − θ2 ) .
(2.75)
(2.76)
(2.77)
(2.78)
Because each of these four vectors is in the null of J at non-singular points and is zero
at singular points, their sum is an admissible centroid-locked control. Further, their
sum is equivariant, so we choose it as the centroid-locked control,


sin(θ3 − θ2 ) + sin(θ4 − θ2 ) + sin(θ4 − θ3 )




sin(θ4 − θ3 ) + sin(θ1 − θ3 ) + sin(θ1 − θ4 )
.
v=


sin(θ1 − θ4 ) + sin(θ2 − θ4 ) + sin(θ2 − θ1 )


sin(θ2 − θ1 ) + sin(θ3 − θ1 ) + sin(θ3 − θ2 )
(2.79)
32
As with two and three agents, when the state is balanced, this control vector is in the
span of the vector of ones. Like the three-oscillator case, circling is not possible above
a certain order parameter, which is 1/2 here. Finally, this equivariant control again
has an anti-relative-feedback structure in that control for each agent is based on phase
differences of the other agents.
While the equivariant centroid-locked control v does not move the phasor centroid
from its initial condition, specific initial conditions are required to produce a splay
orbit. To see this point clearly, consider a N = 4 non-splay balanced initial condition,
as in Figure 2.6(a), for which v reduces to a constant vector in the span of 1. Thus
for any balanced state, the centroid-locked control can be integrated to see that the
resulting orbit is not splay, Figure 2.6(b). If, however, the initial condition is splay
(θ ∈ S) as in Figure 2.6(c), it will remain splay and result in a splay orbit, Figure
2.6(d). When the initial state is not balanced, the structure of v precludes an analytical
solution to the differential equation. We have determined specific initial conditions
which result in circling or back-and-forth splay orbits as a function of the initial phasor
centroid, however determining if a particular state lies on a splay orbit remains an
open problem, even for N = 4.
For N = 4, we have recently found a way to drive the state from an arbitrary
initial condition to one on a splay orbit, be it the circling mode or the back-and-forth
oscillation mode. The key result here comes from the fact that under control (2.79), the
scalar I = θ1 − θ2 + θ3 − θ4 is invariant. To verify this invariance, note that
I˙ = θ̇1 − θ̇2 + θ̇3 − θ̇4
= sin(θ3 − θ2 ) + sin(θ4 − θ2 ) + sin(θ4 − θ3 )
(2.80)
(2.81)
− sin(θ4 − θ3 ) − sin(θ1 − θ3 ) − sin(θ1 − θ4 )
(2.82)
+ sin(θ1 − θ4 ) + sin(θ2 − θ4 ) + sin(θ2 − θ1 )
(2.83)
− sin(θ2 − θ1 ) − sin(θ3 − θ1 ) − sin(θ3 − θ2 )
(2.84)
= 0.
(2.85)
When the state is perfectly in the circling mode, I = 2kπ whereas when in the back-
33
and-forth mode, I = 2kπ + π, where k ∈ Z is an integer. Most importantly, I can be
controlled through an anti-equivariant,
σ ◦ F (θ) = −F (σ ◦ θ),
(2.86)
form of the vector orthogonal to (2.79) in the nullspace of J. This vector, denoted
v⊥ is unfortunately too long to be given here (it would take an entire page to express).
Nonetheless, we find numerically that I˙ is proportional to v⊥ so that a particular value
of I can be stabilized using a linear feedback term,
ṽ = v − κ(I − I ∗ )v⊥ ,
κ > 0,
(2.87)
where I ∗ is the desired value of I. Because v⊥ ∈ ker J at non-singular points, this
control will not disturb previously developed matching control (2.8). Thus, we can
control the oscillation mode for N = 4. See the demonstration in Figure 2.7.
2.4.4
General N
The equivariant centroid-locked control for N = 3 and N = 4 derived in the previous
sections suggest that the centroid-locked control for a general number of oscillators
should have the form of a sum of sines. Indeed, extending the pattern to general N , we
arrive at the following equivariant control,
vk =
k+N
X−1 k+N
X−1
sin(θi − θj ),
(2.88)
j=k+1 i=j+1
(wrapping at N is implicit). This control reduces to the previously derived control for
N = 3 and N = 4. In addition, when the order parameter is zero, this control collapses
to a constant vector in the span of 1, and when the order parameter is one, the control
goes to zero. Indeed, this control is in the nullspace of J for any N and non-aligned
state corresponding to non-singular θ. As with N = 4, the idea here is that the control
can be written the sum of N 3−2 basis vectors, as in (2.78).
As with N = 4, the result cannot be integrated analytically, and most initial conditions do not result in splay orbits. Choosing good initial conditions becomes more
34
challenging with increasing N because each agent adds one degree of freedom to the
systems. An open problem here is stabilizing splay orbits from arbitrary initial conditions in TN , as was just demonstrated for the N = 4 case. When the phasor centroid
is balanced, the splay state (and therefore splay orbits) can be stabilized using a gradient controller of a potential that stabilizes higher-order moments, see [12, 66]. In
these works, if the information exchange between agents is circulant, then symmetric patterns, like the balanced splay state, appear as columns of the discrete Fourier
transform matrix of the graph Laplacian. However, this methodology does not extend
to situations in which the phasor centroid is not located at the origin.
Many parallels can be found between (2.88) and the equivariant control presented
at the end of Watanabe and Strogatz, [87]. However, (2.88) does not fit into the standard form (2.62) required in [87]. We have been unable to find constants of motion for
(2.88), but suspect they exist due to the highly periodic nature of the solution trajectories. We also suspect that these constants could be used to stabilize particular modes,
as in (2.87), for N = 4. Work in this direction is ongoing.
2.5
Closed Kinematic Chain Analogy
The reference matching and centroid-locked controls presented here have a visual interpretation in the form of a closed kinematic chain. A closed kinematic chain is a
series of links pinned, but free to rotate, at either end. Here, we consider a chain with
N unit-length links where the start of the chain is pinned to the origin. The end of the
chain is located at a point N times the phasor centroid of the link angles away from
orign. Thus, for a desired reference vector, the state is in the matched submanifold
only when the end of the chain is located at a point that is N times the reference vector
away from the start. The effect of the matching controller (2.8) is to bring the end of
the chain to this point.
The effect of the centroid-locked control is to move the chain around in such a way
so that neither the starting nor ending points of the chain move. Splay orbits of the
state correspond to wave like motion which propagate down the chain. See Figure 2.8
35
for an example.
2.6
Final Remarks
The work in this chapter presented an extension to the homogeneous Kuramoto phase
coupled oscillator models in which the phasor centroid is stabilized to an arbitrary
reference vector in the unit circle. Also considered was the problem of generating
oscillation that maintains the phasor centroid constraint, and specific attention was
given to the problem of generating splay orbits. In doing so, a connection was made
between equivariance of the control and an extension of the balanced splay state. The
main ideas developed here are consistent with existing results in the literature when
the phasor centroid is at the origin or is on the unit circle.
There are many directions in which to take this work. In particular, the required
delay-free all-to-all coupling may be overburdening to distributed systems connected
by wireless communication. Future work could explore an extension of the excellent
work by Jadbabaie on the Kuramoto model with incomplete interaction topology [28].
Another possible direction here is to add a linear consensus term to locally estimate the
phasor centroid on each agent, along the lines of the work by Paley and Leonard [66].
This extension should immediately allow the matching portion of the research presented here to work with limited interaction. However, generating oscillations within
the matched submanifold using limited information only could prove more challenging.
Another direction of future work is to further explore the spatial aspects of the
coordinated heading control for use in conjunction with unicycle models. Some work in
this direction will be discussed in the later chapters of this thesis. Also presented later
is a coordinated control demonstration on the UW Fin-Actuated Underwater Vehicle
testbed, based on the theory developed in this chapter.
36
(a) Aligned
(b) Balanced (Anti-aligned)
(c) Reference Matched
Figure 2.2: Example aligned, balanced, and reference matched group states are shown
on a phasor plot. Each gray circle represents the heading of one agent. The phasor
centroid, x̄, is marked with a gold star, and the reference vector, xref , shown only in
(c), is denoted with a green circle.
2
1
3
4
Figure 2.3: A balanced splay state is shown for N = 4 agents.
37
1,2
1,2
2,1
(a) N = 2 with 0 < ρ̄ < 0
2,1
(b) N = 2 with ρ̄ = 0 (splay)
Figure 2.4: (a) No centroid-locked control is possible for N = 2 agents when 0 < ρ̄ < 1
because the matched manifold consists only of two points as shown above. (b) When
the state is splay (ρ̄ = 0), uniform rotation is possible, as shown by the red arrows.
4
4
3
3
2
2
1
1
Theta (rad)
Theta (rad)
38
0
−1
0
−1
−2
−2
−3
−3
−4
0
5
10
Time (s)
(a) ρref = 0.3, θref = 0
15
20
−4
0
5
10
Time (s)
15
20
(b) ρref = 0.6, θref = 0
Figure 2.5: A simulation of θ̇ = u + v with u from (2.8) and v from (2.72). Matching
control causes the initial transient in which the phasor centroid is brought to the reference. Then the controlled oscillation results in either a circling mode, when ρref < 1/3
(a), or a back-and-forth mode, when ρref > 1/3 (b). The oscillations are equally spaced
in time, and the phasor centroid is at the reference for all points, so these oscillations
are splay orbits.
39
4
3
2
2
Theta (rad)
1
1
0
−1
−2
3
−3
4
−4
0
(a) Balanced (Non-Splay) Initial State
5
10
Time (s)
15
20
(b) Non-Splay State Trajectory
4
3
2
2
Theta (rad)
1
1
0
−1
−2
3
−3
4
−4
0
(c) Balanced (Splay) Initial State
5
10
Time (s)
15
20
(d) Splay State Trajectory
Figure 2.6: A non-splay initial balanced state (a) results in a non-splay orbit (b), but
a splay initial balanced state (c) does result in a splay orbit (d). This initial condition sensitivity exists for non-balanced initial states as well, but deciding which initial
states will yield splay orbits is an open problem.
4
4
3
3
2
2
1
1
Theta (rad)
Theta (rad)
40
0
−1
0
−1
−2
−2
−3
−3
−4
0
5
10
15
20
Time (s)
25
30
35
40
−4
0
(a) Stabilization of back-and-forth mode
5
10
15
20
Time (s)
25
30
35
40
(b) Stabilization of circling mode
Figure 2.7: Mode stabilization using control (2.87) for x̄ = 0.4∠0 is enabled at
Time=20s. The initial state was selected randomly and produces a non-splay oscillation. Splay oscillations of either the back-and-forth mode (a) or the circling mode (b)
are stabilized with I ∗ = 0 or I ∗ = π, repectively.
41
5.5006
7.0044
8.4121
2
2
2
1.5
1.5
1.5
1
1
1
0.5
0.5
0.5
0
0
0
−0.5
−0.5
−0.5
−1
−1
−1
−1.5
−1.5
−1.5
−2
−1
0
1
−2
−1
2
0
(a) t = 5.5
1
2
0
(b) t = 7.0
1.5
1.5
1
1
0.5
0.5
0
0
−0.5
−0.5
−1
−1
−1.5
−1.5
1
2
11.4116
2
0
1
(c) t = 8.4
9.9056
2
−2
−1
−2
−1
2
−2
−1
(d) t = 9.9
0
1
2
(e) t = 11.4
Figure 2.8: One full period of the chain links for a back-and-forth splay oscillaiton. The
beginning of the chain is located at the origin whereas the end of the chain as driven
to N x̄ = [1.6, 0] by the matching control (2.8).
42
Chapter 3
CONTROLLED SINUSOIDAL COUPLING: HETEROGENEITY
THROUGH LEADERSHIP
In the previous chapter, the system was homogeneous in that each agent was equivalent to every other agent, and every agent implemented the same control law. In this
chapter, we consider a heterogeneous coordinated phase control problem in which most
agents follow the homogeneous natural frequency Kuramoto (sinusoidal phase) model,
but a select few agents instead act as leaders. These leaders have the ability to report
some value other than their current heading to their neighbors. This heterogeneous
model is motivated by systems in biology and in engineering. In biological contexts,
observations have been made that groups of trained animals can bias the behavior
of a much larger groups of untrained animals[67]. Translating these results to engineered contexts is of interest to, for example, reducing the number of human operators
necessary to control a fleet of many autonomous agents. The thesis of this chapter is
that a select few (leader) agents can, under some topologies, control all other (follower)
agents, effectively reducing complexity through controller heterogeneity. The contributions of this chapter come from [40] and include a general reachability result and
a nonlinear control analysis of three interesting example problems. Analytical results
are supported by design and simulation of a controller for the leader agents.
3.0.1
Related Work
Unlike classical leader/follower control design, the work presented in this chapter requires that the followers do not know which agents are leaders, and thus treat all
neighbors equivalently. The leaders have the potential to report some value other
than their current state to their neighbors, but each neighbor receives the same value
from a particular leader. A similar heterogeneous multi-agent systems model has been
43
studied in the context of the controlled linear agreement problem [8, 30, 31, 71, 70], in
which follower agents apply a linear consensus protocol. A key result is that the ability
of a single leader node to control the other nodes is dependent on the topology of the
communication network. In particular, if the topology is symmetric about the leader
node, the states of the other nodes are not controllable in the typical linear systems
sense. Related work using a connection graph method to prove stability of aligned
state is presented in [5].
A parallel research theme can be found in biology, where researchers are working
to understand how heterogeneity and topology influence the behavior of large aggregations. Ongoing research with heterogeneous aggregations of Giant Danio (Devario
aequipinnatus) suggests that as few as three trained (i.e. leader) fish are required to
make twelve untrained (i.e. follower) fish behave as if they were trained [67].
3.0.2
Overview
In this chapter, the theme of heterogeneity through leadership is extended from previous linear consensus studies to situations in which the follower agents obey a nonlinear
(sinusoidal) phase coupling protocol. The overall controlled sinusoidal coupling problem, consisting of leader and follower nodes, can be seen as a sort of heterogeneous
coordinated control system. Understanding this heterogeneous coordination is important not only for engineered systems, but also for improved modeling of heterogeneity
in biological aggregations.
3.0.3
Contribution
The contributions in this chapter are as follows. The dynamics of the controlled sinusoidal coupling problem are first written for an arbitrary interconnection topology
and leadership assignment. A basic result showing that the aligned set is reachable
from initial conditions on a hemisphere is established in the first theorem. Then, the
dynamics are rewritten in several ways to permit a complete but informal analysis of
three specific example problems. These examples are intended to highlight main dif-
44
ferences between controlled linear and controlled sinusoidal protocols. An interesting
conclusion is that symmetry about the leaders does not imply uncontrollability of the
follower agents.
3.0.4
Organization
The material presented here is organized as follows. In the next section, mathematical
preliminaries used throughout the chapter are presented. The system dynamics are
described in Section 3.2. Then, in Section 3.3, a basic reachability analysis is presented
for the controlled sinusoidal coupling problem. Analysis and simulation of specific
example topologies are presented in Section 3.4. Finally, concluding remarks are given
in Section 3.5.
3.1
Preliminaries
The notation, conventions and assumptions used throughout this chapter are described
in this section. Readers already familiar with graph theory, nonlinear control theory,
and graph-theoretic representations of sinusoidally phase-coupled oscillator models
may wish to skip ahead to the next section.
3.1.1
Graph Theory
A graph G = (V, E) is a set of nodes, V , and edges, E ⊆ V × V , in which each edge
connects one node to one other node, and no two nodes are joined by more than one
edge. The cardinality of the node and edge sets are denoted |V | and |E|, respectively.
A graph is said to be connected if there is a path from every node to every other node.
Throughout this chapter, all graphs are assumed to be undirected and connected, unless otherwise noted.
Associating an arbitrary direction with each edge, the directed incidence matrix, B,
45
of a graph G is a |V | × |E| matrix defined as




1



B(i, j) = −1





0
if edge j leaves node i
if edge j enters node i
(3.1)
otherwise.
The incidence matrix has rank |V | − 1 whenever the graph is connected [18]. The
neighbors of node i, Ni , are the set of nodes that are adjacent to vi in G. In a complete
graph, every node is adjacent to every other node. The Laplacian matrix associated
with an undirected graph can be computed as L = BB T . When the underlying graph
is connected, L has one eigenvalue at zero, and all other eigenvalues lie in the right
half-plane. The eigenvector associated with the zero eigenvalue is the vector of ones,
and thus for any connected graph, the linear consensus protocol,
ẋ = −Lx,
(3.2)
is globally asymptotically stable to the span of 1, called the agreement subspace.
A subgraph G0 = (V 0 , E 0 ) of G = (V, E) is a graph with nodes V 0 ⊆ V and edges
E 0 ⊆ E. An induced subgraph G0 (V 0 ) of graph G is formed by keeping only edges of G
that connect nodes in V 0 to other nodes in V 0 .
An interacting group of N agents can be modeled as a graph G = (V, E) in which
nodes represent agents, and edges represent inter-vehicle communication. For the
work in this chapter, two types of nodes are considered: leader nodes and follower
nodes. Thus, the node set can be partitioned into leader and follower node sets, VL and
VF , respectively. Decomposing G into corresponding subgraphs will prove useful later
in this chapter.
Definition 3.1.1 (Follower Subgraph). Define the follower subgraph GF to be the subgraph induced by the follower nodes, VF .
Definition 3.1.2 (Leader Subgraph j). Let vj ∈ VL be a leader. The subgraph corre-
46
sponding to this leader node is defined as
Gj = (V, Ej )
Ej = (vi , vj ) ∈ E vi ∈ VF .
(3.3)
(3.4)
In other words, Gj consists of all nodes, but only retains edges from E that connect
leader node j to its neighboring nodes that are followers.
Note that edges connecting one leader to another do not appear in either leader or
follower subgraphs. Denote by BF and Bj any directed incidence matrix associated
with the follower and j th leader subgraphs.
3.1.2
Nonlinear Control Theory
Some basic definitions and tools from the theory of nonlinear control will prove useful
in the later parts of this chapter. In particular, the dynamics considered here can be
written in standard control-affine form,
ẋ = f0 (x) +
m
X
fi (x)ui .
(3.5)
i=1
The first vector field, f0 , is called the drift of the system. The other m vector fields are
control vector fields, and ui is the ith control input. Unlike the drift vector field, the
control vector fields can be reversed or nulled entirely through the choice of inputs.
The set of all states attainable in exactly time T from a point x0 by any control input
is the T -reachable set from that point, denoted R(x0 , T ). The set of all states that can
be reached from x0 is called the reachable set, R(x0 ). A system is said to be controllable
from x0 if every other point in the domain is reachable from x0 (i.e. x ∈ R(x0 ) ∀x ∈ D).
Finally, a system is small time locally controllable at a point x0 if there is a T > 0 such
that x0 ∈ R(x0 , t) for each t ∈ (0, T ] [13].
3.1.3
A Graph-Theoretic Representation of Sinusoidal Phase Coupling
Sinusoidally-coupled phase models have been introduced previously in this thesis, (see
Section 2.0.1). Graph-theoretic notation can be used to rewrite the sinusoidal phase
47
coupling model (2.1) as
θ̇ = −B sin(B T θ).
(3.6)
The similarity between (3.6) and the linear consensus protocol from (3.2) is apparent
when the sin function in (3.6) is replaced with a constant function. While much is
known about the linear consensus protocol, the sinusoidal protocol is more difficult to
analyze due to the nonlinearities.
3.2
System Dynamics
In controlled phase coupling, a subset of the agents (called leader agents or nodes)
have the ability to transmit a control signal to their neighbors. The follower (i.e. nonleader) agents/nodes do not know of the leaders’ existence, and thus process received
information according to the originally prescribed model (3.6). All neighbors of a particular leader receive the same value from that leader. We begin by writing the system
dynamics.
Let φ ∈ T|VF | be the state of the follower agents (the leader agent states are unimportant, and therefore excluded). As in previous work with controlled linear consensus,
each leader agent has access to all follower agent states, φ. The rate of the ith follower
agent can then be written as
φ̇i =

sin(uj − φi ),
X

j∈Ni sin(φj
− φi ),
if j is a leader
(3.7)
otherwise.
Here, uj is the control signal sent out by leader j. Using the subgraphs defined above,
the follower node dynamics can alternatively be written as
φ̇ =
−BF sin(BFT φ)
−
|VL |
X
P T Bi sin(BiT P (φ − ui 1)).
(3.8)
i=1
The matrix P is formed by selecting columns corresponding to follower agent indices
from an N × N identity matrix.
48
3.3
Aligned Set Reachability
The work in this section develops reachability of the aligned set and is built upon the
following lemma.
Lemma 3.3.1. Consider a connected graph G = (V, E) on N nodes and select any one
node as a leader. The (entire) aligned set is reachable from any point in the aligned set.
Proof. Assume without loss of generality that the first node is selected as the leader
and that the followers have initial state φ(0) = α1 ∈ A, α ∈ T. To show that the
aligned set is reachable, select any β ∈ T as a goal point. If there exists a leader
controller taking the follower state from φ(0) = α1 to limt→∞ φ(t) = β 1, the aligned set
is reachable.
Consider a constant leader controller, u(t) = β, ∀t ≥ 0. This choice permits the
overall heterogeneous system dynamics, including both leader and follower nodes, to be
viewed as a certain homogeneous system. In particular, the equivalent homogeneous
system consists of N nodes, each of which applies sinusoidal coupling to all incident
edges as in (3.6). Edges in the follower subgraph are undirected, as usual, but edges
in the leader subgraph are directed, going from the leader to neighbors. The state of
the leader node in the equivalent homogeneous system never changes because it has
no incident edges whereas the follower nodes will behave as they would in the original
heterogeneous system.
Then, a recent result by Moreau [55] can be leveraged to conclude that the state of
the equivalent homogeneous system will asymptotically approach β 1, and hence the
state of follower nodes in the original heterogeneous system must also approach β 1.
The main idea of Moreau’s proof is that the convex hull of the state decreases to a
singleton, under some connectivity assumptions. The equivalent homogeneous system
with directed topology meets these connectivity assumptions because a directed path
exists from the leader node to each follower node, by construction.
Moreau’s proof is designed for Euclidean spaces, but Example 2 of [55] shows how
the result can be applied systems with state in TN provided all phases are within a
49
common semicircle. Here, the followers all start at α, so the result can be used directly
provided β 6= α+π. To show that β = α+π is also reachable, the leader can temporarily
report u = α + π/2 and later change to u = β once all agents have left α.
Theorem 3.3.2. Consider a connected graph G on N nodes, and select any one node
as a leader. Then, for initial follower heading phasors in a hemisphere, (i.e. φ − φ0 ∈
(−π/2, π/2)|VF | for some φ0 ∈ T), the aligned set is always reachable.
Proof. Proof is by construction of a controller. Without loss of generality, assume the
initial state of the followers is φ ∈ (−π/2, π/2)|VF | . In leaderless sinusoidal phase
coupling, the aligned set is globally asymptotically stable over any compact subset of
(−π/2, π/2)|VF | for arbitrary connected graphs [28]. Thus, let the leader obey the usual
phase coupled oscillator model with initial heading ξ ∈ (−π/2, π/2). Then, a point in
the aligned set will be approached. Finally, the result of Lemma 3.3.1 can be used to
show reachability of the aligned set. Due to the first-order approximation made in the
lemma, this theorem holds only approximately.
Corollary 3.3.3 (Additional Leaders). Adding more leaders does not decrease the size
of the reachable set, because the extra leaders can always implement non-leader-like
behavior.
3.4
Examples
Some specific examples, shown in Figure 3.1, of leader-controlled phase coupling are
presented in this section to gain insight into this challenging problem and to highlight
the main differences between controlled sinusoidal and controlled linear protocols.
3.4.1
Three Node Star Graph with Leader at Center
The first topology studied here is the star graph, with a single leader at the center (see
Figure 3.1(a)). This topology is of interest because it is completely symmetric about the
leader node. Results from controlled linear consensus suggest that the leader might
50
Leader
Leader
1
2
(a) Star
Leader
1
2
1
2
(b) Complete
(c) Chain
Figure 3.1: This figure shows the (a) star, (b) complete, and (c) chain example graphs
considered in this section. Edges from the leader are directed indicating the follower
is coupled sinusoidally to the leader, but not vice versa.
not be able to control the followers. Interestingly, this uncontrollability result does not
hold with sinusoidal coupling, even for the particular three-node star studied here.
With this topology, the dynamics of each follower (3.7) reduce to
φ̇i = sin(u − φi ),
i = 1, 2
(3.9)
= sin u cos φi − cos u sin φi ,
where u ∈ T is the value reported by the leader. Then, the system dynamics can be
rewritten as
(3.10)
φ̇ = A(φ)q(u),
with

A(φ) = 
− sin φ1 cos φ1
− sin φ2 cos φ2



and
q(u) = 
cos u
sin u

.
(3.11)
The leader can choose u to make q(u) point instantaneously in any direction. Thus,
the leader can drive the state in any direction in the state space (T2 ) provided A(φ) is
full rank. For this system, A(φ) loses rank only when the state is aligned or is balanced
(φ ∈ A ∪ B).
When the state is aligned, the range of A(φ) is spanned by [1, 1]T , which is also a
basis vector for the aligned set. Thus, no control signal from the leader can eject the
state from the aligned set. On the other hand, when the state is balanced, the range of
51
A(φ) is spanned by [1, −1]T . Control can drive the state out of the balanced set, but no
motion is possible directly along the balanced set.
Put together, these results imply that the system is controllable from T2 \ A. Because of the structure of T2 , the aligned set A does not form a barrier as it would in R2 .
Instead, to get from one state to another, a controller can always choose a path that
does not pass through the aligned set.
Knowing that this system is controllable outside the aligned set, a simple controller
can be constructed to drive the state from an initial position, φ(0) ∈ T2 \ A to a goal
state, φ∗ ∈ T2 . The basic idea is to choose the control direction that yields the quickest
reduction in distance between the current state and the goal,
u = argmax (φ∗ − φ)T A(φ)q(ũ) .
(3.12)
ũ
This optimization problem can be solved by examining the first-order necessary conditions,



cos
u
∂  ∗

0=
(φ − φ)T A(φ) 
∂u
sin u


−
sin
u
,
= (φ∗ − φ)T A(φ) 
cos u
(3.13)
q(u) = ±A(φ)T (φ∗ − φ).
(3.14)
which are satisfied when
The positive sign is chosen so that the control takes the state in the correct direction:
(φ∗ − φ)T φ̇ = (φ∗ − φ)T A(φ)q(u)
= q(u)T q(u)
(3.15)
≥ 0.
Then, the leader control can be calculated from q(u) as
u = atan (q(u)) ,
where atan is the four quadrant arctangent function.
(3.16)
52
Because A(φ) is full rank outside A ∪ B, the distance between the current and goal
states decreases along controlled trajectories on this subset. The state does not start
aligned, by assumption, and the controller will not drive the state into the aligned
set, by construction, so the only possible trouble spot is the balanced set. Indeed, this
particular controller is imperfect in that it is unable to drive the state to a goal in the
balanced set. Because the system is controllable, a different controller should be used
when the goal state is balanced.
To demonstrate the controller in simulation, the initial and final states were chosen
at φ(0) = [0, −1] and φ(tf ) = [6, 1]. Results are shown in Figure 3.2. The goal state is
reached in about 7sec. Note that outside the aligned and balanced sets, no leader
control input can zero the state derivative. Instead, once the goal point is reached, the
controller naturally oscillates back and forth to keep the state arbitrarily close to the
goal.
! (rad)
2
0
−2
0
2
4
6
8
10
2
4
6
Time (s)
8
10
2
0
u (rad)
!2 (rad)
State Trajectory
Aligned Set
2Balanced Set
Controllable Set
−2
0
−2
−2
0
2
4
!1 (rad)
6
(a) Controlled State Trajectory
8
0
(b) Leader Control
Figure 3.2: The controlled state trajectory (left) and leader control (right) are shown
for a simulation of the three node star graph with the leader at the center. Notice how
the leader control must oscillate to keep the follower state near the goal. The state
space, T2 , is represented as the shaded area between the two identical aligned sets,
the initial state is φ(0) = [0, −1], and the goal state, denoted by the red dot, is [6, 1].
53
3.4.2
Three Node All-to-All with One Leader
The second example considered here is a complete graph on three nodes with a single
leader (see Figure 3.1(b)). As with the star topology, this structure is symmetric, but
has a non-empty follower subgraph that creates a non-zero drift vector field. Following
the analysis technique of the previous example, the dynamics of each follower are

 

d  φ1   sin(φ2 − φ1 ) + sin(u − φ1 ) 
,
(3.17)
=
dt φ
sin(φ − φ ) + sin(u − φ )
1
2
2
2
Equivalently,
(3.18)
φ̇ = A(φ)q(u) + D(φ),
with

A(φ) = 
− sin φ1 cos φ1
− sin φ2 cos φ2


,
D(φ) = 
sin(φ2 − φ1 )
− sin(φ2 − φ1 )

.
(3.19)
Just as with the star graph, A(φ) is full rank everywhere except on the aligned and
balanced sets. Thus, outside A ∪ B, the leader control u can push the state in any
direction in the state space. However, the magnitude of this control is restricted by
the fact that q(u) is unit norm. The drift D pushes the system toward alignment (as
expected, because all-to-all coupling is almost globally stable to alignment). Thus, the
system is only controllable when the magnitude of A(φ)q(u) is large enough to overcome
the drift, D(φ). Once the state is sufficiently close to the aligned set, alignment cannot
be prevented by any control.
To explicitly determine the subset on which this system is controllable, the control
design technique from the star graph example can be employed. Instead of calculating
which input vector q(u) yields the greatest velocity towards the goal (3.14), determine
which input vector results in the greatest velocity against the drift,
q(u) = A(φ)T D(φ)

= sin(φ2 − φ1 ) 
sin φ2 − sin φ1
cos φ1 − cos φ2

,
(3.20)
54
so that u = atan (q(u)). Physically, this choice of input corresponds to the leader reporting that it is located across the phasor circle from the phasor centroid of the followers.
With this choice of input, the drift will overwhelm the control when cos(φ2 −φ1 ) < −1/2.
In other words, the leader is only effective at overcoming the drift when the followers
are separated by at least 120◦ . Thus, this system is only controllable inside a band
around the balanced set, as shown in Figure 3.3.
!2 (rad)
Aligned Set
Balanced Set
2
Controllable Set
0
−2
−2
0
2
4
!1 (rad)
6
8
Figure 3.3: The controllable set on T2 is shown for the complete graph. When the angle
between the followers is less that 120◦ , alignment cannot be prevented.
3.4.3
Three Node Chain with Leader at End
The final example topology considered here is a three-node chain with the leader at one
end (see Figure 3.1(c)). The main difference between this example and the previous
ones is the lack of symmetry about the leader. Again, the edge between the followers
results in a non-zero drift vector field. The dynamics in this case reduce to




d  φ1   sin(φ2 − φ1 ) + sin(u − φ1 ) 
=
.
dt φ
sin(φ1 − φ2 )
2
(3.21)
For this topology, rewriting the dynamics in terms of A(φ) and q(u) does not help because A(φ) is never full rank. Instead, consider the coordinate transformation φ̄ =
55
φ1 + φ2 and ϕ = φ1 − φ2 for which

  
u
d φ̄ 
,
=
dt ϕ
−2 sin(ϕ) + u
(3.22)
and the control is u ∈ [−1, 1]. Thus, the drift dominates the control, dragging the state
towards the aligned set, when 30◦ < |ϕ| < 150◦ .
With this result in mind, the state space can be partitioned into three sets,
Controllable:
Drift Dominated:
Reachable:
S C = {φ | 150◦ < |ϕ|}
(3.23)
S D = {φ | 30◦ < |ϕ| <= 150◦ }
(3.24)
S R = {φ | |ϕ| <= 30◦ } .
(3.25)
For points in the controllable set (S C ), every other point in T2 is reachable in finite
time, and thus the system is controllable from this set. Once the state enters the drift
dominated set (S D ), the above analysis shows that entering the reachable set (S R ) is
unavoidable. Every point in the reachable set is reachable from every other point in the
state space. However, the reachable set is positively invariant, meaning it cannot be
escaped. The controllable, drift dominated, and reachable sets are depicted in Figure
3.4 as light, medium, and dark gray shaded regions, respectively.
!2 (rad)
2
0
−2
−2
0
2
4
!1 (rad)
6
Aligned Set
Balanced Set
Controllable
Drift Dominated
8
Reachable
Figure 3.4: The sets for the chain graph are shown. All states are reachable from the
controllable set (light gray), however the positively invariant reachable set (dark gray)
cannot be avoided after entering the drift dominated set (medium gray).
56
3.5
Concluding Remarks
The work in this chapter has extended previous research on linear controlled agreement to sinusoidal coupling on the N -torus. While much remains to be learned about
this fascinating system, results presented here show that, for any connected topology
with one or more leaders, the aligned set is reachable from any initial state in a hemisphere. Specific examples were then presented and analyzed to demonstrate the effect
of topology and the significant differences between linear and sinusoidal couplings.
One observation is that leader symmetry does not imply uncontrollability, as it did in
the linear case.
Future work will build upon the foundation presented here to answer more general questions of reachability and controllability. For instance, if a single leader can
be placed anywhere in a large network, where should it be put in order to be most
effective? Also, how few leader nodes are required to make the heterogeneous system
controllable, if possible? To answer these questions and make additional progress on
this problem, future work will connect this problem to existing theory and develop new
tools as necessary. Control of nonlinear systems that contain drift is an active area of
research. Finally, opportunities remain for closing the loop with biologists to see if
controlled phase coupling a is good model of heterogeneity in natural aggregations.
57
Chapter 4
SINUSOIDAL PHASE COUPLING IN DISCRETE TIME
In this chapter, we extend the continuous-time coordinated heading controller from
Chapter 2 to a discrete-time formulation. Instead of continually exchanging information as before, here individual agents only communicate every ∆T seconds. The
need for analyzing a discrete-time version of the control stems from the fact that most
real-world communication systems operate on packetized data, which is fundamentally discrete-time. In particular, both underwater and traditional wired and wireless
communications operate on packets of data that arrive at discrete time instants.
Another important reason for studying the discretized problem is to study how little communication is required to maintain stability. Inter-vehicle communication requires power, so less communication means longer life for battery-powered systems.
The tradeoff, naturally, is that overall task performance can be decreased. This tradeoff will become apparent in the work presented in the later parts of this chapter.
Finally, to further extend the work in Chapter 2 toward application on a physical system like the UW-FAV, communication topology limitations must be taken into
account. In our radio-frequency and acoustic modem underwater communication systems, only one agent can “talk” at a time to prevent signal interference. Thus, the
communication topology has a natural broadcast structure.
4.0.1
Contribution
The phase-coupled oscillator model is converted to discrete time using a zero-order
hold. Stability of this discrete-time model to aligned, balanced, and reference-matched
sets is show to be dependent on the product of the coupling strength, K, and the discretization interval ∆T . While initial results require all-to-all communication, later
results permit an extension a specific time-varying broadcast topology in which each
58
agent has an equal probability of broadcasting at each time interval. Finally, energy
efficiency routings are studied, with the conclusion being that a single-hop broadcast
is more efficient than any multi-hop gossiping scheme when the time deadline, ∆T , is
sufficiently small.
4.0.2
Organization
The material in this chapter was taken from peer-reviewed papers by the author and
co-authors including a book chapter [79], several conference papers [41, 45], and one
journal publication [42]. Organization is as follows. We begin in Section 4.1 by applying a zero-order hold to the homogeneous Kuramoto model. The stability of this
model to the aligned set under all-to-all and random one-to-all communication topologies are studied in Sections 4.2 and 4.3, respectively. Then, stability to the balanced
set is analyzed under all-to-all communication in Section 4.4. A reference vector is
added to the discrete-time model and stability to the matched submanifold is studied
in Section 4.5. Results on network routing optimization are presented in Section 4.6.
Performance of the discrete-time control system is studied in simulation in Section 4.7.
Some concluding remarks are made in Section 4.8.
4.1
The Discrete-Time Phase Coupling Model
Recall the continuous-time all-to-all homogeneous Kuramoto model from (2.1),
N
KX
sin(θj (t) − θi (t)).
θ˙i (t) = ωi (t) +
N
(4.1)
j=1
To arrive at the discrete time phase-coupled agent model first presented in [79], set
ωi = 0 for all individuals (to avoid undesirable incoherent states), and apply a zero
order hold:
θi (h + 1) = θi (h) +
N
K∆T X
sin(θj (h) − θi (h)).
N
(4.2)
j=1
Here, h is a time index, and ∆T is the discretization period. The model is thus written
in synchronous discrete time.
59
As in the previous chapters, the phasor centroid,
N
1 X
x̄(h) =
xi (h) = ρ̄(h)∠θ̄(h),
N
i=1


cos θi (h)
,
xi (h) = 
sin θi (h)
(4.3)
(4.4)
will again prove useful. We will continue to refer to the magnitude of the phasor
centroid, ρ̄(h) = kx̄(h)k, as the order parameter. Kuramoto showed [47] that the phasor
centroid can be used to express (4.2) in mean field coupling form,
θi (h + 1) = θi (h) + K∆T ρ̄(h) sin(θ̄(h) − θi (h)).
(4.5)
The equilibria of (4.2) and (4.5), for non-zero K, occur whenever
N
1 X
sin(θj − θi ) = 0.
N
(4.6)
i=1
States, θ, for which (4.6) holds can be partitioned into aligned (2.29), balanced (2.30),
and unstable sets. In the unstable set,
U = {θ ∈ TN \ A ∪ B | sin(θj − θi ) = 0 ∀i, j},
(4.7)
all headings are parallel, but the state is neither aligned nor balanced.
4.2
All-to-All Aligned Set Stability
The stability of the discrete time phase-coupled agent model with all-to-all communication to either aligned or balanced sets is related to K∆T . The following theorem
shows that the aligned set is stable for 0 < K∆T < 2.
Theorem 4.2.1 (All-to-All Aligned Set Stability in Discrete Time). For a system of N
all-to-all coupled oscillators in discrete time (4.5), an aligned state will be approached
if and only if 0 < K∆T < 2, provided θ(0) ∈
/ U.
Proof of Sufficient Condition: The proof makes use of LaSalle’s Invariance Principle for discrete time systems [35]. Take as a Lyapunov candidate V (h) = 1 − ρ̄(h). This
60
function achieves a minimum value of zero only when all vehicles are aligned and is
otherwise positive. An equivalent expression for the Lyapunov candidate at time step
h is
where eθ̄(h)
(4.8)
V (h) = 1 − x̄(h)T eθ̄(h) ,
h
iT
≡ cos θ̄(h) sin θ̄(h)
is a unit vector in the direction of θ̄(h), and for
brevity x̄(h) = x̄(θ(h)). The difference in V between two consecutive time steps can be
bounded above:
∆V (h) = x̄(h)T eθ̄(h) − x̄(h + 1)T eθ̄(h+1)
≤ x̄(h)T eθ̄(h) − x̄(h + 1)T eθ̄(h) .
(4.9)
Thus, the result will follow from examining −∆x̄(h)T eθ̄(h) , where ∆x̄(h) = x̄(h + 1) −
x̄(h). Starting from (4.9) and defining δ θ̄i (h) ≡ θ̄(h) − θi (h),


cos
θ̄(h)

∆V (h) ≤ −∆x̄(h)T 
sin θ̄(h)

 
T 

N
1 X cos θi (h) cos θi (h + 1) cos θ̄(h)
=
−
 sin θ (h)
N
sin θi (h + 1)  sin θ̄(h)
i=1
i
=
=
N
1 X
cos(δ θ̄i (h)) − cos(θ̄(h) − θi (h + 1))
N
1
N
i=1
N
X
cos(δ θ̄i (h)) − cos δ θ̄i (h) − K∆T ρ̄(h) sin(δ θ̄i (h)) .
i=1
Provided ρ̄(h) 6= 0 and assuming 0 ≤ δ θ̄i (h) < π, without loss of generality due to
symmetry, each term of the above sum is negative when sin(δ θ̄i (h)) 6= 0, and
0 < K∆T ρ̄ sin(δ θ̄i (h)) < 2δ θ̄i (h).
A sufficient condition is then 0 < K∆T < 2, because
0 < K∆T <
2δ θ̄i (h)
< 2.
ρ̄ sin(δ θ̄i (h))
Thus, V (h) is a valid Lyapunov function for this system and the stated range of K∆T .
61
LaSalle’s Invariance Principle states that the solution of a dynamical system will
approach the largest positively invariant set contained in E = {θ ∈ TN |∆V = 0} ∩ D̄,
where D̄ is the closure of the domain. From (4.10), ∆V is zero when ρ̄ = 0 and/or
when sin(δ θ̄i (h)) = 0, ∀i ∈ {1, 2, . . . , N }. The heading rate (4.5) is zero at each of these
points, thus rendering them positively invariant. However, not all of these equilibria
are stable. When ρ̄ = 0, the system is in an unstable balanced state (V (h) from (4.8)
attains its maximum value at these points). A small perturbation of θ resulting in
ρ̄ > 0 will allow the system to move towards alignment. When sin(δ θ̄i (h)) = 0, ∀i ∈
{1, 2, . . . , N }, all vehicle headings are parallel, but do not necessarily point in the same
direction. However, these states are also unstable because any small perturbation will
make ∆V < 0 for 0 < K∆T < 2. Thus, the only stable equilibria are those in which all
vehicle headings are aligned.
Proof of Necessary Condition: A necessary condition for an equilibrium point to be
stable is that all eigenvalues of the linearized system matrix lie within the unit circle.
Due to the rotational symmetry of the aligned set, the linearization can be done about
the origin. In other words, the aligned set is stable if and only if the origin is stable.
Denote by Bc the incidence matrix associated with a complete graph on N nodes.
The discrete-time Kuramoto model (4.2) can then be written as
θ(h + 1) = θ(h) −
K∆T
Bc sin(BcT θ(h)).
N
(4.10)
Linearizing (4.10) about the origin yields
θ(h + 1) = Aθ(h)
K∆T
A= I−
Lc ,
N
(4.11)
(4.12)
where Lc = Bc BcT is the graph Laplacian associated with the complete graph. The
eigenvalues of A lie at
λ(A) = 1 −
K∆T
λ(Lc ).
N
(4.13)
62
The eigenvalues of the graph Laplacian of an undirected complete graph are


0
with multiplicity one
λ(Lc ) =

N with multiplicity N − 1.
(4.14)
All eigenvalues of A must lie within the unit circle for stability of the aligned set.
Note that one eigenvalue will always be of value 1 (corresponding to λ(Lc ) = 0) with
eigenvector of all ones,
1. However, this eigenvalue/eigenvector pair corresponds to
moving from point to point in the alignment subspace and has no effect on stability
because the control is zero at all points on the aligned set. For the remainder of the
system, orthogonal to the vector of ones, the dynamics are projected to a subspace
orthogonal to 1. In this subspace, 0 < K∆T < 2 is both necessary and sufficient for
stabilization of the aligned set when connectivity is all-to-all.
4.3
Random One-to-All Aligned Set Stability
The aligned set stability proof in the previous section requires all-to-all communication. For many realistic communication systems, including the UW-FAV testbed, only
one vehicle can “talk” at a time. In this section, we show that the previous stability
theorem holds with slight modification if each agent has an equal probability of being
selected as the broadcaster on each time step.
A random broadcast network is defined to be one in which at each time step h, one
agent is chosen at random from a uniform distribution to broadcast its state to all
other agents. To implement this controller on a real system, the sequence of random
broadcasters can be selected ahead of time. The random broadcast network simplifies
the discrete-time Kuramoto model as the summation in (4.2) reduces to a single term:
e
θi (h + 1) = θi (h) + K∆T
sin(θb(h) (h) − θi (h)), i ∈ {1, 2, . . . , N } .
(4.15)
Here, b(h) ∈ {1, 2, . . . , N } denotes the index of the randomly selected broadcasting
e ≡ K/2.
agent at time step h and K
Theorem 4.3.1 (Random One-to-All Aligned Set Stability in Discrete Time). For a
system of N oscillators coupled by random one-to-all broadcasts in discrete time (4.15),
63
e
an aligned state will be approached in probability for 0 < K∆T
< 2, from all initial
conditions not in the unstable set, U.
Proof. Because each agent has an equal probability of being selected as the broadcaster, the expected value of the next heading of each agent i ∈ {1, 2, . . . , N },
E{θi (h + 1)} = θi (h) +
N
K̃∆T X
sin(θj (h) − θi (h)),
N
j=1
is exactly the update given by all-to-all communication. Therefore, the reasoning in
Theorem 4.2.1 can be used to conclude that (4.8) is a supermartingale [48], and thus
the state will approach the aligned set in probability, from almost all initial conditions.
This approach works in part because the aligned set is positively invariant with respect
to the broadcast update (4.15).
On each broadcast, the state is expected to become more aligned than it was previously. Although a sequence of broadcasters for which alignment is never reached can
be constructed, the probability of such a sequence being randomly selected approaches
zero as the length of the sequence goes to infinity.
4.4
All-to-All Balanced Set Stability
We now turn attention to balanced set stability. In the continuous time phase-coupled
oscillator model, stabilizing the state to the balanced set instead of the aligned set
required only changing the sign of the coupling gain. In discrete time, a similar phenomenon occurs. In particular, we have the following conjecture regarding almost
global asymptotic stability of the balanced set.
Conjecture 4.4.1. For a system of N all-to-all coupled oscillators in discrete time (4.5),
a balanced state will be approached for almost all initial conditions if −2 < K∆T < 0.
The main difference between Conjecture 4.4.1 and Theorem 4.2.1 is the sign of the
coupling gain. However, the proof method used in Theorem 4.2.1 does not apply here
because the projective-technique employed there is not sufficient here.
64
Nonetheless, we are able to prove almost global asymptotic stability of N phasecoupled agents to the balanced set for −1 < K∆T < 0. The main theorem here is built
upon two lemmas which will be presented before the main theorem. Throughout this
section, Figure 4.1 may be used to give some graphical significance to the approach.
Also, the following assumption will hold for the remainder of this section.
Assumption 4.4.2. Assume without loss of generality that the state θ(h) has been rotated so that θ̄(h) = 0. Note then that
x̄(h) = [ρ̄(h), 0]T ,
(4.16)
and θ(h) ∈ A ∪ U =⇒ θi ∈ {0, π}, i = 1, . . . , N . Also, note that x̄(h + 1) will not typically
lie on the x-axis.
Lemma 4.4.3. Denote by Br (c) the open ball of radius r centered at the point c. If there
exists a nonempty I ⊆ {1, . . . , N } such that
xi (h + 1) ∈ Bρ̄(h) (xi (h) − x̄(h)) for i ∈ {1, 2, . . . , N } ,
(4.17)
xi (h + 1) = xi (h) otherwise,
(4.18)
x̄(h + 1) ∈ Bρ̄(h) (0) .
(4.19)
and
then
Proof. To begin, add and subtract x̄(h) from the definition of x̄(h + 1):
N
1 X
(xi (h + 1) − xi (h))
N
i=1
1 X
= x̄(h) +
(xi (h + 1) − xi (h)) .
N
x̄(h + 1) = x̄(h) +
(4.20)
(4.21)
i=I
Then, from (4.17),
xi (h + 1) − xi (h) ∈ Bρ̄ (−x̄(h)) for i ∈ I.
(4.22)
65
1
x∗i (h)
0.5
xi (h) − x̄(h)
ρ̄(h)
xi (h)
θi (h)
ψi∗ (h)
1
θi (h)
0
x̄(h)
0
0.5
1
Figure 4.1: This figure shows the general setup of the proofs in this chapter. The unit
circle is shown in light green (gray). The phasor centroid is drawn at x̄ = [0.5, 0] and is
marked with a red “o”, the dot is xi , and the diamond is located at xi − x̄. The square
denoted x∗ is the farthest point xi could move around the unit circle before leaving the
shaded ball, Bρ̄ (xi − x̄).
Using this result, (4.21) can be rewritten as
x̄(h + 1) = x̄(h) +
1 X
pi ,
N
(4.23)
i=I
where pi ∈ Bρ̄ (−x̄(h)). Now, note that
X
1 X
γ
pi =
pi = γ p̄
N
| {1, 2, . . . , N } |
i=I
(4.24)
i=I
with γ = | {1, 2, . . . , N } |/N ∈ (0, 1]. Further, γpi ∈ Bρ̄(h) (−x̄(h)), which guarantees
γ p̄ ∈ Bρ̄(h) (−x̄(h)) ,
(4.25)
66
because the mean of a set of points must lie within the (open) convex hull of those
points. Returning now to (4.23) and using (4.24),
x̄(h + 1) = x̄(h) + γ p̄
(4.26)
which, when combined with (4.25) reveals
x̄(h + 1) ∈ Bρ̄(h) (x̄(h) − x̄(h)) = Bρ̄(h) (0) .
(4.27)
Lemma 4.4.3 will be used to show that the phasor centroid at h + 1 will be closer to
the origin than at h (4.19), provided that individual state changes satisfy (4.17).
Lemma 4.4.4. Starting from any point xi (h) = [cos(θi (h)), sin(θi (h))]]T on the unit circle, the heading update (4.2), (4.5) results in a new point,
xi (h + 1) = [cos (θi (h + 1)) , sin (θi (h + 1))]]T ,
(4.28)
on the unit circle that satisfies
xi (h + 1) ∈ Bρ̄(h) (xi (h) − x̄(h)) ,
(4.29)
provided −1 < K∆T < 0, ρ̄(h) > 0, and θi (h) 6= kπ, k ∈ Z.
Proof. Assume by symmetry that θi (h) ∈ (0, π) and note that θi (h + 1) ∈ (θi (h), π). The
magnitude of the phasor centroid, ρ̄(h), must lie in the open interval between zero and
one because ρ̄(h) > 0 from the lemma statement, and ρ̄(h) = 1 implies θi (h) = 2kπ,
which contradicts the lemma statement.
A circle of radius ρ̄(h) ∈ (0, 1) centered at xi (h) − x̄(h) intersects the unit circle
exactly twice, for θi (h) 6= 0. One of these points must lie at xi (h) and the other at
a point x∗i (h) = 1∠θi∗ (h) such that θi∗ (h) ∈ (θi (h), π), see Figure 4.1. Define ψi∗ (h) =
θi∗ (h) − θi (h) ∈ (0, π − θi (h)) and note that


cos (θi (h) + α)

 ∈ Bρ̄(h) (−x̄(h))
sin (θi (h) + α)
(4.30)
67
holds for all α ∈ (0, ψi∗ (h)). In what follows, we show that ψi (h) = θi (h + 1) − θi (h) <
ψi (h)∗ , thereby guaranteeing that xi (h+1) ∈ Bρ̄(h) (xi (h)−x̄(h). In other words, xi (h+1)
will lie on the bold portion of the green unit circle shown in Figure 4.1.
Applying the law of sines to the triangle highlighted in Figure 4.1,
ρ̄(h) sin (θi (h)) = kxi (h) − x̄(h)k sin(ψi∗ (h)/2).
(4.31)
Recalling (4.5) with θ̄ = 0 from Assumption 4.4.2,
ψi (h) = −K∆T ρ̄(h) sin (θi (h))
(4.32)
< ρ̄(h) sin (θi (h)) ,
(4.33)
for −1 < K∆T < 0. Using (4.31) with (4.33),
ψ(h) < kxi (h) − x̄(h)k sin(ψi∗ (h)/2)
(4.34)
≤ 2 sin(ψi∗ (h)/2)
(4.35)
< ψi∗ (h),
(4.36)
because kxi (h) − x̄(h)k ≤ 2 by the triangle inequality and because sin(x/2) ≤ x/2 for
x > 0.
Referring to Figure 4.1, Lemma 4.4.4 establishes that phase coupling will move an
agent from xi (h) to some point on the bold portion of the green unit circle.
Theorem 4.4.5 (All-to-All Balanced Set Stability in Discrete Time). For N > 1, the
discrete time system (4.2), (4.5) is asymptotically stable to the balanced set for −1 <
K∆T < 0 from all initial conditions θ(0) ∈ TN \U ∪ A.
Proof. Take as a candidate Lyapunov function
V (h) = ρ̄(h),
(4.37)
which is non-negative and equals zero for all θ ∈ B. We will show that for θ(h) ∈
/ B,
V (h) is monotonically decreasing in h by showing that x̄(h + 1) ∈ Bρ̄ (0).
68
First, assume that the state θ(h) ∈
/ U ∪ A, and define J as
J = {j ∈ 1, . . . , N | θj (h) 6= kπ, k ∈ Z} .
(4.38)
Either the state is balanced, θ(h) ∈ B, or the index set is nonempty, |J | > 0. If i ∈ J ,
xi (h + 1) ∈ Bρ̄(h) (xi (h) − x̄(h)), from Lemma 4.4.4, and otherwise xi (h + 1) = xi (h)
because sin(θ̄(h) − θi (h)) = 0 in (4.5). Lemma 4.4.3 can then be used to conclude that
x̄(h + 1) ∈ Bρ̄(h) (0), so V (h + 1) < V (h).
Now, assume that the state θ(h) ∈ U ∪ A. These points are invariant, but we
will show by linearization that they are unstable. Without loss of generality, from
Assumption 4.4.2 and the definitions of U and A, reorder the state so that


0 for i = 1, . . . , η
θi =

π for i = η + 1, . . . , N,
(4.39)
and also note that η > N/2 > 0 because ρ̄(h) lies on the positive x-axis (θ(h) ∈
/ B).
The linearized state transition matrix has η eigenvalues at 1 − K∆T ρ̄(h)/N , which are
unstable for ρ̄(h) > 0 and −1 < K∆T < 0, as is the case here. Thus, because η > 0,
θ is unstable for all θ ∈ U ∪ A. Note, therefore, that the state will never reach U ∪ A
because all points in this union are unstable and θ(0) ∈
/ U ∪ A.
Thus, V (h + 1) < V (h) provided V (h) 6= 0, in which case the state is already balanced. LaSalle’s Invariance Principle for discrete time systems can be used to conclude
that the state will converge asymptotically to the largest invariant set, which here is
the balanced set alone.
Note that Theorem 4.4.5 establishes a sufficient condition only. If K∆T is less
than −1, Lemma 4.4.4 fails to hold. From our simulation study in [79], the Lyapunov
function used here appears to work for K∆T up to two. Ongoing work is aimed at
formalizing this result.
4.5
Reference Set Stability
In this section, we modify the phase-coupled agent model to include a reference vector. The objective is to drive the phasor centroid to this reference vector, as in Chap-
69
ter 2. The main contribution presented in this section is a proof that the referenceaugmented system is asymptotically stable in discrete time, for a range of K∆T .
Let the reference vector be xref = ρref ∠θref ∈ B1 (0), and consider the referenceaugmented phase-coupled agent system,
N
K∆T X
θi (h + 1) = θi (h) +
sin(θj (h) − θi (h)) − K∆T ρref sin(θref − θi (h)).
N
(4.40)
j=1
As with the standard phase-coupled oscillator model, the reference-augmented model
can be written in mean field coupling form,
θi (h + 1) = θi (h) + K∆T ρ̃(h) sin(θ̃(h) − θi (h)),
(4.41)
where ρ̃(h) and θ̃(h) are the magnitude and phase of centroid error, x̃(h) = x̄(h) − xref .
One main difference between (4.41) and (4.5) is that the ρ̃ ≤ 1 + ρref ≤ 2 whereas ρ̄ ≤ 1.
The following assumption will be made throughout the remainder of this section.
Assumption 4.5.1. Assume without loss of generality that the state, θ(h), has been
rotated so that θ̃(h) = 0. Note θ̃(h + 1) 6= 0 in general.
Recall the definition of the matched submanifold from (2.31). Analogous to the
unstable set from earlier in this chapter (4.7), the reference-augmented unstable set,
e can be defined as
U,
Ue = {θ ∈ TN | sin(θ̃ − θi ) = 0 ∀i, x̄ 6= xref }.
(4.42)
The following theorem establishes asymptotic stability to a state in which the phasor
centroid matches the reference vector.
Theorem 4.5.2 (Reference Set Stability in Discrete Time). The discrete time referenceaugmented phase-coupled system (4.40), (4.41) is asymptotically stable to the reference
set for −2/(2 + ρref ) < K∆T < 0, provided θ(0) ∈
/ Ue and N > 1.
Proof. The proof is similar to the one presented for Theorem 4.4.5 with the main difference being that ρ̃(h) replaces ρ̄(h) and θ̃(h) replaces θ̄(h). To avoid repetition, only
differences from the proof of Theorem 4.4.5 will be described here.
70
Lemma 4.4.3 remains unchanged other than the notation change. Lemma 4.4.4
requires a slight modification to account for the new range of K∆T , however the result
will remain unchanged. Specifically, for −2/(2 + ρref ) < K∆T < 0, (4.33) becomes
ψi (h) <
2ρ̃(h)
sin (θi (h)) .
2 + ρref
(4.43)
Using the triangle inequality, ρ̃(h) = kx̃(h)k ≤ 2 + ρref , so
ψi (h) < 2 sin(ψi∗ /2),
(4.44)
as in (4.35), and the conclusion remains ψi (h) < ψi∗ .
The only change required to the text of Theorem 4.4.5 is that Ue replaces A ∪ U. All
states θ ∈ Ue are unstable. Then, as before, LaSalle’s Invariance Principle can be used
to conclude asymptotic stability to the matched submanifold (2.31).
Note that if the reference vector is unknown, −2/3 < K∆T < 0 is always sufficient.
Also, if the reference vector is at the origin, the result of Theorem 4.4.5 is recovered.
4.6
Network Routing Optimization
In this section, energy efficient communication schemes1 are addressed for realizing
the sequence of logical one-to-all communications needed by the discrete-time coordinated controller presented in the previous section of this chapter. The results were
first published in [41], and a more complete version with additional material was later
published in [42].
From a wireless networking perspective, one can obtain results on the energy optimal realization of each broadcast tree which transfers M bits of information (representing the heading θb (h)) from the broadcaster to all others in no more than ∆T seconds. This result is a generalization of energy-efficient multicast trees as it includes
a hard deadline of ∆T for a multicast session. For notional simplicity, the effective
transmission rate will be denoted by R̄ = M/∆T .
1
This work was done in collaboration by Prof. Tara Javidi, a communication theorist, and her graduate
student, Phillip Lee, from the University of California at San Diego.
71
Note that the simplest routing/relaying strategy is a single-hop wireless broadcast,
while other options include multi-hop routing and relaying (also known as gossiping)
(see Figure 4.2). As will be shown below, the delay constraint imposed on the broadcast
transmission has a significant impact on the solution to the minimum energy routing
problem. The tradeoff between single hop broadcasting and multi-hop transmission is,
in principle, the tradeoff between energy saving in transmission rate and transmission
distance. When the message is broadcast to all nodes in the network in a single hop,
the source node can use the entire time interval ∆T and transmit at a lower rate; but
the transmission has to reach the farthest node in the network. On the other hand,
when the message is relayed via intermediate nodes, the distance of each hop is smaller
but the effective transmission rate for each node is larger than the single hop case.
To better describe this tradeoff, define f (Ri (t)) as the expected SNR required for
reliable communication at rate R(t), t ∈ [h∆T, (h+1)∆T ]. Suppose f (Ri (t)) grows faster
than polynomial order with an increase in transmission rate. Taking into account
that the typical power loss along a transmission path is only polynomial order, this
trend intuitively suggests that savings in transmission rate would be more crucial as
transmissions are constrained by a tight deadline. This result is formalized in this
section: when the effective information rate (R̄ = M/∆T ) is above some threshold Rc ,
single-hop broadcasting (a star topology) minimizes energy while meeting the strict
delay deadline ∆T . Note that this result is a direct multi-hop result of lazy scheduling
in [84] and [93]. The result will first be proved for the case of a linear network (i.e.
nodes are located along a line) with three nodes and a path loss exponent of four; the
results are easily generalized to a network with three nodes with general topology,
admitting a network of more than three users.
4.6.1
Energy Efficient Routing for Broadcast Networks
Consider a network with N nodes where the path loss exponent is assumed to be
four2 . The problem of topology choice reduces to finding the optimal transmission SNR
2
This parameter can be readily extended to any arbitrary value.
72
Deadline = ! T
DeadlineDeadline
= !T = !T
!T/3
!T/3 !T/3
!T
!T
!T
a) Logical Graph
(a) Logical Graph
a) Logical Graph
b) Long Range Broadcast
(b) Long Range Broadcast
b) Long Range Broadcast
!T/3
!T/3
!T/3
!T/3
!T/3
T/3
c) Multi!hop!Gossip
(c) Multi-Hop Gossip
c) Multi!hop Gossip
a) Logical Graph
b) Long Range Broadcast
c) Multi!hop Gossip
Figure 4.2: A logical one-to-all graph (a) can be realized via simple wireless broadcast-
ing (b) or gossip schemes (c).
Pi (t), i = 1, 2...N , at time t ∈ [h∆T, (h + 1)∆T ] (without loss of generality, the noise
power is normalized at the receiver to value 1). The following assumptions are made
regarding the operation of a network in service of coordinated control.
Technical Assumptions
A1. Each node can transmit with high enough power for a wireless broadcast transmission to reach the farthest node in a single hop;
A2. The interference model is the protocol model in [23], where the guard zone is as
large as the diameter of the network3 ;
A3. The received signal power at a distance d from the transmitting node varies as
d−a , where the path loss exponent a depends on the characteristics of the transmitting medium (the path loss exponent is generally assumed to be 2 ∼ 4 for
wireless settings);
A4. Cooperative relaying, or network coding, is not considered, hence, the work follows a simple packet switching model with separation of layer functionalities;
and
A5. Each update message has the same format and the same size of M bits.
3
Meaning only one link can be active during each transmission session.
73
Mathematically, these assumptions reduce to the following optimization problem:
min
N Z
X
{Pi (·)}N
i=1 ∈P
∆T
(4.45)
Pi (t)dt
0
i=1
subject to
Pi (t) ≥ 0, i = 1, 2...N
(4.46)
Pi (t)Pj (t) = 0, for all i 6= j,
Pi (t) = f (Ri (t))d4ijt (i) ,
Z ∆T
M =
Ri (t)dt,
(4.47)
(4.48)
∀i,
(4.49)
0
where P is the class of collective power allocation policies {Pi (·)}N
i=1 satisfying (4.45)(4.49), and jt (i) is the farthest node to whom, at time t, node i attempts to deliver
its M bits reliably. In other words, one can interpret f (R̄) to be the transmit power
required to transmit 1 bit of information with rate R̄ to a node unit distance away. In
this chapter, a case is considered where transmissions over links follow the Shannon
capacity formula for an AWGN channel, i.e. the expected transmit SNR required for
reliable communication at rate R(t) to a node at distance d is given by
Pi (t)/σ = da (22R(t) − 1),
(4.50)
where σ is the noise level at the receiver, therefore f (R̄) is written as
f (R̄) = 22R̄ − 1.
(4.51)
Extensions to a broader class of functions f (R̄) can be found in [42].
The main result of this section is the following theorem:
Theorem 4.6.1. If M and ∆T are chosen such that
M
= R̄ ≥ 4,
∆T
a single hop broadcasting scheme (see Figure 4.2) is more energy efficient than any
multi-hop relaying scheme.
74
To prove this theorem, first the optimal transmission policy within each transmission session and over every single hop is identified. In particular, a fixed rate transmission session is shown to be optimal in terms of minimizing total energy consumption.
In [93], Zafer and Modiano have used the concept of a minimum departure curve and
proposed constructive algorithms to build the optimal departure curve for given delay
constraints with single hop point-to-point communications. Graphically, their results
suggest that the optimal departure curve is the one that has the shortest length among
all feasible curves constrained between the minimum departure and arrival curves.
This result is readily applicable to the setting here when considering a single transmission session over a single-hop for any transmitting node (one contiguous interval
during which Pi (t) > 0) . The following lemma establishes this fact.
Lemma 4.6.2. A necessary condition for optimality is that each node has to transmit
at a fixed rate within each transmission session.
Proof. Assume that the optimal transmission policy Fjk (t) for node j does not transmit
at a fixed rate in its kth transmission session, and denote the corresponding energy
consumption to be Wk . Based on the result in [93], another feasible policy can always
be found that is more energy efficient than Fjk (t). The existence of this policy can be
simply shown by using the same argument in [93]: for any feasible departure curve,
replace a small portion of it by a straight line (corresponding to a fixed rate policy);
the new transmission policy is still feasible, but the total transmission energy corresponding to the new policy would be smaller than the original one. Therefore, another
feasible fixed rate policy can always be found with energy consumption Wk0 such that
Wk0 < Wk , which contradicts the optimality assumption of the original policy. This argument shows that fixed rate transmission within each session is a necessary condition
for any active node’s optimal transmission policy.
To prove the main result, first consider a linear network with three nodes. The
energy required for relaying R̄∆T bits of information in ∆T seconds is given by
Em (α, β, R̄) = ∆T D4 (β 4 α(2
2R̄
α
2R̄
− 1) + (1 − β)4 (1 − α)(2 1−α − 1)),
(4.52)
75
where α∆T denotes the time fraction spent on transmission between the source node
and the intermediate node, and βD denotes the distance fraction between the source
node and the intermediate node. By construction, α, β ∈ (0, 1). On the other hand, the
energy required for single-hop broadcasting is written as
Es (R̄) = ∆T D4 (22R̄ − 1).
(4.53)
Remark 4.6.3. Without loss of generality, and for simplicity of notation, ∆T D4 is normalized to one.
Lemma 4.6.4 below establishes the optimality of the single-hop broadcasting scheme
for a three node linear network.
Lemma 4.6.4. Consider a linear network of 3 nodes. If M = R̄∆T and ∆T are such
that
M
= R̄ ≥ 4,
∆T
(4.54)
the energy of broadcasting M = R̄∆T bits in ∆T seconds in a single hop fashion is less
than any relaying scheme independent of ratios of the node’s distances, β ∈ (0, 1), and
the times allocated to each hop, α ∈ (0, 1).
Using a Taylor series expansion at R̄, Em (α, β, R̄), and Es (R̄) can then be rewritten
as follows:
Em (α, β, R̄) =
∞
X
(2R̄ ln 2)n
n=1
Es (R̄) =
n!
(
β4
(1 − β)4
+
),
αn−1 (1 − α)n−1
∞
X
(2R̄ ln 2)n
n=1
n!
(4.55)
(4.56)
where
an (R̄) ≡
(2R ln 2)n
β4
(1 − β)4
, bn (α, β) ≡ n−1 +
.
n!
α
(1 − α)n−1
Before proceeding with the proof of the above lemma, the properties of the sequences an (R̄), bn (α, β) must be established.
Lemma 4.6.5. Properties of bn (α, β).
• For a given α, β ∈ (0, 1), bn (α, β) is a monotonically increasing function in n.
76
• For all α, β ∈ (0, 1), and ∀n ≥ 4, bn (α, β) ≥ 1.
1
b1 (α,β) .
• For all α, β ∈ (0, 1), and ∀n ≥ 7, bn (α, β) ≥
Lemma 4.6.6. If there exists Rc such that Em (α, β, Rc ) ≥ Es (Rc ) for some α and β, then
Em (α, β, R̄) ≥ Es (R̄) is also true for any R̄ ≥ Rc .
The proof of this lemma as well as other proofs below are omitted here for brevity,
but can be found in [42]. The next lemma provides the last step in proving Lemma 4.6.4.
Lemma 4.6.7. At R̄ = 4, Em (α, β, R̄) > Es (R̄) ∀ α, β ∈ (0, 1).
Proof. Define dn (α, β, R̄) as
(2R̄ ln 2)n
dn (α, β, R̄) =
n!
β4
(1 − β)4
+
−1 .
αn−1 (1 − α)n−1
In other words,
Em (α, β, R̄) − Es (R̄) =
∞
X
dn (α, β, R̄).
n=1
Then it must be shown that
∞
X
dn (α, β, 4) > 0.
n=1
∀ α, β ∈ (0, 1). For simplicity, write dn (α, β, 4) as dn .
∞
X
n=1
dn >
3
X
dn +
n=1
∞
X
dn
n=7
3
X
∞
X
1
− 1)
an
b1
n=1
n=7
!
∞
3
X
1X
= (1 − b1 )
an −
an ,
b1
> (b1 − 1)
an + (
n=7
n=1
where the first inequality is from the fact dn ≥ 0 for n ≥ 4; the second inequality is
from Lemma 4.6.5. Because b1 < 1, the last term is positive if and only if
P∞
b1 < P3n=7
an
n=1 an
.
77
The validity of this result is verified numerically. At R̄ = 4,
P3
n=1 an < 51. Therefore
P∞
an
4
4
,
b1 = β + (1 − β) < 1 < 1.5 < Pn=7
3
n=1 an
P
and at R̄ = 4, ∞
n=1 dn > 0 for ∀ α, β ∈ (0, 1).
P∞
n=7 an
> 80, and
Finally from Lemma 4.6.6, Em (α, β, R̄) > Es (R̄) is proven for all α, β ∈ (0, 1) when
R̄ ≥ 4.
The above work can be extended to the case of the general network with multiple
users in the following steps.
Corollary 4.6.8. Single hop broadcasting is more energy efficient than multi-hopping
in any network with three users when R̄ > 4.
Proof. Consider a network with three users where S denotes the source node, Unear
the closest node to the source node, and Uf ar the farthest node from the source node.
Let D be the distance between the source node and the farthest node. By the triangle
inequality,
D = d(S, Uf ar ) ≤ d(S, Unear ) + d(Unear , Uf ar ).
(4.57)
Defining Dnew = d(S, Unear ) + d(Unear , Uf ar ), the energy required for multi-hopping is
4
Em (α, β, R̄, Dnew ) = ∆T Dnew
(β 4 α(2
2R̄
α
2R̄
− 1) + (1 − β)4 (1 − α)(2 1−α − 1)).
(4.58)
However, from Lemma 4.6.4, when R̄ > 4, the above function is always greater than
4 f (R̄). Because D
Es (R̄, Dnew ) = ∆T Dnew
new ≥ D from the triangle inequality, the fol-
lowing inequality is true for all α, β ∈ (0, 1):
Em (α, β, R̄, Dnew ) > Es (R̄, Dnew ) ≥ Es (R̄, D).
(4.59)
Therefore, in a network with three users, single hop broadcasting is always more energy efficient than multi-hopping when R̄ ≥ 4.
Now consider a network with N users, where links are activated one at a time to
avoid interference, i.e. the interference model is the protocol model in [23]. In this
setting, the main result follows inductively as a corollary to Corollary 4.6.8.
78
Theorem 4.6.1: Single hop broadcasting is more energy efficient than multi-hopping
in any network when R̄ ≥ 4.
4.7
Performance vs. K∆T
This section contains simulation results to support the theoretical findings in this
chapter and to specifically highlight performance aspects of the discrete-time phasecoupled agent model, which are difficult to determine analytically.
As the product K∆T is varied within [0, 2], the time required for the state to converge within an -ball (settling time) of alignment will vary. The general trend of this
variation is depicted in Figure 4.3, where the number of iterations needed for the state
to converge within an -ball is plotted against K∆T for the case of all-to-all communication (indicating the result in (4.15)) and a particular case of one-to-all communication.
Here = 10−7 . Convergence rate is similar for all-to-all and one-to-all broadcast, but
keep in mind the factor of two gain difference. The best performance is obtained with
K∆T near one because for lesser values of K∆T the system lacks control authority,
and for greater values of K∆T ringing occurs. To determine actual time to alignment,
the steps to alignment must be multiplied by ∆T , so smaller ∆T means faster convergence.
Contrast this result with Figure 4.4 which shows the impact of the choice of the
time interval, ∆T , on the communication energy of realizing each broadcast of state
update via a single-hop network versus a multi-hop scenario. One clear trend is that
decreasing ∆T increases energy consumption. This result is the tradeoff between desired control performance and network realization energy.
4.8
Final Remarks
The material presented in this chapter relaxes the continuous-time communication
assumption of Chapter 2. Also, the need for all-to-all communication was slightly relaxed with the aligned set random one-to-all broadcast stability theorem. In each of
the theorems in this chapter, stability was shown to be dependent on the product of
79
300
Expected (all−to−all)
Random (one−to−all)
Steps to Alignment
250
200
150
100
50
0
0
0.5
1
K!T
1.5
2
Figure 4.3: The general trend of number of iterations to convergence of headings to
an -ball versus K∆T for both an expected broadcast (i.e. all-to-all) and an actual oneto-all broadcast sequence (4.15) is shown. The same randomly selected broadcast sequence was used for each value of K∆T .
the coupling strength, K, and the synchronous time step, ∆T .
Future work on this topic could branch out in several directions. First, the balanced
set stability proof was sufficient only an almost certainly conservative. Conjecture
4.4.1 suggests that the balanced set is stable for −2 < K∆T < 0 whereas the proof
only holds for −1 < K∆T < 0. Extending the current proof methodology to address the
conjecture appears to be non-trivial.
Another direction for future work is to examine more general, but still connected,
communication topologies. In typical distributed systems, each agent can only communicate with a subset of the group. Work has been done here with the continuous
time model [28], and it will be interesting to see how these results carry over to the
discretized model.
Finally, time delay is a consideration in many engineered distributed systems. Time
delay has been studied in the context of the continuous time model [36, 92] and preliminary simulation results [79] indicate that the discrete time phase-coupled agent
80
8
10
M = 7, single hop
M = 7, multi−hop
M = 20, single hop
M = 20, multi−hop
M = 50, single hop
M = 50, multi−hop
6
Normalized Energy
10
4
10
2
10
0
10
−2
10
0
2000
4000
6000
8000
10000
Time Discretization Step !T (ms)
Figure 4.4: Total communication (normalized by noise) energy versus time discretization step, ∆T , for various levels of quantization, M , for a linear network. The tradeoff
between single hop and multi-hop is most apparent for the seven bit case, M = 7.
model is robust to delay, but this result needs to be formalized.
81
Chapter 5
COORDINATED TARGET TRACKING IN A CLUTTERED
ENVIRONMENT
The previous chapters have focused on low-level coordinated control theory that
can be applied to a variety of potential applications. In this section, we shift focus to a
more realistic setting to consider a group of N UAV-like vehicles tasked with tracking
a target vehicle in a cluttered environment. The main questions asked here surround
coordinated target tracking at a rather high level. For example, what data should
each agent transmit and when should the communication take place? What benefit
does communication provide? The purpose of this chapter is to shed light on these and
other questions within the framework of a coordinated target tracking problem.
Specific emphasis is placed on a complete solution, including state estimation, communication, and control problems. Due to the realistic setting of the problems considered in this chapter, analytical results are intractable. Instead, we design controllers
based on nearby analytical solutions and rely upon Monte Carlo simulation methods
to evaluate and compare various control, communication, and estimation strategies.
As in the later parts of the previous chapter, a sequence of discrete-time broadcasts
are used for communications, here between constant-speed and non-holonomic pursuit
vehicles. To explore the question of what data each vehicle should transmit, three
possible communication protocols are considered: transmit nothing, transmit one, two,
or three recent measurements, and transmit a state estimate. Communicated data
arrives delayed, and sensors fail both stochastically and when the line of sight to the
target is occluded by an obstacle. Fusion of local measurements and received communications is achieved by a per-vehicle Unscented Kalman Filter (UKF). A behavior-based
controller, designed by Dr. Benjamin Triplett, is used on each vehicle to disperse the
vehicles about the target while avoiding collisions with other vehicles and obstacles.
82
5.0.1
Related Work
Perhaps the work most closely related to this investigation is the square root sigma
point information filter developed by Campbell and Whitacre [14, 89]. In Campbell’s
work, each pursuit vehicle shares information with every other pursuit vehicle after
each observation. By sharing information so frequently, each agent is effectively able to
recover the centralized target state estimate. Occasional communication dropouts and
delay were addressed with a queuing system. This approach is ideal when the number
of agents is sufficiently low and the available bandwidth is sufficiently high. The control objective was to maintain a fixed clock angle between vehicles, and obstacles were
not considered.
Olfati-Saber and others created a series of Distributed Kalman Filtering (DKF)
algorithms for sensor networks [62, 63, 64, 65, 77]. The objective of the DKF is to drive
the target state estimate on each node towards a common state estimate. As compared
to local (non-consensus) Kalman filtering, the DKF results in lower estimate error and
reduced estimate disagreement between nodes. However, the algorithm was designed
for linear target dynamics and communication delay was not considered.
The results presented in this chapter differ from these previous works in that allowable communication bandwidth per vehicle is assumed to be sufficiently low that
recovering the centralized state estimate is not possible. Another difference is the explicit inclusion of sensor-occluding obstacles. The behavior-based control is computed
in a distributed manner.
5.0.2
Organization
The work in this chapter1 was originally presented in a conference paper [80] and
will soon be available in a journal article [81]. The organization is as follows. In
the next section, the vehicle dynamics, sensor model, and communication network are
introduced. Coordinated target estimation and pursuit are described in Sections 5.2
and 5.3, respectively. Simulation results and concluding remarks follow in Sections 5.4
1
Done in collaboration with Dr. Benjamin Triplett.
83
and 5.5.
5.1
System Description
The system considered here consists of N pursuit vehicles and a single target vehicle.
The pursuit vehicles are equipped with camera-like sensors to measure the planar
position of the target. These sensors fail stochastically, and are less likely to make a
successful measurement when far from the target. The pursuit vehicles are capable
of exchanging information over a communication network. The environment contains
obstacles which occlude the sensor and impede the path of the vehicles.
5.1.1
Vehicle Dynamics
In line with to the work presented in previous chapters, the individual pursuit vehicles
are modeled as constant speed unicycles,
  

i
i
i
x
v cos(θ )
 

d 
 i  i

y  =  v sin(θi )  ,
dt   

i
i
θ
usteer
i = 1 . . . N.
(5.1)
Here ri = [xi , y i ] is the Cartesian coordinates of the ith vehicle, θi is its heading, v i is its
speed, and uisteer is the steering control input. These dynamics govern the movement of
the pursuit vehicles as they move in the plane. The target vehicle plant model assumed
for the estimation process is a second-order unicycle,

  
v t cos(θt )
xt

  

 t  t
 y   v sin(θt ) 




d  t 

.
θ  = 
φi

dt   

 t 
 v   wv 

  
wφ
φt
(5.2)
Here rt = [xt , y t ] is the Cartesian coordinate of the target, θt is its heading, v t is its
speed, and φt is its turn rate. wv and wφ are zero-mean Gaussian white noise terms
that allow the estimator to compensate for the fact that true target heading and speed
commands are unknown to the observing agents.
84
5.1.2
Sensor Reliability Model
The sensors onboard the pursuit vehicles measure the planar position of the target
at regular discrete-time intervals of length Ts . However, measurements fail stochastically, with a rate depending on the distance between the vehicle and the target, see
Figure 5.1. Denoting by ρi = kri − rt k the distance from the ith pursuit vehicle to the
target and by pi : R → R the sensor reliability curve of agent i, the observation model
can be written as,
y i (h) =


Hxt (h) + wyi
with probability pi (ρi )

∅
otherwise,
(5.3)
where H selects the target position only and wyi is Gaussian white noise.
1
Sensor reliability
0.8
0.6
0.4
0.2
0
0
200
400
600
800
Distance to target
1000
Figure 5.1: Example sensor reliability dependence on vehicle distance to target. The
circle denotes the end of the optimal sensor range, while the x denotes the absolute
sensor range.
85
5.1.3
Communication Network
The pursuit vehicles communicate via a sequence of broadcast transmissions. On each
broadcast, one vehicle can communicate a limited amount of information to the other
agents. Only one vehicle can communicate during each broadcast session, which occur
every Tb = nb Ts , nb ∈ N seconds. When the broadcasting agent is chosen sequentially {1, 2, . . . , N, 1, 2 . . .}, the time between data transmissions for a given agent will
be N Tb . The pursuit vehicles are assumed to have synchronized clocks to support this
communication model.
The data communicated by the broadcasting agent always includes its current position, needed to support the coordinated controller developed later. Additional communicated information includes one of the following three options:
1. Transmit nothing
2. Transmit recent sensor measurements
3. Transmit the current state estimate
Communications arrived delayed by one sensor period, Ts . An example relationship
between communication and sensing events is depicted in Figure 5.2.
5.1.4
Clutter Model
Clutter in the environment is modeled generically as circular obstacles of radius ro .
These obstacles obstruct the sensor view can block the vehicles’ motion. The location
of these obstacles is not known a priori, an no model of their location is created here.
Regular spacing between obstacles will be varied in experiments to evaluate performance.
5.2
Coordinated Target Estimation
Each robot uses an Unscented Kalman Filter (UKF) to estimate the state of the target vehicle from local measurements and received communications. Because received
86
Delay
4Ts
Ts
2Ts
Ts
2Ts
3Ts
Vehicle 1
Com
Sense
time
Vehicle 2
Com
Sense
time
Figure 5.2: An example in which two vehicles communicate their most recent measurement is depicted. Here, black dots represent successful measurements and arrows
point from the time at which the data was communicated to the time the communication arrives at the other vehicle.
communications arrived delayed, the delayed data must be slotted into the correct
time-slot, as in [58]. Detailed information about the UKF can be found in [22, 85],
although the specific version of the UKF used here is from [76].
5.2.1
Unscented Kalman Filtering for Target State Estimation
The UKF maintains a Gaussian probability distribution over the state. Here, a separate UKF is run on each pursuit agent. The estimate on agent k can be represented by
a mean state x̂k ∈ Rn and an error covariance matrix Pk ∈ Rn×n . Unlike the extended
Kalman filter, the UKF does not require explicit linearization and instead makes use
of the Unscented Transform [32] in which discrete samples, called sigma points, are
passed directly through the nonlinear function representing the motion or observation
model. The sigma points are collected after the transformation and approximated by a
Gaussian distribution.
The target state estimate is initialized on each vehicle with an assumed estimate
mean, x̂i0 , and error covariance, P0i . The sigma points on agent k are denoted sigma
87
(i)
points, xk and can be computed as,
(i)
(i)
xk = x̂k + x̃k ,
p
T
(i)
x̃k =
nPk
i
p
T
(i)
x̃k = −
nPk
i
where
i = 1, . . . , 2n
(5.4)
i = 1, . . . , n
(5.5)
i = n + 1, . . . , 2n,
(5.6)
√
√
nPk is the matrix square root of nPk , ( nPk )i is its ith row, and n is the length
(i)−
of the state. Denote the sigma points propagated through (5.2) by xk
. The predicted
estimate mean is then recovered by computing the average of the propagated sigma
points.
2n
x̂−
k =
1 X (i)−
xk .
2n
(5.7)
i=1
The predicted estimate error covariance is computed from the propagated sigma points
as
2n
Pk− =
1 X (i)−
(i)−
(xk − x̂k )(xk − x̂k )T + Qk ,
2n
(5.8)
i=1
where Qk is added to the estimate error covariance to take the process noise into account. Incorporating the process noise in this way assumes that the process noise adds
linearly to the system dynamics.
(i)+
When data arrives from the sensors, new sigma points, xk
, are calculated using
equations (5.4-5.6) to capture the statistical moments of the current state estimate.
Then, a predicted measurement is computed:
2n
ŷk =
1 X (i)
yk .
2n
(5.9)
i=1
(i)
Here yk
(i)+
= Hxk
, where H is from (5.3). The posterior error covariance is then
recovered from the sigma points:
2n
Pkyy
1 X (i)
(i)
(yk − ŷk )(yk − ŷk )T + Rk ,
=
2n
(5.10)
i=1
where Rk is added to include the affect of the measurement noise. Next, the cross
covariance between the predicted measurement error covariance and the predicted
88
state estimate is determined,
2n
Pkxy
1 X (i)+
(i)
T
=
(xk − x̂−
k )(yk − ŷk ) .
2n
(5.11)
i=1
The filter gain can now be computed simply as
Kk = Pkxy (Pkyy )−1 .
(5.12)
Then the estimate mean is updated in the usual Kalman filtering way:
−
x̂+
k = x̂k + Kk (yk − ŷk ),
(5.13)
and the estimate error covariance is update by
Pk+ = Pk− − Kk Pkyy KkT ,
(5.14)
which gives the posterior target state estimate at time step k.
The above equations can be used without modification for the situation in which vehicles do not communication any information about the target. If, however, measurements or estimates are received via communication, the above filter equations must be
modified. To facilitate handling of delayed data, we assume that all transmitted data
is tagged with a timestamp, tjc , indicating the time step, c, at which the data was valid.
5.2.2
Shared Measurement Data Fusion
Data fusion with shared measurements is not difficult. Here, data received from other
agents is independent from local measurements according to the zero-mean white
Gaussian observation noise model. Thus, received measurements can be slotted into
the UKF at the appropriate time stamp, after which point the UKF is re-run forward
to the current time.
5.2.3
Shared Estimate Data Fusion
When state estimates are shared, a complication occurs because the state estimate on
one vehicle is not independent from that on any other due to the common process noise
89
of the target vehicle [3]. The cross correlation between the two estimations ideally
should be used in computing the true posterior, however the cross correlation computation is complicated [16]. Methods for determining the cross covariance between
estimates for a linear system with two independent sensors exist, but require the sharing of all estimate error covariance for all time between the two sensor estimates [3].
Specifically, the estimate cross covariance for fusion of estimates from two independent
sensors observing a linear process given by
ij
jT
j
i
i
Pkij , E[x̃ik x̃jT
k ] = (I − Kk H)(Fk−1 Pk−1 Fk−1 + Qk−1 )(I − Kk H),
(5.15)
where Qk−1 is the discrete time target process noise covariance. Then the fused estimate is computed as
x̂fk
= x̂ik − Pkβ (x̂ik − x̂jk )
(5.16)
Pkf
= Pki − Pkβ (Pki − Pkij )T ,
(5.17)
where Pkβ is given by
Pkβ = (Pki − Pkij )(Pki + Pkj − Pkij − Pkji )−1 .
(5.18)
This formulation gives the maximum likelihood fused estimate. An alternative formula
gives the minimum mean squared error fused estimate [15].
For the nonlinear system of N agents considered in this chapter, the above equation
do not say how the fused estimate can be computed. Further, it is assumed that there
is not sufficient bandwidth available to share all available data at every sensor step.
To make headway on this problem, we make the assumption that the cross covariance
is zero. Because the cross correlation comes from the process noise, we expect greater
estimate degradation is expected when the process noise is large [49]. With this assumption in place, maximum likelihood and minimum mean squared error estimates
can be computed as
x̂fk
Pkf
= x̂ik − Pki (Pki − Pkj )−1 (x̂ik − x̂jk ),
= Pki I − (Pki − Pkj )−1 .
(5.19)
(5.20)
90
Upon receiving an estimate from another vehicle, the fused estimate can be computed in one of two ways. The fusion can be computed using either (1) an estimate
computed using local measurements only or (2) the previously computed fused estimate. Here, we will call a fusion based on local information only clean whereas one
computed based on the previously computed fused estimate as a combined estimate. In
either case, the information transmitted by an agent will be the clean estimate.
5.3
Coordinated Target Pursuit
In this section, we propose a controller to keep the pursuit vehicles near the target. Because of the realistic nature of this problem, we design a behavior-based controller in
which behaviors can be added or removed as necessary. This flexible structure permits
us to conduct a simulation study of the benefit of coordinated behaviors. The motivation behind the controller is to keep the vehicles spaced out about the target while
maintaing inter-vehicle separation. The controller is structured as
uisteer = (1 − w)(uict + uispace ) + uiavoid ,
(5.21)
where uict is the centroid to target control, uispace affects inter-vehicle spacing, and uiavoid
is the obstacle avoidance term. The weighting w ensures that obstacle avoidance takes
priority in close proximity to obstacles. The control is called uncoordinated when only
using the obstacle avoidance behavior and coordinated when using all three behaviors.
5.3.1
Centroid to Target Control
The centroid to target term attempts to bring the centroid of the group to the target.
The idea being that the vehicles are spread out about the target when their centroid
collocated with the target. The control behavior is computed as
uict = kct sin(θct − θi ),
(5.22)
where θct is the angle of the vector from the group centroid to the target, and kct is a
gain. The angle θct is computed using the latest information available.
91
5.3.2
Inter-Vehicle Spacing
Inter-vehicle spacing control is based on the work in [34] and is defined as
uis
N r 2 X
s0
=
1 − ij
sin(θij − θi ).
d
(5.23)
j6=i
Here, θij ∈ [−π, π) is the angle of the vector from agent i to agent j, dij ∈ R+ is the intervehicle distance, and rs0 is the nominal desired inter-vehicle spacing. The sinusoidal
term determines the direction in which the vehicle will steer.
5.3.3
Obstacle Avoidance
The obstacle avoidance term is intended to maintain separation between the vehicle
and the obstacles. Here, the control is designed to allow the agents to move freely until
the distance to the nearest obstacle falls below a threshold, roT > ro . The obstacle
avoidance control is computed as


0
i
uoa =

f (∆θi , di )
oa
oj oj
j
,
if dioj > roT
(5.24)
otherwise.
Superscript i refers to the ith vehicle, oa implies “obstacle avoidance” and superscript
j refers to the j th obstacle. The obstacle avoidance function in equation (5.24) is given
by
i
foa (∆θoj
, dioj ) =
X
i
i
cos(∆θoj
) sgn(∆θoj
)fd (dioj ),
(5.25)
j∈No
i = θ i − θ i , and θ i is the angle of the vector from vehicle i to the center
where ∆θoj
oj
oj
of obstacle j. The obstacle distance function, which determines the magnitude of the
obstacle avoidance control in equation (5.25), is given by
fd (dioj ) = −2
2 − r 2 )2
(roT
o
.
2
(ro2 − di2
oj )
(5.26)
The obstacle avoidance behavior causes a vehicle to steer until the current heading
is perpendicular to the obstacle. The vehicle can then circle around the obstacle, if
commanded by the other behaviors.
92
5.4
Results
A variety of simulation experiments were conducted to evaluate the performance of the
state estimator, controller, and information sharing. Here, performance is measured
by the average (across the N agents) mean log likelihood as well as average integrated
position error.
In the first round of experiments, called isolated estimator experiments, the focus is
on evaluation of the state estimator under the three differing communication schemes.
Thus, we use a fixed sensor reliability, effectively neglecting the position of each agent
and any obstacles that may be present in the environment. In the second set of experiments, called pursuit vehicle coordination, the focus is on evaluation of the entire
system, including the behavior-based controller. Simulations are conducted for N = 3
agents.
In each simulation, the target vehicle runs with zero control on heading and speed,
but with zero-mean white process noise, with standard deviations σv = 0.05,
σφ = 0.1,
acting as inputs. While the resulting heading and heading rate of the target are unconstrained, the speed of the target was constrained to lie between 50% and 150% of
its initial value which was half that of the pursuit vehicles. In the figure legends,
No Comm refers to data that was generated without communication of target data between agents. Meas i, where i = 1, 2, 3, 4, 5, refers to the maximum number of measurements that could be shared between pursuit vehicles. Fus A and Fus B refer to data
that was computed using clean estimates or combined estimates respectively. Centralized refers to fully centralized estimate data in which estimates were computed using
all measurement information available to the pursuit agents without delays.
5.4.1
Isolated Estimator Experiments
Sensor Reliability
In the first set of isolated estimator experiments, estimator performance is computed
as a function of the sensor reliability, which is fixed during each simulation here. The
93
following data transmissions were considered: 1) no communication, 2) one to five of
the most recent measurements broadcast, 3) the most recent estimate broadcast and
fused with clean estimates at the receiving agent, 4) most recent estimate broadcast
and fused with the combined estimates at the receiving agent, and 5) fully centralized
estimates. The fully centralized estimate uses all measurements without delay and is
primarily intended to bound the other results.
The results, displayed in Figure 5.3, indicate that estimator performance improves
with increasing sensor reliability, and with an increasing amount of communication
data. Using more measurement is better, but the return is diminishing, likely because
of the added delay in older measurement. Estimate fusion outperforms measurement
2
4
x 10
1.5
3
0.5
0
−0.5
No comm
Meas 1
Meas 2
Meas 3
Meas 4
Meas 5
Fus A
Fus B
Centralized
−1
−1.5
−2
0
20
40
60
80
Sensor success rate (%)
Integrated position estimate error
Mean log likelihood
1
No comm
Meas 1
Meas 2
Meas 3
Meas 4
Meas 5
Fus A
Fus B
Centralized
100
(a) Estimator Likelihood
2.5
2
1.5
1
0.5
20
40
60
80
Sensor success rate (%)
100
(b) Estimator Integrated Error
Figure 5.3: Estimator performance vs. sensor success rate is measured by (a) mean
log likelihood and (b) integrated position error as a function of sensor reliability. All
pursuit agents have the same sensor precision.
94
fusion, but requires more shared data. The performance of the estimate fusion method
may have been helped somewhat by the constraint on target speed, so that its motion
is more self-correlated than the estimate models presume.
Sensor Precision
In this simulation, the sensor precision on only one of the pursuit vehicles is varied.
The precision is varied by dividing its measurement covariance matrix, Ri by a constant factor. As opposed to improving the sensor on all vehicles, this may be a cost
effective way to get improved results across the group. Experiments were conducted
for fixed sensor reliability of 70% and the results are shown in Figure 5.4. The general
trend in the figure is that increasing the measurement precision of a single agent in
the group improves the overall estimator performance.
3
Mean log likelihood
2.5
2
No comm
Meas 1
Meas 2
Meas 3
Meas 4
Meas 5
Fus A
Fus B
1.5
1
0.5
0
0
2
4
6
8
Vehicle 2 Sensor precision multiplier
10
Figure 5.4: Effects of changes in relative sensor precision of a one agent on the estimator performance of the other two agents. This figure shows the estimate results for one
of the fixed measurement covariance agents, which have σx = σy = 9. Sensor precision
on the lower axis indicates the factor by which sensor precision was increased. Sensor
reliability is fixed at 70%.
95
Communication Period
One of the main questions being studied here is determining how frequently communication events should take place. If communication occur in rapid succession, there
is a high communication cost and relatively little new information is expected in each
packet. If the communication rate is too slow, then benefits of cooperation are lost. To
investigate this tradeoff, the communication period was varied with the sensor period
fixed at one second.
The result is shown in Figure 5.5. As expected, more frequent communications
results in improved estimates of the target state. The only case in which decrease in
1.2
0.4
1
Mean log likelihood
0.8
0.6
0.4
No comm
Meas 1
Meas 2
Meas 3
Meas 4
Meas 5
Fus A
Fus B
Centralized
0.2
0
Mean log likelihood
No comm
Meas 1
Meas 2
Meas 3
Meas 4
Meas 5
Fus A
Fus B
Centralized
−0.2
−0.4
−0.6
0.2
−0.8
0
−0.2
0
−1
2
4
6
Communication period (s)
8
10
−1.2
0
(a) Sensor Reliability: 70%
2
4
6
Communication period (s)
8
10
(b) Sensor Reliability: 30%
Figure 5.5: Estimator performance dependence on communication period for sensor
reliabilities of (a) 70% and (b) 30%. The sensor measurement period is fixed at Tb = 1.
The No comm case, and the Centralized estimate case are unaffected by communication. Drop-off in mean likelihood at low communication periods for the Fus B case
indicates effects of the neglected cross-covariance.
96
communication period did not improve the target estimate results was for the case of
recombined estimate fusion (Fus B in Figure 5.5). This was true only at the highest
communication rates, in which it appears that the lack of accounting for the estimate
cross covariance has become detrimental to the performance of the estimate fusion.
This was not true for the clean estimate fusion (Fus A in Figure 5.5).
5.4.2
Pursuit Vehicle Coordination
We now add in vehicle position and variable sensor reliability (Figure 5.1), which depends on the distance between the vehicle and the target. Obstacles are also added
at this point and the first experiment shows how the clutter density affects the mean
integrated position error (see Figure 5.6). As expected, more dense obstacles generally
reduce the performance of the system because the incoming flow of measurements is
slower. Communication helps a lot here and in fact, the target is lost entirely in the
No Comm case.
4
x 10
Integrated position estimate error
2.5
No Comm, No Coord
Meas 3, No Coord
No Comm, Coord
Meas 3, Coord
2
1.5
1
0.5
0
50
100
150 200 250 300 350
Obstacle Separation Distance
400
450
500
Figure 5.6: Target tracking performance for the four communication cases with variable obstacle separation distance.
The coordinated control motions did not server to improve the state estimates as
97
much as we would have liked. However, the coordination can be used to good effect
in the right situation, such as when it might be advantageous to spread the pursuit
vehicles out around the target rather than to allow them to travel directly over it.
Uncoordinated Control
Without centroid-to-target and inter-vehicle spacing behaviors, the vehicles can bunch
up behind an obstacle, as shown in the example in 5.7. This sort of behavior can cause
the target to be lost if the environment is sufficiently cluttered. Further, a diverse view
of the target cannot be obtained when all vehicles view it from the same vantage point.
Figure 5.7: Pursuit vehicle trajectories under the controller without coordination.
Here, shaded circles around an agent indicate loss of sensor view of the target and
lines trailing the vehicles indicate trajectory history.
Coordinated Control
Now we enable the coordinated centroid to target and inter-vehicle spacing behaviors
which cause the pursuit vehicles spread out around the target. An example is shown in
98
Figure 5.8. With this coordinated control, each vehicle tends to stay far from the others.
Also, the agents have a diverse view of the target permitting improved estimates when
the obstacles form long barricades, as shown in the figure.
Figure 5.8: Pursuit vehicle trajectories when the vehicles employ coordinated behavior.
Compared to the uncoordinated case, the vehicles here have a much better chance of
maintaining a good estimate of the target through the clutter.
5.5
Final Remarks
The work in this chapter has taken a more realistic, but less analytical, look at a
coordinated control problem. Results obtained via Monte Carlo simulations quantify
the benefit offered by sharing information and coordinating control for the particular
task of tracking a target vehicle moving in a cluttered environment.
Many aspects of this study could be explored in future work. In particular, approximation of cross-covariance in the estimate fusion, asynchronous communication and
sensing, other clutter/obstacle models, and targets operating under detection avoidance behaviors, are particularly relevant and interesting. Lastly, a comparison study
of different coordinated target pursuit algorithms how their performance varies with
99
the number number of pursuit vehicles would aid in gaining insight into the sort of
behaviors that are beneficial to a coordinated target tracking system.
100
Chapter 6
TRACKING MULTIPLE FISH ROBOTS USING UNDERWATER
CAMERAS
The UW Fin-Actuated Underwater Vehicles are capable of sensing orientation and
depth only. To apply the coordinated control algorithms of the previous chapters, in this
chapter we develop a computer vision system capable of simultaneously tracking one
or more 3D submarine robots in real-time. While the previous chapters have tended
to be theoretical in nature, the content of this chapter focuses on the development of a
software tool for localizing the fish robots. There are, in fact, many applications besides
tracking of our robots which could benefit from the software developed here. In particular, the research lab of Prof. Julia Parrish, from the University of Washington Dept.
of Aquatic & Fishery Sciences, is interested in using the vision system for tracking
individual members in a school of live fish, Giant Danio (Devario aequipinnatus).
6.0.1
Related Work
The problem of generating 3D trajectory data has a rich history and has been approached by researchers from different perspectives depending on the specific geometry of their problem. In [9], a set of methods is presented for multi-view image tracking
using a set of calibrated cameras and a Kalman filter to track each target in 3D world
coordinates and 2D image coordinates. Their methods are robust to occlusions but the
topology is such that the problem is essentially planar. Solving a similar multi-target
problem, work in [91] uses two Kalman filters to fuse data from multiple cameras.
Other methods that attempt to solve the multi-target problem include [17] in which a
multiple hypothesis tracking (MHT) algorithm is implemented to track multiple targets using a simple Kalman filter, and [24] in which a method is implemented that
involves working directly with the image as a measurement.
101
The problem approached here is similar to the work in [27], where a particle filter
is used to track multiple targets. The strategy considered here uses the geometry of
experiments to simplify the data association problem as well as to provide a practical
implementation. The work presented in [11] also has direct parallels to the work here.
In [11], a particle filter and blob tracker is used to track identical deformable targets
through severe occlusions. This method, however, is restricted to planar data using
a single camera whereas the work here addresses the tracking ability in 3D using
multiple cameras.
6.0.2
Organization
The work in this chapter has not been previously published. Organization is as follows.
Section 6.1 describes the problem setting, including a description of the hardware associated with the experimental testbed. The particle-filter-based tracking method used
to track a single fish in a static environment is discuss in Section 6.2. This method is
extended to outdoor environments in Section 6.3. Then, extensions to allow tracking
of multiple fish are made in Section 6.4. Concluding remarks are made in Section 6.5.
6.1
Problem Description and Experimental Apparatus
The objective of the tracking software to be developed here is enable the fish robots
with a sense of location. This locational awareness will act in support of single and
coordinated control missions, including those described in previous chapters of this
thesis. The individual fish robots can move freely in 3D, as previously described in
Section 1.3, and the working environment is a large above-ground swimming pool with
dimensions 2.4m wide, 2.4m deep and 6m long.
6.1.1
Computer Vision System
The approach take here is an underwater computer vision system in which four cameras are connected to a central processing computer, which then relays state estimates
to the robots using a wireless connection. This approach was carefully selected only af-
102
ter considering other localization methods including onboard vision-based simultaneous localization and mapping (SLAM) and sonar echo-location. The following hardware
devices are currently used in the vision system.
• Four CVC-320WP cameras from CSI-Speeco
• Intel Core2 Quad 2.4GHz workstation computer running Linux
• Bluecherry PV-149 120Hz capture card
• Custom 315MHz radio communication modules
The cameras are configured to produce 640x480 grayscale images at an NTSC standard
rate of 29.97Hz. Custom mounting brackets were designed to hang one camera in each
upper corner of the pool.
6.1.2
Camera Calibration
A variety of methods have been attempted for camera calibration. The purpose of calibration here is to establish a mathematical relationship between points in the world
and their corresponding location in the image. This relationship can be inverted to
determine a correspondence between a pixel in the image and a ray through 3D world
space. The extrinsic camera parameters measure the location of the camera whereas
the intrinsic parameters measure features specific to the internal workings of the camera.
The method used for initial tests is based on Tsai’s camera calibration technique
[82] using the C++ package available online from Reg Wilson [90]. This package
estimates the intrinsics and extrinsic camera parameters simultaneously from noncoplanar image-to-world point correspondences. Data is collected from multiple images
of a planar grid of golf balls hung from a supporting structure over the pool. Imageto-world point correspondences were made with assistance from a custom application
created by the author.
103
Later versions of the camera calibration system use a method in the open source
computer vision library (OpenCV) [10] in place of Tsai’s method. Here, a highly accurate camera calibration grid is first used to determine the intrinsic camera parameters.
Then the OpenCV routine determines only the extrinsic parameters. The advantage of
this two-step method is that the one-step approach has numerical issues. In practice,
the two-step OpenCV approach yields a mild improvement in the calibration accuracy.
6.2
Particle Filter Tracking of a Single Fish Robot
A particle filter (PF) is used to track the robot fish as it moves around in the underwater environment. This filtering method was chosen over other estimation techniques,
including extended and unscented Kalman filtering, because it allows for more flexibility. In particular, Kalman-filter based techniques maintain a Gaussian probability
in which simple physical constraints, like the fact that the robot cannot leave the pool,
are impossible to maintain.
Propagation
As with any Bayesian estimation technique, the PF has propagation and correction
steps. Upon each propagation, the estimate probability is advanced in time using a
discrete-time model of the system dynamics. For the robot fish, we employ an approximation of the Frenet-Serret equations of motion in which the vehicle advances in the
direction it is pointed and then rotates,
 
x
 
 
y 
 
 
z 
 
 
φ
 
 
θ
 
 
ψ 
 
s
k+1

x + δs cos(ψ) cos(θ)





y + δs (sin(ψ) cos(φ) + cos(ψ) sin(θ) sin(φ))




z + δs (sin(ψ) sin(φ) − cos(ψ) sin(θ) cos(φ))




=
 .
φ + δuφ






.9θ + δuθ






ψ + δuψ


clamp(s + δuspeed , smin , smax )
k
(6.1)
104
In these equations, [x, y, z] is position, [φ, θ, ψ] is roll, pitch, and yaw, s is forward speed,
and k is the timestep of length δ. Most of the control inputs, denoted by ux , are chosen
from a Gaussian distribution centered about zero. Because the fish is stable in roll,
we choose uφ = −0.1δφ. Finally, the clamp(x, a, b) function returns the value in the
interval [a, b] that is closest to x and is used above to keeps the speed of the vehicle
within a reasonable range.
Image processing
The correction step uses measurements to update the estimation probability density.
Obtaining features from the underwater camera images is achieved using the following
image-processing procedure:
At startup:
• Grab a static background image
Online:
• Grab the current frame of image data
• Subtract the current image from the background
• Clamp the difference image to 0-255
• Apply a threshold to make a binary image
• Run a blob detector to find contiguous regions
• Compute statistics on each blob
However, the blob statistics are not used directly in the estimation algorithm. Instead,
a blob manager tracks individual blobs in each camera view. The centroids of these
tracked blobs are used as measurements. If a new blob is “sufficiently fishlike” (as
determined by a boolean classification function) and not close statistically (another
boolean classification function) to any existing tracked blobs, it is added to the blob
tracker. Otherwise, if a good correspondence is detected between a blob in the image
105
and an existing tracked blob, the tracked blob is updated to reflect the new information.
Tracked blobs that have not been updated for 5 frames are removed. This process of
tracking blobs adds memory to the measurement process.
Correction
Blob centroids from the blob tracking are used as measurements for the particle filter
correction. The first step here is to compute the expected measurement corresponding
to the fish robot. This measurement can be computed only because the cameras are
calibrated, therefore allowing points in the world to be projected into the image. Only
the measurement closest to the predicted observation is used. The measurement model
encodes the probability of an observation given the state,
2
2
) + N (x; ρ, σnarrow
) + p0 .
p(y|x) = ωN (x; ρ, σwide
(6.2)
Here y and x are collected measurements and the state, N is a standard normal distribution, ω = 50 is a weighting, ρ is the distance from the expected measurement to
the closest tracked blob, σwide = 400 and σnarrow = 10 are standard deviations, and p0
is the probability that the fish could be anywhere.
Particle Filter Algorithm
The particle filter approximates the probability density function over the state space
with discrete samples, called particles. Each particle represents a hypothetical fish
robot. These particles have an associated weight, which quantifies the likelihood of
the particle given the time evolution and state measurements. This weighted particle
representation becomes exact as the number of particles goes to infinity. The problem specific information required by the particle filter are probabilistic dynamics and
observation models. The particle filter uses the propagation (6.1) and correction (6.2)
probability functions to update the density function over the state. On each prediction,
the particles are moved according to the state transition function,
xik+1 ∼ p(xik+1 |xik ),
i = 1, . . . , N.
(6.3)
106
After each measurement is made, the weights are updated as
i
wk+1
= wki p(yk |xik ),
i = 1, . . . , N.
(6.4)
Note that these equations are in simplified by choosing the state transition probability
as the proposal distribution.
The particles are initialized to incorporate any prior knowledge about the state. For
the problem here, the fish robots could be anywhere in the pool, so we choose a uniform
distribution. After several propagation and correction steps, a number of particles will
have very low weight. These particles are not being used effectively because they are
in low-probability areas of the state space. To use the particles more effectively, a
resampling step using Sampling Importance Resampling (SIR) can be used. The idea
here is to resample the particles with probability proportional to their weight. In the
resampled particle set, each particle has the same weight, and the particles are likely
concentrated near high probability areas.
For the fish tracking, we use N = 5000 particles per robot and resample only when
the covariance of the particle weights drops below a threshold.
Implementation
The software to track the robot was written in C++ and is capable of tracking the fish
in real time at speeds of 10 − 20Hz (depending on conditions). Some of the image
processing routines make use of the OpenCV library, but others like the blob detection
and particle filtering were written from scratch. The user interface was written in
FLTK and allows tracking of live or pre-recorded video streams. Video can be saved
directly from the tracking software. The state estimate from the tracker is relayed to
the underwater fish robots using a radio designed by Patrick Bettale [6]. A picture of
the user interface is available in Figure 6.1.
6.3
Tracking a Robot in an Outdoor Environment
The image processing routine for detecting the robots in the camera images relies upon
the fact that the background is static. For tracking in an outdoor environment, this
107
Figure 6.1: The user interface for the fish tracking software showing the four camera
views (left) and a 3D view of the pool (right). Some of the projected particles are shown
over the fish robot in the camera views as red dots.
assumption fails to hold. To allow more dynamic scenery, a low pass filter was applied
to the video stream to eliminate objects the are moving slower than the fish. This
modification permitted robust tracking of the fish during the 2008 Engineering Open
House demonstration in which the fish swam autonomously for up to 20mins at a
time under guidance of the vision system and a simple onboard waypoint controller.
Again, the radio frequency modems were used to tell the fish its current location. Some
pictures from this demonstration are presented in Figure 6.2.
6.4
Multi-Target Tracking
The main added challenge of tracking multiple fish simultaneously is the added image
processing complexity and data association problems. The problem of data association
is to determine which measurements correspond to which robots. This problem has
been investigated by many authors including [11] and [69]. One added difficulty of
the present problem is that one robot cannot be distinguished from another in the
grayscale video stream.
To deal with the correspondence problem, the geometry of the problem proves ad-
108
(a) Outdoor Pool
(b) Base Station
(c) Underwater View
(d) 3D View
Figure 6.2: The fish tracking vision system was used extensively at the 2008 University
of Washington Engineering Open House. This series of images shows (a) the outdoor
pool - black markers were used for camera calibartion, (b) the base station running the
tracking software, (c) the underwater view showing the fish in the top two images, and
(d) the 3D view showing the origin in the center of the pool, the four cameras around
the edge of the pool, and the expected fish state.
vantageous in that the each robot is typically visible in more than one camera view.
Thus, even if two fish overlap in one image, the other images will distinguish these fish
clearly (in most all cases). Thus, due to the multi-view geometry of our problem, multifish tracking can be achieve simply by ignoring measurements from tracked blobs that
109
come close together or merge. Multi-robot tracking is achieved using N particle filters
in parallel. This solution is clearly suboptimal, but seems to provide reasonably good
results due to the specific geometry of the problem.
6.5
Final Remarks
The work in this chapter has addressed the problem of tracking one or more fish robots
in support of the coordinate control algorithms developed in the previous chapters.
The approach taken was a computer vision system using four underwater cameras
connected to a central computer. Software on the base computer collects video frames
and uses one or more particle filters to estimate the state of the fish including position,
orientation, and speed. The vision system has been used extensively in our indoor pool,
and was demonstrated for the general public at the 2008 Engineering Open House.
There are many directions in which this work could evolve. For tracking of a single
fish, better image processing could allow a specific point on the fish to be tracked as
opposed to tracking the centroid of the image blob. Also, a more realistic likelihood
function could better deal with the robot entering and leaving the field of view.
Multi-robot visual tracking remains a difficult problem. The multi-robot extension
presented here is admittedly minimal, yet produces decent results. Perhaps a more
advanced filter, such as a Rao-Blackwellized particle filter [22], could better deal with
the image correspondences. In combination with the image processing and likelihood
improvements mentioned in the previous chapter, this system stands a good chance of
being highly successful multi-robot tracking.
Finally, it would be prudent to consider what modifications of the tracking algorithm would be necessary to track the Giant Danio in the live fish experiment of Prof.
Julia Parrish. These real fish do not move as predictably as the robot, so the dynamics
model would need to be modified. Because the cameras are located above the water
surface, the problem of camera calibration is more complicated for this problem, although some work on an dual (air and water) medium camera calibration has been
done already in this direction [57].
110
Chapter 7
IMPLEMENTATION WITH THE UW FIN-ACTUATED UNDERWATER
VEHICLES
In this chapter, we demonstrate some of the algorithms developed in the previous chapters on the UW-FAV multi-robot system and in simulation. The coordinated
phase controllers developed in the previous chapters of this thesis are applicable to
this robotic system after first augmenting the state with the Cartesian position of each
robot, rk ∈ R2 . Constant-speed unicycle kinematics then dictate that the vehicle advance at constant (e.g. unit) speed in direction θk , as in (2.2). While the UW-FAVs are
capable of variable forward speed, we here set the tail frequencies to a constant and
thus the vehicle move with near-constant speed. Much of the work in this section was
presented at a recent IFAC workshop, [39].
The chapter opens with an implementation overview describing how the fish robots
were programmed and the code that was developed. The first three demonstrations
show heading alignment, heading anti-alignment, and reference matching, from Sections 4.4, 4.3, and 4.5, respectively, using the fin-actuated robots and gains selected
using the theory developed in Chapter 4. Then, coordinated target tracking is demonstrated, in simulation, using ideas from Chapter 2. This final demonstration is available in simulation only due to ongoing upgrades to the electronics of the robots. These
tasks will be demonstrated on the UW-FAV testbed as soon as it is again operational.
Concluding remarks are made in Section 7.6.
7.1
Implementation
The controller for each robot was written in C++ using Metrowerks Codewarrior and
cross-compiled for the robot’s MPC555 processor. For convenience, the three robots will
be referred to as red, blue, and green in correspondence to markings on the individual
111
robots. The heading controllers set the desired heading of each robot. Then, a highfrequency inner-loop heading controller uses data from the internal compass to steer
the vehicle to this heading set point.
To protect the robots against collisions, the individual robots were given depth set
points of 0.3m, 0.6m, and 1.4m. Because the longitudinal and lateral dynamics are only
loosely coupled, this restriction has little impact on task performance. The limited size
of the test facility also caused issues and ultimately resulted in limiting the speed of
the vehicles and the duration of the demonstrations.
7.1.1
Communication Challenges
One main challenge of implementing the control algorithms of the previous chapters
on the UW-FAV testbed is the communication limitations. As previously discussed,
communication necessarily takes place at discrete time instants only. The discretetime coordinated controllers of Chapter 4 were designed with this discretization in
mind, whereas their continuous-time counterparts from Chapter 2 and related work
hold only approximately, even when the time step is small.
Another main challenge stemming from communication is the topology. Only one
robot can broadcast at a time. If two or more robots attempt to broadcast simultaneously, a collision will occur and the information will be lost. For the demonstrations
of this chapter, many of the algorithms require all-to-all communication. Here, we
achieve all-to-all communication with an ordered sequence of N one-to-all broadcasts.
To realize the random one-to-all broadcast sequence needed to support the heading
alignment controller from Section 4.3, a pseudo-random number generator, seeded
with zero, is employed. Communication collisions are avoided using a time-division
protocol, which requires clock synchronization (discussed below).
A third communication-related issue is packet drop. To address packet drop, the
heading set point on each vehicle is updated only upon receipt of a valid transmission:


θk (h) if robot k did not receive a new signal,
θk (h + 1) =
(7.1)

u (h) otherwise.
k
112
7.1.2
Communication Protocol
When a robot is selected as the broadcaster, it transmits its heading using the radiofrequency communication devices. A simple encoding and decoding scheme is used
to slightly compress the heading, which is a floating point value in degrees. The encoder maps the interval from [0,360) to the closest character in [0,255], using a linear
transformation. The decoder similarly maps the received character in [0, 255] back to
[0,360). To maximize the probability of a successful transmission, the encoded character is transmitted four times in quick succession. The transmission takes approximately 50ms to send. The receiving robots get this transmission approximately 10ms
thereafter. Thus, the discrete time step in the controller, ∆T , has a firm lower bound.
The overall flow of the controller running on each robot is shown in Figure 7.1.
Time Sync
Inner Loop
No
Valid Rx?
No
My Tx Turn?
Yes
Yes
Control
Tx
Figure 7.1: Coordinated controller program flow block diagram.
7.1.3
Clock Synchronization
Synchronized clocks allow the experiments to match the theory as closely as possible.
To achieve clock synchronization, before the robots are placed in the water, a base
113
station sends out a sequence of 20 unique characters at a rate of one character per
half-second. Upon reception of a character, the robot will pause until the time at which
the last character in the sequence would be received. After pausing, all vehicles have
synchronized clocks.
7.2
Coordinated Heading Alignment
For the heading alignment demonstration, the robots started at one end of the tank,
with the red robot pointed down the length of the tank and the green and blue robots
splayed approximately 65◦ to either side. The run time was set to 30s to prevent the
robots from running into the far wall. The control gain was selected to be K = 1, and
2
2
1.5
1.5
X (m)
X (m)
the time step was selected as ∆T = 0.5s.
1
0.5
0
0
1
0.5
1
2
3
Y (m)
4
(a) Simulation: Overhead View
5
6
0
0
1
2
3
Y (m)
4
5
6
(b) Experiment: Overhead View
Figure 7.2: A comparison of simulated and experimental heading alignment with random one-to-all broadcasts, as viewed from above. The data in (b) was collected from an
early version of the computer vision system from Chapter 6.
In addition to the physical experiment, a computer simulation was run using Matlab. The actual pseudo-random broadcast sequence used in the physical experiment
was used in the simulation, however, drops were random. The main difference between the experiment and this simulation is that the simulation assumes a discretized
planar kinematic unicycle model whereas the experiment is subject to the full dynamics of the underwater vehicles. The results from the simulation and experiment are
compared in Figures 7.2 and 7.3.
114
.
#!!
"!
!
!"! .
!
"
#!
#"
1*2'./-0
$!
$"
%!
3').4'567*+5
89:'.4'567*+5
;<''+.4'567*+5
#"!
.)/0'12,*+0)2-
&'()*+,-./)',0
#"!
*
#!!
"!
3)0*4)567'15
3)0*.)/0'12
89))1*4)567'15
89))1*.)/0'12
:;<)*4)567'15
:;<)*.)/0'12
!
!"! *
!
"
(a) Simulation: Headings
#!
#"
&'()*+,-
$!
$"
%!
(b) Experiment: Headings
Figure 7.3: Heading alignment data, viewed as compass heading vs. time for the three
robots.
A key point concerning the experimental data is that the camera tracking system
on the tank was not able to collect data for the first meter or so of the vehicle trajectories. Note that the initial segments of the green and blue robots shown in the figure
does not correspond to movement from the initial heading of ±65o but from configurations that are close to alignment. These results demonstrate good heading alignment
control considering experimental conditions that hinder the vehicle coordination. For
example, magnetic fields from the building and the tank itself affect the readings of the
compasses on board the robots. This magnetic field influence varies across the tank,
and so affects the robots differently depending on their locations in the pool. This effect
is clearly visible between the simulation and experimental results where the onboard
experimental measurements indicate alignment, but the external tracking system indicates that the compass measurements have drifted.
7.3
Coordinated Heading Anti-Alignment
For the anti-alignment heading experiment, the initial position of the vehicles was at
the side of the tank, halfway down its length. The initial heading of the red robot
was parallel to the width of the tank, while the blue and green robots’ initial headings
were splayed ±10◦ , respectively, from that of the red robot. The runtime was set at
115
20s, and the time step was again selected as ∆T = 0.5s. To make the group move
toward the balanced state, the sign of the control gain was switched from the aligned
heading experiment. Further, to keep the vehicles as close as possible to an antialigned group state with broadcast communication, the gain was selected as K = −0.5.
Note that the previous theoretical developments were for all-to-all only and thus there
2
2
1.5
1.5
X (m)
X (m)
is no guaranteed that an anti-aligned state should be approached.
1
1
0.5
0.5
0
0
1
2
3
Y (m)
4
5
0
0
6
(a) Simulation: Overhead View
1
2
3
Y (m)
4
5
6
(b) Experiment: Overhead View
Figure 7.4: A comparison of simulated and experimental heading anti-alignment with
random one-to-all broadcasts, as viewed from above. The data in (b) was collected from
an early version of the computer vision system from Chapter 6.
Red Setpoint
Blue Setpoint
Green Setpoint
Headings (deg)
250
200
150
100
Red Setpoint
Red Heading
Green Setpoint
Green Heading
Blue Setpoint
Blue Heading
200
150
100
50
50
0
0
250
Headings (deg)
300
5
10
15
Time (s)
20
25
30
(a) Simulation: Headings
0
0
5
10
15
Time (s)
20
25
30
(b) Experiment: Headings
Figure 7.5: Heading anti-alignment data, viewed as compass heading vs. time for the
three robots.
The results, displayed in Figures 7.4 and 7.5, suffered the same challenges as the
aligned heading experiment with environmental magnetic fields, communication reli-
116
ability, and the size of the tank. However, the last two of these impediments to coordinated heading control were more problematic for this experiment than they were for
the alignment control. The anti-alignment heading control naturally drives the robots
apart from one another, leading to greater separation between the radios, and the
commensurate decrease in communication reliability. Also, the size of the pool is more
constraining in this experiment. Both the red and blue robots encountered the far wall
of the pool and were affected by it. While they were still able to control their headings
when they encountered the pool wall, the heading control effectiveness of the robot
tail mechanism is diminished by wall contact. Finally, note that in both simulation
and experiment, the heading of the red vehicle crosses over that of the green vehicle.
This crossing happened because red received from blue when green did not, and would
not have occurred with all-to-all communication. Regardless, a near-balanced state is
approached.
7.4
Coordinated Reference Matching
For the reference matching experiment, the reference vector was selected as [0, 0.75],
which corresponds to desired group motion along the positive y-axis of the pool. The
UW-FAVs were started in one corner of the pool, with red and blue near 100◦ and green
near 200◦ . The runtime was set at 52s, and the time step was selected as ∆T = 2s.
To be consistent with the theory generated in Chapter 4, all-to-all communication is
required. This communication topology was achieved using three sequential broadcasts per ∆T interval. In other words, a fixed transmission sequence was used for
this experiment, and each UW-FAV updated its reference heading only after receiving
heading information from the other two vehicles. The gain was selected as K = −0.5.
The experimental data is shown in Fig. 7.6.
7.5
Coordinated Target Tracking using Phase Coupling
In this section, we present a simulation of coordinated target tracking using phase coupling of Chapter 2. The objective here is to stabilize the collective centroid to the target
117
250
Red Reference
Red Compass
Green Reference
Green Compass
Blue Reference
Blue Compass
Heading (deg)
200
150
100
50
0
0
20
Time (s)
40
(a) Headings
(b) Phasor Centroid
Figure 7.6: Heading data (a) collected for an experiment in which three vehicles are
performing reference matching. The phasor centroid (b) stabilized to the (constant)
reference vector, xref = [0, 0.75], in approximately 30s.
and simultaneously keep individual vehicles nearby. Given the good correspondence
between the unicycle model and the actual dynamics of the UW-FAVs demonstrated
in the previous section, approximating the fish robots as constant-speed unicycles is
well justified here. Results will be presented in simulation using the continuous-time
all-to-all controllers developed earlier. Also, we scale the problem so that the forward
speed of each unicycle is one so each vehicle has dynamics (2.2) with s = 1.
7.5.1
Matching the Centroid Velocity to a Reference Velocity
We begin with the simplified task of matching the centroid velocity to a reference velocity. With the unit-speed unicycle model, the phasor centroid and the velocity of the
group centroid are equivalent. Stabilizing the velocity of the group centroid to a constant reference velocity can be achieved directly using control control (2.8) from Section
2.2. Figure 7.7(a) shows a simulation of N = 5 unicycles with K = 0.2 steering for 30sec
to bring the centroid velocity to a reference velocity, ẋt = [0.75, 0]. Note that this result
118
is similar to the UW-FAV demonstration above, but uses more vehicles. Similarly, stability of the centroid velocity to a dynamic reference velocity can be achieved by directly
applying (2.32). Figure 7.7(b) shows a simulation using N = 8 unicycles with coupling
(a) Constant Reference
(b) Dynamic Reference
Figure 7.7: Demonstration of coordinated unicycle control. In (a), a group of five unicycles steers to match the velocity of their centroid to a constant reference vector. In
(b), the orientation of the reference vector changes in time.
strength K = 0.5, simulation time of 30sec, and a reference vector with dynamics:
ẋref = 0.75[cos(θref ), sin(θref )]
(7.2)
θ̇ref = 0.1 cos(t/2π).
(7.3)
In both simulations, the vehicles steer to match the velocity of the collective centroid
to that of the reference vector.
119
7.5.2
Tracking a Target Vehicle
The reference vector here will be derived from a planar target vehicle with position
and velocity denoted rt and ṙt , respectively. In order to stabilize the group centroid,
r̄ =
N
1 X
rk ,
N
(7.4)
k=1
to the target vehicle, (2.32) can be used in conjunction with an outerloop controller. In
our conference paper (Theorem 4 in [44]), we use the outerloop,
ẋref = (1 − ω (ρ)) ṙt + ω(ρ)(rt − r̄)/ρ,
(7.5)
where ρ = kr̄ − rt k is the distance between the group centroid and the target and ω is
a smooth function satisfying limρ→0 ω(ρ)/ρ = 0. The simulation in Figure 7.8 shows a
simulation of N = 12 unicycles with
ω(ρ) = 1 − e−αρ , α > 0.
(7.6)
The use of (2.32) also allows the target velocity velocity to be dynamic.
Figure 7.8: A demonstration of the outerloop controller, which is used here to bring the
group centroid to the target vehicle, shown as a blue circle. To produce this simulation,
K = 1, N = 12, T = 45, and α = −.05.
120
While the centroid does track the target in the above simulation, the individual vehicles stray far from the target. Several options have been considered to keeep vehicles
close while the centroid remains on the target. We now consider two options: potential
function spacing and centroid-locked spacing.
Potential Function Spacing
To keep individuals near the centroid, one option is to derive a control input from a
potential function, projected into null space of the matched manifold (2.31) so as to not
disturb the centroid velocity. The potential function steers vehicles to maintain a fixed
distance from the target while the projection ensures that the additional steering term
does not adversely affect the centroid target tracking. For example, in our work for
The Boeing Company, we augment the steering controller to

+ *

+

*
ρ − sin θk 
dV
ρ cos θk  
(ρ, ρ0 )
,
,
ũk = uk + P −
−
dr
kρk
kρk sin θ
cos θ
k
(7.7)
k
where the potential function V is defined as
V (ρ, ρ0 ) = ρ +
ρ20
− 2ρ0 .
ρ
(7.8)
Here, ρ0 is the equilibrium distance and ρ is the distance from the vehicle to the
Figure 7.9: A simulation of a swirling motion produced by (7.7). Here, N = 11, K = 0.1,
ρ0 = 7, and speed of the target, shown as the tank, was half that of the pursuit vehicles.
If desired, the outerloop can be used here to bring the centroid to the target.
121
centroid. The gradient of the potential function term in the controller acts to steer
vehicles towards the target when they are too far and towards the target when they
are too close. The second term creates clockwise rotation about the target. The matrix
P is an orthogonal projection into the null space of J from (2.41). An example with
N = 11 vehicles is shown in Figure 7.9.
Centroid-Locked Spacing Control
A second approach that can be used to keep vehicles near the target is to employ the
centroid-locked steering control, specifically splay orbits, from Chapter 2. Recall that
in a splay orbit, each vehicle performs the same steering control, but splay in time.
When coupled with unicycle kinematics, splay orbit result in each vehicle having the
same average velocity. Because the average velocity of the centroid matches that of
the target, the average velocity of each individual must also match that of the target.
This means the average distance between each individual and the target will not increase provided the target velocity changes slowly compared to the oscillation period.
A centroid-locked spacing control is shown in Figure 7.10.
Figure 7.10: A simulation of a back-and-forth motion produced by the general N
centroid-locked control (2.4.4), with N = 5 vehicles and coupling strength K = 0.1.
A specific initial was selected to produce this splay orbit.
122
7.6
Final Remarks
The work in this chapter shows how the theoretical developments of this thesis can be
applied to the UW-FAV testbed. Initial experiments demonstrated successful heading
alignment and anti-alignment, despite severe communication limitations. In fact, a
bug in the radio communication software was later found to have caused a huge packet
loss, on the order of 50%, during the tests. The radios have since been upgraded and
now have a drop rate near 1%. Another positive result of this demonstration was
that the visual tracking system was able to simultaneously track three robots as they
performed the coordinated maneuvers.
The simulation examples presented in second part of this chapter show how the
theoretical developments of early chapters of this thesis can be applied to the task
of tracking a target vehicle. Of particular importance here is the ability to drive the
phasor centroid, which here corresponds to the velocity of the spatial centroid, to a
reference vector. We hope to demonstrate coordinated target tracking on the UW-FAV
testbed in the near future.
123
Chapter 8
CONCLUSION
The work in this thesis has presented a broad variety of material ranging from technical work on collective phase coupling to a hands-on demonstration of coordinated target tracking on the University of Washington Fin-Actuated underwater multi-Vehicle
testbed.
8.1
Summary
The early chapters of the thesis focused specifically on low level Kuramoto-inspired
coordination to allow stabilization of the phasor centroid to a static or dynamic reference vector. Oscillations preserving this centroid matching were then explored and
an extension of the splay state was derived. Then, a novel extension to a situation in
which some of the vehicles act as leaders to the others were presented and analyzed
for a small system to highlight the differences between sinusoidal and linear protocols.
The following chapter discussed an extension of phase coupled oscillator models to
discrete time in order to make the control compatible with typical digital communication networks and also to reduce the overall communication overhead. Also considered
at this point was a routing optimization showing that a high-power powerful longrange broadcast is optimal when the time deadline is tight.
Following these theory-oriented chapters was a discussion of a complete coordinated target tracking system. The work here was aimed at answering broad questions
regarding this application. The results here indicate that communication always helps,
but serve moreover to point out that many open questions remain for future work on
multi-agent systems.
Finally, a select few of the coordinated controllers were demonstrated on the nonholonomic UW-FAV multi-robot testbed. The first demonstrations showed heading
124
alignment and anti-alignment on the robotic system. Then, simulation results were
presented to demonstrate coordinated target tracking using either potential function
or centroid-locked spacing controls.
8.2
Questions for Future Research
While some questions have been answered by the work in this thesis, many questions
remain unanswered. On the scope of the thesis, general tools for analyzing coordinated
control systems are lacking entirely. Even for linear systems, determining what to
communicate to neighboring vehicles remains a largely open challenge. With respect
to the non-linear coordinated systems considered in this thesis, open questions remain
at all levels from centralized coordinated control to distributed estimation.
For coordinated phase-coupled oscillator control, stabilizing the phasor centroid to
a time-varying reference vector should be guaranteed. As for the material presented
on splay orbits, deriving a feedback term that takes the state from an arbitrary initial
condition to a point on a splay oscillation is of critical importance. The material on
phase-coupled oscillator models with leader agents should be formalized and extended.
With respect to the work on discrete-time phase-coupled oscillator models, more restrictive communication topologies should be permitted in the analysis. For the demonstration towards the end of the thesis, it appears as though reference matching could
work with more limited topologies. If the extension is not direct, a state estimator can
be added to approximate the location of the current phasor centroid. The performance
of the discrete-time protocol should be studied in detail.
The chapter on coordinated target tracking addressed fundamental open questions,
and made progress towards answering these questions by considering a specific problem. Future work here should focus on extending the material of this chapter to more
general nonlinear control systems. Also, issues surrounding coordinated nonlinear
state estimation require much attention.
The demonstration section served to bring a number of practical issues to light. For
example, synchronized clocks were required to make the discrete-time protocol work
125
without communication collisions. Despite the numerous packet dropouts experienced
during the experiment, the coordinated control protocols seemed to demonstrate a certain robustness, which should be formalized. The work on Kuramoto-inspired coordinated target tracking using oscillator models will be demonstrated on the UW-FAV
testbed as soon as it is again operational. Future work here is needed to derive a
feedback term to keep vehicles within a specified distance of the collective centroid.
8.3
Final Remarks
Moreover, what is needed for the future of coordinated control is sound theory quantifying and capitalizing on the benefits offered by multi-agent systems. Specifically, the
supposed advantages of multi-agent systems include performance increases and taskrobustness benefits. However, a large amount of recent research addresses aspects of
coordinated control that do not have immediate return on either of these benefits. The
challenge here is that linear systems theory cannot address all aspects of coordinated
control. Answering broad questions, like what information should be communicated
and when should each communication take place will require the new collaborative
research spanning information theory, communications theory, and computer science
in addition to control theory.
126
BIBLIOGRAPHY
[1] D. G. Aronson, M. Golubitsky, and M. Krupa. Coupled arrays of Josephson junctions and bifurcation of maps with Sn symmetry. Nonlinearity, 4(3):861–902,
1991.
[2] D. G. Aronson, M. Golubitsky, and J. Mallet-Paret. Ponies on a merry-go-round in
large arrays of Josephson junctions. Nonlinearity, 4:903–910, 1991.
[3] Y. Bar-Shalom and X. R. Li. Multitarget-multisensor Tracking: Principles and
Techniques. Yaakov Bar-Shalom, 1995.
[4] R. W. Beard and T. W. McLain. Multiple UAV cooperative search under collision
avoidance and limited range communication constraints. Decision and Control,
2003. Proceedings. 42nd IEEE Conference on, 2003.
[5] V. N. Belykh, I. V. Belykh, and M. Hasler. Connection graph stability method for
synchronized coupled chaotic systems. I. General approach. preprint, 2003.
[6] P. K. Bettale. Design of a reliable embedded radio transceiver module with applications to autonomous underwater vehicle systems. Master’s thesis, University of
Washington, 2008.
[7] B. Birnir. An ODE Model of the Motion of Pelagic Fish. Journal of Statistical
Physics, 128(1):535–568, 2007.
[8] S. Bjorkenstam, M. Ji, M. Egerstedt, and C. Martin. Leader-Based Multi-Agent
Coordination Through Hybrid Optimal Control. Allerton Conference on Communication, Control, and Computing, 2006.
[9] J. Black, T. Ellis, and P. Rosin. Multi view image surveillance and tracking. Proceedings of the Workshop on Motion and Vision Computing, pages 169–174, 2002.
[10] G. Bradski. The OpenCV Library. Dr. Dobb’s Journal November 2000, Computer
Security, 2000.
[11] K. Branson and S. Belongie. Tracking multiple mouse contours (without too many
samples). Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Conference, 1:1039–1046, 2005.
127
[12] E. Brown, P. Holmes, and J. Moehlis. Globally coupled oscillator networks. Problems and perspectives in nonlinear science, A celebratory volume in honor of
Lawrence Sirovich, pages 183–215, 2003.
[13] F. Bullo and A. D. Lewis. Geometric Control of Mechanical Systems: Modeling,
Analysis, and Design for Simple Mechanical Control Systems. Springer, 2005.
[14] M. E. Campbell and W. W. Whitacre. Cooperative tracking using vision measurements on SeaScan UAVs. IEEE Transactions on Control Systems Technology,
Accepted for Publication, 2007.
[15] K. C. Chang, R. K. Saha, and Y. Bar-Shalom. On optimal track-to-track fusion.
Aerospace and Electronic Systems, IEEE Transactions on, 33(4):1271–1276, 1997.
[16] H. Chen and Y. Bar-Shalom. Track Fusion with Legacy Track Sources. Information Fusion, 2006. ICIF’06. 9th International Conference on, pages 1–8, 2006.
[17] I. J. Cox and S. L. Hingorani. An efficient implementation of Reid’s multiple hypothesis tracking algorithm and its evaluation for the purpose of visual tracking.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996.
[18] R. Diestel. Graph Theory. Springer, 2005.
[19] M. J. Field. Equivariant dynamical systems. Trans. Amer. Math. Soc, 259(1):185–
205, 1980.
[20] E. Fiorelli, N. E. Leonard, P. Bhatta, D. Paley, R. Bachmayer, and D. M. Fratantoni. Multi-AUV control and adaptive sampling in Monterey Bay. Autonomous
Underwater Vehicles, 2004 IEEE/OES, pages 134–147, 2004.
[21] G. Flierl, D. Grünbaum, S. Levins, and D. Olson. From Individuals to Aggregations: the Interplay between Behavior and Physics. Journal of Theoretical Biology, 196(4):397–454, 1999.
[22] D. Fox, W. Burgard, and S. Thrun. Probabilistic Robotics. Intelligent Robotics and
Autonomous Agents. The MIT Press, 2005.
[23] P. Gupta and P. R. Kumar. The capacity of wireless networks. IEEE Transactions
on Information Theory, IT-46:388–404, March 2000.
[24] M. Han, W. Xu, H. Tao, and Y. Gong. An algorithm for multiple object trajectory tracking. In Proceedings of the 2004 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition, volume 1, pages 864–871, 2004.
128
[25] N. Hingston. Equivariant Morse theory and closed geodesics. J. Differential Geom,
19:85–116, 1984.
[26] T. C. Hu. Parallel sequencing and assembly line problems. Operations Research,
9(6):841–848, 1961.
[27] C. Hue, J. L. Cadre, and P. Perez. Tracking multiple objects with particle filtering.
IEEE Transactions on Aerospace and Electronic Systems, 38:791– 812, 2002.
[28] A. Jadbabaie, N. Motee, and M. Barahona. On the stability of the Kuramoto
model of coupled nonlinear oscillators. In Proceedings of the American Control
Conference, pages 988–1001, 2004.
[29] J. Jeanne, N. E. Leonard, and D. Paley. Collective Motion of Ring-Coupled Planar
Particles. Decision and Control, 2005 and 2005 European Control Conference.
CDC-ECC’05. 44th IEEE Conference on, pages 3929–3934, 2005.
[30] M. Ji and M. Egerstedt. A Graph-Theoretic Characterization of Controllability for
Multi-agent Systems. American Control Conference, 2007. ACC’07, pages 4588–
4593, 2007.
[31] M. Ji, A. Muhammad, and M. Egerstedt. Leader-based multi-agent coordination:
Controllability and optimal control. Proceedings of the American Control Conference, pages 1358–1363, 2006.
[32] S. J. Julier. The scaled unscented transformation. American Control Conference,
2002. Proceedings of the 2002, 6, 2002.
[33] E. W. Justh and P. S. Krishnaprasad. Steering laws and continuum models for
planar formations. In IEEE 42rd Conf. on Decision and Control, Hawaii, USA,
2003.
[34] E. W. Justh and P. S. Krishnaprasad. Equilibria and steering laws for planar
formations. Systems and Control Letters, 52:25–38, 2004.
[35] H. K. Khalil. Nonlinear Systems. Prentice Hall, third edition, 2002. pages 168174.
[36] S. Kim, S. H. Park, and C. S. Ryu. Multistability in coupled oscillator systems
with time delay. Phys. Rev. Lett., 79(15):2911–2914, 1997.
[37] D. Kingston and R. Beard. UAV Splay State Configuration for Moving Targets in
Wind. Control of Redundant Robot Manipulators: Theory and Experiments, 2005.
129
[38] D. J. Klein. Coordinated collective motion for multivehicle trajectory tracking.
Master’s thesis, University of Washington, 2005.
[39] D. J. Klein, P. K. Bettale, B. I. Triplett, and K. A. Morgansen. Autonomous underwater multivehicle control with limited communication: Theory and experiment.
In Proceedings of the Second IFAC Workshop on Navigation, Guidance and Control of Underwater Vehicles, Killaloe, Ireland, April 2008.
[40] D. J. Klein, E. Lalish, and K. A. Morgansen. Controlled sinusoidal coupling: Heterogeneity through leadership. In the American Control Conference (submitted),
St. Louis, MO, USA, June 2009.
[41] D. J. Klein, P. Lee, K. A. Morgansen, and T. Javidi. Integration of communication
and control using discrete time Kuramoto models for multivehicle coordination
over broadcast networks. Decision and Control, 2007 46th IEEE Conference on,
pages 13–19, 2007.
[42] D. J. Klein, P. Lee, K. A. Morgansen, and T. Javidi. Integration of communication
and control using discrete time Kuramoto models for multivehicle coordination
over broadcast networks. IEEE Journal on Selected Areas in Communications,
26(4):695–705, May 2008.
[43] D. J. Klein, C. Matlack, and K. A. Morgansen. Cooperative target tracking using
oscillator models in three dimensions. In Proceedings of the American Control
Conference, New York, NY, June 2007.
[44] D. J. Klein and K. A. Morgansen. Controlled collective motion for trajectory tracking. In Proceedings of the American Control Conference, pages 5269–5275, Minneapolis, MN, June 2006.
[45] D. J. Klein and K. A. Morgansen. Set Stability of Phase-Coupled Agents in Discrete Time. Proceedings of the American Control Conference, 2008.
[46] D. J. Klein and K. A. Morgansen. A generalization of phase coupled oscillator
models. In preparation for IEEE Transactions on Automatic Control, 2009.
[47] Y. Kuramoto. Chemical Oscillations, Waves, and Turbulence. Springer-Verlag,
1984.
[48] H. J. Kushner. Stochastic Stability and Control. Academic Press, New York, 1967.
[49] B. F. La Scala and A. Farina. Choosing a track association method. Information
Fusion, 3(2):119–133, 2002.
130
[50] N. E. Leonard, D. Paley, and R. Sepulchre. Oscillator models and collective motion:
Splay state stabilization of self-propelled particles. In Proceedings of the IEEE
Conference on Decision and Control, Seville, Spain, December 2005.
[51] J. E. Marsden, T. Ratiu, and R. Abraham. Manifolds, Tensor Analysis, and Applications. Springer-Verlag, 3rd edition, 2001.
[52] M. J. Mataric, M. Nilsson, and K. Simsarian. Cooperative multi-robot boxpushing. Proceedings of the 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 556–561, 1995.
[53] R. E. Mirollo. Splay-phase orbits for equivariant flows on tori. SIAM Journal on
Mathematical Analysis, 25(4):1176–1180, 1994.
[54] P. Monzon and F. Paganini. Global considerations on the Kuramoto model of sinusoidally coupled oscillators. Decision and Control, 2005 and 2005 European
Control Conference. CDC-ECC’05. 44th IEEE Conference on, pages 3923–3928,
2005.
[55] L. Moreau. Stability of multiagent systems with time-dependent communication
links. Automatic Control, IEEE Transactions on, 50(2):169–182, 2005.
[56] K. A. Morgansen, B. I. Triplett, and D. J. Klein. Geometric Methods for Modeling and Control of Free-Swimming Fin-Actuated Underwater Vehicles. Robotics,
IEEE Transactions on, 23(6):1184–1199, 2007.
[57] K. Mushambi. Application of modern filtering techniques for 3D localisation in
biological and robotic systems. Master’s thesis, University of Washington, 2006.
[58] E. W. Nettleton and H. Durrant-Whyte. Delayed and asequent data in decentralised sensing networks. Proc. SPIE Conf, 4571:255–266, 2001.
[59] S. Nichols and K. Wiesenfeld. Ubiquitous neutral stability of splay-phase states.
Physical Review A, 45(12):8430–8435, 1992.
[60] University of California San Diego. The argo project: Global ocean observation for
understanding and prediction of climate variability. http://www.jimo.ucsd.
edu/research/projects/current/A_theme/argo.htm, July 2008.
[61] University of Washington.
washington.edu, 2008.
The Neptune project.
http://www.neptune.
[62] R. Olfati-Saber. Distributed Kalman Filter with Embedded Consensus Filters.
Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC’05.
44th IEEE Conference on, pages 8179–8184, 2005.
131
[63] R. Olfati-Saber. Distributed Kalman filtering for sensor networks. In Proc. of the
46th Conference on Decision and Control, December 2007.
[64] R. Olfati-Saber, J. A. Fax, and R. M. Murray. Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE, 95(1):215–233, 2007.
[65] R. Olfati-Saber and J. S. Shamma. Consensus Filters for Sensor Networks and
Distributed Sensor Fusion. Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC’05. 44th IEEE Conference on, pages 6698–6703, 2005.
[66] D. A. Paley, N. E. Leonard, R. Sepulchre, D. Grünbaum, and J. K. Parrish. Oscillator models and collective motion: Spatial patterns in the dynamics of engineered
and biological networks. IEEE Control Systems, 27(4):89, 2007.
[67] J. Parrish. Personal correspondence, 2007.
[68] J. K. Parrish and L. Edelstein-Keshet. From individuals to emergent properties:
complexity, pattern, evolutionary trade-offs in animal aggregation. Science, pages
99–101, 1999.
[69] H. Pasula, S. J. Russell, M. Ostland, and Y. Ritov. Tracking many objects with
many sensors. In IJCAI ’99: Proceedings of the Sixteenth International Joint
Conference on Artificial Intelligence, pages 1160–1171, San Francisco, CA, USA,
1999. Morgan Kaufmann Publishers Inc.
[70] A. Rahmani, M. Ji, M. Mesbahi, and M. Egerstedt. Controllability of multi-agent
systems from a graph-theoretic perspective. SIAM Journal on Control and Optimization, 2008.
[71] A. Rahmani and M. Mesbahi. On the controlled agreement problem. American
Control Conference, 2006, 2006.
[72] D. Rus, B. Donald, and J. Jennings. Moving furniture with teams of autonomous
robots. Proceedings of IEEE/RSJ International Conference on Intelligent Robots
and Systems, pages 235–242, 1995.
[73] R. Sepulchre, D. Paley, and N. E. Leonard. Collective motion and oscillator synchronization. In V. Kumar, N. Leonard, and A. Morse, editors, Cooperative Control, volume 309, pages 189–205. Springer-Verlag, 2005.
[74] R. Sepulchre, D. A. Paley, and N. E. Leonard. Stabilization of planar collective
motion with limited communication. Automatic Control, IEEE Transactions on,
53(3):706–719, 2008.
132
[75] R. Simmons, D. Apfelbaum, W. Burgard, D. Fox, M. Moors, S. Thrun, and
H. Younes. Coordination for multi-robot exploration and mapping. Proceedings of
the AAAI National Conference on Artificial Intelligence, 2000.
[76] D. Simon. Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches. Wiley-Interscience, 2006.
[77] D. P. Spanos, R. Olfati-Saber, and R. M. Murray. Approximate distributed Kalman
filtering in sensor networks with quantifiable performance. Information Processing in Sensor Networks, 2005. IPSN 2005. Fourth International Symposium on,
pages 133–139, 2005.
[78] S. Strogatz. Sync: The Emerging Science of Spontaneous Order. Hyperion, 2003.
[79] B. I. Triplett, D. J. Klein, and K. A. Morgansen. Discrete time Kuramoto models
with delay. In Panos J. Antsaklis and Paulo Tabuada, editors, Proceedings of the
Networked Embedded Sensing and Control Workshop, Lecture Notes in Control
and Information Sciences, pages 9–23, University of Notre Dame, USA, October
2005. Springer.
[80] B. I. Triplett, D. J. Klein, and K. A. Morgansen. Distributed estimation for coordinated target tracking in a cluttered environment. In Proc. of Robocomm, Athens,
Greece, October 2007.
[81] B. I. Triplett, D. J. Klein, and K. A. Morgansen. Cooperative estimation for coordinated target tracking in a cluttered environment. Accepted for publication in
ACM/Springer Mobile Networks and Applications (MONET), 2008.
[82] R. Y. Tsai. An efficient and accurate camera calibration technique for 3D machine vision. Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, pages 364–374, 1986.
[83] Applied Physics Lab University of Washington. Seaglider. http://www.apl.
washington.edu/projects/seaglider/summary.html, July 2008.
[84] E. Uysal-Biyikoglu and A. El Gamal. On adaptive transmission for energyefficiency in wireless data networks. IEEE Transactions on Information Theory,
December 2004.
[85] E. A. Wan and R. Van Der Merwe. The unscented Kalman filter for nonlinear estimation. Adaptive Systems for Signal Processing, Communications, and Control
Symposium 2000. AS-SPCC. The IEEE 2000, pages 153–158, 2000.
[86] A. G. Wasserman. Equivariant differential topology. Topology, 8(1969):127–150,
1969.
133
[87] S. Watanabe and S. Strogatz. Constants of the motion of superconducting Josephson arrays. Physica D, 74:195–253, 1994.
[88] S. Watanabe and S. H. Strogatz. Integrability of a globally coupled oscillator
array. Physical Review Letters, 70(16):2391–2394, 1993.
[89] M. Wheeler, B. Schrick, W. Whitacre, M. Campbell, R. Rysdyk, and R. Wise. Cooperative tracking of moving targets by a team of autonomous uavs. In 25th Digital
Avionics Systems Conference, 2006 IEEE/AIAA, pages 1–9, Oct. 2006.
[90] R. Willson. Tsai camera calibration software, 1995. Available at http://www.
cs.cmu.edu/˜rgw/TsaiCode.html.
[91] G. Wu, Y. Wu, L. Jiao, Y. F. Wang, and E. Chang. Multi-camera spatio-temporal
fusion and biased sequence-data learning for security surveillance. Proceedings
of the ACM Intl. Conf. on Multimedia, pages 528–538, 2003.
[92] M. K. S. Yeung and S. H. Strogatz. Time delay in the Kuramoto model of coupled
oscillators. Physical Review Letters, 82(3):648–651, January 1999.
[93] M. A. Zafer and E. Modiano. A calculus approach to minimum energy transmission policies with quality of service guarantees. INFOCOM 2005, 1:548–559, Mar.
2005.
134
VITA
Daniel J. Klein was born in Madison Wisconsin in 1981. After graduating from
James Madison Memorial High School in 1999, he entered the Mechanical Engineering
program at the University of Wisconsin, Madison. While at student at UW-Madison,
Daniel worked in the research lab of Prof. Nicola Ferrier. His research focused on
robotics and medical image processing. Daniel graduated in 2003 with a B.S. degree in
Mechanical Engineering, with Honors in Research.
The positive research experiences Daniel had while working with Prof. Ferrier led
him to enter the Ph.D program in Aeronautics and Astronautics at the University
of Washington in 2003. Daniel was attracted to this specific program because of the
novelty of the robotic fish project. He was granted a scholarship from the Achievement
Rewards for College Scientists (ARCS) Foundation. In 2008, Daniel earned a Doctor of
Philosophy at the University of Washington in Aeronautics and Astronautics.
During the summer of 2007, Daniel took an internship with Intel Research, Seattle. The focus of this research lab is on human and computer interaction, and one of
the active research projects involves a micro-computer with onboard gyroscope, magnetometer, and accelerometer triads. Daniel wrote software to estimate the orientation
of the computer using real-time data from these sensors.
Daniel currently lives in Kirkland, WA with his wife, Judy, and golden retriever,
Finny. However, he is soon moving to Santa Barbara to take a postdoc position at
the University of California, Santa Barbara. Daniel will continue research on control
of multi-agent systems under the supervision of João Hespanha in the department of
Electrical and Computer Engineering.

Download Report

Coordinated Control and Estimation for Multi

Paperzz.com

Your Paperzz