Search and rescue using mixed swarms of heterogeneous agents

Search and rescue
using mixed swarms of heterogeneous agents:
modeling, simulation, and planning
Eduardo Feo, Luca Gambardella, Gianni A. Di Caro
Technical Report No. IDSIA-05-12
April 2012
IDSIA / USI-SUPSI
Istituto Dalle Molle di studi sull’intelligenza artificiale
Galleria 2, 6928 Manno, Switzerland
IDSIA is a joint institute of both University of Lugano (USI) and University of Applied Sciences of Southern Switzerland (SUPSI),
and was founded in 1988 by the “Dalle Molle” Foundation which promoted quality of life.
This research has been partially funded by the Swiss National Science Foundation (SNSF) Sinergia project SWARMIX, project
number CRSI22 133059.
Search and rescue using mixed swarms of heterogeneous
agents: modeling, simulation, and planning
Eduardo Feo, Luca Gambardella, Gianni A. Di Caro
Dalle Molle Institute for Artificial Intelligence (IDSIA)
Lugano - Switzerland
{eduardo,luca,gianni}@idsia.ch
Abstract
Coordinating and managing emergency response scenarios in wilderness areas is a difficult
task, especially when the searching team is distinguished by a wide diversity of sensory-motor
and cognitive skills (e.g., human rescuers, dogs, and autonomous robots). Moreover, local
terrain characteristics and environmental conditions have a strong influence on the performance
of the exploration tasks executed by the agents in the rescue team. To cope with these issues,
we present a mission support tool, integrated with geographic information systems (GIS),
to assist and and perform automatic monitoring and decision-making of mission plans. The
proposed framework introduces novel strategies to estimate search efficacy according to agent
and environment characteristics, an optimization-based mission planning component which
assists the automatic allocation and scheduling of searching tasks, and a simulation environment
which enables the user to ascertain the outcome of the defined plans. We show the application
of the proposed tools in a simulated scenario, using real geographical data. We also discuss
future and ongoing work.
1
INTRODUCTION
Current search rescue (SAR) missions feature the use of technologies (e.g., robots) and of human
and animal agents (e.g., dogs) providing different, capabilities and expertise. The use of such
heterogeneous teams must be accompanied with innovative mission support and planning tools
in order to exploit and integrate the capabilities of each agent and make an optimal use of all
the available assets. In this work we address the challenges arising in the design of a mission
support system (MSS) for wilderness search and rescue operations (WiSAR in short) using a team
of heterogeneous agents. In the scenarios that we consider the location of a stationary non-evasive
target (e.g., an injured hiker) must be quickly identified. We assume the presence of an underlying
communication infrastructure enabling the command center and the SAR agents to continuously
exchange information.
In order to plan an effective deployment of the available agents over the search area, it is
customary to rapidly gather as much information as possible about the target and the area. In
this respect, the access to geographical data for the area becomes particularly useful, especially in
WiSAR scenarios, where search is done over vast and irregular terrains. This irregularity turns into
uncertainty on how agents’ performance and behavior (e.g., search efficacy, controllability) vary in
different parts of the area. This argument, together with the widespread availability of geographic
datasets and tools, suggests the use of geographic information systems (GIS) to model the tight
relationship between the environment and the performance of the executed search activities through
search efficacy models which take into account characteristics and conditions of the surrounding
environment.
Additionally to this, search and rescue missions are inherently dynamic. Several issues are
continually changing, such as: the agent’s estimated search performance which could be subject to
weather conditions, the discovery of new hints about target location during the mission, or changes
1
in the number of the agents participating to the SAR team. This dynamic nature of the problem
encourages the use of a receding horizon approach, where mission plans are iteratively defined in
stages. New information acquired during the search and/or changes in the problem or in the team
are included in the planning of the following mission stages.
In this work we describe an MSS aimed to help the user (i.e., the SAR mission commander)
to assess the area to search, the resources available in the SAR team, and use this information to
define joint search plans. A plan consists of a sequence of actions to be performed by an agent for a
certain time duration at defined environment locations. The MSS also includes the functionality for
the continuous monitoring of the activities of the agents and the assessment of search progress over
the different portions of the environment. Due to its GIS integration, the system that we present
can be categorized as a spatial decision support system.
When the SAR team includes the presence of a large number of heterogeneous searchers, it is
inherently difficult to manually allocate predefined sectors to each agent by at the same time exploiting at the best their capabilities and mutual synergies. Therefore, our MSS can be used by the
mission commander to automatically compute mission plans (e.g., initial plans, or modifications to
existing ones) seeking for an optimal joint use of the available resources, thus potentially increasing
the efficiency of the mission. The graphical visualization and simulation of the computed plans,
both at the trajectory and exploration gain level, enables the commander to revise and subsequently
dispatch the instructions to the searchers using the communication network. The proposed MSS
provides optimal global search and rescue mission plans considering all available information. Mission planning is modeled as a mixed integer linear optimization problem (MILP) in which the model
simultaneously allocates predefined sectors to be explored and specifies the schedule of the actions
that each agent should follow. The resulting plans guarantee a maximum reward for the search
activities. A number of constraints are included to model cooperation, proximity, and connectivity
relationships among agents’ team, as well as uncertainties and energy issues. The activation of
these constraints allows the user to select different search strategies when and where needed (e.g.,
at the start of the search the agents can be quickly spread over a large area, while in later stages
they can be focused on restricted areas working in tight cooperation).
The MSS also includes a simulation environment, using real-world geographical data for the
scenario, and realistic mobility and sensing models for the agents. The simulator allows to perform
general studies of the planning model and to make an online prediction of the results of a defined
plan.
We show the use of the MSS in a simulated scenario, where a team composed of human searchers,
air-scent dogs, and autonomous aerial robotic platforms work together to perform the search of a
missing person in a mountain area.
2
RELATED WORK
The automation of search and rescue missions has been investigated in an extensive body of research.
The work on search theory, introduced by Koopman [17], Stone [26] and others [28], has been at
the core of many of the proposed approaches developed inside a probabilistic context. In these
cases, the analysis is based on the notion of likelihood of the target’s presence in the search space.
The proposed mission assignment algorithms attempt to optimize the search by minimizing the
expected time for target detection [18], or by maximizing detection probability [3, 21] or number of
detections [7]. Although we do not explicitly formulate the problem in probabilistic terms (avoiding
non trivial normalization issues), we use the related notion of exploration demands: bounded range
variables used to prioritize parts of the area according to the amount of exploration they require.
In WiSAR missions, terrain features and environment conditions have a significant impact of
agents’ mission performance. Lin et al. [20] pointed out that the behavior of a (missing) person is
correlated with environmental features, such as terrain characteristic or weather conditions, and also
the likelihood of finding him/her in a specific place should show the same correlation. Therefore,
the use of GIS is proposed to extract terrain features and environmental information to support the
elaboration of probability maps. Similarly, the use of GIS tools is also encouraged in the context of
maritime search [13]. However, so far, the use of GIS was focused on the effect of the environment
2
on the target, but not on its the influence on the searchers. In fact, it is common practice to
assume that agents’ search efficacy is uniform throughout the area [3, 21, 24]. In this work, given
the specific characteristic of a search agent, we exploit the use of GIS data to estimate its expected
search efficacy and mobility within each different portion of the search area.
Many search planning algorithms are based on a cellular-based partitioning of the environment [7, 30], where each cell can conveniently include a description of its characteristic and status
of search. For instance, information about where the target might be, feedbacks from the exploration already done, search performance estimates of each agent in particular regions of the map,
between others, can be easily mapped into parts of the terrain. Continuous space approaches [15]
are less flexible since they are adequate in the case of uniform areas (e.g., maritime search using
UAVs), but not when terrain characteristic are irregular and not precisely known (e.g., land search).
In fact, in these cases, it becomes very hard to formulate plans letting the agents moving through
precisely defined locations. Although widely used, cellular decomposition of the search space also
poses problems in the case of heterogeneous SAR teams and/or agents with multi-scale sensing.
Waharte et al. [30] introduced the use of multiple grid cells and the observation of partial areas to
provide realistic sensing models for aerial platforms flying at variables altitudes. A multiscale representation of the environment was also proposed in [8]. Since we deal with heterogeneous agents,
we use a multiscale cellular decomposition by assigning to each agent search sectors composed of
subsets of cells according to its sensing and mobility.
Our work considers a number of issues related to SAR missions which have been covered individually by research in different domains. Path planning taking into account distance constraints have
been presented in [1] assuming that the agents already have a trajectory to follow. Constraints are
defined between pairs of agents and are modeled as time-parametric functions of the their maximum
distance. In this work we introduce several types of proximity constraints that might be useful in
the context of SAR missions. Moreover, the computed plans guarantee an optimal mission assignment respecting these constraints. In [27, 29], robot coalition formation is considered to accomplish
tasks that would not be possible to realize by a single robot. In our work, we do not enforce
coalitions in order to accomplish certain tasks, however we do consider that tasks might be done
more efficiently when several agents work physically close and cooperate with each other. These
affirmation is quite intuitive in search and rescue activities given that the quality of the search is
expected to improve when there is cooperation in the field. For instance, agents can perceive each
other’s activities to avoid overlapping of work. Moreover, by having agents working nearby, the
safety of the mission increases, in the sense that contingency plans would be deployed faster in case
of unexpected failures (e.g., accidents for the rescuers).
The use of UAVs for SAR has recently attracted a lot of interest [21, 31]. Other research on the
use of autonomous robotics for SAR has considered the automation of search and terrain exploration
using teams of robots [24, 14, 11]. However, most of these works are bounded to particular agent
and sensing profiles (e.g., UAVs) while others are mostly oriented to study distributed coordination
schemes.
A significant research effort has been devoted to simulation of SAR missions. However, in
most of the cases the focus on rescue teams acting in large urban disasters [16]. The simulation
tools include the simulation of the occurrence of: building damages and collapse, block of roads,
damages in electricity, water supply, gas, and other infrastructures, movements of refugees, hospital
operations, among others. Urban SAR is significantly different from WiSAR, where the terrain
characteristics and its effect over the searchers and target play a crucial role. Other works have
proposed the use of simulators in the field of Maritime SAR [4], in which the ocean drift is one of
the key aspects.
Mixed-integer linear programming has been successfully used in path planning problems [10, 12],
mission assignment [7] and cooperative control [19]. One of the advantages of MILP formulations is
that, given requirements and available knowledge, the optimal solution is provided. Compared to
these works, we put an emphasis on agent heterogeneity, include spatiotemporal constraints, and
solve both task allocation and scheduling problems.
3
3
SYSTEM MODEL
The task of searching a large area is closely related to the area coverage problem. In the context
of SAR missions, searchers can be conveniently modeled as mobile sensors with a predefined speed
and sensor range. In a simple scenario (i.e., no effect of the environment on the performance of
the mobile sensors, and target location completely unknown), a naive but effective solution is the
coordination of sensors’ paths to spread in the field and cover as much area as possible within the
minimum time. However, in real-world scenarios, terrain is usually irregular and cluttered, such as
its effect on both mobility and sensing is not uniform throughout the field. Therefore, the percentage
of surface effectively covered by the sensors (i.e., SAR agents) strictly depends on the allocation
of field regions to team members. This issue is particularly relevant in the case of heterogeneous
teams. For instance, consider the scenario where an area consisting of two delimited regions (region
A and B) must be explored by a team consisting of two searcher agents: one UAV and one air-scent
dog team. The UAV is equipped with a high definition camera and object detection software onboard. Region A is characterized for being mostly flat and having dense vegetation while the other
for having highly irregular terrain with small vegetation. The dense vegetation of region A will
probably obstruct the field of view of the UAV’s camera, therefore we consider that its the sensing
range is reduced in that region. On the other hand, the irregularity of the region B will greatly
difficult the movements of the air-scent dog team, therefore we consider its mobility decreased.
MISSION SUPPORT SYSTEM
AGENT PERFORMANCE MODELS
MISSION COMMANDS
PLAN UPDATES
OPTIONAL MODEL
FEATURES
USER
COVERAGE MAP
INTEGRATED PLANNING
FRAMEWORK
COMMUNICATION
NETWORK
ACTIVITY STATUS
GPS POSITIONS
SIMULATED PLANS
ENVIRONMENT
SEARCHER AGENTS
GIS DATA
Figure 1: Illustration of the components of the mission support system and their relationships.
3.1
Key concepts
Adopting the coverage view of SAR, in the following we introduce the concepts at the core of the
proposed MSS.
Area decomposition
Our model considers a cellular decomposition of the search area. The cells represent the smallest
and indivisible spatial elements. Each cell can be associated to a set of local properties. For a given
area decomposition (as selected by the user), the set of cells is indicated by C = {c1 , c2 , . . . , cn }.
Without losing generality, in the following we assume a grid-based decomposition.
Coverage Map
To evaluate the status of the mission in terms of coverage, we define a coverage map. It relates cells
to numerical values representing the amount of coverage required, or, in other words, the residual
need of exploration of each cell. A coverage map is denoted by Cm : C 7→ [0, 1]. For instance, for
c ∈ C, a value Cm (c) = 1 indicates that the cell still requires a full exploration (i.e., full coverage).
On the other hand, Cm (c) = 0 indicates no interest in exploring the cell, either because it has been
already explored or because the user is certain that the target is not located there (e.g., because of
prior knowledge). The mission commander can adjust these values at any point of time, in order to
4
express the desire to explore some regions more in depth than others, or to completely exclude some
parts of the area. By tracking movements and activities of the searchers in the search field, the
MSS continuously updates Cm values, providing awareness on how the search effort is distributed
in the area and assisting the overall decision-making process.
Agent Profile and Sectors
Searcher agents are mobile entities whose activities in the field can modify the coverage map. When
deployed in the field, they are connected to the communication network and able to inform the MSS
about the status of their activities. In our model, the group of agents participating in the mission is
denoted by the set A. To each agent k ∈ A, corresponds a search profile Pk = (sk , rk ) characterized
by: its optimal search speed (sk ), which is the maximum speed (m/s) at which the agent moves
while performing search activities, and its optimal sensor range (rk ), which is its nominal target
detection range (m). These properties are termed optimal because their values correspond to the
agent performance under ideal conditions. For instance, for a unmanned ground vehicles equiped
with high-resolution cameras, the setting of these values correspond to the best-case environments,
that is, flat and smooth terrain with very low vegetation density and under good illumination
conditions.
Sectors correspond to spatial elements which can be assigned to the agents for exploration. To
account for the different sensing and mobility characteristics of the agents, a sector is defined as a
portion of the search area that includes one or more cells, hence it is represented as a subset of C.
The set of sectors which can be assigned to a specific agent k ∈ A is denoted with Γk ⊆ P (C).
Time representation
The whole mission time is discretized into mission intervals. Any task assigned to an agent must
start and finish at the boundaries of these intervals. This will allow us to flexibly express timedependent constraints (see Section 4). We also denote the length of a mission interval (in seconds)
as ∆t , which also becomes the time unit of all properties representing points in the mission timeline
or duration of activities.
3.2
GIS-based search performance models
Although most of the information currently used for decision-making in SAR missions is carefully
defined by a team of experts (e.g., by the mission command center), we introduce a GIS-based
framework to assist (or to generate in automatic) the definition of the properties of the environment
and of the related search and mobility models of the agents.
3.2.1
Terrain elevation
One of the basic topographic characteristics of land surfaces is terrain elevation. This important
feature has a significant effect on the activities performed both on and above the terrain surface.
Digital Elevation Maps (DEMs) contain a rich set of information for elevation. From a raw DEM
(i.e., only elevation measures), we can derive additional terrain features which are useful for the definition of search efficacy and mobility models. Within these derived features we consider: (a) Slope,
computed as a function of the maximum rate of change between neighbor points; (b) Hillshade,
obtained by computing, from the estimated position of the sun, a depiction of the terrain showing
the amount of light reflecting in the surface; (c) Ruggedness index [25], commonly used as a measure
of terrain irregularity. Figure (2) shows a DEM and its derived data.
3.2.2
Land cover
Information also plays an important role in defining agent’s performance in
exploration activities inside areas with high vegetation density can reduce
searcher, and reduce the amount of area surveyed and the quality of the
data to estimate the vegetation density of an area is a land cover map. It
5
the field. For instance,
the average speed of a
search. One source of
describes the observed
(a) DEM
(b) Slope
(c) Hillshade for a spring day at 12h
(d) Hillshade for a spring day at 18h
Figure 2: Example of features derived from elevation data
(a) Aerial image of sample area
(b) Corine Land Cover map
Figure 3: Land cover
6
(a) Corine Land Cover - Categories
(b) Vegetation density
Figure 4: Vegetation density derived from CLC
Category Id
311-313
322-324
211-243, 331-333
other
Density value
3
2
1
0
Table 1: Mapping CLC categories to vegetation density
physical covering of the terrain, including natural or planted vegetation and human constructions
(e.g., buildings, roads). Some land cover maps such as the Corine Land Cover (CLC) [2] include
several categories for forest and natural vegetation areas (see Figure 4a). We use these categories
to define the vegetation density values of the area under consideration. Here, we propose a simple
mapping from some relevant categories of the CLC to a vegetation density index value. This relation
is depicted in Table 1. Figures (3) and (4) shows the derivation of vegetation density from a CLC
map.
Given a set of cells C and the corresponding geographical data (i.e., the DEM and CLC map),
we assign to each cell c ∈ C, a collection of properties p∗ (c) = {ph , ps , pl , pr , pv }, where: ph is the
elevation (in meters), ps is the slope (in degrees), pl represents the natural light as not shadowed
percentage (derived from hillshade), pr the ruggedness index, and pv is the vegetation density index
(derived from CLC).
3.2.3
Agent search performance models
In our MSS, we propose an analytical formulation of the agent search performance, ϕ, in terms
of p∗ (c). The formulation is composed of two elements: mobility efficiency, we , and sensor range
efficacy, re , with 0 ≤ we , re ≤ 1. For an agent k ∈ A with profile Pk = (sk , rk ), we define the
optimal area coverage done by k in a time interval of t seconds as:
Ck (t) = sk rk t.
(1)
Using the above definition, we formulate the amount of exploration performed during an interval
of t seconds by an agent k inside a cell c, as:
ϕ̂k (c, t) =
wek (c)rek (c)Ck (t)
,
Ac
(2)
where Ac is the cell’s area size (m2 ), and wek (c) and rek (c) are agent’s mobility efficiency and sensor
range efficacy inside cell c. By tracking the movements of all agents and summing up the amount
of time spent inside each cell, the MSS uses the ϕ̂ values to continuously update the coverage map
Cm .
7
Since exploration granularity is expressed in terms of sectors, if L ∈ Γk is the sector searched by
agent k for one mission interval, the update of the coverage map is done according to the following
expression that generalizes the definition of ϕ for sectors:
ϕk (L, c) =
∆t
)
wek (c)rek (c)Ck ( |L|
Ac
∀c ∈ L.
(3)
In the case of Eq. (3) we have assumed that the time duration of the activities inside L is uniformly
distributed among all cells composing the sector. If this condition does not hold, we talk of irregular
coverage models, and a different weighting needs to be performed for the different cells c ∈ L.
The effect of terrain and the environment over the agents’ search performance is modeled in
the formulation of wek (c) and rek (c) as functions of p∗ (c). Section (6.2) shows an example on the
derivation of search performance models based on geographical terrain properties.
3.3
Coverage Map update
t
Let Cm
be the coverage map corresponding to the state of the mission at time t(∆t ). The initial
0
coverage map values (Cm
) are a user-defined input to the MSS. During their course of the mission,
agents keep track of their trajectories and record the amount of time spent inside each cell along
their trails. These exploration logs are simply a collection of tuples (c, t), where c ∈ C and t a real
number representing the actual amount of time that the agent spent inside that cell. Let Lk [ti : tj ]
be the entries of the log kept by agent k corresponding to its activities carried out during interval
[ti ∆t , tj ∆t ). We abbreviate Lk [0 : tj ] as simply Lk [tj ]. We also denote the global exploration log
as L[ti : tj ] = ∪k∈A Lk [ti : tj ).
Assuming that L[t] is fully known, the demand values for all cells can be recursively updated
as follows:
t
t−1
Cm
(c) = arg max Cm
(c) − ϕ̄(c, L[t − 1 : t]), 0
(4)
where ϕ̄ is the exploration update function, which simply accounts the estimated exploration contribution of all agents considering the amount of time each of them spent inside the cell:
X
ϕ̄(c, L[ti : tj ]) =
ϕ̂k (c, tkc )
(5)
k∈A
tkc
=
where
X
(6)
t
(7)
(c,t)∈Lk [ti :tj ]
In this way, the coverage map is updated considering the execution of mission plans. For
convenience, we use Cm (without time index) to refer to the current values of the coverage map.
4
SAR PLANNING AS OPTIMIZATION PROBLEM
We formulate mission planning as an MILP optimization problem. The objective is to jointly
define, for all agents in the SAR team, search trajectories
and activity scheduling that maximize
P
the collection of cells’ utilities (i.e., that minimize
C
c∈C m (c)) in the given time horizon. By
exploiting agents’ expected search performance in the individual cells as defined in the previous
section, the joint solution approach allows to synergistically optimize team-level performance.
We formulate SAR planning as a variant of the Team Orienteering Problem (TOP)[5]. In the
TOP, a set of vertices is provided, each of them associated with a score, and a graph that defines the
adjacency relations between the vertices. The goal is to determine a set of paths that maximize the
total collected score. According to this model, cells correspond to vertices and the score is the reward
of the exploration activities at the cells, that is, the decrease in Cm . Several aspects distinguish our
problem from typical TOPs in literature. First, the total collected score (i.e., achieved coverage,
in our case) depends upon the amount of time spent servicing each vertex, which implies that the
8
optimization model must decide not only which vertices are visited, but also their service times.
Moreover, the maximum collected score which can be obtained from a vertex is limited (i.e., the
maximum score associated to c ∈ C is Cm (c)). Second, given that cells correspond to vertices, but
agent paths in the solutions consist of sequences of sectors (i.e., sequences of collections of cells),
the basic TOP model need to be adapted to consider visiting several vertices at the same time.
4.1
Basic MILP Formulation
Time flows in discrete steps, represented by numerical values in N which correspond to multiples of
a fixed time unit (i.e., the mission interval). Each time unit an agent k visits a sector L ∈ Γk , the
collected reward associated to c ∈ L is defined by the agent’s search efficacy ϕk (c, L). dc = Cm (c)
is the maximum reward associated to a cell c ∈ C. As defined by Eq. (3), ϕk is a linear function
of time. If sector L1 and L2 are assigned to agent k for
P an amount of time
P of t1 and t2 units
respectively, the utility of agent k’s plan is computed as c∈L1 t1 ϕk (c, L1 ) + c∈L2 t2 ϕk (c, L2 ).
To represent the start and end of the path of an agent k we make use of artificial sectors, denoted
respectively as Sk , Sk0 . For each agent k ∈ A, we define a graph Gk = (Γk ∪ {Sk , Sk0 }, Ek ) where Ek
contains an edge (i, j) if both sectors are adjacent, implying that when agent k is located at sector
i at time step t, it is able to reach sector j by time step (t + 1). There exist edges from all sectors
from the agent starting position. We
to the dummy sector Sk0 , and from Sk to all sectors reachable S
denote the set of agent sectors that include c ∈ C as θ(c) ⊆ k Γk . For each agent k, the model
computes a plan Sk from time t = 1 to a maximum time horizon T , that is, Sk : {1, . . . , T } 7→ Γk ,
such that (Sk (t) , Sk (t + 1)) ∈ Ek or Sk (t) = Sk (t + 1) ∀t.
The mixed-integer linear program formulation for SAR planning is:
X
min
(dc − Φc ) + ςc
(8)
c∈C
subject to
X
(Sk ,j)∈Ek
X
(i,Sk0 )∈Ek
X
XSk jk = 1 ∀k ∈ A
(10)
XiSk0 k = 1 ∀k ∈ A
(11)
X
Xijk =
(i,j)∈Ek
(9)
Xjik = Yjk
(j,i)∈Ek
∀k ∈ A, i ∈ Γk
(12)
YSk k = YSk0 k = 1 ∀k ∈ A
(13)
Tik + Wik − Tjk ≤ (1 − Xijk )M ∀k ∈ A, (i, j) ∈ Ek
X
TSk k +
Wik = TSk0 k ∀k ∈ A
(15)
Yik ≤ Tik , Wik ≤ T Yik
∀k ∈ A, i ∈ Γk
(14)
(16)
i∈Γk
Φc =
X
X
ϕk (c, i)Wik
k∈A i∈θ(c)∩Γk
Φc ≤ d c + ς c
0 ≤ Φc , 0 ≤ ς c
∀c ∈ C
∀c ∈ C
Xijk , Yjk ∈ {0, 1}
(18)
∀c ∈ C
Tik , Wik ∈ N ∀k ∈ A, i ∈ Γk
∀k ∈ A, i, j ∈ Γk ∪
(17)
(19)
(20)
{Sk , Sk0 }
(21)
Constraints (10-13) are directly derived from standard TOP formulations. Precedence and time
definition constraints (14-16) define variables T and W , where M is a large constant. Auxiliary
variables Φ and ς are defined by constraints (17-18). Cells’ excess of reward, ς, is added up, as a
penalty, in the objective function.
9
4.2
Cooperation, proximity constraints, and energy consumption
The above formulation represents the basic model which we use to maximize the utility of joint agent
plans (i.e., minimize residual reward). Additionally, to include direct dependencies in space and
time among the plans of individual agents, we define the following time-indexed binary variables:
T
Yikt
equals 1 if agent k is inside sector i at time t;
T
Yckt
equals 1 if agent k is visiting cell c at time t;
Yikjlt equals one if agents k, l are exploring sectors i ∈ Γk and j ∈ Γl at time t.
The additional constraints which define the previous set of variables are:
T
Yikt
= 1 ⇔ Tik ≤ t ≤ Tik + Wik ∀k ∈ A, i ∈ Γk , 1 ≤ t ≤ T
X
T
T
Yckt
=
Yikt
∀k ∈ A, c ∈ C, 1 ≤ t ≤ T
(22)
(23)
i∈θ(c)∩Γk
T
T
Yikjlt = 1 ⇔ Yikt
and Yjlt
∀k, l ∈ A, i ∈ Γk , j ∈ Γl , 1 ≤ t ≤ T
(24)
We considered the following aspects, to give the user the possibility to affect the way plans are
computed to meet specific requirements:
4.2.1
Collaboration rewards
One extension to consider in the previous problem is the situation when the collaboration between
agents improve their exploration capabilities. The formulation of the extension is as follows: Given
a cell c, whenever the number of agents simultaneously performing exploration activities inside
sectors which involve c is greater or equal than Ā , their exploration efficacies are increased by a
linear factor τ , which we refer to as collaboration reward.
This simple extension of the SAR planning is included in the MILP model by means of additional
variables and constraints. Let us define the following variables:
τ
Ycikt
equals 1 if agent k is exploring cell c by visiting sector i, and the number of agents exploring
c is greater or equal to Ā, 0 otherwise.
These variables are defined by the following constraints:
X
τ
T
T
Ycikt
=1⇔
Yck
∀c ∈ i, k ∈ A, 1 ≤ t ≤ T
0 t ≥ Ā and Yikt = 1
(25)
k0 ∈A
The reward is included in the objective function by replacing (17) by:
X X X
T
τ
Φc =
Yikt
(Ycikt
τ ϕk (c, i) + ϕk (c, i)) ∀c ∈ C
k∈A i∈θ(c)∩Γk
(26)
t
Note that the model no longer utilizes variables W to compute the rewards. Instead, time-indexed
variables Y T are used, given that by (22), the following condition holds:
X
T
Yikt
= Wik
(27)
t
4.2.2
Relations between groups of agents
Consider the scenario where a group of mobile base stations is deployed to relay data obtained
by the agents to the mission command center, and unmanned aerial vehicles (UAV) are used for
the collection of aerial imagery which is analyzed in real-time at the command center aiming to
spot any signs of a lost target. To be able to transfer the bulk data from the UAVs to the base
station, the mission commander requests the planner to provide plans in which every UAV is within
communication range of one of the mobile base stations. Another situation of interest are scenarios
10
involving the use of scent dogs. While one scent dog performs its activities, the presence of other
agents (e.g. human or other dogs) might interfere with the search. Hence, mission plans that keep
each dog physically separated by a minimum distance from other agents are desirable. As noted,
both situations represent examples of proximity constraints (i.e. minimum and maximum separation
distance) between group of agents. To enable the model to establish relations between groups of
agents, we introduce the concept of agent groups, denoted as Ω ⊆ P(A). We formulate two types of
proximity constraints: Let G1 , G2 ∈ Ω, a minimum distance proximity constraint requires that one
agent in G2 must be located within certain distance (Υmin (G1 , G2 )) from all agents in G1 during all
mission steps. Conversely, a maximum distance proximity constraint requires that one agent in G2
must be located at distance greater or equal to Υmax (G1 , G2 ) from all agents in G1 . This approach
can be used to model different kind of situations, given appropriate definitions of agent groups.
For instance, minimum distance constraints are used to enforce network connectivity or to promote
local cooperation (e.g., between humans and robots), while maximum distance constraints are used
when physical proximity between some agents might have negative effects (e.g., task interference).
We introduce auxiliary variables:
ΘkGt is the minimum distance between agent k and all agents l ∈ G at time t.
Θ̂klGt is the difference between the distance from agent k to agent l ∈ G and the minimum distance
from k to any agent in G.
ΨkGt is the maximum distance between agent k and agents G at time t.
Ψ̂klGt is the difference between the distance from agent k to agent l ∈ G and the maximum distance
from k to any agent in G.
which are defined by the following constraints:
ΘkGt + Θ̂klGt = ψij Yikjlt
∀ l ∈ G, i ∈ Γk , j ∈ Γl , t
(28)
ΨkGt − Ψ̂klGt = ψij Yikjlt
∀ l ∈ G, i ∈ Γk , j ∈ Γl , t
(30)
Θ̂klGt ≥ 0
∀ l ∈ G, i ∈ Γk , j ∈ Γl , t
(29)
Ψ̂klGt ≥ 0
∀ l ∈ G, i ∈ Γk , j ∈ Γl , t
(31)
To complete the definition of ΘkGt we must impose an additional constraint:
∃l ∈ G | Θ̂klGt == 0 and ∃l ∈ G | Ψ̂klGt == 0
(32)
Combining constrains (24) with (28-32) we can easily model the group proximity constraints.
Let Υmin (G1 , G2 ) and Υmax (G1 , G2 ) be the minimum and maximum possible separation distance
required between agents belonging to group G1 and at least one agent of group G2 . Figure (5) shows
a representation of variables Θ and Ψ given the position of two agent groups G1 , G2 (depicted as
circles and squares) at a given time step t.
ΨkG2 t ≤ Υmin (G1 , G2 ) ∀k ∈ G1 , 1 ≤ t ≤ T
Υmax (G1 , G2 ) ≤ ΘkG2 t
4.2.3
∀k ∈ G1 , 1 ≤ t ≤ T
(33)
(34)
(35)
Energy consumption by mobility and search
So far, we have not considered the energy aspects of the search and exploration activities. When
agents are energy limited, mission planners might seek solutions which guarantee excellent exploration performance, while taking into account the energy budget. The flexibility of our formulation
allows to introduce two types of energy consumption models. First, energy costs can be assigned
to the movements of the agents by associating each edge (i, j) ∈ Ek with a cost. This factor can be
used to represent the existence of paths between adjacent sectors that might relief the movement of
11
ΨG1 G2 t
ΘG1 G2 t
Figure 5: Representation of variables Θ and Ψ.
an agent (i.e., energy cost for moving between sectors a and b can be set high if there is no known
path in the area between such sectors). Secondly, we can also associate costs to the visit of an
agent in a sectors, which can model the mobility efforts implied by the search inside a cell.
Let λk (i, j) for agent k ∈ A and edge (i, j) ∈ Ek be the cost of moving from sector i to sector
j. Let Λk be the energy budget of agent k ∈ A. Let λk (i) be the cost of agent k’s exploration in
sector i ∈ Γk . The total cost of a mission plan for an agent, in terms of energy consumption, is
characterized by:
X
X
λ̄k =
Xijk λk (i, j) +
Wik λk (i) ∀k ∈ A
(36)
i∈Γk
(i,j)∈Ek
λ̄k ≤ Λk
4.2.4
∀k ∈ A
(37)
Search spreading
The use of heterogeneous teams does not exclude scenarios where a large portion of the agents share
similar search characteristics. Under these circumstances, the basic formulation would lead similar
agents exploring similar areas, resulting in trajectories close to each other, or even overlapping.
While this might be desirable to intensify the search during later search stages, it might be an
unwanted behavior at earlier stages, when a more widespread exploration of the whole area might
be desirable. To give the user the faculty to control this behavior, we include a linear spreading
reward component s ∈ [0, 1], which can be set proportional to the intention of obtaining spreading
trajectories.
Π = min{ΠEXP L + ws ΠSP READ }
where
X
(dc − Φc + sdc )
ΠEXP L =
c∈C
X
(38)
(39)
(40)
dc
c∈C
|C| −
ΠSP READ =
X
c∈C
|C|
Yc
(41)
(42)
Equation (40) corresponds to the normalized residual demand. Equation (41) represents the
number of unexplored cells, normalized by the total number of cells. Parameter or reward ws is
12
the spreading weight. Higher values for ws lead to greater spreading of agents, while smaller values
might favor the concentration of exploration in smaller areas (i.e., cells for which agents have higher
exploration efficacy). Equation (38) becomes the new objective function of the model, replacing
(8).
4.3
Modeling approach for uncertain routes
So far, we have assumed that after computing a mission plan, agents will strictly follow the issued
commands, without any deviations from the assigned trajectories. However, this might not hold
for agents with partially controllable mobility or agents with the ability of making autonomous
decisions that might override their computed mission plan. We propose a simple and effective
approach to consider these scenarios and compute robust solutions that guarantee an excellent use
of all the mission assets even in the case that executed plans might suffer deviations due to agent’s
partial controllability.
We denote γk (i, j) ⊆ Γk for (i, j) ∈ Ek , as the subset of agent sectors which represent the sectors
to which the agent might deviate when moving from i to j. Sector j ∈ γk (i, j) to consider the case
where no deviations occur. A new problem is formulated on top of the previous one by defining
new sets of agent sectors Γγk = {γk (i, j) | (i, j) ∈ Ek }. New adjacency graphs are also defined as
Gk = (Γγk , γ(Ek )) where:
γ(Ek ) = {(γk (h, i) , γk (i, j)) | (h, j), (i, j) ∈ Ek }
(43)
For h ∈ γk (i, j), let us define pkhij as the probability of deviating to sector h ∈ Γk when agent k
P
is departing from sector i to sector j. Moreover, h∈γk (i,j) pkhij = 1.
The search efficacies ϕγk is redefined as follows:
ϕγk (c, l) = max ϕk (c, i)pkhij
h∈l
∀l ∈ Γγk , c ∈ l
(44)
This new problem defined on top of the previous one contemplates the uncertanty on agent’s
mobility. Figure (6) details the proposed method: Agents commanded to move from a to c are prone
to mistake the order and arrive in sectors b or d with probability 0.25 each. In the new problem,
sectors {b, c, d} and {e, f, g} are compacted into two sectors: γ(a, c) and γ(c, f ) repectively.
We also propose another method to tackle uncertainty by simply enlarging the agent predefined
sectors. Using an irregular coverage model, the system can set the search efficacy in the following
way: For the set of cells which belong to the original sector, we assume that the agent will perform
its activities with some (larger) probability. The same applies for the set of extended cells, but
with reduced probability. Given an assigned time, these probabilities translate into an expected
amount of time that the agent will spend inside each of the cells. For the original cells, the search
efficacy will be decreased, while for the extended cells, the system will consider a poor (but not
zero) search efficacy. Figure (7) shows how the model works. In this example, the original sector
consisted of four cells. For these cells, the performance model defined full search efficacy (i.e., 1.0).
After applying the model, the sector was extended to 16 cells. Here, the system considered that
the agent would visit the original cells with probability 0.8, and the extended cells with 0.2. The
computation of the performance model using the expected amount of time lead to the indicated
search efficacy. We note that the new set of sectors is only used for planning purposes. For mission
execution, agents will be still commanded to do their activities in the original sectors.
Although these approaches are heuristic strategies, they can provide robustness to the computed
solutions. The study of these remains for future work.
5
SIMULATION OF MISSION PLANS
In order to study the effectiveness of the mission plans computed according to the model of previous
section, or issued by hand by the user, we developed a discrete-time simulator for SAR scenarios.
Each agent is modeled as a single sensor moving through the environment according to the given
13
γ(a, c)
a
γ(c, f )
b
e
pbac = 0.25
pecf = 0.25
c
f
pcac = 0.50
pf cf = 0.50
d
g
pdac = 0.25
pgcf = 0.25
Figure 6: Modelling uncertain routes. γ(a, c) = {b, c, d}, γ(c, f ) = {e, f, g}
0.12 0.27 0.16 0.02
1.0 1.0
0.13 1.00 0.95 0.02
1.0 1.0
0.15 1.00 0.83 0.02
0.20 0.23 0.05 0.02
Figure 7: Tackling uncertainty by enlarging sectors
Var
Xijk
Yik
Φc
ςc
Tik
Wik
T
Yikt
T
Yckt
τ
Ycikt
Yikjlt
ΘkGt
Θ̂klGt
ΨkGt
Ψ̂klGt
λ̄k
Description
Whether agent k traverses edge (i, j) ∈ Ek
Whether agent k visits location i ∈ Γk
Total amount of exploration consumed at cell c ∈ C
Surplus of exploration consumed at cell c ∈ C
Start time of exploration activity of agent k at location i ∈ Γk
Duration of the exploration activity done by agent k at location i ∈ Γk
Whether agent k is exploring location i at time t
Whether agent k is exploring cell c at time t
Whether agent k is exploring cell c by visiting location i and the number of
agents exploring c at time t is greater or equal to T
Whether agents k, l are exploring location i ∈ Γk and j ∈ Γl at time t
Minimum distance between agent k and agents in G at time t
Difference between the distance from position of agent k to position of agent
l ∈ G and the minimum distance from k’s position to agents G
Maximum distance between agent k and agents in G at time t
Difference between the distance from position of agent k to position of agent
l ∈ G and the maximum distance from k’s position to agents in G
Energy cost associated to schedule of agent k ∈ A
Figure 8: MILP model variables
14
Type
Binary
Binary
Real
Real
Integer
Integer
Binary
Binary
Binary
Binary
Real
Real
Real
Real
Real
Term
C
A
Γk
Γ
θ(c)
Gk
Ek
dc
ϕk (c, L)
Sk
ψij
T
Ω
Υmin (G1 , G2 )
Υmax (G1 , G2 )
λk (i, j)
λk (i)
Λk
ws
Description
Area cells
Agents
Sectors for agent k ∈ A
Agent sectors
Agent sectors that include cell c ∈ C
Adjacency graph for agent k
Edge set of adjacency graph Gk
Exploration demand for cell c ∈ C
Search efficacy for agent k ∈ A, L ∈ Γk , and c ∈ C
Initial location of agent k
Distance between sectors i, j ∈ Γ
Maximum time
Agent groups
Minimum distance required between agent groups G1 ,G2 ∈ Ω
Maximum distance allowable between agent groups G1 ,G2 ∈ Ω
Energy cost of movement of agent k from sector i to j
Energy cost of visit of agent k in sector i ∈ Γk
Energy budget of agent k ∈ A
Spreading weight in objective function
Type
Set
Set
Set
Set
⊆Γ
Graph
⊆ (Γk )2
Real
Real
∈ Γk
Real
Integer
⊆ P(A)
Real
Real
Real
Real
Real
Real
Figure 9: Model parameters
plans. Its sensor detection range and mobility pattern can change over time and space, depending on
the current location in the search area and the related GIS data. The effect of agents’ cooperation
is also simulated, by increasing their detection capabilities when cooperation occurs (e.g., when
group behavior is enforced by the plan).
An important aspect of our scenarios is the movement of the agents in the search area. In fact,
depending on their cognitive and motor characteristics, the agents are not expected to precisely
follow in space and time the given plans (e.g., a ground robot might encounter unexpected obstacles,
or being attracted by some detected cues). Therefore, the simulator include search mobility models
for the different types of simulated agents. Actual agent trajectories result from the combination
of the given plans, the characteristics of the area the agent is moving through, and its inherent
controllability (e.g., a flying robot is expected to follow the given plan more faithfully than a scent
dog directed by his master). A general description of the mechanisms used to simulated agent
mobility is provided below.
5.1
Mobility models
The employed model to simulate agent mobility patterns is based on the stochastic vector mobility
model presented in [6]. Differently from standard stochastic mobility models, trajectories performed
by an agent during the search have directionality, given by the mission waypoints, and purpose,
determined by area coverage objectives and cue detection. The proposed model encompasses both
these aspects: decisions about where to move are locally affected by mission waypoints and randomly
defined intermediate points of interest, which simulate wandering around in the assigned areas
looking for hints or being detoured by obstacles or other events (e.g., presence of strong wind in
the case of aerial robots). These two mobility biases are combined as vectors of different intensity
and direction to produce the resulting local move. Their relative impact is weighted according to
the characteristics of the agent and of the environment (e.g., for scent dogs, we also simulate the
presence of wind, that strongly bias dog behavior by carrying target information).
15
5.1.1
Destination Targeted Vector Mobility Model
In [6] authors presented a stochastic vector mobility model, named Destination Targeted Vector
Mobility Model (DTVMM), in which the resulting trajectories have navigational characteristics.
Given a vector constructed from an origin and a destination point, the main procedure of DTVMM
consists of determining a sequence of mobility vectors such that the summation of all these vectors
is close to the destination vector. This guarantees a convergence towards the destination point.
The standard DTVMM is formulated as follows. Let θD be the angle of the destination vector
defined from the origin to the destination points. We assume a discretization of time consisting
of intervals of equal length ∆t. The mobility vector at the ith discrete interval is denoted vi and
determined by a size ri and an angle θi ,
vi = (vix , viy ) = (ri cos(θi ), ri sin(θi ))
(45)
The value of ri = vi ∆t at the ith time interval determines the distance of the movement and
clearly depends on the agent’s speed at that moment (vi ). We make the following assumptions
about the agent’s velocity:
(a) its value increases or decreases in a gradual form,
(b) is independent at each discrete time interval and,
(c) the ∆t is small enough to let velocity decrease or increase by a small variation.
The variation of vi is modeled as a discrete Markov chain process where each state correspond
to a possible agent speed: V = {V0 , V1 , . . . , VM = V̂ } with V0 = 0 and V̂ the agent’s maximum
speed.
The Markov chain transitions are described as follows:

Psame

 Pinc +Psame if i = 0
same
P (Vi , Vi ) = PdecP+P
if i = M
same


Psame
otherwise
(
Pinc
if i = 0
P (Vi , Vi+1 ) = Pinc +Psame
Pinc
otherwise
(
Pdec
if i = M
P (Vi , Vi−1 ) = Pdec +Psame
Pdec
otherwise
where Pdec is the probability of decreasing the speed, Pinc the probability of increasing speed, and
Psame the probability of no variation of speed.
The second element of the mobility vectors are the vector angles θi , which are obtained using
the following formula
θi = (θD + φi )
(46)
where φi is a normal variable N (0, σφ ).
The DTVMM terminates when the projected summation of the generated mobility vectors equal
or exceed the length of the destination vector.
DTVMM with distraction points An important extension we propose to the basic DTVMM,
meant to accommodate trajectories done by rescuers during SAR missions, is the inclusion of distraction points. A distraction point is simply a location in the vicinity of the agent that temporarily
draws away its attention, therefore becoming the new destination point of the DTVMM, replacing
the previous one. Distraction points may appear either before or after the entry of the agent inside
the planned location with different probabilities.
16
Effect of local terrain characteristics on mobility Possible effects of the terrain on agent
movements are reflected in the mobility model by the introduction of simple rules to govern the
speed and the degree of coverage of the DTVMM. We simply take the definition of the walking
efficiency presented in Section 3.2.3 to regulate the agent speed. The probability of increasing speed
while moving inside a particular cell will be equivalent to the agent’s walking efficiency in that cell.
Hence, agents will move at their maximum search speed, (i.e. optimal search speed) whenever the
local area conditions are good. When the terrain conditions are not suitable for an agent, it will
move at the lowest speed. A random noise is introduced to the walking efficiency to consider the
effect of innacuracies of the agent profile models.
Summarizing all above, the procedure of the ground search mobility model is as follows: Let
P = {(L0 , t0 ), . . . , (LN , tn )} be the mission plan suggesting theP
agent to perform search activities
at location Li during time interval (starti , endi ), with starti = j<i tj and endi = starti + ti . Let
WP = {W P (L0 ), . . . , W P (LN )} be the set of location waypoints for the corresponding plan. In
the simulation, each waypoint W P (Lj ) becomes the agent’s destination point at time startj . At
this point of time, the agent might not yet be located at the corresponding location Lj . In such
case, the mobility model is going to direct the agent towards Lj .
At any time, distractions may appear, replacing the current destination point and restarting
the DTVMM model (i.e., reset of the destination vector). The positions of the distractions are
determined through a random point selection inside a disc with radius rd centered at the agent’s
current position, modeling situations where the agent was attracted to a nearby place because it
had perceived something. At this point, many different strategies can be implemented, for instance,
to simulate agent coordination in the field, as proposed in Section (5.4). The DTVMM maintains
the distraction point as current destination until the agent has arrived (determined by the standard
DTVMM termination procedure), or the time assigned to explore the current sector expires. In
both cases, the new destination point is set to the waypoint of the current sector.
Figure (10) shows the agent trajectories obtained using the proposed exploration mobility model
for a mission plan consisting of 5 100m×100m square sectors. In this case, the waypoints are simply
the center of the areas. The mission plan assigned the 5 minutes to explore each sector, with the
exception of L3, which was assigned 10 minutes. The parameters of the mobility model were set
to σφ = 25◦ , V̂ = 1.2m/s, Pinc = Pdec = 0.4 and Psame = 0.2. In the figure we can appreciate
how the movement of the agent is generally directed by the waypoints of the mission plan (circles),
and at the same time, by the intermediate distraction points (depicted as squares). Figure on the
right shows the trajectory using a larger angle variation σφ = 40◦ in the model. Given the higher
variability of the direction angle, the trajectory is more erratic and, in this scenario, the agent failed
to reach the last assigned sector.
Figure 10: Simulated trajectories for a plan through five sectors.
5.2
Scent attraction
In the simulation process, we also model the effect of the target’s scent on the behavior of the
dog agents. We use a software system [23] that computes a surface wind field, taking into account
the topographical characteristics, and using computational fluid dynamics. The software consists
of importing elevation data (DEM) files and solving the Navier-Stokes equations to determine the
flow speed and direction everywhere within the domain. The results from this set of calculations
17
67.5◦
112.5◦
157.5◦
22.5◦
202.5◦
337.5◦
247.5◦
292.5◦
Figure 11: Selection of position of odor particles based on direction angle. Each interval corresponds
to one neighbor position.
are in the form of a discrete vector field, which is given as input to a simple method to estimate
the probability of scent detection within the area. This estimation method consists in modeling the
scent propagation coming from the target’s location. The output of this method is a mapping from
each point in the area to a probability of detecting the target’s scent. This approach is applied
offline, prior the SAR simulation, and its output will be used by the mobility models to simulate
the influence of the scent on the behavior of the dog agents.
Previous works in odor source localization using multi-robot systems have proposed simulation
approaches for odor propagation taking into account the wind flow. Our odor propagation model
is inspired on the plume-based model proposed by Lochmatter et al. [22]. We substitute the
molecular dispersion and filament concentration models by a simple estimation the probability of
odor detection entirely based on the distance from the source.
The method is described as follows: Let P be a set of points in the area determined by a
uniform grid. We select a point ps , closest to the source location. From this point, we emit a large
number of particles which will disperse throughout the area. Each particle starts with a probability
of detection (POD) equal to one. This POD linearly decreases over the distance traveled by the
particle, until reaching the maximum propagation distance dmax .
The dispersion is modeled stepwise, and every step t the model determines the next position
pi,t+1 ∈ P of a particle based on the wind direction at the current position. Thus, the propagation
angle αi,t−1 is computed as follows:
αi,t = w(pi,t ) + vp
(47)
where a stochastic component vp is a Gaussian random variable, N (0, σ) with σ = 20◦ .
Using αi,t , we select the next position from the set of neighbors, using angle intervals, as shown
in Figure 11. Together with this, the POD of that particle is decreased:
P OD(i, t) = P OD(i, t − 1) −
|pi,t − pi,t−1 |
dmax
(48)
After performing the procedure for a large number of particles, for each position in P , we
compute the mean value of the POD of all particles that visited that location. These mean values
will be the output of the the scent propagation method and represent the estimated probabilities
of scent detection by the dog agents. Figure 12 shows the results after applying the procedure for
the DEM in figure 2a.
Integrating scent attraction into mobility model Scent attraction is directly fitted into the
exploration mobility model in the procedure that generates the location of the distraction points.
18
Figure 12: Scent propagation procedure.
As mentioned before, distraction points are randomly placed around the agent within a certain
distance. Under the effect of scent attraction, distraction points occur in the opposite direction of
the wind, directing the agent towards the source of the odor.
Figures 13 show how agent mobility is affected by the activation of the scent attraction in the
simulator.
(a) without scent attraction
(b) with scent attraction
Figure 13: Effect of scent attraction on mobility
5.3
Target detection models
In the proposed simulation environment, target detection is modeled considering the local terrain
characteristics around the target location, the agent sensor range and the exploration time spent in
the vicinity of the target. Our approach attempts to accommodate the continuous nature of space
and time into the discrete simulation procedure. To achieve this, we first define a circular region
around the target for each agent. The sizes of these areas will basically depend on the local terrain
characteristics and the agents’ sensor ranges. To account the time factor for the target detection,
every time the agent exits its associated area, a Bernoulli trial determines the occurrence of the
target detection. The probability of the Bernoulli random variable is determined by a function
fk : N → [0, 1], which takes the amount of time the agent k has remained inside the detection area.
5.4
Cooperation models
Cooperation between ground searchers occurs when two or more agents are assigned to nearby areas
and it is reflected through their mobility models. Every time a group of agents are located within
19
certain distance from each other, the simulator enables a cooperative behavior which encourages
them to search without overlapping efforts, and hence improving the area coverage. This is achieved
by enforcing the location of distraction points to be separated by a certain distance. In case of aerial
agents, this could also be achieved by the generation of complementary aerial search patterns.
6
EXAMPLE APPLICATION
In this section we present an application of the MSS in a mountain search and rescue mission.
Particularly, we consider an emergency response scenario where a team of human rescuers, air-scent
dogs and autonomous aerial robotic agents work together with the goal of finding a missing person
within a given area. The illustrative example considers an area of 1km × 1km in a mountain region
in the Swiss Canton of Ticino (coordinates WGS 84: N 46◦ 26.984 E 9◦ 01.592). We identify several
aspects of the problem and perform simple modeling approaches using the proposed framework. In
the simulation of computed mission plans, we do not consider target detection, therefore we do not
make use of the scent propagation, cooperation and detection models described in Section (5).
Mobility of human and air-scent dog agents is simulated using the model described in Section 5.1.
In the case of UAV agents, we decided to simulate a sweeping pattern trajectory inside the assigned
sectors. Example simulated trajectories for each agent, corresponding to a provided mission plans,
are shown in Figures (14), (15), and (16).
(a) Provided mission plan
(b) Simulated trajectory
Figure 14: Mobility model of air-scent dog agents
(a) Provided mission plan
(b) Simulated trajectory
Figure 15: Mobility model of human agents
6.1
Model parameters
The modeling starts with the definition of the search area partition, which determines the set of
cells C. For simplicity, we adopt a uniform grid partitioning of square cells of ∆c × ∆c , where
20
(a) Provided mission plan
(b) Simulated trajectory
Figure 16: Mobility model of UAV agents
∆c = 50m. Therefore, the uniform partition consists in |C| = 400 cells. The mission time step
interval is set to ∆t = 300 seconds (5 minutes). For simplicity, in this example we do not account
for energy costs regarding to the activities or movement across sectors.
(a) DEM
(b) Vegetation density
(c) Satellite image of area and surroundings
Figure 17: GIS data for example scenario
21
6.2
6.2.1
Agent performance models and sectors
Human rescuers
We define the sectors for human rescuer agents using an uniform grid of sectors of 100m×100m.
We consider a human rescuer as an agent with uniform sensor coverage, and use Eq. (3) to define
the search efficacy. The parameters for the uniform sensor coverage area:
m
s
= 50.0m
sHuman = 1.0
(49)
rHuman
(50)
The definition of the walking efficiency (we ) function is based on [9]:
h
v
s
r
l
we (c) = fH,w
(ph )fH,w
(pv )fH,w
(ps )fH,w
(pr )fH,w
(pl )
where:


1.0




0.975





0.95



0.90
h
fH,w
(ph ) =
0.85





0.75




0.65




0.4
for ph , pv , ps , pr , pl ∈ p∗ (c)
if ph < 1000m
if 1000m ≤ ph < 2000m
if 2000m ≤ ph < 3000m
if 3000m ≤ ph < 4000m
if 4000m ≤ ph < 5000m
if 5000m ≤ ph < 6000m
if 6000m ≤ ph < 7000m
otherwise
(51)
(52)
(53)
v
fH,w
(pv ) = 0.85 + (3 − pv ) ∗ 0.05

1.0 if ps < 5◦





0.8 if 5◦ ≤ ps < 10◦



0.6 if 10◦ ≤ p < 15◦
s
s
fH,w
(ps ) =
◦
◦

0.4
if
15
≤
p
s < 25




0.2 if 25◦ ≤ ps < 40◦



0.1 if 40◦ ≤ ps


if pr < 161
1.0
r
fH,w
(pr ) = 0.9
if 161 ≤ pr < 497


0.85 if 497 ≤ pr
(54)
(55)
(56)
l
fH,w
(pl ) = 0.8 + pl ∗ 0.2
(57)
The sensor efficacy is intuitively defined as a function of the natural light intensity and the vegetation
density:
l
v
re (L) = fH,r
(pl )fH,r
(pv )
l
fH,r
(pl )
v
fH,r
(pv )
6.2.2
for pv , pl ∈ p∗ (c)
= 0.5 + pl ∗ 0.5
= 0.4 + (3 − pv ) ∗ 0.2
(58)
(59)
(60)
Unmanned aerial vehicles
We assume that UAVs are equipped with high-resolution cameras in order to obtain high-quality
images of the field which can be used to determine the presence of the target. The sectors of
the UAVs are determined using a uniform grid of square areas of 250m×250m. We also assume
that UAVs fly at constant altitude. However, different flight altitudes can be easily considered by
defining sectors of different size. Moreover, energy costs between sectors can be used to represent
the cost of changing altitude.
22
We consider the UAVs as agents with uniform sensor coverage. However, we do not directly apply
the definition of optimal coverage in terms of optimal speed and sensor range due to the fact that
aerial platforms might not be continuously observing the environment using their sensors (e.g., aerial
photographs are taken periodically with a certain rate). Another issue might be the overlapping of
images, required by some algorithms in order to reconstruct geo-referenced orthomosaics. Therefore,
we assume that the coverage rate of the UAV can be estimated by a value that simply represents
2
the amount of area that can be effectively covered over time (in ms ). This value multiplied by
time allows to obtain an estimate of the surface covered by the UAV. In our scenario, we define the
optimal area coverage for an UAV agent as:
CU AV (t) = 150
m2
(t)
s
(61)
These value correspond to UAVs able to obtain high resolution imagery data of an area of 500m ×
500m approximately during a mission interval (300 seconds).
Although the walking efficiency of the UAVs is optimal in every part of the field, because it
is not affected by the characteristics of the terrain (i.e., we (c) = 1.0 ∀c ∈ C), its sensor efficacy
has a critical dependency on the quality of the images, which at the same time depends on the
visibility of the terrain, mainly determined by the vegetation and the light intensity. We use a
simple formulation for the sensor efficacy, defined in terms of the natural light intensity and the
vegetation density:
l
v
re (c) = fU,r
(pl )fU,r
(pv )
l
fU,r
(pl )
v
fU,r
(pv )
6.2.3
for pl , pv ∈ p∗ (c)
= 0.25 + pl ∗ 0.75
= 0.25 + (3 − pv ) ∗ 0.25
(62)
(63)
(64)
(65)
Air-scent search dogs
Due to their olfactory sensory capabilities, air-scent search dogs should be considered searcher
agents with irregular sensor coverage. Using the proposed formulation, their particular search
characteristics can be taken into consideration by determining which cells are exposed to the dog’s
olfactory senses. We propose a simple method to determine this set of cells assuming the wind
direction can be estimated a priori.
First, let us define the set of agent sectors Γk , in the same way as for human searchers. For each
sector, we define two sets of related cells. The first set, corresponds to those cells lying inside the
square area (i.e., sector). These cells will have uniform exploration efficacy because we consider that
the agent would move through them, therefore can be considered under both a visual and olfactory
inspection. A second set of cells, will be also considered affected by the agent activities, although
these cells are not located inside the sector. These cells are only explored by the olfactory sense,
and their exploration level decays by the distance from the sector center point. This indirectly
explored set of cells is determined by the definition of a triangular sector, with one vertex at sector
center point as shown in Figure (18). Model parameters (i.e., angle α and distance r), are given by
the user as input, and may be defined using expert knowledge or field experiments.
We consider two definitions of the exploration efficacy of a location L and cell c ∈ L, depending
if the cell is directly or indirectly explored. Let LD ⊆ L and LI ⊆ L be the set of directly
and indirectly explored cells respectively. The former are considered as uniform coverage, so we
use a similar formulation as for the human agents, using the optimal coverage with the following
parameters:
The parameters for the uniform sensor coverage area:
m
s
= 75.0m
sdog = 1.5
(66)
rdog
(67)
23
α
r
Figure 18: Scent model. Four cells are directly explored while 17 are indirectly explored by the
olfactory sense.
Their walking efficiency is considered in same manner as for the human searcher. The sensor
efficacy is similar, giving more weight to the vegetation density and less to the light factor:
l
v
re (L) = fD,r
(pl )fD,r
(pv )
(68)
l
(pl )
fD,r
(69)
v
fD,r
(pv )
= 0.7 + pl ∗ 0.3
= 0.1 + (3 − pv ) ∗ 0.3
In the second case, we use the following formula:
r
ϕk (L, c) = we (L)re (L)
|L − c| + r
(70)
c ∈ LI
(71)
to model the scent propagation using the inverse of the distance, where |L − c| is the distance
between sector L and cell c center points.
(a) Human
(b) Air-scent dog (without scent
model)
(c) UAV
Figure 19: Search performance models for the agent types considered
6.3
Mission planning
To demonstrate the functionality of the planning framework, we computed mission plans for a
mixed team consisting of one, two, and three agents for each type (i.e., three, six, and nine searchers
respectively). The duration of the plans is 20 mission intervals, which correspond to 20×∆t = 6000
R solver. All agents were initially
seconds, or 100 minutes. To solve the MILP we used the CPLEX
located at the bottom left corner of the search area. The initial values of the coverage map were
set to Cm (c) = 1 ∀c ∈ C.
24
6.3.1
Spreading component
Figures (20), (21), and (22) show the resulting coverage maps, after simulation of the computed
mission plans. To display the Cm values, we also use a cyan-red color ramp where cyan corresponds
to fully explored cells (i.e., Cm (c) = 0) and red to no explored cells (Cm (c) = 1.0). We use
the standard deviation of the coverage map values, σ, as a measure of uniformity/spreading of
exploration (higher σ values means
Plower spread). The total residual coverage needed after the
execution of the mission, |Cm | = c∈C Cm (c), is used to evaluate overall plan performance. As
expected, higher spreading results in greater number of explored cells, and more uniform search,
although this exploration strategy might not necessarily produce positive results. In fact, we can
appreciate that enforcing slightly higher spreading generally decreases the overall performance.
(a) No spread. σ = 0.40, |Cm | =
272.35
(b) Low spread. σ = 0.38, |Cm | =
280.90
(c) High spread. σ = 0.36, |Cm | =
286.15
Figure 20: Mission performance: effect of spreading (3 agents).
(a) No spread. σ = 0.42, |Cm | =
229.05
(b) Low spread. σ = 0.40, |Cm | =
236.40
(c) High spread. σ = 0.38, |Cm | =
239.81
Figure 21: Mission performance: effect of spreading (6 agents).
7
CONCLUSIONS AND ONGOING WORK
We presented an MSS for wilderness search and rescue with heterogeneous teams. The system
features the combination of GIS-based environment modeling, the automatic definition of agents’
search efficacy, an optimization-based mission planner, and a mission simulator. Altogether, these
characteristics make the proposed MSS a powerful integrated tool for managing emergency response
scenarios. Ongoing and future work includes the proposal of different centralized and decentralized
mission planning strategies, in-depth study of the model features regarding the air scenting agents
and dealing with uncertainty in plan calculations.
25
(a) No spread. σ = 0.40, |Cm | =
199.5
(b) Low spread. σ = 0.39, |Cm | =
199.7
(c) High spread. σ = 0.38, |Cm | =
207.2
Figure 22: Mission performance: effect of spreading (9 agents).
7.1
Planning strategies
We or studying strategies for both centralized and decentralized planning. All methods make use
of the MILP optimization model presented in Section (4) to compute mission plans, but differ in
the way they manage the planning process.
7.2
One-shot planning
Assuming that the mission has a maximum allowed duration Tmax , one-shot planning simply computes a single plan for each agent spanning the entire duration of the mission. However, this simple
way of performing planning has several drawbacks in dynamic situations since it can not accommodate variations in the executions of the plans. Moreover, given the complexity of the optimization
problem, it also suffers from computational problems. This method corresponds to the one used
to compute plans in the example application of Section (6.3), in which Tmax was defined to be 20
mission intervals.
7.3
Constant Time Window Planning
Constant Time Window Planning (CTW-P) consists in sequentially compute mission plans for
fixed-length short time windows. In this way, the MSS can include newly gathered updates to the
coverage map in subsequent planning stages, which is a clear advantage with respect to one-shot
planning. However, the short length of the time window used by the planner might induce a myopic
behavior in the planning process, leading to poor mission performance. The study of this trade-off
is part of ongoing and future work.
7.4
Iterative Adaptive Planning
Iterative adaptive planning differs from CTW-P in the sense that time windows do not have a fixed
length. Instead, the MSS suggests the triggering of new planning stages according to the current
mission performance. This strategy can be implemented, for instance, by defining thresholds to
the difference with respect to the expected performance corresponding to the current mission plans.
Together with this, if the computation power is available, an efficient implementation of this strategy
may allow the MSS to recompute plans without user supervision and anticipating re-planning. In
this way new mission plans are always readily available.
7.5
Decentralized planning
With this method, we propose the implementation of a decentralized storage of the coverage map
and the design of distributed communication protocols to exchange its updates. Each agent will
posses a coverage map, which might differ from others given possible delays in the updates or data
26
loss. This allows mission planning to be performed independently by each agent, using any of the
strategies mentioned above. To use the MILP model, each agent includes the current known positions of neighbor agents (i.e., those agents located inside reachable cells involved in the planning).
Subsequently, the optimization model is used to compute a solution which will include a schedule for the planner agent (i.e., the agent executing the planning), and for each of its neighbors.
The planner agent obtains a mission plan for itself, and ignores mission plans corresponding to
neighbor agents (i.e., they are only needed for computation purposes, but are not sent out, to reduce communication overhead). Within the implementation and study of this strategy we include:
the simulation of the communication network to consider the effect of data losses, a comparison
against the fully centralized strategies to assess the its scalability, and the computation of an initial centralized plan using the proximity constraints, in order to implicitly shape agent coalitions.
This centralized global plan could act as a guide and provide better communication quality for a
following decentralized functionality.
References
[1] S. Bhattacharya, M. Likhachev, and V. Kumar. Multi-agent path planning with multiple
tasks and distance constraints. In Proc. of the IEEE Intl. Conf. on Robotics and Automation
(ICRA), pages 953–959, 2010.
[2] M. Bossard, J. Feranec, and J. Otahel. CORINE land cover technical guide. Technical Report 40, European Environment Agency, 2000.
[3] F. Bourgault, T. Furukawa, and H. Durrant-Whyte. Optimal search for a lost target in a
bayesian world. Field and Service Robotics, 24:209–222, 2006.
[4] Ø. Breivik and A. A. Allen. An operational search and rescue model for the norwegian sea
and the north sea. J. of Marine Systems, 69(12):99 – 113, 2008.
[5] I.-M. Chao, B. L. Golden, and E. A. Wasil. The team orienteering problem. European J. of
Operational Research, 88(3):464–474, 1996.
[6] J.-M. Chung and D.-C. Go. Stochastic Vector Mobility Model for Mobile and Vehicular Ad
Hoc Network Simulation. IEEE Trans. on Mobile Computing, pages 1–14, 2011.
[7] T. Chung, M. Kress, and J. Royset. Probabilistic search optimization and mission assignment
for heterogeneous autonomous agents. In Proc. of the IEEE Intl. Conf. on Robotics and
Automation (ICRA), pages 939–945, 2009.
[8] T. H. Chung and S. Carpin. Multiscale search using probabilistic quadtrees. In Proc. of the
IEEE Intl. Conf. on Robotics and Automation (ICRA), pages 2546–2553, 2011.
[9] M. Ciolli, L. Mengon, A. Vitti, and P. Zatelli. A GIS-based decision support system for the
management of SAR operations in mountain areas. Geomatics workbooks, 6, 2006.
[10] M. Earl and R. D’Andrea. Iterative MILP methods for vehicle-control problems. IEEE Trans.
on Robotics, 21(6):1158–1167, 2005.
[11] B. Grocholsky, J. Keller, V. Kumar, and G. Pappas. Cooperative air and ground surveillance.
IEEE Robotics & Automation Magazine, 13(3):16–25, 2006.
[12] E. I. Grøtli and T. A. Johansen. Path Planning for UAVs Under Communication Constraints
Using SPLAT! and MILP. J. Intell. Robotics Syst., 65(1-4):265–282, 2011.
[13] L. Guoxiang and L. Maofeng. SARGIS: A GIS-Based decision-making support system for
maritime search and rescue. In Proc. Int. Conf. on E-Business & E-Government (ICEE),
pages 1571–1574, 2010.
27
[14] G. Hollinger, J. Djugash, and S. Singh. Coordinated search in cluttered environments using
range from multiple robots. Field and Service Robotics, 42:433–442, 2008.
[15] A. Hubenko, V. a. Fonoberov, G. Mathew, and I. Mezić. Multiscale adaptive search. IEEE
Trans. on Systems, Man, and Cybernetics. Part B, pages 1–12, 2011.
[16] H. Kitano and S. Tadokoro. Robocup rescue: A grand challenge for multiagent and intelligent
systems. AI Magazine, 22(1):39–52, 2001.
[17] B. O. Koopman. Search and its optimization. The American Mathematical Monthly, 86(7):527–
540, 1979.
[18] H. Lau, S. Huang, and G. Dissanayake. Probabilistic search for a moving target in an indoor
environment. In Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS),
pages 3393–3398, 2006.
[19] W. Li and C. Cassandras. A cooperative receding horizon controller for multivehicle uncertain
environments. IEEE Trans. on Automatic Control, 51(2):242–257, 2006.
[20] L. Lin and M. Goodrich. A Bayesian approach to modeling lost person behaviors based on
terrain features in Wilderness Search and Rescue. Comput. Math. Organ. Theory, 16:300–323,
2010.
[21] L. Lin and M. A. Goodrich. UAV intelligent path planning for Wilderness Search and Rescue.
In Proc of the IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), pages 709–714,
2009.
[22] T. Lochmatter and A. Martinoli. Understanding the Potential Impact of Multiple Robots
in Odor Source Localization. In Distributed Autonomous Robotic Systems 8, pages 239–250.
Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa,
2008.
[23] A. Lopes. WindStation - A software for the simulation of atmospheric flows over complex
topography. Environmental Modelling & Software, 18(1):81–96, 2003.
[24] A. Macwan, G. Nejat, and B. Benhabib. Optimal deployment of robotic teams for autonomous
wilderness search and rescue. In Proc.of the IEEE/RSJ Intl. Conf. on Intelligent Robots and
Systems (IROS), pages 4544–4549, 2011.
[25] S. J. Riley, S. D. DeGloria, and R. Elliot. A terrain ruggedness index that quantifies topographic
heterogeneity. Intermountain J. of Sciences, 5(1-4):23–27, 1999.
[26] L. D. Stone. Theory of optimal search. Academic Press, NY, 1975.
[27] F. Tang and L. Parker. A complete methodology for generating multi-robot task solutions using
asymtre-d and market-based task allocation. In Proc.of the IEEE Intl. Conf. on Robotics and
Automation (ICRA), pages 3351 –3358, 2007.
[28] K. Trummel and J. Weisinger. The complexity of the optimal searcher path problem. Operations Research, 34(2):324–327, 1986.
[29] L. Vig and J. A. Adams. Coalition formation: From software agents to robots. J. Intell.
Robotics Syst., 50(1):85–118, 2007.
[30] S. Waharte, A. Symington, and N. Trigoni. Probabilistic search with agile UAVs. In Proc. of
the IEEE Intl. Conf. on Robotics and Automation (ICRA), pages 2840–2845, 2010.
[31] A. Xu, C. Viriyasuthee, and I. Rekleitis. Optimal complete terrain coverage using an Unmanned
Aerial Vehicle. In Proc. of the IEEE Intl. Conf. on Robotics and Automation (ICRA), pages
2513–2519, 2011.
28