Artificial Intelligence for Transportation: Advice, Interactivity and Actor Modeling: Papers from the 2015 AAAI Workshop Optimal Planning Strategy for Ambush Avoidance Emmanuel Boidot, Aude Marzuoli, Eric Feron [email protected], [email protected], [email protected] Guggenheim School of Aerospace Engineering Georgia Institute of Technology Atlanta, Georgia 30332 Abstract by Vanek et al. (2011, 2013) and money transit protection by Salani et al. (2010). This problem of ambush avoidance can be described as a network interdiction problem where interdiction are done on nodes. In the class of network interdiction problems, several special cases have been studied: 1. Shortest path interdiction where arcs of the network can be removed (Israeli and Wood 2002). 2. Network interdiction when the network structure is set and the concern is to monitor the network (the objective of the evader is to maximize the probability of being undetected by choosing the most reliable path) (Pan, Charlton, and Morton 2003). The invader can be informed or uninformed. 3. Maximum flow interdiction where the goal of the interdictor is to prevent the flow of some unwanted items. (Wood 1993), which has been extended to a stochastic formulation to minimize the expected maximum flow (Janjarassuk and Linderoth 2008). Each of these models are Stackelberg games, two steps, two players sequential games, zero-sum. Washburn (1995) identifies optimality properties regarding some interdiction network subclasses. The consideration of ambush games as network interdiction problems allows us to better understand the optimal position of the ambushes. Thus we can predict the optimal outcome of the game. A well-known framework for path planning in adversarial environments is differential games, pioneered by Isaacs (1954). For a complete and recent set of references, see (Friedman 2013). Unlike pursuit-evasion games, we are interested in the case when one agent is a vehicle, and the other agent makes a unique decision, that of placing one or more ambushes. Our approach naturally leads to the design of random distributions of trajectories. For the same scenario, the returned strategy is unique but two vehicles following this strategy might use different paths. For clarity reasons, a probabilistic strategy is referred to as a ”route” and a deterministic realization of this strategy as a ”path”. While the work cited so far proposes solution of the game on transportation network, there has been little study of the set of optimal solutions. Furthermore, its extension to continuous environments remains yet to be explored. With this perspective, a method to create a network on any environment on which the game is to be run is first developed. Second, the outcome of an ambush is no longer constant, but it depends on its position; the factors influencing this outcome Operating vehicles in adversarial environments between a recurring origin-destination pair requires new planning techniques. Such a technique, presented in this paper, is a game inspired by Ruckle’s original contribution. The goal of the first player is to minimize the expected casualties undergone by a moving agent. The goal of the second player is to maximize this damage. The outcome of the game is obtained via a linear program that solves the corresponding minmax optimization problem over this outcome. The formulation originally proposed by Feron and Joseph is extended to different environment models in order to compute routing strategies over unstructured environments. To compare these methods for increasingly accurate representations of the environment, a grid-based model is chosen to represent the environment and the existence of a sufficient network size is highlighted. A global framework for the generation of realistic routing strategies between any two points is described. Finally the practicality of the proposed framework is illustrated on real world environments. Introduction We seek to develop routing strategies for dynamical systems operating in adversarial environments. An agent tries to move a vehicle from a given origin to a desired target set, while avoiding undesired set of states (ambushes); and another agent sets up ambushes at the locations of his choice. The vehicle is penalized if it gets caught into an ambush, while the ambusher then gets rewarded. Examples of applications of this research could be found in protection of automated vehicles, such as delivery vehicles. The idea behind this paper was first introduced by Ruckle (1976), who extended Isaacs’s (1954) classical battleship versus bomber duel by interpreting a two-dimensional environment as a rectangular array of lattice points. Ruckle stated that such geometric games played on a finite lattice always exhibit a solution because of the Minimax theorem of Von Neumann and Morgenstern (1944). He identified necessary and sufficient conditions for the mixed strategy of a player to be optimal. Ruckle’s idea was extended by Joseph and Feron (2005a, 2005b). They advanced the idea of a variable game outcome at each node of the network. Their formulation was later used in applications for piracy prevention c 2015, Association for the Advancement of Artificial Copyright Intelligence (www.aaai.org). All rights reserved. 11 2 are identified. This makes the computation of the optimal routing strategy possible on any environment without manual input of a case specific game matrix. These two aspects constitute the main contribution of this paper to the current state of the art in the field of ambush games. The outline of this paper is as follows. First, the game and its mathematical formulationare described. Different network construction methods and ambush types considered for the adaptation of the game to unstructured environments are then detailed. Finally, two applications are described to illustrate this work. 5 1 3 7 6 4 Figure 1: Decision flow - The network representation of the environment (left) is considered in order to create a strategy for BLUE (middle): p = (p12 , p13 , p14 , p25 , p26 , p35 , p36 , p45 , p46 , p57 , p67 ) = (1/3, 1/3, 1/3, 1/6, 1/6, 1/6, 1/6, 1/6, 1/6, 1/2, 1/2). Then a strategy for RED (right) is computed: q = (q1 , . . . , q7 ) = (0, 0, 0, 0, 1/2, 1/2, 0). The outcome of the game for these strategies is V = 1/2. The probability associated with each edge is displayed as the width of the edge. Approach Game Description The problem of interest is to plan the path for a agent (evolving in the air, on the ground, or at sea) that needs to move from an origin, or source, s to a destination, or sink, t in an environment where hostile forces might try to ambush it. It is modeled as a two players non-cooperative zero-sum game, meaning that a player’s sole goal is to optimize his personal gain (non cooperative) and the sum of the outcome over all players is zero. Player 1, denoted BLUE, chooses a route from origin to destination. Player 2, denoted RED, selects a number of locations (ambush area or ambush sites) where he can set up one or several ambushes. An example of such a decision flow is represented in Figure 1. The number of ambushes depends on RED’s resources. In the remainder of this paper we assume that RED only sets one ambush. It is common for this type of game theoretic problems to assume that ambushes take place at nodes or on edges of the network, which is the approach adopted by Boidot and Feron (2012). However, to expand the approach to continuous state space, it makes more practical sense for RED’s strategies to relate to areas of the environment (continuous sets), rather than the topology of the network used to define BLUE’s paths. This way, RED’s strategy translates into where it would position its ambushes given a specified action range. The physical meaning of an ambush site corresponds to the area RED can impact if he sets his ambush at a given location. Setting an ambush on this area means that BLUE will be ambushed if his path goes through nodes contained by the ambush area. Ambushes are populated ambush sites. A representation of this concept is displayed in Figure 2. This idea is further developed in the next section. If BLUE’s path passes through an ambush site, then RED wins. If BLUE’s path avoids all ambushes, then BLUE wins. The outcome of an ambush corresponds to the casualties that the agent would experience in this specific area, should an ambush be effectively be present at that location. It is dependent on the characteristics of the local environment. The outcome of a game is the sum of this local outcome over all the ambushes that BLUE’s path has gone through. Thus we assume ambushes are not lethal, although high costs may be incurred. The outcome α of an ambush at any point is assumed to be known by both players. Its computation is discussed in the Applications section. The environment is represented by a network (N, E) and a local outcome map. Each ambush area i is associated with a local outcome αi . The set α is the discretized local outcome map. In this paper, only single-stage ambush games are considered, meaning that both players need to decide their strategy at the beginning of the game, and they cannot change it during the execution of the route. The environment and its discrete representation are the only information RED and BLUE share : the network, the local outcome map, BLUE’s origin and destination points and the position of the ambush areas are known to both players, but not the position of those chosen by RED. A possible strategy for BLUE is represented by a probability vector p, where pij is the probability that the agent uses edge eij between nodes ni and nj . Similarly, a strategy for RED is represented by a probability vector q that contains the probability qk that RED sets an ambush at the ambush area ak . Given the network and the local outcome map, the goal of the game is to find the optimal strategy p∗ for BLUE, assuming that RED follows its optimal strategy q∗ . Consider the case of a recurring transition from state s to state t, where there exist two different paths from s to t. A deterministic approach will return only one of the two paths, whereas the mixed strategy p will return a probability p1 of BLUE using the first path and a probability p2 of BLUE using the second path. Incorporating mixed strategy solutions means that, for a given environment, RED can at best (after a large number of runs following this strategy) figure out the strategy p but will never gather any certainty over the agent’s exact path. Mathematical Formulation Let (N, E) be a network. Consider first the case where ambushes can be set at nodes. Therefore, the ambush areas {ai }i and nodes N = {ni }i represent the same set. A strategy for player BLUE is a mapping p from E to [0, 1] such that 0 ≤ p(i, j) ≤ 1 and the flow constraints are satisfied, where i and j are node indices. A strategy for RED is a mapping q from N to [0, 1] such that 0 ≤ q(j) ≤ 1 and P qj = 1. BLUE’s (resp. RED’s) strategy space P (resp. j∈Q Q) is the set of all mappings. 12 From here, p will be the vector representation of the image of E by the mapping p and q will be the image of N by the mapping q. pij is the probability that BLUE use edge eij . qj is the probability that RED set up an ambush at node nj . Assume that the two players strategies are independent. At each node nj , the probability that BLUE be ambushed is equal to the probability that BLUE’s path go through nj times the probability that RED set an ambush at this node. The gain for RED at this node being αj , the P expected outcome of the game relative to this node is: pij qj αj . P i|(i,j)∈E P j|(s,j)∈E P pij = P (1) i|(i,j)∈E pjk , ∀j ∈ N \ {s, t} 1 pjt = 1 0.17 (b) Area size = 10 A simple network example is displayed in Figure 1. The local outcome α is assumed to be equal to one at each internal node of the network and zero on departure and arrival nodes. The result of the optimization problem presented above is displayed in Figure 1(b). Note that this optimization problem was solved using the interior-point algorithm implemented in function linprog in MATLAB (Mehrotra 1992). Using a symmetric network (with respect to the s − t axis) and a uniform local outcome (constant across the environment), the following observation is made: the probability of the vehicle passing by is spread across the network and preserves the initial symmetry. This property is of particular interest when it comes to avoiding ambushes. It seems to indicate that optimal solutions to the problem are also the most deceptive ones because most paths have similar likelihoods. k|(j,k)∈E psj = 0.17 Figure 2: Simple network example - Ambushes by area. The probability associated with each edge is displayed as the width of the edge. The orange rectangles represent the area impacted by placing an ambush at the center of any rectangle. Thus, the larger the rectangle, the larger the impact of the ambush. The brighter the rectangle, the less penalty there is at being ambushed there. The origin and destination area are supposed to be out of the game boundaries (they are green zones). Grey edges are unused or used with very low probability. Red circles represent the strategy chosen by RED, ie areas i where RED might set his ambush with probability qi > 0. This example illustrates the importance of associating an area to an ambush. Increasing the reach of RED (width of the impact area) between (a) and (b) leads to a drastically different strategy: the reach of RED has very important consequences on BLUE’s strategy. The other constraints of this problem enforce the conservation of the flow of probabilities through the network. The probability that the agent arrives at node nj is equal to the probability that the agent leaves the same node. Probabilities of the agent being at origin and destination nodes are equal to 1. 0.17 (a) Area size = 6 with Djk = αj if the k th line of p represents the probability that BLUE use an edge eij directed towards nj , and Djk = 0 otherwise. The objective of the approach is to find a strategy for BLUE that minimize the largest possible outcome for RED. Provided that qj ≤ 1 for all j, RED can always maximize V by choosing the node nj for which the probability of BLUE passing through that node, weighted by the value αj , is maximal. Therefore BLUE’s optimal solution is to minimize this product across all nodes : X p∗ = arg min max pij αj . (2) j∈N 0.17 0.5 j∈N i|(i,j)∈E p 0.17 0.5 i|(i,j)∈E Therefore the strategic outcome of the game is : X X V= pij qj αj = qt Dp. 0.17 (3) j|(j,t)∈E Framework for unstructured environments This problem is solved as a linear programming problem by introducing a slack variable z constrained as follows. P z ≥ pij αj ∀j ∈ N (4) i|(i,j)∈E The approach described in the previous section, defined by Joseph (2005a, 2005b), requires the existence of a network to optimize the routing, for example a city street map. Unstructured environments do not have a “natural” framework over which the finite-dimensional linear program (5) may be posed and solved. This section proposes three methods to build a network to support the formulation of Problem (5), it compares the corresponding outcomes and it identifies key features. Rewriting Equations (2) and (4), the problem can be posed as a linear program: minimize z subject to Dp−1z ≤ 0, Ap = b, p ≥ 0. (5) Network Construction where A and b represent the flow conservation constraints. The probability of each edge being used is computed as to minimize expected losses. Taking full advantage of unstructured environments may be advantageous for vehicles with off-road capabilities. Instead 13 Network # Sampling Connectivity 1(rdm) Random Delaunay triangulation 2(uni8 ) Uniform 8 connected grid 3(uniD ) Uniform Delaunay triangulation 0.25 Table 1: Different network construction methods. 0.25 0.13 0.12 0.12 0.25 0.12 0.13 0.25 0.13 0.12 (b) rdm - λ 6= 0. (a) rdm - λ = 0. of assuming the existence of a road network that limits vehicle routing options, the network now only represents a discretization of the continuous environment, and it is capable of accounting for the vehicle physical properties and the environment characteristics. Several methods are tested in order to discretize the environment with networks allowing reasonably short computation times, as shown in Table 1. Consider Method 1: while randomly sampled nodes connected through a Delaunay triangulation result in a relatively small and computationally efficient representation of the environment, rdm might not be representative enough of the details of the environment. The sampling is limited to a small number of nodes to keep time to compute the solution of (5) relatively small. uni8 is a structured method that creates a grid on the environment. It is fully connected, but requires 16 directed edges per node, making the linear problem computationally intensive. It might be preferable to choose a less precise technique with yet enough granularity on the environment description. Method 3, uniD , reduces the connectivity of the network by using a Delaunay triangulation instead of constructing a fully connected grid. 0.13 0.17 0.17 0.17 0.17 0.17 0.17 0.17 0.17 0.17 0.17 0.17 0.17 (d) uni8 - λ 6= 0. (c) uni8 - λ = 0. 0.17 0.15 0.17 0.17 0.16 0.18 0.17 0.17 0.17 0.17 0.17 0.17 (f) uniD - λ 6= 0. (e) uniD - λ = 0. Figure 3: Results of the linear program (6) for different network geometries. In the left column, cycles can be identified in the departure and arrival areas whereas there are only outgoing edges in the right column examples. Notice the 1-to1 repartition of route segments to areas for routes returned with energy optimization (as seen in Figures 3(d) and 3(f)) along the median between departure and arrival. Ambush types A different model is now considered where the state space for RED is modified. Ambushes are paired with an area in the environment instead of a node on the network. This corresponds to discretizing the ambushes by tiling the space with area of impact. Here, only rectangular tiles over a rough grid are considered. The ability to choose the ambush locations over a finer grid than that given by a tile could also be investigated. The reach of RED is a measure of the area of impact of RED. The size of an ambush area is proportional to the square of the reach. If RED decides to allocate a resource to this area and BLUE’s path goes through one of these nodes, then the ambush is successful and RED’s outcome is the value αi corresponding to this area. The local outcome corresponds to the casualties that would be incurred to the agent. Figure 2 displays the optimal solution for BLUE given two values of the reach. Setting ambush by area also decouples the space dependency for BLUE strategy from RED’s strategy. We can study the convergence of BLUE’s strategy regarding the precision of the environment model when the set of ambushes for RED is fixed. This feature could support the use of the discrete routing strategies as approximations of a continuous strategy. The mathemical formulation of these new concepts is now considered. A transposition matrix S is created. If node j belongs to ambush area i then the j th column of S will be zero except on its ith line. Reformulating the linear problem similarly the initial formulation, BLUE’s optimal solution is now: p∗ = arg min max qt SDp. p q As seen in Figure 2, the reach of RED influences BLUE’s optimal strategy, hence our choice of the reach as a parameter in the environment model representing RED’s capabilities. Note that the strategy q∗ for RED is computed assuming it has perfect knowledge of BLUE’s strategy. RED knows the network and the probability that BLUE uses any edge of the network. Once p∗ is computed for BLUE, q∗ is computed as q∗ = arg max qt SDp∗ . This is a very strong q assumed advantage for RED regarding which the routing method is quite robust. Energy Optimization A consequence of taking into account the reach of RED is the creation of cycles, most particularly inside each ambush area. The probabilistic routing could lead to a path where the vehicle stays inside a closed subset for a very long time. This situation is more likely in ambush areas where the local outcome is close to zero. In his work, Joseph (2005) intro- 14 duced a penalty on the path length to avoid cycles. Here, we use a similar “energy” metric E associatedPwith a weight (or energy factor) λ > 0 such that E = pij keij k2 A i,j|(i,j)∈E comparison of the results with and without this term is done in the Performance subsection. The new linear program is: minimize (1 − λ)z + λE subject to SDp−1z ≤ 0, Ap = b, p ≥ 0. (6) (a) 900 vertices Large values of λ lead to a few paths close to the shortest path while increasing the overall risk for the system. Low values of λ return routing strategies close to the safest ones since the objective function will not penalize lengthiest safest paths. Note that, the set of optimal strategies being fairly large, the routing strategies returned with low values of λ might also be risk optimal. An analysis of this parameter led to the empirical result that λ ∼ 10−3 allows for risk optimal results while removing cycles. Examples of routing strategies with or without energy optimization for different networks described in Table 1 are compared in Figure 3. The routes with and without energy optimization result in similar game outcome (V = 31 ). Hence cycles are suppressed without significant change to the routing strategy regarding the outcome of the game. Interestingly, it seems that the route includes a fixed number of distinct paths depending on the reach of RED. In Figure 3, this distance is such that the median between departure and arrival is divided into three segments. This configuration leads to four distinct paths in Figures 3(f),3(d). Yet, this is optimally equivalent to having only one path along the set of areas aligned between departure and arrival. Any route with intersection-free paths such that the flow is uniformly distributed along the ambush areas would be optimally equivalent regarding the strategic outcome. This comes back to the properties of the optimal solution proven by Ruckle (1976, 1979). He stipulates that the optimal outcome of the game is 1 m , where m is the number of dictincts path from departure to arrival. Work in progress aims at proving that the outcome is proportional to the min-cut capacity of the network. The network construction method has an important impact on RED strategy. Comparing Figures 3(a),3(b) and Figures 3(c),3(d),3(e),3(f), RED has a strategy with a larger spread on grid-based networks. This represent an interesting feature because, if RED has limited resources, his strategy being spread corresponds to a higher probability of BLUE not being ambushed. (b) 3600 vertices Figure 4: Comparison of the results obtained for different network density with the second network construction method. The departure-arrival median is divided in seven areas, which leads to seven different paths. The routes in (a) and (b) are similar, illustrating the convergence of the solution as the size of the network increases. of the route measures the portion of the environment covered by the route as a fraction of the total surface of the environment. The energy metric, which we have already discussed, represents the deviation of the random strategy from the least-energy (and shortest) path between origin and destination. The metrics defined above were computed for each network construction method, using different optimization algorithms, with or without energy optimization. While the study was realized over 100 environments with 5 to 20 obstacles randomly generated, we start by looking at the obstacle free environment in Figure 4 for comprehension purposes. As soon as there exists a path in each of the seven distinct sections in the environment described above, the solutions are close to optimality. On average, more spreading is present when constructing a random network. When optimizing the energy, the regular networks exhibit less spreading. Also, the simplex optimization method, when it is used, tends to produce solutions with lower entropy and a narrower probability base when no energy penalty term is present in the optimization. The spreading and energy metrics tend to favor opposite solutions. Comparing the network construction methods, uni8 and uniD offer equivalent results. But random sampling is more energy consuming results in lesser entropies on the distribution. The maximum theoretical entropy of log(7) ' 1.94 (because there are 7 different paths) is reached by uni8 and uniD . The metrics and methods used converge after a given density of nodes in the network, for a fixed size of the environment. This corresponds to the presence of an edge between each pair of nodes where ambush areas are located. Figure 4 illustrates one of the main results of this paper. For each method, BLUE’s strategy is converging to a distribution that depends solely on RED’s reach and the local outcome. Evidence suggests that there is a sufficient network density for optimality. On Figure 5, we can see that the outcome of the game converges with and without energy optimization, for any network construction method. The rate of convergence varies Performance In order to assess the performance of the precedure, several metrics are used. In his study of the one dimensional problem, Ruckle identified the uniform distribution of paths along a line as the optimal strategy. This strategy corresponds to maximizing the entropy of the probability distribution. The first metric is therefore entropy: it is high for deceptive routes and close to zero for non-deceptive ones. It is computed as the entropy of the probability distribution over the s − t cut equidistant from s and t. The spreading 15 Outcome VS #nodes such that T (0) = σ1 and T (1) = σ2 . We define a strategic homotopy class as a homotopy class where all cycle-free s− t paths are spaced by at most the reach of RED. Then we can define a continuous ambush game as: X min max pi σij qj 1 rdm − intpoint uni8 − intpoint uniD − intpoint rdm − simplex uni8 − simplex uniD − simplex 0.9 0.8 0.7 0.6 0.5 i j 0.4 0 5 10 15 sqrt(N) 20 N=# of vertices 25 X 30 (a) Outcome - λ 6= 0 pi = 1, i X qj = 1 j σij = 1 if Hi ∩ aj 6= ∅ σi ∈ Hi {H1 , · · · , Hm }is the set of all strategic homotopies Outcome VS #nodes 1 rdm − intpoint uni8 − intpoint uniD − intpoint rdm − simplex uni8 − simplex uniD − simplex 0.9 j 0.8 0.7 We argue that the optimal strategy for BLUE in this game 1 is pi = m , ∀i = 1, · · · , m and that the optimal outcome is 1 . This will be the subject of further study. m 0.6 0.5 0.4 0 5 10 15 sqrt(N) 20 N=# of vertices 25 30 Applications (b) Outcome - λ = 0 General framework description Figure 5: Game outcome convergence. The outcome is plotp ted as function of |N | for the three methods and both zero and non-zero λ. The results are averaged over 100 environments. The environment is analyzed for three different purposes. The first objective consists in completing the existing road network with an off-road network to cover both structured and unstructured parts of the environment. The second is to identify relevant geographical areas for ambushes. The third is to compute the local outcome map. A correlation is established at each location between the possible outcome of an ambush and the strategic characteristics of the local environment around this ambush. To achieve these objectives, various data sources are used. The topological information about the environment is collected through data from the Shuttle Radar Topography Mission (SRTM). While SRTM data is not precise enough for real-time high precision path planning, which is not our present concern, it gives sufficient information for route planning purposes on scales of approximately one to tens of kilometers. The second type of data used in this framework is Open Street Map (OSM) data (Hacklay et al., 2008). Open Street Map is a global, collaborative effort to create an open map of the world. The data aggregates semantic and usage information about the environment infrastructure such as the type of roads (interstate, primary, secondary, etc), authorized direction (one-way or two-ways), maximum speed, street names or information about the buildings. The different sources of data help create the analytic environment. The local outcome map, the off-road environment and road information merge to develop the network supporting the optimization. The methodology proposed offers an easy automated way to find routing strategies without the need for manual operations and fine tuning. depending on the method used. The results with and without energy optimization are comparable. The other metrics were studied and the convergence behavior observed was similar. This confirms that the secondary objective can be added without any loss on the strategic outcome. Overall better results are obtained using the Interior Point algorithm on a uniformly sampled network fully connected with non-zero λ. However all techniques offer very similar results on average. The interior-point solution is preferred because it returns equally optimal results with a much higher spreading compared simplex algorithm solutions. This illustrates the high degree of degeneracy of this problem. For the empty environment in Figure 4, the route is optimal if there are as many distinct paths as there are areas in a vertical section of the map (ie. 7 for this example). A parallel can be drawn between ambush games and network interdiction. Washburn et al. (1995) proved that the optimal value of a network interdiction game is obtained when the interdictor closes nodes or edges on the minimal capacity s−t cut of the network. Let the reduced network of our game be the network that consists of the ambush areas as nodes, with edges if there are possible paths between them. For non-empty environments with uniform local outcome, a similar result is obtained: the optimal strategy for RED is uniform (qj = 1c ) on the minimal node-cut of the reduced network, where c is the capacity of this cut. The corresponding outcome is 1c . Ruckle optimality proof becomes a special case of this property, applied on a trivial network (two nodes, c distinct paths from s to t). The convergence observed previously suggests that there exists an analog to this property in continuous space. Recall that two paths σ1 , σ2 belong to the same homotopy class if there exists a continuous transformation T : [0, 1] → Σ Examples The remainder of the paper focuses on examples of applications. The local outcome α is now defined as a parameter depending only on the maximum speed, inversely proportional to it, bounded between 0 and 1: 0 ≤ α(v) = 16 90 − v ≤ 1. 60 744.5 3540 3540 3538 3538 3536 3536 3534 3534 3532 3532 7.445 744 90 km/h 7.44 743.5 75 km/h 7.435 lat 60 km/h 742.5 lat 7.43 lat 743 3530 3530 3528 3528 3526 3526 7.425 45 km/h 742 7.42 30 km/h 741.5 3524 7.415 741 3524 3522 15 km/h 7.41 3522 −1.1685 −1.168 −1.1675 −1.167 −1.1665 lon 740.5 4372 7.405 43.72 43.725 43.73 43.735 43.74 43.745 43.75 43.755 (a) Speed limit in the city of Monaco. 4372.5 4373 4373.5 4374 4374.5 4375 4375.5 −1.166 −1.1685 4 x 10 (a) Pedestrian lon −1.168 −1.1675 −1.167 lon −1.1665 −1.166 4 x 10 (b) Car (b) Optimal strategy Figure 7: Optimal routing strategy for a pedestrian and a car near Fort Irwin, CA. The pedestrian is slow hence its local outcome is high everywhere. The car’s outcome is low in the lowlands (clear color) because it can go much faster. Figure 6: Example of solution for the road network of the city of Monaco imported from OSM data. Conclusion A slower speed implies a higher local outcome, and reciprocally. Indeed, interviews of former Army personnel identified the speed of the vehicle as one of the most influential factors on the local outcome. The environmental data is used to find the maximum speed at a given position. The work developed addresses ambush games in unstructured environments. The opponent’s reach is a key parameter of the environment representation. The possible losses for the system in case of ambush are computed through the identification of several factors influencing this outcome due to the environment and the agents’ resources. Comparing different network construction methods and different linear optimization algorithms, efficient techniques are identified to elaborate ambush avoidance strategies. The results highlight the existence of a sufficient network density to represent such environments. The correlation between the reach of the ambushing player and the optimal outcome of the game is pointed out and a “representative distribution” emerges. Finally a comprehensive framework is elaborated with topological data, a pair of points and a vehicle type as inputs. It provides an optimal stochastic routing strategy between these two points. Examples of application are tested for structured and unstructured environments. From a theoretical standpoint, several aspects could be further investigated. While we have established a parallel with network interdiction problem, the optimality property developed for the these game might not be immediately applicable when the local outcome is not constant. Also, the authors would like to propose a more complete continuous model of the game at hand. This would allow to identify the limit distribution at once and maybe draw a parallel between the distinct paths among routes and homotopy classes. A number of applications have been envisioned. The only factor influencing the local outcome is currently the maximum speed at a given location. The fidelity of the model could be improved by taking into account more risk factors into detail. The purpose of this framework is to provide a hands-on tool for route prediction. However the merging of structured and unstructured environments still requires manual tuning. This merging process will be automated more thoroughly in the upcoming work. Finally, it might be interesting to increase the precision of the environment description by one or two orders of magnitude. Doing so requires determining how to accurately forecast the maximum safe speed at a given location. Control theory hence becomes part of the optimization problem. Road network Consider the city of Monaco, the portion of unstructured environment is negligible. The model of ambushes at nodes is thus used because of the structured network of the environment. On road networks the maximum speed is set to be the maximum allowed speed. The speed profile of Monaco is presented in Figure 6(a). The corresponding optimal solution is displayed in Figure 6(b). The departure and arrival nodes are located on opposite sides of the city. As one might expect, the optimal strategy returned by the framework uses with much higher probability the portions of the network where the speed limit is higher. However, the entire network is used by the routing strategy. Unstructured Environment On the unstructured part of the environment, the maximum speed is a combined function of the different topological factors and information about BLUE’s resources such as its means of locomotion. The resulting optimal strategies for either a pedestrian or a car are displayed in Figures 7(a) and 7(b). The variations of topology in the environment lead to local outcome maps that depend on the transportation mode. The pedestrian being much slower than the car in the lowlands, the local outcome is larger. On the contrary, it is much less penalized by the hills than the car is. Therefore it is more advantageous for the pedestrian to travel through the hills than it is for the car. The fact that some resources for BLUE are more adequate for a specific environment could translate into multimodal routing strategies. Given a set of transportation options, a player could change part of its dynamics during its travel to reduce the optimal outcome. Overall the framework developed is efficient. Since it requires few inputs from the user, it can be easily used to compare different routing strategies on various environments. The large amount of SRTM and OSM data available makes it applicable to almost anywhere in the world. 17 Acknowledgement Wood, R. K. 1993. Deterministic network interdiction. Mathematical and Computer Modelling 17(2):1–18. This work was supported by the Army Research Office under MURI Award W911NF-11-1-0046. The authors would like to thank the reviewers for their valuable input regarding the communication network and operations research litterature. References Boidot, E., and Feron, E. 2012. Planning random path distributions for ambush games in unstructured environments. In IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR). Friedman, A. 2013. Differential games. Courier Dover Publications. Haklay, M. M., and Weber, P. 2008. Openstreetmap: Usergenerated street maps. IEEE Pervasive Computing 7(4):12– 18. Isaacs, R. 1954. Differential games i. Israeli, E., and Wood, R. K. 2002. Shortest-path network interdiction. Networks 40(2):97–111. Jakob, M.; Vanek, O.; and Pechoucek, M. 2011. Using agents to improve international maritime transport security. IEEE Intelligent Systems 26(1):90–96. Janjarassuk, U., and Linderoth, J. 2008. Reformulation and sampling to solve a stochastic network interdiction problem. Networks 52(3):120–132. Joseph, F., and Feron, E. 2005. Computing the optimal mixed strategy for various ambush games. Joseph, F. A. 2005. Path-planning strategies for ambush avoidance. Master’s thesis, MIT. Mehrotra, S. 1992. On the implementation of a primaldual interior point method. SIAM Journal on optimization 2(4):575–601. Neumann, J. V., and Morgenstern, O. 1944. Theory of Games and Economic Behavior. Princeton University Press. Pan, F.; Charlton, W. S.; and Morton, D. P. 2003. A stochastic program for interdicting smuggled nuclear material. In Network interdiction and stochastic integer programming. Springer. 1–19. Ruckle, W.; Fennell, R.; Holmes, P. T.; and Fennemore, C. 1976. Ambushing random walks I: Finite models. Operations Research 24(2):314–324. Ruckle, W. H. 1979. Geometric games of search and ambush. Mathematics Magazine 52(4):pp. 195–206. Salani, M.; Duyckaerts, G.; and Swartz, P. G. 2010. Ambush avoidance in vehicle routing for valuable delivery. Journal of Transportation Security 3(1):41–55. Vanek, O.; Jakob, M.; Hrstka, O.; and Pechoucek, M. 2013. Agent-based model of maritime traffic in piracy-affected waters. Transportation Research Part C: Emerging Technologies. 10.1016/j.trc.2013.08.009. Washburn, A., and Wood, K. 1995. Two-person zerosum games for network interdiction. Operations Research 43(2):243–251. 18
© Copyright 2026 Paperzz