Practical Distance Sensitivity Oracles for Directed Graphs Jong-Ryul Lee Chin-Wan Chung Dept. of Computer Science, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon, Korea Dept. of Computer Science, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon, Korea [email protected] chung [email protected] ABSTRACT paper, we deal with an interesting variation of the shortest distance computation in a graph, called the distance sensitivity problem. Given a graph G = (V, E) where V is the set of nodes and E is the set of edges, the distance sensitivity problem is to answer queries that ask to compute the distance of the shortest path from a node to another avoiding the failed part of the network. A distance sensitivity oracle is a data structure which is designed to answer such queries. The distance sensitivity oracle can be importantly used for a network where a small number of recoverable failures (node or edge) can simultaneously occur. For example, in road networks, a road construction, a demonstration, or a car accident can make a road (edge) or a road junction (node) unavailable temporarily. In general, such node or edge failures will be recovered after the construction is finished, the demonstration is over, or clearing the scene of the accident. In addition to such failures, a user may directly want to know the distance of the shortest path from a point to another avoiding user-specific roads such as frequently clogged roads. In computer networks, network devices (nodes) and links (edges) between them can be broken down by various reasons such as a cut network cable. Such broken devices or links can be recovered by replacing a broken equipment with a normal equipment. The main motivation of a distance sensitivity oracle is that it enables us to avoid stalling queries. Consider that we have an algorithm answering a shortest distance query with an updatable data structure which is called a fully dynamic distance oracle in the theory community. Because node/edge failures change the network structure, when they are given, we have to update the distance oracle to correctly answer the query. After the failures are recovered, we also need to re-update it with the same reason. Note that while a distance oracle is being updated, any query cannot be processed. That is, we should stall queries that are given during the update. In addition, the cost for the update can be very large. For example, given that an edge is deleted, the worst-case update time of a fully dynamic exact distance oracle with O(1) query time is trivially Ω(n2 ), because the distance information of O(n2 ) node pairs can be updated and any distance oracle with O(1) query time must require at least Ω(n2 ) space for a directed graph [16]. In order to avoid such stalling during an expensive update operation, a distance sensitivity oracle is essential. A distance sensitivity oracle is a data structure answering queries that ask the shortest distance from a node to another in a directed graph expecting node/edge failures. It has been mainly studied in theory literature, but all the existing oracles for an n-node m-edge directed graph with real-valued edge weights suffer from ω(n2 ) cost for space and ω(mn) cost for preprocessing time. Due to the expensive costs, they cannot be used even in a graph with thousand-level nodes. Motivated by this, we develop two practical distance sensitivity oracles for directed graphs. The first oracle is an exact oracle for this problem. The core structure of this oracle is a distance graph which is a small graph summarizing useful path information on the entire graph. The exact oracle efficiently computes the answer of a query using the distance graph. The second oracle is an approximate oracle that is derived by sparsifying the distance graph of the exact oracle with the control of the loss of accuracy. Compared with the existing distance sensitivity oracles, our oracles are more efficient in terms of preprocessing time and space. We conduct extensive experiments to evaluate our oracles. Because all the existing distance sensitivity oracles for directed graphs are not applicable to real-life datasets, we use the Dijkstra’s algorithm and a recent fully dynamic distance oracle as competitors. The experiments demonstrate that our oracles outperform the competitors in terms of query time. Compared with the dynamic distance oracle, our exact oracle is more efficient in terms of preprocessing time and space. Our approximate oracle is more efficient than the dynamic distance oracle in terms of space, while it has comparable performance in terms of preprocessing time. Along with such efficiency, our approximate oracle has accuracy similar to that of the dynamic distance oracle. To the best of our knowledge, this is the first work that handles practical graphs with million-level nodes. 1. INTRODUCTION Computing the shortest distance from a node to another is one of the fundamental issues in graph theory. In this Existing oracles. Most existing works for a distance sensitivity oracle deal with only a constant number of failures in a directed graph or a variable number of failures in an undirected graph [8, 2, 3, 6, 9, 1, 13]. Weimann and 1 Yuster propose the only oracle which can answer queries containing a variable number of failures in a directed graph [19]. Their oracle is constructed with a randomized MonteCarlo algorithm and they provide a theoretical guarantee for high success probability that it finds the exact answer e 4−α ) 1 preprocessof a given query. Their oracle has O(n 3−α 2−2(1−α)/f ) e e ing time, O(n ) space, and O(n ) query time, where f is a parameter for the maximum number of failures, f = o(log n/ log log n), and α is a parameter in (0, 1) for specifying the trade-off among preprocessing time, space, and query time. However, since the cost for space and the cost for preprocessing time are too expensive, their oracle is not applicable to real-life datasets. In fact, all existing distance sensitivity oracles for a directed graph with real-valued edge weights suffer from expensive ω(n2 ) cost for space and ω(mn) cost for preprocessing time. Due to the expensive cost for preprocessing time and space, the existing distance sensitivity oracles cannot be applied to a network with even several thousand nodes. the existing distance sensitivity oracles is applicable to reallife datasets, we do not use any of them as a competitor. We demonstrate that our oracles are much faster than the competitors in terms of query time for most cases. Along with the efficient query time, the preprocessing time and the space of our exact oracle are better than those of the fully dynamic distance oracle. The space of our approximate oracle is much smaller than that of our exact oracle and the fully dynamic distance oracle, while the preprocessing time of it is somewhat behind that of the fully dynamic distance oracle. The accuracy of our approximate oracle is similar to that of the fully dynamic distance oracle. Meanwhile, our approximate oracle is faster than our exact oracle in terms of query time, but it requires more expensive preprocessing time. This paper presents the first distance sensitivity oracle that handles practical graphs with million-level nodes. Outline. The rest of this paper is organized as follows. In Section 2, we review existing and applicable oracles for the distance sensitivity problem. The distance sensitivity problem is formulated in Section 3. We propose the exact distance sensitivity oracle in Section 4 and the approximate distance sensitivity oracle in Section 5. Then, we evaluate the efficiency and the accuracy of our oracles with various real-life datasets in Section 6 and conclude the paper in Section 7. Contributions. The main contribution of this paper is to propose two practical distance sensitivity oracles for a variable number of edge failures: an exact distance sensitivity oracle and an approximate distance sensitivity oracle. Our exact oracle is based on two ingredients: a distance graph and a k-path cover. A distance graph is a small graph summarizing some useful path information on the entire graph. A k-path cover is a set of nodes such that all paths of k consecutive nodes in the entire graph include at least one of the nodes in this set. These ingredients were already applied for shortest path computation in [10], but not for the distance sensitivity problem. The core structure of our exact oracle is a distance graph constructed based on a k-path cover. While the idea of using a k-path cover to construct a distance graph appears in [10], the density of a distance graph is not considered in [10]. We propose an effective method to construct a more sparse distance graph by discovering a relationship between a k-path cover and an independent set in graph theory. Note that this relationship provides theoretical foundation to use the distance graph based on an independent set for our exact oracle. Based on the distance graph, we design an efficient query algorithm for our exact oracle. This query algorithm is a Dijkstra-like algorithm and it finds the correct answer for any query on the distance graph instead of G. Even if our exact oracle is faster than the Dijkstra’s algorithm, its distance graph is usually very dense. Thus, we devise an effective way of sparsifying the distance graph with a theoretical guarantee for accuracy. By using this sparsified distance graph, given two parameters α ≥ 1 and fS ≥ 0, we derive our approximate oracle that guarantees an α-approximation ratio for any query having at most fS failed edges. This oracle is more efficient than our exact oracle while achieving a bounded approximation ratio for such a query. For any query having more failed edges than fS , it still shows good accuracy. We conduct extensive experiments to evaluate our distance sensitivity oracles. In the experiments, we use two competitors: the Dijkstra’s algorithm and a recent fully dynamic approximate distance oracle in [17]. Because none of 2. RELATED WORK The distance sensitivity problem has been mainly studied in theory literature. Demetrescu et al. [8] proposed two oracles for handling a single network failure. Their first ora2 e 2 ) space, and O(mn e cle has O(1) query time, O(n + n3 ) preprocessing time. Their second oracle has O(1) query 1.5 e time, O(n2.5 ) space, and O(mn +n2.5 ) preprocessing time. Bernstein and Karger [2] improved the preprocessing time e √mn2 ) using random sampling with O(1) query time to O( e 2 ) space. They improved the preprocessing time and O(n e again to O(mn) based on a deterministic construction algorithm [3] with the same query time and the same space. For integer edge weights in the interval [−M, M ], Weimann and Yuster [18] proposed an oracle, constructed by a randomized e 1+α ) query time, O(n e 3−α ) space, algorithm, that has O(n ω+1−α e and O(M n ) preprocessing time, where ω < 2.373 is the matrix multiplication exponent and α is a parameter in (0, 1). By improving the oracle in [18], Grandoni e and Williams [11] proposed an oracle has O(M nω+1/2 + ω+α(4−ω) 1−α e Mn ) preprocessing time, O(n ) query time, and e 2.5 ) space. For dual network failures, Duan and Pettie O(n e e 2) [9] proposed an oracle that requires O(1) query time, O(n space, and ω(mn) polynomial preprocessing time. For a variable number of failures, Weimann and Yuster proposed an oracle constructed by a randomized algorithm in [18, 19]. This oracle was introduced first in [18] and completed in [19]. As we reviewed, all the existing distance sensitivity oracles for a directed graph with real-valued edge weights suffer from ω(n2 ) space and ω(mn) preprocessing time. There is only one study for the distance sensitivity problem with some experiments. Qin et al. [13] proposed a distance sensitivity oracle that works only for undirected unweighted graphs with any single edge failure. This work can be applied only for a limited class of graphs and the datasets used in its experiments are rather small. 1 e f (n) = O(g(n)) if f (n) = O(g(n) logk (n)) for some constant k > 0. 2 One of non-trivial ways for handling the distance sensitivity problem is to use an existing fully dynamic distance oracle [12, 7, 15, 14, 17]. There is a recent fully dynamic approximate distance oracle, named LCA, based on landmarks without any theoretical guarantee for accuracy [17]. The query time of this oracle is O(lD) where l is the number of landmarks and D is the diameter of a graph, while the update time is O(l(m+n log n)). We will compare our distance sensitivity oracles with this fully dynamic distance oracle in the experiments to show that a fully dynamic distance oracle is inappropriate for solving the distance sensitivity problem. 3. the k-path cover. Note that for all theorems and lemmas in this section and the next section, their proofs are given in Appendix. Table 1 summarizes frequently used notations in Section 4 and Section 5. Table 1: Frequently used notations G D e D nout G (v) nin G (v) P (s, t, F ) d(s, t, F ) d(s, t) d(P ) dD (u, v) PROBLEM DEFINITION To represent a general network, we consider a directed graph G = (V, E) where V is the set of nodes and E is the set of edges. This is because there are many cases that an edge failure can be unidirectional. For example, road networks can be represented as an undirected graph, but edge failures (road construction, demonstration, or car accident, etc.) can be unidirectional. Given a graph G, for any node v in G, the set of inbound neighbors of v is denoted as nin G (v) and the set of outbound neighbors of v is denoted as nout G (v). In addition, we denote the empty set as ∅. We define that a path P is a sequence of edges in E. Even though P is a sequence, when we focus on its elements, not the order, we handle it as a set of edges with set operations such as intersection. Every edge (u, v) ∈ E is associated with a weight, denoted as w(u, v), which is a non-negative real value. Given P any path P , the distance of P is defined to be d(P ) = (u,v)∈P w(u, v). In addition, the length of P , denoted as |P |, is defined to be the number of hops of P . For any two nodes u, v ∈ V and a set of edges E ∗ ⊆ E, the shortest (distance) path from u to v in the graph (V, E \ E ∗ ) is denoted as P (u, v, E ∗ ) and its distance is denoted as d(u, v, E ∗ ). 4.1 4.1.1 the input graph a distance graph a sparsified distance graph the outbound neighbors of node v in graph G the inbound neighbors of node v in graph G the shortest path from s to t in the graph (V, E \ F ) the distance of P (s, t, F ) the distance of P (s, t, ∅) (= d(s, t, ∅)) the distance of path P the shortest distance from u to v in D Distance Graph-based Query Processing Motivation and Definition Consider a set of nodes C ⊆ V , a graph D having C as a node set, and a query (s, t, F ). Suppose that for some 1 ≤ L ≤ |P (s, t, F )| − 1, there are nodes u1 , u2 , ..., uL ∈ C which are included in P (s, t, F ). Then, we can divide P (s, t, F ) into L + 1 sub-paths: P (s, u1 , F ), P (u1 , u2 , F ), ..., P (uL , t, F ). Suppose that d(s, u1 , F ) and d(uL , t, F ) are known and for each sub-path P (ui , uj , F ) such that 1 ≤ i < j ≤ L, there exists the edge (ui , uj ) in D. Then, by letting the weight of every edge (x, y) in D be d(x, y, F ), we can compute d(s, t, F ) on D instead of G by using a Dijkstralike algorithm. If we can efficiently compute d(s, u1 , F ), d(uL , t, F ), and D, this entire procedure can be much faster than computing d(s, t, F ) via the Dijkstra’s algorithm on G. 6 3 5 1 Definition 3.1 (Distance Sensitivity Problem). Given a directed graph G = (V, E), the distance sensitivity problem is a query (s, t, F ), where s is the start node, t is the destination node, and F is a set of at most f failed edges, asking to compute d(s, t, F ). 7 10 2 4 8 (a) An input graph 5 One trivial solution for the distance sensitivity problem is the Dijkstra’s algorithm. This does not utilize any preprocessed structure and its running time is O(m + n log n) with the Fibonacci heap. Another trivial solution is storing the answer of every possible query. It takes O(n2+2f ) space, since O(n2 ) comes from all pairs of nodes in V and O(n2f ) comes from all possible combinations of at most f edges among O(n2 ) edges. Thus, a non-trivial distance sensitivity oracle should be faster than the Dijkstra’s algorithm with a data structure of size lower than O(n2+2f ). 4. 9 9 2 7 4 (b) The distance graph of the input graph with {2, 4, 5, 7, 9} as the node set EXACT ORACLE Figure 1: Examples of an input graph and a distance graph This section describes our exact oracle, called DIStance graph-based Oracle (DISO), for the distance sensitivity problem. DISO is based on a compact data structure, called a distance graph D. At a high level, given a query (s, t, F ), DISO uses D to compute d(s, t, F ) using a Dijkstra-like algorithm. In the following subsections, we first formally define the distance graph and explain how to use it for answering queries. Then, we show an effective strategy to choose the node set of a distance graph based on the concept of In order to make this idea concrete, we formally define the distance graph which is the core structure of DISO as follows: Definition 4.1 (Distance Graph). Given any set of nodes C ⊆ V , a distance graph is defined to be a graph D = (VD , ED ) where VD = C and ED ⊆ C × C. For any node pair (u, v) ∈ C × C, (u, v) is included in ED , if there exists at least one path from u to v in G which does not pass 3 4.1.3 through any other node in C. Each edge (u, v) ∈ ED is associated with a weight denoted as wD (u, v). For any two nodes u, v ∈ VD , the shortest distance from u to v in D is denoted as dD (u, v). The procedure of the Dijkstra’s algorithm to compute shortest distances with a priority queue is called the Dijkstra process. In the Dijkstra process, a node popped from the priority queue is called a settled node. Because the failed edge set is given in querying, sometimes we need to locally compute single source shortest distances in our query algorithm. For this, we simply modify the Dijkstra’s algorithm to avoid traversing beyond a node x ∈ C if x is settled and is not the source. This modified version of the Dijkstra’s algorithm is called the bounded Dijkstra’s algorithm. For any node v ∈ V , let us denote a set of nodes except v which are included in C and settled in the bounded Dijkstra’s algorithm from v in the outbound direction as Âout (v). Similarly, let us denote a set of nodes except v which are included in C and settled in the bounded Dijkstra’s algorithm from v in the inbound direction as Âin (v). For example, when {(9, 10)} is given as a failed edge set for the input graph of Figure 1, Âout (1) = {2, 5} and Âin (10) = {4, 7}. It is easy to see that the distance from a node u to a node v computed by the bounded Dijkstra’s algorithm on the graph (V, E \ F ) is ˆ v, F ). Given a query (s, t, F ), we will use the actually d(u, bounded Dijkstra’s algorithm to maintain the edge weighting scheme for a distance graph and to get supersets of Aout (s) and Ain (t) based on the following lemma. For the rest of this section, consider that we have a set of nodes C ⊆ V and the distance graph D having C as the node set. Figure 1 shows an example input graph and the distance graph of it derived from a node set. Intuitively, any path from u to v in G that does not pass through any other node in C is mapped to edge (u, v) in distance graph D. In Figure 1, there are three paths in the input graph mapped to the edge (4, 7) in the distance graph:h(4, 7)i, h(4, 8), (8, 7)i, and h(4, 8), (8, 10), (10, 7)i. Given a query (s, t, F ), let us define the out-access nodes of s as nodes v ∈ C such that v 6= s and P (s, v, F ) does not contain any node in C as an intermediate node. The set of such nodes is denoted as Aout (s). A property of the out-access nodes is that by definition, all shortest paths in the graph (V, E \ F ) from s to any nodes in C must pass through at least one of its out-access nodes. Similarly, we define the in-access nodes of t as nodes v ∈ C such that v 6= t and P (v, t, F ) does not contain any node in C as an intermediate node. The set of such nodes is denoted as Ain (t). All shortest paths from any nodes in C to t in the graph (V, E \ F ) must pass through at least one of its inaccess nodes. For any two nodes u ∈ Aout (s) and v ∈ Ain (t), we define, du,v D (s, t, F ) = d(s, u, F ) + dD (u, v) + d(v, t, F ). Lemma 4.2. For any node v ∈ V , Aout (v) ⊆ Âout (v) and Ain (v) ⊆ Âin (v). (1) 4.1.4 Then, we define, dD (s, t, F ) = min(u,v)∈A(s,t) du,v D (s, t, F ) The Query Algorithm Consider that a query (s, t, F ) is given. For easy understanding, we provide the outline of the query algorithm of DISO. (2) where A(s, t) is a set of node pairs (u, v) such that u ∈ Aout (s) and v ∈ Ain (t). 4.1.2 A Core Tool:Bounded Dijkstra’s Algorithm 1. Access nodes computation: This step gets Âout (s) and Âin (t) using the bounded Dijkstra’s algorithm on the graph (V, E \ F ). Construction Let us discuss the construction of distance graph D given node set C. The topological structure of the distance graph can be computed by executing a graph traversal algorithm for each node of C in preprocessing. Then, we discuss how to give a weight wD (u, v) to each edge (u, v) ∈ ED . Given any two nodes x ∈ V and y ∈ C, for any edge set E ∗ ⊆ E, let us denote the shortest path from x to y in the graph (V, E \E ∗ ) that does not pass through any node in C \{x, y} ˆ y, E ∗ ). When E ∗ = ∅, as P̂ (x, y, E ∗ ) and its distance as d(x, i.e., for the graph (V, E), P̂ (x, y, E ∗ ) becomes P̂ (x, y, ∅). P̂ (x, y, E ∗ ) may not be P (x, y, E ∗ ). For example, in Figure 1, consider that w(1, 3) > 1 and the other edge weights in the input graph are equal to 1. Then, P̂ (1, 5, {(1, 5)}) is h(1, 3), (3, 5)i and P (1, 5, {(1, 5)}) is h(1, 2), (2, 5)i. On the other hand, it is noteworthy that if y ∈ Aout (x), P̂ (x, y, E ∗ ) ˆ y, E ∗ ) = d(x, y, E ∗ )). Based on equals P (x, y, E ∗ ) (i.e., d(x, this, for any edge (u, v) ∈ ED and any failed edge set F ⊆ E, ˆ v, F ). This weighting scheme gives we set wD (u, v) to d(u, the following lemma. 2. Distance computation: By using Âout (s) and Âin (t), this step computes dD (s, t, F ). It includes maintaining the edge weighting scheme for D. Computing access nodes. Given a query (s, t, F ), it is not trivial to exactly compute Aout (s) and Ain (t), unless we run the Dijkstra’s algorithm from s and from t in G until all nodes in C are settled. Instead, we use Âout (s) and Âin (t) that can be efficiently computed by running the bounded Dijkstra’s algorithm from s in the outbound direction and from t in the inbound direction, respectively. Let us define that Â(s, t) is a set of node pairs (u, v) such that u ∈ Âout (s) ˆ u, F ) and v ∈ Âin (t). For any pair (u, v) ∈ Â(s, t), d(s, ˆ and d(v, t, F ) are also computed when Âout (s) and Âin (t) are computed, respectively. Because A(s, t) ⊆ Â(s, t) by Lemma 4.2, we substitute A(s, t), d(s, u, F ), and d(v, t, F ) in ˆ u, F ), and d(v, ˆ t, F ) to compute (1) and (2) with Â(s, t), d(s, dD (s, t, F ). Note that this substitution does not harm the correctness of the query algorithm as it will be proved. Lemma 4.1. For any two nodes u, v ∈ C and any failed edge set F ⊆ E, the edge weighting scheme guarantees that dD (u, v) = d(u, v, F ). Maintaining the edge weighting scheme. Because it ˆ v, F ) for every edge (u, v) ∈ can be expensive to compute d(u, ED given a query (s, t, F ), we propose an efficient maintenance technique for the edge weights in D called lazy recomputation. For each edge (u, v) in ED , wD (u, v) is set We will explain how to efficiently maintain the edge weighting scheme later in detail. 4 ˆ v, ∅) in preprocessing. wD (u, v) will be recomputed to d(u, ˆ v, F ) in querying, if P̂ (u, v, ∅) contains some failed to d(u, edge and wD (u, v) is necessary to compute a query answer. ˆ v, ∅) as the weight of (u, v) in D, Otherwise, we can use d(u, ˆ ˆ because d(u, v, F ) = d(u, v, ∅) when P̂ (u, v, ∅) does not contain any failed edge. For such recomputations, we use the bounded Dijkstra’s algorithm on the graph (V, E \ F ). Because the bounded Dijkstra’s algorithm can act like a single source shortest path algorithm, we actually consider recomputations over nodes, not edges. In what follows, we explain how to efficiently implement the lazy recomputation. Let us define that for any node u ∈ C, if u has an edge (u, v) in D such that P̂ (u, v, ∅) contains at least one of failed edges in F , u is called an affected node. The query algorithm of DISO is a Dijkstra-like algorithm, which means it includes the Dijkstra process. For any settled node u ∈ C in the query algorithm, if u is an affected node, the weights of all edges from u in D are recomputed by the bounded Dijkstra’s algorithm from u in the outbound direction. That is, for any ˆ nodes v ∈ nout D (u), wD (u, v) is set to d(u, v, F ). The lazy recomputation must be faster than recomputing the weights of edges from all affected nodes. However, because finding affected nodes itself can be expensive, in order to more efficiently implement the lazy recomputation, we need some advanced technique for it. Motivated by this, we propose an efficient way of computing a node set containing affected nodes. d(s, t, F ) is presented in Algorithm 2. For any node v ∈ C ∪{t}, dD (v) keeps the minimum among observed distances from s to v. In Line 2, Âout (s) and Âin (t) are computed by the bounded Dijkstra’s algorithm. After that, the suspicious nodes are computed in Line 3. Lines 4-22 describe the procedure to compute dD (s, t, F ) like the Dijkstra’s algorithm. This procedure can be efficiently implemented with a priority queue. While the Dijkstra’s algorithm starts with a single node, this procedure starts with nodes in Âout (s). In Lines 18-19, if v is a suspicious node, the edge weights associated with v are recomputed by the bounded Dijkstra’s algorithm. The other part in Lines 4-22 is identical to the Dijkstra’s algorithm. After t is settled, the algorithm reˆ t, F ). Note that turns the minimum between dD (t) and d(s, ˆ d(s, t, F ) can be computed together when Âout (s) and Âin (t) are computed. For example, suppose that all edge weights of the graph in Figure 1 are equal to 1 and a query (1, 10, {(5, 7)}) is given. Then, the query algorithm computes Âout (1) = {2, 5}, Âin (10) = {4, 7, 9}, and S = {2, 4, 5}. After that, it finds P (1, 10, {(5, 7)}) = h(1, 5), (5, 4), (4, 7), (7, 10)i. It should be noted that ∞ is assigned to wD (5, 7) right after 5 is settled in the query algorithm. Algorithm 2: Distance Graph-based Query Algorithm input output Lemma 4.3. Given a failed edge set F ⊆ E, for any node v ∈ C, v is not an affected node, if there is no edge (x, y) ∈ F such that there exists some path from v to x in G which does not contain any node in C \ {v, x}. 1 2 3 4 5 From Lemma 4.3, we define that a node v ∈ C is a suspicious node, if there is some edge (x, y) ∈ F such that there exists some path from v to x in G which does not contain any node in C \ {v, x}. By definition, all affected nodes are also suspicious nodes, so we can consider to use the suspicious nodes instead of the affected nodes in the query algorithm. Lemma 4.3 implies that we can efficiently compute all suspicious nodes by conducting multiple times a graph traversal method such as the depth-first search. Based on this, the algorithm to compute suspicious nodes is described in Algorithm 1. Let us denote the set of all paths from a node u to a node v in G as π(u, v). In Algorithm 1, Sx is the set of nodes each of which is a suspicious node due to (x, y) ∈ F and it can be efficiently computed by using the depth-first search from x in the inbound direction once. Note that in the depth-first search, if it visits some node v in C, it does not traverse beyond v, because we do not have to consider paths containing any node in C as an intermediate node. 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Algorithm 1: Find Suspicious(C, F ) input 1 2 3 4 5 6 : s: the start node, t: the destination node, F : the failed edge set : d(s, t, F ): The shortest distance from s to t in the graph (V, E \ F ) begin Compute Âout (s) and Âin (t); S :=Find Suspicious(C, F ); for v ∈ C ∪ {t} do dD (v) := ∞; for v ∈ Âout (s) do ˆ v, F ); dD (v) := d(s, dD (s) := 0; Initialize a priority queue Q with the nodes in Âout (s); while Q is not empty do v := arg minu∈Q dD (u); Pop v from Q; if v = t then break; if v ∈ Âin (t) then Push t into Q if dD (t) = ∞; ˆ t, F )}; dD (t) := min{dD (t), dD (v) + d(v, if v ∈ S then ˆ x, F ) for any (v, x) ∈ ED ; wD (v, x) := d(v, for n ∈ nout D (v) do Push n into Q if dD (n) = ∞; dD (n) := min{dD (n), dD (v) + wD (v, n)}; ˆ t, F )}; return min{dD (t), d(s, For simplicity, Algorithm 2 is constructed as a variation of the classic (not bidirectional) Dijkstra’s algorithm. If we construct this query algorithm based on a more efficient online shortest path algorithm like the bidirectional Dijkstra’s algorithm, the query algorithm will run faster. : C: the node set of a distance graph, F : the set of failed edges : S: the set of suspicious nodes output begin S := ∅; for (x, y) ∈ F do Sx := {v ∈ C|∃P ∈ π(v, x) s.t. P ∩ (C \ {v, x}) = ∅}; S := S ∪ Sx ; Correctness. We now prove the correctness of this query algorithm. return S; Lemma 4.4. For any query (s, t, F ), if P (s, t, F ) contains some node in C, Putting it together. Based on the techniques we introduced, the entire query algorithm of DISO to compute dD (t) = dD (s, t, F ) = d(s, t, F ). 5 (3) By definition, for any query (s, t, F ), if P (s, t, F ) does not ˆ t, F ) = d(s, t, F ). This proposicontain any node in C, d(s, tion and the previous lemma directly imply the correctness of the query algorithm. they do not consider the sparsity of the distance graph (the overlay multigraph in [10]) derived from the k-path cover. Thus, the distance graph derived from their k-path cover may be very dense, and then it harms the efficiency of DISO. Motivated by this, we propose an effective method to construct a distance graph considering its density, and it is based on the independent set, which is a well-known concept in graph theory. An independent set I is a set of nodes in V such that no two nodes in I are adjacent. Let us explain how a distance graph for G is derived from I. Initially, we have a graph GI = (VI , EI ) where VI = V and EI = E. For each node v ∈ I, we remove v from GI and edges incident out to v. Then, we add edges (x, y) ∈ nin G (v) × nG (v) into GI if (x, y) 6∈ EI . After this process is finished, because I is an independent set, it is easy to see that VI = V \ I and VI is a minimal 2-path cover in G, and GI is a distance graph. Given a maximal independent set computation method GetIS (·), we construct a distance graph with a more elaborate way as follows. Consider a distance graph Di = (Vi , Ei ) for 0 ≤ i ≤ t − 1 where t is some parameter larger than 0. Initially, D0 = G. We iteratively compute an independent set ISi by calling GetIS (Di ) and construct a new distance graph Di+1 for G with Vi+1 = Vi \ ISi . This procedure is described in Algorithm 3 and gives the following lemma. Theorem 4.1. For any query (s, t, F ), the query algorithm of DISO correctly finds d(s, t, F ). Cost analysis. We denote the average cost for the bounded Dijkstra’s algorithm as cB . cB depends on the structure of G and C. It is easy to see that Lines 2-9 require O(cB + |C| log |C|) where O(|C| log |C|) comes from building priority queue Q. The procedure described in Lines 10-22 corresponds to the Dijkstra process. This procedure uses O(|ED | + |C| log |C| + |S|avg cB ) time where |S|avg is the average of |S|. Thus, the total query time is O(|ED | + |C| log |C|+|S|avg cB ). Meanwhile, it is straightforward that the preprocessing time to construct a distance graph is O(n|C|) when C is given. The total space complexity required by this query algorithm is O(|ED |). 4.2 Toward a Good Distance Graph A good distance graph for DISO has a small number of nodes and edges. This subsection presents an effective method for constructing such a distance graph for DISO. 4.2.1 Algorithm 3: IS-based Path Cover A k-Path Cover and its Effect input Because a set of nodes identifies an unique distance graph, we focus on choosing a node set for a good distance graph. For this purpose, we first define a k-path cover. 1 2 3 Definition 4.2 (k-Path Cover). For some integer k > 0, a k-path cover C is a set of nodes such that all simple paths consisting of k nodes in G pass though at least one of the nodes in C. 4 5 6 For example, the node set of the distance graph in Figure 1 is a 3-path cover in the input graph. It is easy to see that all paths consisting of 3 nodes in the input graph pass through at least one of 2,4,5,7, and 9. There can be many k-path covers in G, but we focus on a minimal k-path cover. Our strategy to get a good distance graph starts with selecting a minimal k-path cover as the node set of a distance graph. Let us examine how a minimal k-path cover C affects DISO, when the distance graph derived from C is used for DISO. The most important property of C is that the length of the longest path between any two nodes of C in G is bounded by k. It guarantees that the bounded Dijkstra’s algorithm from a node v searches at most k hops from v. When k is much smaller than the diameter of G, cB , which is the average cost for the bounded Dijkstra’s algorithm, must be smaller than that for the Dijkstra’s algorithm. Without such a guarantee, cB is asymptotically the same as the cost for the Dijkstra’s algorithm. Meanwhile, it is easy to see that when k is large, the size of C becomes small. Thus, there is some trade-off for the query time depending on k, so we find an appropriate value for k by increasing k until the query time is converged to minimum in experiments. 4.2.2 : G: the input graph, t: the parameter for the number of iterative steps : Dt : the distance graph output begin Copy G to D0 ; for i = 0 to t − 1 do ISi := GetIS (Di ); Di+1 := a distance graph with Vi \ ISi ; return Dt ; Lemma 4.5. For all integer parameter t ≥ 1, Vt is a minimal 2t -path cover. Independent Set Selection. The remaining part for the distance graph construction is to compute the independent set in Algorithm 3. For each step 0 ≤ i ≤ t − 1 in Algorithm 3, it computes an independent set for Di that is used to derive Di+1 . Thus, if we can find an independent set such that the number of nodes and the number edges in Di+1 are minimized, then the output distance graph eventually has a small number of nodes and edges. However, finding such an independent set is a hard problem, because finding the maximum independent set is NP-hard, which only minimizes the number of nodes in Di+1 . Based on this background, we propose a greedy method to find a maximal independent set I such that the number of edges in the distance graph derived from I is effectively reduced. Our greedy method incrementally constructs an independent set and its derived distance graph as follows. Given an input distance graph Di = (Vi , Ei ), our greedy method starts with initializing I as the empty set and constructing a graph DI = (VI , EI ) where VI = Vi and EI = Ei . Then, it iteratively picks a node v from DI such that v is not adjacent to any node in I on Di and v minimizes the following: out σ(v) = | {(x, y) ∈ N P air(v) \ EI } | − nin Di (v) + nDi (v) , Selecting a Better k-Path Cover based on an Independent Set Computing the minimum k-path cover is NP-hard [4]. Funke et al. [10] introduced an efficient and effective heuristic method to compute a minimal k-path cover. However, out where N P air(v) denotes nin Di (v) × nDi (v). It is easy to see that σ(v) is the net contribution of v to the number of edges 6 in the distance graph derived from the resulting independent set. After removing v, new edges (x, y) ∈ N P air(v) \ EI are added into DI such that (x, y) 6∈ EI . This is repeated until all nodes in VI are adjacent to some node in I on Di (i.e., I is maximal). This method is described in Algorithm 4. In Algorithm 4, VI∗ denotes the set of nodes in VI that are not adjacent to any node in I on Di . t-spanner problem that is NP-hard [5]. Thus, we propose a heuristic sparsification algorithm to find an α-distance graph with a small number of edges. e satisfies the Suppose that a sparsified distance graph D following: for any edge (x, y) ∈ ED and any failed edge set F such that |F | ≤ fS , d(x, y, F ) ≤ dDe (x, y) ≤ αd(x, y, F ). e for SDISO, it is easy to prove that SDISO If we use D guarantees the α-approximation. Based on this observation, we consider a process to find an α-distance graph as e = (V e , E e ) where V e = VD follows. Consider a graph D D D D and EDe = ED . For each edge (u, v) ∈ ED , we test that if e then for any edge (x, y) ∈ ED , (u, v) is removed from D, d(x, y, F ) ≤ dDe (x, y) ≤ αd(x, y, F ). If it is true, then we e and go to check the next edge. After remove (u, v) from D e becomes an α-distance graph this process is finished, then D with a small number of edges. Our sparsification algorithm is an implementation of this process. In the subsequent subsections, we discuss how to efficiently determine whether an e during this process. edge can be deleted from D Algorithm 4: GetIS (Di ) 1 2 3 4 5 input : Di = (Vi , Ei ): the distance graph output : I: the independent set begin I := ∅; Construct DI = (VI , EI ) where VI = Vi and EI = Ei ; while I is not maximal do v := arg minv∈V ∗ σ(v) ; 6 7 8 I Remove v from DI and add it to I; Add new edges (x, y) ∈ N P air(v) \ EI into DI ; return I; The idea behind this method is that the method removes a node which minimizes the net contribution of the node to the number of edges in the distance graph derived from the resulting independent set. Thus, the distance graph computed by Algorithm 3 has a relatively smaller number of edges than the distance graph derived from a k-path cover in [10]. 5.2 e durConsider an intermediate sparsified distance graph D ing the sparsification process. We want to determine whether e violates the α-approximation, but removing an edge from D it is hard to do it in preprocessing, because a failed edge set is only known after a query is given. For handling this issue, we use the concepts of the upper bound and the lower bound. For any edge (u, v) ∈ ED and any failed edge set F such that |F | ≤ fS , we propose an upper bound for dDe (u, v) and a lower bound for d(u, v, F ). For d(u, v, F ), we use d(u, v, ∅), which is abbreviated as d(u, v), as a lower bound, because it is trivial that d(u, v) ≤ d(u, v, F ). We now construct an upper bound for dDe (u, v). For any path PG in G and any e such that they have the same start node and path PDe in D the same destination node, we say that PG corresponds to PDe , if all nodes in PDe are included in PG and PG does not include any other node in VDe . This means that if PG does not contain any failed edge, then it can be used as PDe in the query algorithm. Note that any path in G corresponds e to at most one path in D. Complexity. Let us analyze the time complexity of Algorithm 3. The cost for GetIS (·) (Algorithm 4) is O(n(log n + 2 degavg )) where degavg is the average degree in the observed distance graphs. In Line 5 of Algorithm 3, we can get Di+1 without any additional cost, because DI in Algorithm 4 becomes Di+1 after I is computed. Therefore, the total time complexity to construct a distance graph for DISO is 2 O(tn(log n + degavg )). 5. APPROXIMATE ORACLE Given two parameters α ≥ 1 and fS ≥ 0, we propose an αapproximate distance sensitivity oracle for any query having at most fS failed edges, named Sparsification-based approximate DISO (SDISO). DISO and SDISO use the same query algorithm. 5.1 Disjoint Path-based Upper Bound Background 5 Consider a distance graph D = (VD , ED ) where VD is a k-path cover C. Let us define a sparsified distance graph e = (V e , E e ) as a subgraph of D such that V e = VD and D D D D EDe ⊆ ED . All notations and functions for a distance graph are defined for a sparsified distance graph in the same way. Given a path P , V (P ) denotes the set of nodes in P . For any query (s, t, F ) such that V (P (s, t, F )) ∩ C = ∅, beˆ t, F ) equals d(s, t, F ), the query algorithm finds cause d(s, e the correct answer with any distance graph including D. e satisfyThus, we want to find a sparsified distance graph D ing that for any two parameters α ≥ 1 and fS ≥ 0, and any query (s, t, F ) such that V (P (s, t, F )) ∩ C 6= ∅ and |F | ≤ fS , d(s, t, F ) ≤ dDe (s, t, F ) ≤ αd(s, t, F ). We call such a sparsified distance graph an α-distance graph. For simplicity, if we mention α-approximation, then it means α-approximation e for any query having at most fS failed edges. If we use D for the query algorithm instead of D, then it guarantees αapproximation. However, finding the α-distance graph with a minimum number of edges is equivalent to the minimum 9 2 7 4 Figure 2: A sparsified distance graph Figure 2 presents a sparsified distance graph of the distance graph in Figure 1. For example, the path h(5, 4), (4, 8), (8, 10), (10, 7)i in the input graph of Figure 1 corresponds to the path h(5, 4), (4, 7)i in the sparsified distance graph in Figure 2. Then, we define the following concept: Definition 5.1 (Detour). Given an edge (u, v) ∈ ED , the detours of (u, v) are defined as paths from u to v in G each of e that is not a path consisting which corresponds to a path in D of the single edge (u, v). 7 5 5 7 5 7 7 this without explicitly computing such a path and updating dU (x, y). The key idea is to give a different approximation ratio to e Consider alternative paths from x to y in each edge in D. G each of which consists of the sub-path of PG from x to u, a path in πU (u, v), and the sub-path of PG from v to y. For any failed edge set F such that |F | ≤ fS , it is easy to see that there is at least one path, among such alternative paths and the paths in πU (x, y)\{PG }, which does not contain any failed edge. In addition, each alternative path PG0 satisfies the following: 10 4 4 (a) 8 4 (b) 8 (c) Figure 3: Example detours Figure 3 presents some detours of edge (5, 7) of the sparsified distance graph in Figure 2. All the detours in Figure 3 pass through node 4 which is included in the node set of the sparsified distance graph. Note that the path h(5, 6), (6, 9), (9, 10), (10, 7)i in the input graph of Figure 1 cannot be a detour of (5, 7), because there is no path in the sparsified distance graph that the path corresponds to. Consider an edge (u, v) ∈ EDe to be tested during the sparsification process. One noteworthy property of the detours e paths in of (u, v) is that even if (u, v) is removed from D, e corresponding to the detours can be used in the query D algorithm. Based on this, we consider a set of fS + 1 edgedisjoint detours πU (u, v) from u to v in G. We denote the set e each of which a path in πU (u, v) corresponds of paths in D ∗ (u, v). Let us denote the maximum distance among to as πU the paths in πU (u, v) as dU (u, v). Then, dU (u, v) is an upper bound for dDe (u, v), because at least one path in πU (u, v) does not contain any failed edge. dU (u, v) is called a disjoint path-based upper bound for dDe (u, v). We will discuss how to select an edge-disjoint detour set in Section 5.5. Note that for any edge (u, v) ∈ ED , dU (u, v), πU (u, v), ∗ (u, v) are computed when we test whether if (u, v) and πU can be removed in the sparsification process. This setting gives us a guarantee that when we test it, all edges on paths ∗ (u, v) are included in EDe . After removing (u, v), EDe in πU is changed and for the next edge (u0 , v 0 ) to be tested, we ∗ (u0 , v 0 ). compute dU (u0 , v 0 ), πU (u0 , v 0 ), and πU Let us explain how to use the disjoint path-based upper bound dU (u, v) and the lower bound d(u, v) given an e Suppose that dU (u, v) ≤ αd(u, v). Then, edge (u, v) ∈ D. e we can guarantee that even if we remove (u, v) from D, d(u, v, F ) ≤ dDe (u, v) ≤ αd(u, v, F ). In this way, we can preserve the α-approximation when we remove the first single e = D. However, removing multiple edges by just edge from D applying this repeatedly may violate the α-approximation. In the next subsection, we explain the reason and propose an elegant way to remove multiple edges while keeping the α-approximation. 5.3 d(PG0 ) ≤ d(PG ) − d(u, v) + dU (u, v). Suppose that dU (u, v) ≤ αd(x,y) d(u, v). d(PG ) αd(x, y) d(u, v) d(PG ) αd(x, y) − d(PG ) d(PG ) ≤ d(PG ) + d(PG ) d(PG ) − d(u, v) + dU (u, v) ≤ d(PG ) − d(u, v) + = αd(x, y). Therefore, we can guarantee that, d(PG0 ) ≤ d(PG ) − d(u, v) + dU (u, v) ≤ αd(x, y). With this guarantee, we can avoid explicitly computing some alternative path for PG . We generalize this idea by using the following concept, called the permitted approximation ratio. Definition 5.2 (Permitted Approximation Ratio). For any edge (u, v) ∈ ED , let us define the dependent set λ(u, v) as the set of edges (x, y) ∈ ED \ EDe (i.e., already removed ∗ (x, y) contains (u, v). For any edges) such that a path in πU edge (u, v) ∈ ED , the permitted approximation ratio θ(u, v) is defined as follows: min(x,y)∈λ(u,v) θx,y (u, v) if |λ(u, v)| > 0 θ(u, v) = , α otherwise where θx,y (u, v) = θ(x,y)d(x,y) and du,v (x, y) is the distance du,v (x,y) ∗ (x, y) of the path in πU (x, y) corresponding to the path in πU containing (u, v). In the sparsification process, for every edge (u, v) ∈ ED , e is identical to D at first. θ(u, v) is initially set to α, because D As the sparsification process goes along, for remaining edges e θ(u, v) is updated. Note that once an edge (u, v) (u, v) in D, e then θ(u, v) is never changed. Based on is removed from D, this, we derive the following theorem. Permitted Approximation Ratio Theorem 5.1. After the sparsification process is finished, the α-approximation holds for any queries having at most fS failed edges, if for any edge (u, v) ∈ ED \ EDe , Suppose that there is an edge (x, y) already removed from e during the sparsification process such that dU (x, y) ≤ D αd(x, y). Suppose that there is an edge (u, v) ∈ EDe such ∗ that some path PDe ∈ πU (x, y) includes (u, v) and now we want to test whether (u, v) can be removed. In this case, e then P e cannot be used in the if (u, v) is removed from D, D query algorithm, because PDe contains (u, v). Consider the path, denoted as PG , in πU (x, y) corresponding to PDe . Then, we need to find another path PG0 instead of PG for πU (x, y) and update dU (x, y) if d(Pa0 ) > dU (x, y). However, computing such a path directly may cause an expensive cost. For this reason, we propose an efficient way of handling dU (u, v) ≤ θ(u, v)d(u, v). 5.4 Sparsification Algorithm Based on the above techniques, the sparsification algoe is identirithm is presented in Algorithm 5. Because D cal to D at first, θ(x, y) is set to α for all (x, y) ∈ ED in Line 3. In Lines 4-10, the sparsification process is executed over edges (x, y) ∈ ED . The algorithm computes a disjoint path-based upper bound dU (x, y) for πU (x, y) and 8 Algorithm 5: Sparsify(G, fS , α) input 1 2 3 4 5 6 7 8 9 e (u, v), fS ) Algorithm 6: DisjointDetour(G, D, : G: the input graph, D: the distance graph, fS : the maximum number of failures, α:the approximation ratio we want to achieve e the sparsified distance graph : D: output begin e Copy D to D; θ(x, y) := α for all (x, y) ∈ ED ; for (x, y) ∈ ED do ∗ Compute dU (x, y) and πU (x, y); if dU (x, y) ≤ θ(x, y)d(x, y) then e Remove (x, y) from D; ∗ for P ∈ πU (x, y) do for (u, v) ∈ P ∩ ED e do n θ(u, v) := min θ(u, v), 10 input output 1 2 3 4 5 6 θ(x,y)d(x,y) d(P ) o 7 11 begin ∗ πU (u, v) := ∅, ∆ := {(u, v)}; ∗ while |πU (u, v)| < f + 1 do e Find the shortest path PD e from u to v in D avoiding edges in ∆; ∗ ∗ πU (u, v) := πU (u, v) ∪ {PD e }; e e ); /* E(P e e ): edges in D e mapped from ∆ := ∆ ∪ E(P D D paths which overlap the detour corresponding to PD e */ ∗ return maxP ∈π∗ (u,v) d(P ) as dU (u, v), πU (u, v); U e return D; e e) Then, there exists the detour corresponding to PDe . E(P D denotes the set of edges (x, y) in EDe such that P̂ (x, y, ∅) overlaps the detour. For example, for the graph in Figure 1 and the sparsified distance graph in Figure 2, if PDe = h(5, 4), (4, 7)i, P̂ (9, 7, ∅) = h(9, 10), (10, 7)i, and the detour corresponding to PDe is h(5, 4), (4, 8), (8, 10), (10, 7)i, then e e ) = {(5, 4), (4, 7), (9, 7)}. By adding edges in E(P e e) E(P D D into ∆, we can guarantee the edge-disjointness between the ∗ (u, v). After detours of (u, v) corresponding to paths in πU ∗ (u, v)| = fS +1, we get dU (u, v) as the maximum distance |πU ∗ (u, v). among the paths in πU Line 4 can be efficiently implemented by the Dijkstra’s e modified to avoid edges which are marked algorithm on D as included in ∆. Thus, computing the shortest path PDe in Line 4 requires O(|C| log |C| + |EDe |). Therefore, the total time complexity is cU = O(fS (|C| log |C| + |ED | + e e )|avg )) where |E(P e e )|avg is the average size of E(P e e) |E(P D D D in Line 6. ∗ πU (x, y) the path set that πU (x, y) corresponds to in Line 5. According to Theorem 5.1, the algorithm tests whether dU (x, y) ≤ θ(x, y)d(x, y). If it is true, (x, y) is removed from ∗ e In addition, for each edge (u, v) on paths in πU D. (x, y), if e we update θ(u, v). (u, v) is not yet removed from D, This sparsification algorithm finds an α-distance graph with a minimal number of edges in terms of the permitted approximation ratio. Regarding efficiency, this algorithm does not include optimization. Nevertheless, it removes a sufficient number of edges that are unnecessary while keeping the α-approximation. Let us analyze the time complexity of Algorithm 5. Let us denote the cost for computing a disjoint path-based upper bound as cU . The cost for removing (x, y) in Line 7 is negligible compared with that for the update of θ(u, v) in Lines 8-10. Thus, the total time complexity is O(m(cU +f |P |avg )) where |P |avg is the average size of P in Line 8. 5.5 e the intermediate sparsified : G: the input graph, D: distance graph, (u, v):the edge in ED , fS : the maximum number of failures : dU (u, v): the disjoint path-based upper bound, ∗ πU (u, v): the path set that πU (u, v) corresponds to More efficient computation. D is sometimes extremely dense, when G comes from a scale-free network. This may strongly affect the efficiency of Algorithm 6, especially computing the shortest path in Line 4. Thus, we propose a more efficient technique for Line 4 as follows. Let us denote the shortest path from a node u to a node v e as P e (u, v). Given the input edge (u, v) in Algorithm 6, in D D for any parameter β > 0, we define the β-length shortest path tree Tv of v as the shortest path tree of v in the inbound direction consisting of nodes w such that |PDe (w, v)| ≤ β. e Consider that we want to find a path PD0e from u to v in D avoiding edges in ∆ instead of PDe in Line 4 of Algorithm 6. There are at most |nout e (u)| paths from outbound neighbors D of u to v in Tv . Based on them, we can utilize at most |nout e (u)| paths from u to v via an outbound neighbor of u in D e which do not contain any edge in ∆. We use the shortest D, path among them as PD0e . Because the number of edges in Tv is O(|C|), finding PD0e requires O(|C|) time. This technique is called the shortest path tree-based detour computation. Let us explain how to use the β-length shortest path tree efficiently. Suppose that in the sparsification algorithm, the edges of ED are tested in the order of the ID value of their target node. Then, for any node v ∈ VD , the inbound edges of v in D are consecutively tested. Thus, if we construct Tv once when the first inbound edge of v is tested, then we can also use Tv when the subsequent inbound edges of v are tested. In this way, we can efficiently use Tv to find edge-disjoint detours. Efficient Detour Selection Let us explain how to efficiently select edge-disjoint detours for any edge (u, v) ∈ ED . In order to remove many e eventually, we can consider the fS + 1 edgeedges from D disjoint detours of (u, v) such that the maximum distance among them is minimized or the sum of their distances is minimized. However, for any integer p > 1, finding the p edge-disjoint paths from a node to another such that the maximum distance among them is minimized or the sum of their distances is minimized is NP-hard [20]. Because we can only consider detours, not general paths, finding such fS + 1 edge-disjoint detours would be more complex. In addition, there are usually lots of edges in ED and we should compute an edge-disjoint detour set for each edge, so we cannot spend much time to compute it per edge. In this background, we propose an efficient heuristic algorithm iteratively selecting fS + 1 edge-disjoint detours of the input edge (u, v) ∈ ED . Assume that for any edge ˆ y, ∅) (i.e., (x, y) in D e (x, y) ∈ EDe , wDe (x, y) is set to d(x, corresponds to P̂ (x, y, ∅)). The idea of the algorithm is that for efficiency, we want to compute edge-disjoint detours use not G. The algorithm is described in Algorithm 6. ing D, ∗ Note that this algorithm only returns dU (u, v) and πU (u, v), because actual detours are not used in the sparsification ale that are avoided when the gorithm. ∆ is the set of edges in D shortest path is computed in Line 4. In Line 4, the shortest e avoiding edges in ∆ is computed. path PDe from u to v in D 9 With this technique, the time complexity of Algorithm 6 is e 0e )|avg )) and the time complexity changed to O(fS (|C|+|E(P D of Algorithm 5 will have an additional term |C|cβ where cβ is the cost for computing a β-length shortest path tree. The benefit of the reduction of cU outweighs the cost of the new term |C|cβ . 6. the minimum weight edge. This does not affect the correctness of course. In addition, DBLP and YOU are undirected graphs. For the undirected graphs, we make them directed by adding an edge (v, u) for each edge (u, v) ∈ E. Table 2 shows the detail statistics of all the datasets. Edge weights. For road network datasets, we use the travel time as the weight of each edge. For social network datasets, since they do not provide any information about edge weights, we set the weight of each edge as a real value that is sampled uniformly at random from 0 to 1. EXPERIMENTS Our distance sensitivity oracles are evaluated with extensive experiments. We implement algorithms with C++ and run the experiments on an Intel(R) i7-3970X 3.50 GHz CPU machine with 64GB RAM. 6.1 Query generation. Because there is no well-known public dataset including a graph structure and failures, we synthetically generate queries in the following way. Let us consider a query (s, t, F ) and an edge e ∈ F . If F is simply made by selecting f edges uniformly at random, there can be some edges e such that P (s, t, F ) = P (s, t, F \ {e}). We say that such edges are not essential to construct P (s, t, F ). In order to guarantee that F only contains essential edges for P (s, t, F ), we generate queries (s, t, F ) with the following procedure. First, we select two nodes uniformly at random from V and assign them to s and t, respectively. Next, after initializing F = ∅, we iteratively pick uniformly at random an edge x in P (s, t, F ), add x into F , and recompute P (s, t, F ) until |F | = f . Because we pick an edge in P (s, t, F ) at each iteration, every edge in F contributes to the change of the shortest path from s to t. Thus, this method makes P (s, t, F ) much different from the shortest path from s to t in G. Experimental Environment Comparison methods. Recall that all the existing distance sensitivity oracles for directed graphs cannot be used for practical datasets. We have four comparison methods as follows. DISO is our exact oracle. SDISO is our approximate oracle. DI is the classic (not bidirectional) Dijkstra’s algorithm with a binary heap. FDDO is a fully dynamic distance oracle without any theoretic guarantee for accuracy. This oracle is one of four algorithms proposed in [17], named LCA (meaning Lowest Common Ancestor), that has efficient query time with good accuracy. For this algorithm, we use 20 landmarks that are selected by a method proposed in [17], named the best coverage technique. In addition, we revise their update algorithm to make it work for a weighted directed graph as described in [17]. In addition to them, we will compare the IS-based path cover algorithm (Algorithm 3), denoted as IS-Cover, with the k-path cover algorithm in [10], named Pruning. Pruning finds a minimal k-path cover considering the number of nodes in the resulting cover. In Pruning, we set that nodes in V are visited in increasing order of the sum of the indegree and the out-degree of a node. This setting is easy to be implemented and presents good effectiveness in [10]. NY Table 3: Results for varying t (DISO) Dataset EPI NY DBLP YOU CAL USA |V | 76K 264K 317K 1.1M 1.9M 24.0M |E| 509K 734K 1M 3.0M 4.7M 58.3M Avg. deg. 6.7 2.76 6.6 5.3 2.45 2.41 Max deg. 3.0K 8 0.7K 57.7K 7 9 EPI Table 2: Statistics of the real-life datasets Type Social Road Social Social Road Road 6.2 3 Q(ms) 46.26 30.73 23.38 21.68 23.82 Pruning |C| |ED | 140.83K 579.45K 89.05K 546.31K 59.67K 593.58K 44.74K 740.94K 37.93K 1,013.56K 18.27 14.88 23.05 22.43K 9.66K 6.16K 488.98K 709.71K 1,465.7K Results In the experiments, if |F | is not explicitly specified, we consider that |F | = 5 when we generate queries. We generate 100 syntactic queries for each dataset and results are averaged over the queries. Due to space limitation, we sometimes only present results for some of the datasets from experiments. Note that in such a case, results for the other datasets show a tendency similar to that of the presented results. Datasets. In the experiments, we use six network datasets: Epinions, NY, DBLP, Youtube, CAL, and USA. Epinions, DBLP, and Youtube were published by Jure Leskovec2 . NY, CAL, and USA were published in the 9th DIMACS Implementation Challenge3 . More specifically, Epinions is an online social network representing who-trust-whom and it is abbreviated as EPI. DBLP is a co-authorship network where two authors are directly connected if they co-authored at least one paper. Youtube is a social network that is included in a video-sharing web site where users can make friendship each other and it is abbreviated as YOU. NY is a road network of New York city, CAL is a road network of California and Nevada, and USA is the entire road network of USA. In these datasets, there exist multiple edges defined over the same node pair. Among those edges, we only take 2 IS-Cover t Q(ms) |C| |ED | 1 38.84 142.24K 578.57K 2 24.95 90.10K 508.00K 3 18.56 62.97K 479.05K 4 15.15 47.20K 472.53K 5 12.97 37.25K 476.80K 6 11.93 30.63K 483.32K 7 11.50 25.89K 484.07K 8 11.52 22.43K 489.14K 1 17.0 24.6K 470.3K 2 13.6 11.4K 570.1K 3 14.3 7.7K 744.9K The effect of t (Comparing with [10]). Recall that t is the parameter to determine k when we select a k-path cover, and k = 2t for some t > 0. The experimental results for the effect of t are described in Table 3. This table also includes the results of the comparison between IS-Cover and Pruning. Let Q denote the query time. From the results, we can find that the distance graph from IS-Cover is more sparse than that from Pruning, so the query time of DISO with IS-Cover is smaller than that with Pruning. As discussed earlier, manipulating k (i.e., t) affects the query time of DISO and SDISO. We can find an appropriate http://snap.stanford.edu/data/. http://www.dis.uniroma1.it/challenge9/. 10 DISO DI FDDO Approximation error(%) SDSIO Query time(ms) 10000 1000 100 10 1 SDSIO FDDO 10 1 0.1 0.01 EPI DBLP YOU NY CAL EPI USA (a) Query time DBLP YOU NY CAL USA (b) Approximation error Figure 4: Results over various datasets (|F | = 5) value for t by increasing t from 1 until the query time reaches minimum. After the query time hits the minimum, the query time increases as t increases. We stop increasing t when the query time increases for the first time. In the rest of the experiments, t is set to be 7, 10, 15, 2, 5, and 3, for NY, CAL, USA, EPI, DBLP, and YOU, respectively. It is easy to see that if β is sufficiently large, the shortest paths from outbound neighbors of u to v in Tv may overlap each other. Otherwise, such shortest paths are likely to be edge-disjoint of each other. For example, β = 1, for all outbound neighbors of u that are included in Tv , the shortest paths from the neighbors to v are edge-disjoint of each other. However, if β is too small, it is likely that Tv does not include many of the outbound neighbors of u. Thus, β is experimentally determined for efficiency and effectiveness. For the rest of the experiments, we experimentally determine the parameters. As a result, fS is set to 1 for all datasets and α is set to 4 for the road network datasets. For the social network datasets, α is set to 8. β is set to 4, 6, and 4, for EPI, DBLP, and YOU, respectively. Table 4: The results of parameter testing for SDISO fS 0 1 2 3 4 5 α 2 4 6 8 β 2 4 6 8 NY EPI Q(ms) Error (%) R(%) Q(ms) Error (%) R(%) 5.14 36.25 13.95 4.74 5.91 8.30 6.64 6.63 27.56 5.39 2.09 13.09 8.06 1.34 47.02 6.48 1.03 18.39 10.36 0.14 78.59 7.13 0.52 25.08 11.43 0.00 99.96 8.22 0.20 32.93 11.40 0.00 100.00 9.08 0.18 41.19 7.76 6.64 6.54 6.21 0.98 6.63 12.02 14.02 39.24 27.56 24.72 23.39 5.78 5.47 5.39 5.37 0.31 1.57 2.09 2.87 15.31 13.54 13.09 12.88 - - - 6.62 5.39 5.48 6.02 0.63 2.09 1.29 0.76 23.67 13.09 14.00 18.49 Overall performance and accuracy. We compare the comparison methods in terms of query time and accuracy over the datasets. Because the query time of FDDO is too large for USA, we omit the result from FDDO for USA. The query time of each comparison method is presented in Figure 4(a). For all the datasets, the query times of DISO and SDISO are much smaller than those of FDDO and DI. SDISO is as much as 2.2 times faster than DISO. Note that FDDO is slower than DI in many datasets, because the cost for the update operation of FDDO is comparable to the cost for running the Dijkstra’s algorithm multiple times. This result shows that FDDO is inappropriate for the distance sensitivity problem. The approximation errors of SDISO and FDDO are shown in Figure 4(b). For all the datasets, the approximation error of SDISO is lower than 7.1% and this is much lower than the value of α we set. Based on these results, we can see that the graph sparsification algorithm for SDISO is very effective. Table 5: The results for varying |F | (NY) Parameter testing for SDISO. We test the three parameters, fS , α, and β, for SDISO. The results are presented in Table 4. The default setting is fS = 1 for both NY and EPI, α = 4 for NY, α = 8 for EPI, and β = 4 for EPI. From this setting, we vary each parameter. Approximation error is defined to be (AP P − OP T )/OP T , where OP T is the optimum and AP P is an approximation. In this table, the approximation error is denoted as Error and the ratio between |EDe | and |ED | is denoted as R. Note that the shortest path tree-based detour computation is applied only for scale-free network datasets, (i.e., social networks) because for a bounded-degree network (i.e., road network), the original detour set selection is sufficiently efficient. The results say that, when fS ≥ 3, the number of removed edges in the sparsification process for NY is very small. This is because the number of edge-disjoint paths from a node u to a node v in G is bounded by the minimum between the out-degree of u and the in-degree of v. Thus, in NY, the number of edge-disjoint paths, between two nodes, that can be used for the sparsification is very small, because NY is a bounded-degree graph. Nevertheless, even if fS = 0, the error is only 36.25% for NY and 5.91% for EPI. Next, because α is a parameter for an approximation ratio, as it increases, the query time decreases and the error increases. Finally, when β = 4, the query time is smallest with reasonable accuracy. For explanation, consider a β-length shortest path tree to compute an edge-disjoint detour set πU (u, v). |F | 1 3 5 7 9 DISO 10.47 10.57 11.50 13.12 14.66 Query time(ms) SDISO DI 5.35 106.89 5.73 102.36 6.64 103.10 7.65 118.09 9.20 125.80 FDDO 262.37 685.49 947.29 1289.57 1405.29 Error(%) SDISO FDDO 6.69 7.13 6.75 6.01 6.63 2.47 7.12 7.54 6.46 1.07 We also conduct some experiments with different values of |F |. As presented in Table 5, the query time of FDDO is most strongly affected by |F |. One interesting thing is that while the approximation error of SDISO is very stable, that of FDDO is somewhat fluctuating. Because FDDO is based on the landmark heuristics, its accuracy can be more sensitive to queries. Overall, our distance sensitivity oracles outperform the competitors in terms of query time. SDISO is the most 11 Table 6: The Preprocessing time and the index size of DISO, SDISO, and FDDO Data EPI NY DBLP YOU CAL USA Index size(MB) DISO SDISO FDDO 12.6 1.7 47.7 10.6 3.0 232.6 49.4 7.8 279.0 279.4 19.0 998.7 34.9 8.9 1,663.9 455.7 77.4 - [5] L. Cai and D. Corneil. Tree spanners. SIAM J. Discrete Math, 8:359–387, 1995. [6] S. Chechik, M. Langberg, D. Peleg, and L. Roditty. F-sensitivity distance oracles and routing schemes. In ESA, pages 84–96, 2010. [7] C. Demetrescu and G. F. Italiano. A new approach to dynamic all pairs shortest paths. J. ACM, 51(6):968–992, Nov. 2004. [8] C. Demetrescu, M. Thorup, R. A. Chowdhury, and V. Ramachandran. Oracles for distances avoiding a failed node or link. SIAM J. Comput., 37(5):1299–1318, Jan. 2008. [9] R. Duan and S. Pettie. Dual-failure distance and connectivity oracles. In SODA, pages 506–515, 2009. [10] S. Funke, A. Nusser, and S. Storandt. On k-path covers and their applications. Proc. VLDB Endow., 7(10):893–902, June 2014. [11] F. Grandoni and V. Williams. Improved distance sensitivity oracles via fast single-source replacement paths. In FOCS, pages 748–757, 2012. [12] V. King. Fully dynamic algorithms for maintaining all-pairs shortest paths and transitive closure in digraphs. In FOCS, pages 81–89, 1999. [13] Y. Qin, Q. Z. Sheng, and W. E. Zhang. Sief: Efficiently answering distance queries for failure prone graphs. In EDBT, pages 145–156, 2015. [14] L. Roditty and U. Zwick. Dynamic approximate all-pairs shortest paths in undirected graphs. In FOCS, pages 499–508, 2004. [15] M. Thorup. Fully-dynamic all-pairs shortest paths: Faster and allowing negative cycles. In SWAT’04, pages 384–396, 2004. [16] M. Thorup and U. Zwick. Approximate distance oracles. In STOC, STOC ’01, pages 183–192, 2001. [17] K. Tretyakov, A. Armas-Cervantes, L. Garcı́a-Bañuelos, J. Vilo, and M. Dumas. Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In CIKM, pages 1785–1794, 2011. [18] O. Weimann and R. Yuster. Replacement paths via fast matrix multiplication. In FOCS, pages 655–662, Oct 2010. [19] O. Weimann and R. Yuster. Replacement paths and distance sensitivity oracles via fast matrix multiplication. ACM Trans. Algorithms, 9(2):14:1–14:13, Mar. 2013. [20] P. Zhang and W. Zhao. On the complexity and approximation of the min-sum and min-max disjoint paths problems. In Combinatorics, Algorithms, Probabilistic and Experimental Methodologies, volume 4614, pages 70–81. 2007. Preprocessing time(s) DISO SDISO FDDO 4.5 43.5 92.1 6.0 23.4 122.5 29.0 1552.2 308.6 1470.8 10917.3 1633.7 60.0 184.9 1091.1 1785.5 24506.7 - efficient method in terms of query time while achieving accuracy similar to that of FDDO. With such efficiency, as described in Table 6, the index sizes of DISO and SDISO are more compact than that of FDDO. Because SDISO is based on graph sparsification, the index size of SDISO must be smaller than that of DISO. In terms of preprocessing time, DISO outperforms SDISO and FDDO. The preprocessing time of SDISO is comparable to that of FDDO. 7. CONCLUSIONS The distance sensitivity problem is an important variation of the point-to-point shortest distance problem which can be used in many applications. It has been mainly studied in theory literature, but all the existing works of this problem for directed graphs suffer from ω(n2 ) space and ω(mn) preprocessing time. This paper presents two practical distance sensitivity oracles that efficiently work for directed graphs. The first oracle is the exact oracle, named DISO, based on the distance graph. In order to compute a good distance graph for the exact oracle, we utilize the relationship between the k-path cover and the independent set. Next, we propose an efficient approximate distance oracle, named SDISO, with reasonable accuracy. This oracle is derived by applying a sparsification algorithm to the distance graph of the exact oracle with the control of the loss of accuracy. We conduct extensive experiments to evaluate our distance sensitivity oracles. The experimental results demonstrate that our distance sensitivity oracles outperform the competitors for all cases in terms of query time. Compared with the recent fully dynamic distance oracle in [17], DISO is more efficient in terms of preprocessing time and space. SDISO is more efficient than the dynamic distance oracle in terms of space, while it has comparable performance in terms of preprocessing time. In addition, SDISO has similar accuracy to that of the dynamic oracle. Compared with DISO, SDISO is more efficient in terms of query time and space, but it requires more expensive preprocessing time and has a small amount of error. This paper is the first work that handles practical graphs with million-level nodes. 8. REFERENCES [1] S. Baswana, U. Lath, and A. S. Mehta. Single source distance oracle for planar digraphs avoiding a failed node or link. In SODA, pages 223–232, 2012. [2] A. Bernstein and D. Karger. Improved distance sensitivity oracles via random sampling. In SODA, pages 34–43, 2008. [3] A. Bernstein and D. Karger. A nearly optimal oracle for avoiding failed vertices and edges. In STOC, pages 101–110, 2009. [4] B. Brešar, F. Kardoš, J. Katrenič, and G. Semanišin. Minimum k-path vertex cover. Discrete Appl. Math., 159(12):1189–1195, July 2011. APPENDIX Proof of Lemma 4.1 If P (u, v, F ) contains any nodes in C except u and v, P (u, v, F ) can be divided into multiple sub-paths each of which does not contain any node in C as an intermediate node. For those paths P (x, y, F ), there exist corresponding edges (x, y) ˆ y, F ) = d(x, y, F ). Then, in D such that wD (x, y) = d(x, 12 dD (u, v) is the sum of the weights of such edges, and it is the same as d(u, v, F ). Otherwise, there exists an edge (u, v) ∈ ˆ v, F ) = d(u, v, F ). ED by definition and wD (u, v) = d(u, Therefore, this lemma is proved. and it does not contain any other node in Vi . Note that the length of the longest path (in terms of length) in G between two nodes in a x-path cover is x. In addition, suppose that there also exists a path Pv,w from v to some node w ∈ Vi in G such that its length is x and it does not contain any other node in Vi . Then, there exist edges (u, v) and (v, w) in Di . Thus, if v is included in ISi , u and w cannot be included in ISi . v is not included in Vi+1 either. Note that the path consisting of Pu,v and Pv,w does not contain any other node in Vi+1 except u and w and its length is 2x. This implies that this new path has the longest length among paths between nodes in Vi+1 . In addition, because ISi is maximal, if Vi is minimal, for any node v ∈ Vi+1 , there exists a path P from or to v such that |P | = 2x and it does not contain any node in Vi+1 as an intermediate node. This means that Vi+1 is also minimal. Therefore, Vi+1 is a minimal 2x-path cover. Because the initial node set, V0 , is a minimal 1-path cover, the result node set is a minimal 2t -path cover. Proof of Lemma 4.2 Because the bounded Dijkstra’s algorithm avoids traversing beyond only nodes in C, for any two nodes v ∈ V and w ∈ C, if P (v, w, F ) does not contain any node in C as an intermediate node, P (v, w, F ) must be found in the bounded Dijkstra’s algorithm. It means that w must be included in Âout (v). In the same way, we can prove that Ain (v) ⊆ Âin (v). Proof of Lemma 4.3 The sufficient condition means that for any failed edge (x, y) ∈ F , all paths from v to x must pass through some node in C \ {v, x}. Thus, if the sufficient condition is true, for any edges (v, w) from v in D, all paths P from v to w, which does not pass through any node in C as an intermediate node, do not contain any failed edge in F . Therefore, v is not an affected node. Proof of Theorem 5.1 ∗ Recall that for any edge (x, y) ∈ ED , πU (x, y) is computed when (x, y) is tested to be removed. Thus, for any edge (u, v) ∈ ED \ EDe (i.e., removed edges), we can consider the sequence < (u1 , v1 ), (u2 , v2 ), ..., (ui = u, vi = v) > such that λ(u, v) = ∅ and for any 1 ≤ j ≤ i, Proof of Lemma 4.4 We first prove dD (t) = dD (s, t, F ). Consider a query (s, t, F ) and the case that P (s, t, F ) contains at least one node in C. The query algorithm starts with the nodes in Âout (s). Thus, for any v ∈ C, if v is popped from Q, then, ˆ u, F ) + dD (u, v) . dD (v) = min d(s, (uj , vj ) = arg In Lines 18-19, the algorithm properly updates the weights of edges (v, x) ∈ ED if v is suspicious. Thus, by Lemma 4.3, dD (v) is also properly computed. In addition, dD (t) keeps ˆ t, F )} where P OP (t) is the set minv∈P OP (t) {dD (v) + d(v, of nodes popped from Q among the in-access nodes of t. Therefore, by Lemma 4.2, after all, ˆ u, F ) + dD (u, v) + d(v, ˆ t, F ) d(s, dD (t) = min (u,v)∈Â(s,t) = dD (s, t, F ). Next, we prove dD (s, t, F ) = d(s, t, F ). P (s, t, F ) can be divided into P (s, u1 , F ), P (u1 , u2 , F ), ..., P (ul , t, F ) where ui is a node included in C for any 1 ≤ i ≤ l and l is the number of nodes in P (s, t, F ) ∩ C except s and t. It is easy to see that u1 and ul are included in Aout (s) and Ain (t), respectively, because P (s, u1 , F ) and P (ul , t, F ) do not contain any other node in C. Thus, (u1 , ul ) is included in A(s, t). By Lemma 4.1, dD (u1 , ul ) = d(u1 , ul , F ). Therefore, we can guarantee, min (u,v)∈A(s,t) u ,ul 1 du,v D (s, t, F ) = dD θx,y (uj+1 , vj+1 ). Because there is no case that (u1 , v1 ) = (ui , vi ) (i.e., a cyclic ∗ (ui , vi ) do not consequence), suppose that all paths in πU tain any edge in ED \ EDe . In order to prove the α-approximation for any queries having at most fS failed edges, we claim that for any edge (uj , vj ) in the sequence and any failed edge set F such that |F | ≤ fS , there exists a path Pj from uj to vj in e such that Pj corresponds to Pj∗ , G and a path Pj∗ in D Pj ∩ F = ∅ and d(Pj ) < θ(uj , vj )d(uj , vj ). This guarantees that the query algorithm will consider some distance lower than θ(uj , vj )d(uj , vj ) as the distance between uj and vj . Consider the last edge (ui , vi ) in the sequence. Because all ∗ (ui , vi ) do not contain any edge in ED \ EDe , the paths in πU we can guarantee that there is a path Pi from ui to vi in πU (ui , vi ) such that Pi ∩F = ∅ and d(Pi ) ≤ θ(ui , vi )d(ui , vi ). Then, consider an intermediate edge (uj , vj ) in the sequence. Suppose that there is a path Pj from uj to vj in the graph (V, E \ F ) such that d(Pj ) ≤ θ(uj , vj )d(uj , vj ). Next, consider the previous edge (uj−1 , vj−1 ). It easy to see that there exists a path Pj−1 from uj−1 to vj−1 such that Pj−1 includes Pj , Pj−1 ∩ F = ∅, and, u∈Âout (s) dD (s, t, F ) = min (x,y)∈λ(uj+1 ,vj+1 ) d(Pj−1 ) ≤ ψ − d(uj , vj ) + θ(uj , vj )d(uj , vj ) θ(uj−1 , vj−1 )d(uj−1 , vj−1 ) − ψ d(uj , vj ) ≤ψ+ ψ (s, t, F ) ≤ θ(uj−1 , vj−1 )d(uj−1 , vj−1 ), = d(s, t, F ). where ψ = duj ,vj (uj−1 , vj−1 ). Thus, the claim is proved. Because we picked (u, v) arbitrarily, the claim can be applied for any such edges. This implies that for any removed e some path P from u to v in the graph edge (u, v) from D, (V, E \ F ) such that d(P ) ≤ θ(u, v)d(u, v) ≤ αd(u, v) can be considered in the query algorithm. Therefore, it guarantees that the answer of the query algorithm is at most αd(s, t, F ). Proof of Lemma 4.5 Consider an intermediate iteration i in Lines 4-5 of Algorithm 3. Suppose that Vi is a minimal x-path cover. Then, Vi+1 is a minimal 2x-path cover. In order to prove this, consider a node u in Di . Suppose that there exists a path Pu,v from u to some node v ∈ Vi in G such that its length is x 13
© Copyright 2026 Paperzz