Week 5 Depth-first search General remarks Recall: BFS Analysis of

CS 270
Algorithms
Week 5
General remarks
Oliver
Kullmann
Oliver
Kullmann
Analysing
BFS
Depth-first search
Depth-first
search
Analysing
DFS
Analysing BFS
Dags and
topological
sorting
2
Depth-first search
Detecting
cycles
3
Analysing DFS
1
CS 270
Algorithms
Analysing
BFS
We finish BFS, by analysing it.
Then we consider the second main graph-search algorithm,
depth-first search (DFS).
And we consider one application of DFS, topological
sorting of graphs.
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
Reading from CLRS for week 5
Chapter 22, Sections 22.2, 22.3, 22.4.
4
Dags and topological sorting
5
Detecting cycles
Recall: BFS
CS 270
Algorithms
Analysis of BFS
Oliver
Kullmann
BFS(G , s)
1
2
3
4
5
6
7
8
9
10
11
12
for each u ∈ V (G )
d[u] = ∞
π[s] = nil
d[s] = 0
Q = (s)
while Q 6= ()
u = Dequeue[Q]
for each v ∈ Adj[u]
if d[v ] = ∞
d[v ] = d[u] + 1
π[v ] = u
Enqueue(Q, v )
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
CS 270
Algorithms
Oliver
Kullmann
Correctness Analysis: At termination of BFS(G , s), for every
vertex v reachable from s:
v has been encountered;
d[v ] holds the length of the shortest path from s to v ;
π[v ] represents an edge on a shortest path from v to s.
Time Analysis:
The initialisation takes time Θ(V ).
Each vertex is Enqueued once and Dequeued once;
these queueing operations each take constant time, so
the queue manipulation takes time Θ(V ) (altogether).
The Adjacency list of each vertex is scanned only when
the vertex is Dequeued, so scanning adjacency lists
takes time Θ(E ) (altogether).
The overall time of BFS is thus Θ(V + E ).
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
Why do we get shortest paths?
Is it really true that we get always shortest paths (that is, using
the minimum number of edges)? Let’s assume that to some
vertex v there exists a shorter path P in G from s to v than
found by BFS. Let this length be d ′ < d[v ].
1
v 6= s, since the distance from s to s is zero (using the path
without an edge), and this is correctly computed by BFS.
2
Consider the predecessor u on that shorter path P.
3
If also d[u] would be wrong (that is, too big), than we
could use u instead of v . Thus w.l.o.g. d[u] is correct.
4
Now when exploring the neighbours of u, in case v is still
unexplored, it would get the correct distance d ′ = d[u] + 1.
5
So v must have been explored already earlier (than u).
6
So at the time of determining the distance d[u] ≤ d[v ] − 2,
the distance d[v ] must have been already set.
7
However the distances d[w ] set by BFS are non-decreasing!
The role of the queue (cont.)
CS 270
Algorithms
Oliver
Kullmann
Analysing
BFS
The vertices are taken off the queue (“dequeued”) from the
left, and are added (“enqueued”) to the right (“first in, first
out”).
What is added has a distance one more than what was
taken away.
We start with 0, and add some 1’s.
Then we take those 1’s, and add 2’s.
Once the 1’s are finished, we take the 2’s, and add 3’s.
And so on — that why the sequence of distances is
non-decreasing.
A crucial property of BFS is that the distances set in step 10 of
the algorithm are non-decreasing, that is,
Depth-first
search
if we imagine a watch-point set at step 10,
and monitor the stream of values d[v ], then we will see
0, 1, . . . , 1, 2, . . . , 2, 3, . . . , 3, . . . .
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
Why is this the case? This must be due to the queue used —
whose special properties we haven’t yet exploited!
CS 270
Algorithms
Oliver
Kullmann
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
Since in step 12, directly after setting the distance, we put the
vertex into the queue, the above assertion is equivalent to the
statement, that the distances of the vertices put into the queue
are non-decreasing (in the order they enter the queue).
Now why is this the case?
CS 270
Algorithms
Oliver
Kullmann
First we need to notice that once a distance d[v ] is set, it is
never changed again.
The role of the queue
Analysing
BFS
Running BFS on directed graphs
We can run BFS also on a digraph G , with start-vertex s:
1
2
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
3
For digraphs we have directed spanning trees/forests.
Only the vertices reachable from s following the directions
of the edges are in that directed tree (directed from s
towards the leaves).
Still the paths in the directed tree, from s to any other
vertex, are shortest possible (given that we obey the given
directions of the edges).
For a graph (i.e., undirected graph), we need to restart BFS (to
obtain a spanning forest, not just a spanning tree) only if the
graph is disconnected.
However for a digraph there is a much higher chance, that we
need to restart in order to cover all vertices.
We thus more often need a directed spanning forest, not just a
directed spanning tree.
CS 270
Algorithms
Oliver
Kullmann
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
Restart typically needed for digraphs to cover all
vertices
Consider the simple digraph
CS 270
Algorithms
Oliver
Kullmann
Analysing
BFS
1O o
2
Depth-first
search
Analysing
DFS
3
In order to cover all vertices, one needs to run BFS at least two
times (and if the first time you start it with s = 1, then you
need to run it three times).
CS 270
Algorithms
Arcs of different lengths
Dags and
topological
sorting
Detecting
cycles
Oliver
Kullmann
The edges of (di-)graphs have (implicitly) a length of one unit.
If arbitrary non-negative lengths are allowed, then we have
to generalise BFS to Dijkstra’s algorithm.
This generalisation must keep the essential properties, that
the distances encountered are the final ones and are
non-decreasing.
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
But now not all edges have unit-length, and thus instead of
a simple queue we need to employ a priority queue.
So even for the above apparently very simple graph we need
a directed spanning forest.
A priority queue returns the vertex in it with the smallest
value (distance).
Note that the obtained directed trees (given by π) overlap.
So in such a directed forest there is typically a certain
overlap between the directed trees in it.
Depth-first search
Depth-first search (DFS) is another simple but very important
technique for searching a graph.
Such a search constructs a spanning forest for the graph, called
the depth-first forest, composed of several depth-first trees,
which are rooted spanning trees of the connected components.
DFS recursively visits the next unvisited vertex, thus extending
the current path as far as possible; when the search gets stuck in
a “corner” it backtracks up along the path until a new avenue
presents itself.
DFS computes the parent π[u] of each vertex u in the
depth-first tree (with the parent of initial vertices being nil), as
well as its discovery time d[u] (when the vertex is first
encountered, initialised to ∞) and its finishing time f [u] (when
the search has finished visiting its adjacent vertices).
CS 270
Algorithms
CS 270
Algorithms
The algorithm
Oliver
Kullmann
Oliver
Kullmann
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
Analysing
BFS
DFS(G )
DFS-Visit(u)
1 for each u ∈ V (G )
2
d[u] = ∞
3 time = 0
4 for each u ∈ V (G )
5
if d[u] = ∞
6
π[u] = nil
7
DFS-Visit(u)
1
2
3
4
5
6
7
8
time = time + 1
d[u] = time
for each v ∈ Adj[u]
if d[v ] = ∞
π[v ] = u
DFS-Visit(v )
time = time + 1
f [u] = time
Analysis: DFS-Visit(u) is invoked exactly once for each
vertex, during which we scan its adjacency list once. Hence
DFS, like BFS, runs in time Θ(V + E ).
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
CS 270
Algorithms
DFS illustrated
1 /nil ∞ /− ∞ /− ∞ /−
−
−
−
−
1
2
4
6
d[u]
/π[u]
f [u]
1 /nil 2 /1 ∞ /− ∞ /−
−
−
−
−
1
2
4
6
u
3
5
7
∞ /− ∞ /− ∞ /−
−
−
−
Stack = ()
u=1
(labelling)
1 /nil 2 /1 ∞ /− ∞ /−
−
−
−
−
1
2
4
6
3
5
7
3 /2 ∞ /− ∞ /−
−
−
−
Stack = (2, 1)
1 /nil 2 /1
−
−
1
2
3
3 /2
−
u=3
4 /3 ∞ /−
−
−
4
6
5
7
5 /4 ∞ /−
−
−
Stack = (4, 3, 2, 1)
1 /nil 2 /1
−
−
1
2
3
3 /2
−
4 /3
−
4
5
5 /4
8
Stack = (4, 3, 2, 1)
u=5
9 /4
−
6
7
6 /5
7
u=6
3
5
7
∞ /− ∞ /− ∞ /−
−
−
−
Stack = (1)
u=2
1 /nil 2 /1
−
−
1
2
4 /3 ∞ /−
−
−
3
4
6
5
7
3 /2 ∞ /− ∞ /−
−
−
−
Stack = (3, 2, 1)
1 /nil 2 /1
−
−
1
2
3
u=4
4 /3 ∞ /−
−
−
4
5
Oliver
Kullmann
Running DFS on directed graphs
Again (as with BFS), we can run DFS on a digraph G .
Analysing
BFS
Again, no longer do we obtain spanning trees of the
connected components of the start vertex.
Depth-first
search
But we obtain a directed spanning tree with exactly all
vertices reachable from the root (when following the
directions of the edges).
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
6
Different from BFS, the root (or start vertex) normally does not
play a prominent role for DFS:
1
Thus we did not provide the form of DFS with a start
vertex as input.
9 /4
10
2
But we provided the forest-version, which tries all vertices
(in the given order) as start vertices.
7
3
This will always cover all vertices, via a directed spanning
forest.
7
3 /2
−
5 /4
−
6 /5
−
1 /nil 2 /1
14
13
4 /3
11
Stack = (5, 4, 3, 2, 1) u = 7
1
2
3
3 /2
12
Stack = ()
4
5
5 /4
8
6
6 /5
7
What are the times good for?
CS 270
Algorithms
Oliver
Kullmann
Analysing
BFS
(Directed) DFS trees do not contain shortest paths — to that
end their way of exploring a graph is too “adventuresomely”
(while BFS is very “cautious”).
Nevertheless, the information gained through the computation
of discovery and finish times is very valuable for many tasks. We
consider the example of “scheduling” later.
In order to understand this example, we have first to gain a
better understanding of the meaning of discovery and finishing
time.
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
Existence of a path versus finishing times
Lemma 1
Consider a digraph G and nodes u, v ∈ V (G ) with d[u] < d[v ].
Then there is a path from u to v in G if and only if f [u] > f [v ]
holds.
In words: If node u is discovered earlier than node v , then we
can reach v from u iff v finishes earlier than u.
Proof.
If there is a path from u to v , then, since v was discovered later
than u, the recursive call of DFS-Visit(v ) must happen inside
the recursive call of DFS-Visit(u) (by construction of the
discovery-loop), and thus v must finish before u finishes.
In the other direction, if v finishes earlier than u, then the
recursive call of DFS-Visit(v ) must happen inside the
recursive call of DFS-Visit(u) (since the recursion for u must
still be running). Thus, when discovering v , we must be on a
path from u to v .
CS 270
Algorithms
Oliver
Kullmann
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
CS 270
Algorithms
Oliver
Kullmann
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
CS 270
Algorithms
Directed acyclic graphs
An important applications of digraphs G is with scheduling:
The vertices are the jobs (actions) to be scheduled.
A directed edge from vertex u to vertex v means a
dependency, that is, action u must be performed before
action v .
Now consider the situation where we have three jobs a, b, c and
the following dependency digraph:
G = a _❃
❃❃
/b
⑧
❃❃
⑧⑧
❃❃
⑧⑧
⑧
❃ ⑧
⑧
Oliver
Kullmann
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
CS 270
Algorithms
Topological sorting
Given a dag G modelling a scheduling task, a basic task is to
find a linear ordering of the vertices (“actions”) such that all
dependencies are respected.
This is modelled by the notion of “topological sorting”.
A topological sort of a dag is an ordering of its vertices
such that for every edge (u, v ), u appears before v in the
ordering.
Oliver
Kullmann
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
For example consider
G= a
c
Clearly this can not be scheduled!
c
In general we require G to by acyclic, that is, G must not
contain a directed cycle.
/b
⑧?
⑧
⑧⑧
⑧⑧
⑧
⑧
The two possible topological sortings of G are a, c, b and c, a, b.
A directed acyclic graph is also called a dag.
Finishing times in DAGs
Lemma 2
After calling DFS on a dag, for every edge (u, v ) we have
f [u] > f [v ].
Proof.
There are two cases regarding the discovery times of u and v :
1
If d[u] < d[v ], then by Lemma 1 we have f [u] > f [v ] (since
there is a path from u to v (of length 1)).
2
Now assume d[u] > d[v ]. If f [u] > f [v ], then we are done,
and so assume that f [u] < f [v ]. But then by Lemma 1
there is a path from v to u in G , and this together with the
edge from u to v establishes a cycle in G , contradicting
that G is a dag.
CS 270
Algorithms
Topological sorting via DFS
Oliver
Kullmann
CS 270
Algorithms
Oliver
Kullmann
Analysing
BFS
Analysing
BFS
Depth-first
search
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
Corollary 3
To topologically sort a dag G , we run DFS on G and print the
vertices in reverse order of finishing times. (We can put each
vertex on the front of a list as they are finished.)
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles
CS 270
Algorithms
Topological sorting illustrated
Consider the result of running the DFS algorithm on the
following dag.
Oliver
Kullmann
Analysing
BFS
1/20
21/26
m
22/25
n
6/19
q
r
2/5
3/4
v
7/8
11/12
Analysing
DFS
Dags and
topological
sorting
23/2413/16
w
10/17
y
x
Depth-first
search
p
s
u
t
27/28
o
Detecting
cycles
z
9/18
14/15
(labelling: d[u]/f [u])
Listing the vertices in reverse order of finishing time gives the
following topological sorting of the vertices:
u:
p
n
o
s
m r
y
v
w
z
x
u
q
t
f [u]:
28
26
25
24
20
18
17
16
15
12
8
5
4
19
Deciding acyclicity
Revisiting the lemma for topological sorting
In Lemma 2 we said:
If G has no cycles, then along every edge
the finishing times strictly decrease.
Oliver
Kullmann
Analysing
BFS
Depth-first
search
Analysing
DFS
So if we discover in digraph G an edge (u, v ) with f [u] < f [v ],
then we know there is a cycle in G .
Dags and
topological
sorting
Is this criterion also sufficient?
Detecting
cycles
YES: just consider a cycle, and what must happen with the
finishing times on it.
Lemma 4
Consider a digraph G . Then G has a cycle if and only if every
run of DFS on G yields some edge (u, v ) ∈ E (G ) with
f [u] < f [v ].
Oliver
Kullmann
For a graph G (i.e., undirected) detecting the existence of a
cycle is simple, via BFS or DFS:
G has a cycle (i.e., is not a forest) if and only if
BFS resp. DFS will discover a vertex twice.
One has to be a bit more careful here, since the parent vertex
will always be discovered twice, and thus has to be excluded, but
that’s it.
However for digraphs it is not that simple — do you see why?
CS 270
Algorithms
CS 270
Algorithms
Analysing
BFS
Depth-first
search
Analysing
DFS
Dags and
topological
sorting
Detecting
cycles