Graphs – Breadth First Search

Graphs – Breadth First Search
ORD
SFO
LAX
DFW
Outline
 BFS Algorithm
 BFS Application: Shortest Path on an unweighted graph
 Unweighted Shortest Path: Proof of Correctness
Outline
 BFS Algorithm
 BFS Application: Shortest Path on an unweighted graph
 Unweighted Shortest Path: Proof of Correctness
Breadth-First Search
 Breadth-first search (BFS) is a general technique for traversing a graph
 A BFS traversal of a graph G
 Visits all the vertices and edges of G
 Determines whether G is connected
 Computes the connected components of G
 Computes a spanning forest of G
 BFS on a graph with |V| vertices and |E| edges takes O(|V|+|E|) time
 BFS can be further extended to solve other graph problems
 Cycle detection
 Find and report a path with the minimum number of edges between two
given vertices
BFS Algorithm Pattern
BFS(G,s)
Precondition: G is a graph, s is a vertex in G
Postcondition: all vertices in G reachable from s have been visited
for each vertex u ÎV [G]
color[u] ¬ BLACK //initialize vertex
colour[s] ¬ RED
Q.enqueue(s)
while Q ¹ Æ
u ¬ Q.dequeue()
for each v ÎAdj[u] //explore edge (u,v)
if color[v] = BLACK
colour[v] ¬ RED
Q.enqueue(v)
colour[u] ¬ GRAY
BFS is a Level-Order Traversal
 Notice that in BFS exploration takes place on a
wavefront consisting of nodes that are all the same
distance from the source s.
 We can label these successive wavefronts by their
distance: L0, L1, …
BFS Example
A
undiscovered
A
discovered (on Queue)
A
finished
A
L1
B
C
D
unexplored edge
discovery edge
E
F
cross edge
L0
L1
L0
A
B
C
E
D
F
L1
A
B
C
E
D
F
BFS Example (cont.)
L0
L1
L0
A
B
C
E
L0
L1
F
L0
C
E
B
L2
A
B
L2
D
L1
D
F
L1
A
C
E
F
A
B
L2
D
C
E
D
F
BFS Example (cont.)
L0
L1
L0
L1
A
B
L2
C
E
D
F
C
E
D
F
L1
A
B
L2
A
B
L2
L0
C
E
D
F
Properties
Notation
Gs: connected component of s
A
Property 1
BFS(G, s) visits all the vertices and
edges of Gs
B
C
D
Property 2
E
The discovery edges labeled by
BFS(G, s) form a spanning tree Ts of
Gs
L0
Property 3
For each vertex v in Li
 The path of Ts from s to v has i
edges
 Every path from s to v in Gs has at
least i edges
L1
A
B
L2
F
C
E
D
F
Analysis
 Setting/getting a vertex/edge label takes O(1) time
 Each vertex is labeled three times
 once as BLACK (undiscovered)
 once as RED (discovered, on queue)
 once as GRAY (finished)
 Each edge is considered twice (for an undirected graph)
 Each vertex is placed on the queue once
 Thus BFS runs in O(|V|+|E|) time provided the graph is
represented by an adjacency list structure
Applications
 BFS traversal can be specialized to solve the
following problems in O(|V|+|E|) time:
Compute the connected components of G
Compute a spanning forest of G
Find a simple cycle in G, or report that G is a forest
Given two vertices of G, find a path in G between
them with the minimum number of edges, or report
that no such path exists
Outline
 BFS Algorithm
 BFS Application: Shortest Path on an unweighted
graph
 Unweighted Shortest Path: Proof of Correctness
Application: Shortest Paths on an Unweighted Graph
 Goal: To recover the shortest paths from a source node
s to all other reachable nodes v in a graph.
 The length of each path and the paths themselves are returned.
 Notes:
 There are an exponential number of possible paths
 Analogous to level order traversal for trees
 This problem is harder for general graphs than trees because of
cycles!
?
s
Breadth-First Search
Input: Graph G  (V , E ) (directed or undirected) and source vertex s V .
Output:
d [v ]  shortest path distance  (s ,v ) from s to v , v V .
 [v ]  u such that (u ,v ) is last edge on a shortest path from s to v .
 Idea: send out search ‘wave’ from s.
 Keep track of progress by colouring vertices:
 Undiscovered vertices are coloured black
 Just discovered vertices (on the wavefront) are coloured red.
 Previously discovered vertices (behind wavefront) are coloured grey.
BFS Algorithm with Distances and Predecessors
BFS(G,s)
Precondition: G is a graph, s is a vertex in G
Postcondition: d[u] = shortest distance d [u] and
p [u] = predecessor of u on shortest path from s to each vertex u in G
for each vertex u ÎV [G]
d[u] ¬ ¥
p [u] ¬ null
color[u] = BLACK //initialize vertex
colour[s] ¬ RED
d[s] ¬ 0
Q.enqueue(s)
while Q ¹ Æ
u ¬ Q.dequeue()
for each v ÎAdj[u] //explore edge (u,v)
if color[v ] = BLACK
colour[v] ¬ RED
d[v ] ¬ d[u] + 1
p [v ] ¬ u
Q.enqueue(v)
colour[u] ¬ GRAY
BFS
Found
Not Handled
Queue
First-In First-Out (FIFO) queue
stores ‘just discovered’ vertices
s
b
a
e
d
g
c
f
j
i
h
m
k
l
BFS
Found
Not Handled
Queue
s d=0
b
a
s d=0
e
d
g
c
f
j
i
h
m
k
l
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
e
d
g
c
f
j
i
h
m
k
l
d=0
a d=1
d
g
b
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
a d=1
d
g
b
e
d
g
c
f
j
i
h
m
k
l
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=1
e
d
g
c
f
i
h
m
k
l
d
g
b
c d=2
j
d=2 f
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=1
e
d
g
c
f
i
h
m
k
l
g
b
c d=2
j
f
d=2 m
e
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=1
e
d
g
c
b
c d=2
j
f
d=2 m
e
j
f
i
h
m
k
l
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=1
e
d
g
c
f
c d=2
j
f
d=2 m
e
j
i
h
m
k
l
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
c d=2
f
m
e
j
e
d
g
c
f
j
i
h
m
k
l
d=2
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
e
d
g
c
f
i
h
m
k
l
d=3
d=2
f
m
e
j
j
d=2 h d=3
i
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
e
d
m
e
j
j
d=2 h d=3
i
g
c
f
i
h
m
k
l
d=3
d=2
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
e
d
g
c
f
i
h
m
k
l
d=3
d=2
e
j
j
d=2 h d=3
i
l
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
e
d
d=2
g
c
f
i
h
m
k
l
d=3
j
j
d=2 h d=3
i
l
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=2
e
d
g
c
f
j
i
h
m
k
l
d=3
d=2 h d=3
i
l
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
h d=3
i
l
e
d
g
c
f
j
i
h
m
k
l
d=3
d=2
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=3
i
l
k d=4
e
d
g
c
f
j
i
h
m
k
d=4
l
d=3
d=2
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=3
e
d
l
k d=4
g
c
f
j
i
h
m
k
d=4
l
d=3
d=2
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=3
e
d
g
c
k d=4
f
j
i
h
m
k
d=4
l
d=3
d=2
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
k d=4
e
d
g
c
f
j
i
h
m
k
d=4
l
d=3
d=2
BFS
s d=0
Found
Not Handled
Queue
d=1
b
a
d=4
d=5
e
d
g
c
f
j
i
h
m
k
d=4
l
d=3
d=2
Breadth-First Search Algorithm: Properties
BFS(G,s)
Precondition: G is a graph, s is a vertex in G
Postcondition: d[u] = shortest distance d [u] and
p [u] = predecessor of u on shortest paths from s to each vertex u in G
for each vertex u ÎV [G]
d[u] ¬ ¥
p [u] ¬ null
color[u] = BLACK //initialize vertex
colour[s] ¬ RED
d[s] ¬ 0
 Q is a FIFO queue.
 Each vertex assigned finite d
value at most once.
 Q contains vertices with d
values {i, …, i, i+1, …, i+1}
Q.enqueue(s)
while Q ¹ Æ
u ¬ Q.dequeue()
for each v ÎAdj[u] //explore edge (u,v)
if color[v] = BLACK
colour[v] ¬ RED
d[v ] ¬ d[u] + 1
p [v ] ¬ u
Q.enqueue(v)
colour[u] ¬ GRAY
 d values assigned are
monotonically increasing over
time.
Breadth-First-Search is Greedy
 Vertices are handled (and finished):
 in order of their discovery (FIFO queue)
 Smallest d values first
Outline
 BFS Algorithm
 BFS Application: Shortest Path on an unweighted graph
 Unweighted Shortest Path: Proof of Correctness
Correctness
Basic Steps:
s
d
The shortest path to u
has length d
u
v
& there is an edge
from u to v
There is a path to v with length d+1.
Correctness: Basic Intuition
 When we discover v, how do we know there is not a
shorter path to v?
 Because if there was, we would already have discovered it!
s
d
u
v
Correctness: More Complete Explanation
 Vertices are discovered in order of their distance from
the source vertex s.
 Suppose that at time t1 we have discovered the set Vd of
all vertices that are a distance of d from s.
 Each vertex in the set Vd+1 of all vertices a distance of
d+1 from s must be adjacent to a vertex in Vd
 Thus we can correctly label these vertices by visiting all
vertices in the adjacency lists of vertices in Vd.
s
d
u
v
Correctness: Formal Proof
Input: Graph G  (V , E ) (directed or undirected) and source vertex s V .
Output:
d[v] = distance d (v) from s to v, "v ÎV .
p [v ] = u such that (u,v ) is last edge on shortest path from s to v.
Two-step proof:
On exit:
1. d [v ]   (s,v )v  V
2. d [v ]   (s,v )v  V
Claim 1. d is never too small: d [v ]   ( s, v )v  V
Proof: There exists a path from s to v of length £ d[v].
By Induction:
Suppose it is true for all vertices thus far discovered (red and grey).
v is discovered from some adjacent vertex u being handled.
 d [v ]  d [u ]  1
  (s, u )  1
s
  (s,v )
u
since each vertex v is assigned a d value exactly once,
it follows that on exit, d [v ]   (s,v )v V .
v
Claim 1. d is never too small: d [v ]   ( s, v )v  V
Proof: There exists a path from s to v of length £ d[v].
BFS(G,s)
Precondition: G is a graph, s is a vertex in G
Postcondition: d[u] = shortest distance d [u] and
p [u] = predecessor of u on shortest paths from s to each vertex u in G
for each vertex u ÎV [G]
d[u] ¬ ¥
s
p [u] ¬ null
color[u] = BLACK //initialize vertex
u
colour[s] ¬ RED
d[s] ¬ 0
v
Q.enqueue(s)
while Q ¹ Æ
 <LI>: d [v ]   (s,v ) 'discovered' (red or grey) v  V
u ¬ Q.dequeue()
for each v ÎAdj[u] //explore edge (u,v)
if color[v] = BLACK
colour[v] ¬ RED
d[v ] ¬ d[u] + 1
p [v ] ¬ u
Q.enqueue(v)
colour[u] ¬ GRAY
  (s, u )  1   (s,v )
Claim 2. d is never too big: d [v ]   (s, v )v  V
Proof by contradiction:
Suppose one or more vertices receive a d value greater than  .
Let v be the vertex with minimum  ( s, v ) that receives such a d value.
Suppose that v is discovered and assigned this d value when vertex x is dequeued.
Let u be v's predecessor on a shortest path from s to v .
d [ x ]  d [v ]  1
Then
 (s,v )  d [v ]
  (s,v )  1  d [v ]  1
s
 d [u ]  d [ x ]
x
u
d [u ]   (s,v )  1
Recall: vertices are dequeued in increasing order of d value.
 u was dequeued before x.
 d [v ]  d [u ]  1   (s,v )
Contradiction!
v
Correctness
Claim 1. d is never too small: d [v ]   ( s, v )v  V
Claim 2. d is never too big: d [v ]   (s, v )v  V
 d is just right: d [v ]   (s,v )v  V
Progress?
 On every iteration one vertex is processed (turns gray).
BFS(G,s)
Precondition: G is a graph, s is a vertex in G
Postcondition: d[u] = shortest distance d [u] and
p [u] = predecessor of u on shortest paths from s to each vertex u in G
for each vertex u ÎV [G]
d[u] ¬ ¥
p [u] ¬ null
color[u] = BLACK //initialize vertex
colour[s] ¬ RED
d[s] ¬ 0
Q.enqueue(s)
while Q ¹ Æ
u ¬ Q.dequeue()
for each v ÎAdj[u] //explore edge (u,v)
if color[v] = BLACK
colour[v] ¬ RED
d[v ] ¬ d[u] + 1
p [v ] ¬ u
Q.enqueue(v)
colour[u] ¬ GRAY
Optimal Substructure Property
 The shortest path problem has the optimal substructure property:
 Every subpath of a shortest path is a shortest path.
shortest path
How would we
prove this?
v
u
s
shortest path
shortest path
 The optimal substructure property
 is a hallmark of both greedy and dynamic programming algorithms.
 allows us to compute both shortest path distance and the shortest paths
themselves by storing only one d value and one predecessor value per
vertex.
Recovering the Shortest Path
For each node v, store predecessor of v in (v).
s
u
v
(v)
Predecessor of v is (v) = u.
Recovering the Shortest Path
PRINT-PATH(G , s , v )
Precondition: s and v are vertices of graph G
Postcondition: the vertices on the shortest path from s to v have been printed in order
if v  s then
print s
else if  [v ]  NIL then
print "no path from" s "to" v "exists"
else
PRINT-PATH(G , s ,  [v ])
print v
BFS Algorithm without Colours
BFS(G,s)
Precondition: G is a graph, s is a vertex in G
Postcondition: predecessors p [u] and shortest
distance d[u] from s to each vertex u in G has been computed
for each vertex u ÎV [G]
d[u] ¬ ¥
p [u] ¬ null
d[s] ¬ 0
Q.enqueue(s)
while Q ¹ Æ
u ¬ Q.dequeue()
for each v ÎAdj[u] //explore edge (u,v)
if d[v] = ¥
d[v] ¬ d[u] + 1
p [v] ¬ u
Q.enqueue(v)
Outline
 BFS Algorithm
 BFS Application: Shortest Path on an unweighted graph
 Unweighted Shortest Path: Proof of Correctness