Graph Algorithms

Graph Algorithms
Like trees, graphs represent a fundamental data structure used in computer science. We often hear
about cyber space as being a new frontier for mankind, and if we look at the structure of cyberspace,
we see that it is structured as a graph; in other words, it consists of places (nodes), and connections
between those places. Some applications of graphs include
• representing electronic circuits
• modeling object interactions (e.g. used in the Unified Modeling Language)
• showing ordering relationships between computer programs
• modeling networks and network traffic
• the fact that trees are a special case of graphs, in that they are acyclic and connected graphs,
and that trees are used in many fundamental data structures
An undirected graph G = (V, E) is a pair of sets V , E, where
• V is a set of vertices, also called nodes.
• E is a set of unordered pairs of vertices called edges, and are of the form (u, v), such that
u, v ∈ V . Another name for edge is link. Note that if E is a multiset, then G is called a
multigraph, and if there is an edge of the form (u, u) (which represents a loop), then G is
called a general graph.
• if e = (u, v) is an edge, then we say that u is adjacent to v, and that e is incident with u
and v.
• In practice, |v| = n is finite and n is called the order of G.
• |E| = m is called the size of G.
• A path P of length k in a graph is a sequence of vertices v0 , v1 , . . . , vk , such that (vi , vi+1 ) ∈ E
for every 0 ≤ i ≤ k − 1.
– a path is called simple if the vertices v0 , v1 , . . . , vk are all distinct.
– a path is called a cycle if the start and end vertices are the same: i.e. v0 = vk .
• a geometrical representation of a graph is obtained by treating the vertices as points on a plane,
and edges e = (u, v) as smooth arcs which terminate at both u and v.
• the degree of a vertex v, denoted as deg(v), equals the number of edges that are incident with
v.
• Handshaking Theorem.
P
v∈V
deg(v) = 2|E|.
1
Example 1. Let G = (V, E), where
V = {SD, SB, SF, LA, SJ, OAK}
are cities in California, and
E = {(SD, LA), (SD, SF ), (LA, SB), (LA, SF ), (LA, SJ), (LA, OAK), (SB, SJ)}
are edges which represent flights between two cities. Provide the following for the graph:
• the order and size
• a geometrical representation
• a path of length 8
• the longest simple path
• the largest cycle
• the degrees of each vertex
2
Data Structures for Undirected Graphs
UndirectedNode
{
List(UndirectedNode) neighbors
Integer mark //used in some algorithms for marking the node
}
UndirectedGraph
{
List(UndirectedNode) nodes
Integer size //number of edges for this graph
}
3
A directed graph is a graph G = (V, E) whose edges have direction. In this case, given (u, v) ∈ E,
u is called the start vertex and v is called the end vertex. Moreover, the in-degree of vertex
v, denoted deg + (v), is the number of edges for which v is the end vertex. And the out-degree of
vertex v, denoted deg − (v), is the number of edges for which v is the start vertex.
Similar to the handshaking theorem
X
v∈V
deg + (v) =
X
deg − (v) = |E|.
v∈V
The DirectedGraph and DirectedNode data structures are defined similarly as their undirected counterparts, except now each DirectedNode n has both parents and children lists, where the parent
list contains all the start vertices of all edges that end at n, and the children list contains all the end
vertices of edges that start at n (note that repeats occur in the case of more than one edge between
two nodes).
An undirected graph can be made into a directed graph by orienting each edge of the graph; that
is, by assigning a direction to each edge. Conversely, each directed graph has associated with it an
underlying undirected graph which is obtained by removing the orientation of each edge.
Example 2. Redraw the graph from Example 1, but now with oriented edges.
4
A network is a directed graph whose edges are labeled with real numbers (we tend to use the term
weighted graph for undirected graphs). In other words, there is a function c : E → R, where R is
the set of real numbers. Networks are often written as an ordered triple: G = (V, E, c).
Similar to DirectedNodes, the NetworkNode data structure has both a parents and children list, but
now these lists hold Edge data structures, since the edge “weight” or “cost” must now be recorded.
One such Edge structure is shown below.
NetworkEdge
{
NetworkNode from //start vertex
NetworkNode to //end vertex
Real weight //weight (cost) of this edge
};
5
Example 3. Let G = (V, E, c) be a directed graph, where
V = {1, 2, 3, 4, 5}
and the edges-costs are given by
E = {(1, 2, 1), (1, 3, 3), (2, 3, 1), (2, 4, 1), (2, 5, 4), (3, 4, 1), (4, 5, 1), (5, 1, 1)}.
Draw the graph.
6
Graph Connectivity
Recall that a path in a graph G = (V, E) (either directed or undirected) from vertex u to vertex v
is a sequence of vertices u = v0 , v1 , . . . , vn = v for which (vi , vi+1 ) ∈ E for all i = 0, 1, . . . , n − 1.
We then say that G is connected provided there is a path from every vertex u ∈ V to every other
vertex v ∈ V . In what follows we present algorithms that pertain to the degree of connectivity of a
graph. To begin, we consider two different ways of traversing a graph; i.e. visiting each node of the
graph. The first algorithm makes use of a FIFO queue data structure, and is called a breadth-first
traversal, while the second uses a stack data structure, and is called a depth-first traversal.
Breadth-First Graph Traversal Algorithm. Let G = (V, E) be a graph (either directed or
undirected).
Initialize FIFO queue Q as being empty.
Initialize each vertex of G as being unmarked.
While there exist unmarked vertices:
Let u be one such unmarked vertex.
Mark u and place it in Q.
While Q is nonempty:
Remove node u from Q.
For every v that is a neighbor/child of u:
If v is unmarked, then mark v and place it in Q.
In addition to visiting each node, this procedure implicitly yields a spanning forest of trees (or a
spanning tree if the forest has only one tree) whose edges are comprised of those of the form (u, v)
where u was the node that was removed from Q and led to v being marked. For undirected graphs,
a breadth-first traversal partitions the edges of G into those that are used in the forest, and those
that are not used. The latter edges are called cross edges, since they always connect nodes that are
on different branches of one of the (spanning) trees of the spanning forest.
7
Depth-First Graph Traversal Algorithm. Let G = (V, E) be a graph (either directed or undirected).
Initialize stack S as being empty.
Initialize each vertex of G as being unmarked.
While there exist unmarked vertices:
Let u be one such unmarked vertex.
Mark u and push it on to S.
While S is nonempty:
Let u be at the front of S.
Let v be the first unmarked neighbor/child of u.
If v does not exist:
Pop u from S.
Otherwise:
Mark v and push v on to S.
In addition to visiting each node, this procedure also implicitly yields a spanning forest of trees (or a
spanning tree if the forest has only one tree) whose edges are comprised of those of the form (u, v)
where u was the node from the front of S that reached v and caused it to be marked. For undirected
graphs, a depth-first traversal partitions the edges of G into those that are used in the forest, and
those that are not used. The latter edges are called backward edges, since they always connect
nodes that are on the same tree branch (i.e., the edge connects a descendant to an ancestor).
8
Example 4. For the graph G = (V, E), where
V = {a, b, c, d, e, f, g, h, i, j, k}
and the edges are given by
E = {(a, b), (a, c), (b, c), (b, d), (b, e), (b, g), (c, g), (c, f ),
(d, f ), (f, g), (f, h), (g, h), (i, j), (i, k), (j, k)}.
Show the forest that results for both a depth-first and breadth-first traversal of the tree. Assume
that all adjacency lists follow an alphabetical ordering.
9
Proposition 1. Undirected graph G = (V, E) is connected iff a breadth-first traversal of G yields a
forest with exactly one tree.
Proof of Proposition 1. Suppose a breadth-first traversal of G yields a forest with exactly one
tree T . Then G is connected since T is connected (by definition a tree is connected and acyclic), and
is a subgraph of G.
Now suppose G is connected. Suppose u is the root of the first tree. Let v be any other vertex of G.
Then there is a path P : u = v0 , v1 , . . . , vn−1 = v from u to v. Notice that, in the first iteration of
the outer while loop, v1 has the opportunity to be marked when v0 is removed from the queue. And
inductively, assuming that vi−1 will be entered in the queue during the first iteration, vi will then
have the opportunity to be marked. Therefore, by induction, v will be marked in the first iteration
of the outer while loop. Since v was arbitrary, it follows that all vertices of G belong to the first and
only tree (each iteration of the outer loop corresponds with a new tree).
Corollary 1. The size of the forest generated in a breadth-first traversal of a undirected graph equals
the number of connected components of that graph. Moreover, each tree in the forest represents a
spanning tree for one of the connected components; in other words, a tree that contains each vertex
of a component.
A Connectivity Algorithm for Directed Graphs For directed graphs, first note that there are
two kinds of connectivity to consider. The first is called strong connectivity, and requires that all
paths follow the orientation of each directed edge. The other kind is called weak connectivity and
allows for paths that ignore edge orientation. Of course, weak connectivity can be tested using the
breadth-first traversal algorithm described above. On the other hand, the (strong) connectivity of
a directed graph G = (V, E) can be determined by making two depth-first traversals of the graph.
In performing the first traversal, we recursively compute the post order of each vertex. In the first
depth-first traversal, for the base case, let v be the first vertex that is marked, but has no unmarked
children. Then the post order of v equals 1. For the recursive case, suppose v is a vertex and the
post order of each of v’s children has been computed. Then the post order of v is one more than the
largest post order of any of its children. Henceforth we let post(v) denote the post order of v.
The second depth-first traversal is performed with what is called the reversal of G. The reversal Gr
of a directed graph G = (V, E) is defined as Gr = (V, E r ), where E r is the set of edges obtained by
reversing the orientation of each edge in E. Moreover, during this second depth-first search, when
selecting a vertex to mark and enter into the stack during execution of the outer while loop, the
vertex of highest post order (obtained from the first depth-first traversal) is always chosen.
Proposition 2. In the second depth-first traversal of directed graph G = (V, E), the resulting forest
represents the set of strongly connected components of G.
Proof of Proposition 2. Let T1 , T2 , . . . , Tm , be the trees of the forest obtained from the second
traversal, and in the order in which they were constructed. First note that, if u and v are in different
trees, then u and v must be in different strongly-connected components. For example, suppose u is
in Ti and v is in Tj , with i < j, then there is no path in Gr from the root of Ti to v, which implies no
10
path from v to the root of Ti in G. Thus, being in the same tree is a necessary condition for being in
the same strongly-connected component. We now show that this condition is also sufficient, which
will complete the proof.
Now let u and v be in the same tree Tj , for some j = 1, . . . , m. Let r be the root of this tree. Then
there are paths from r to u and from r to v in Gr , which implies paths from u to r and from v to r
in G. We now show that there is also a path from r to u in G (and similarly from r to v). This will
imply a path from u to v (and similarly a path from v to u) in G by combining the paths from u to
r, and then from r to v.
To this end, since r is the root of a second-round depth-first tree, we know that r must have higher
post order than u, as computed during the first round. In the first round, r must be in the same tree
as u, since there is a path from u to r and r has a higher post order than u. But then, since u has
post order less than that of r, u must be a descendant of r, in which case there is a path from r to u.
11
Example 5. Use the algorithm described above to determine the strongly-connected components of
the following directed graph. G = (V, E), where
V = {a, b, c, d, e, f, g, h, i, j}
and the edges are given by
E = {(a, c), (b, a), (c, b), (c, f ), (d, a), (d, c), (e, c),
(e, d), (f, b), (f, g), (f, h), (h, g), (h, i), (i, j), (j, h)}.
12
Biconnectivity
An undirected graph is called biconnected if there are no vertices, called articulation points, or
edges, called bridges, whose removal disconnects the graph.
Example 6. For the graph G = (V, E), where V = {a, b, c, d, e, f, g} and
E = {(a, c), (a, d), (b, c), (b, e), (c, d), (c, e), (d, f ), (e, g)},
find all articulation points and bridges.
13
We now show how to determine all articulation points and bridges using a single depth-first traversal
for a connected undirected graph.
Let D denote the depth-first spanning tree that is constructed for G = (V, E). For all v ∈ V , let
num(v) denote the order in which v is added to D (i.e. the order in which it is marked and placed
in the stack), and low(v) be recursively defined as the minimum of the following:
1. num(v),
2. the lowest num(w) for any back edge (v, w), and
3. the lowest low(w) for any child w of v in D.
Lemma 1. Let D denote the depth-first spanning tree that is constructed for G = (V, E), and v be
any vertex of D. Suppose T1 and T2 are two distinct subtrees rooted at v. Then there are no edges
in G that connect a vertex in T1 with a vertex in T2 .
Proof of Lemma 1. Assume that T1 is generated first in the depth-first traversal that constructs
D. In other words, num(u) < num(w) for every u ∈ T1 and w ∈ T2 . Thus, there cannot be an edge,
say (u, w) connecting T1 and T2 , since otherwise, during the depth-first traversal, edge (u, w) would
have been added to D, and one would have w ∈ T1 , a contradiction.
Proposition 3. Let G = (V, E) be an undirected and connected simple graph. Let D be a depthfirst spanning tree for G. Then v ∈ V is an articulation point iff either v is the root of D and two or
more subtrees are rooted at v, or v is not the root, but has a child w for which low(w) ≥ num(v).
Proof of Proposition 3. If v is the root and two or more subtrees are rooted at v, then the
above lemma implies that v’s removal from G will disconnect all of those subtrees. Therefore v is
an articulation point. Moreover, if there is only one subtree rooted at v, then removing v does not
disconnect D, and hence v is not an articulation point.
Now suppose v is not the root, and there is a child w of v in D for which low(w) ≥ num(v). Then
the only path from w to an ancestor of v must pass through v, which implies v is an articulation
point, since its removal will disconnect w from the ancestor.
Finally, assume v is a non-root articulation point. First note that v cannot be a leaf of D, since
the removal of v would still leave D connected, and thus all other nodes remain interconnected via
D − {v}. Let D̂ denote the part of D that remains after v and its subtrees are removed from D.
Then it must be the case that there is at least one subtree rooted at v that has no back edges that
connect to D̂. If this were not the case, then v would not be an articulation point, since all of its
descendants would remain connected via D̂. Hence, letting w denote the root of this subtree, we
have low(w) = num(w) > num(v); and the proof is complete.
Corrolary 1. Let G = (V, E) be an undirected and connected simple graph. Let D be a depth-first
spanning tree for G. Then e ∈ E is a bridge iff e = (v, w) is an edge of D and low(w) ≥ num(v).
14
Example 7. Using the graph from Example 6, show the depth-first spanning tree, along with the
low and num values of each vertex. Verify Proposition 3 for this example.
15
Computing The Distances Between Two Vertices of a Graph.
For non-network graphs, the distance from vertex u to vertex v, denoted d(u, v) is defined as the
shortest length of a path from u to v. It is not hard to prove using induction that the distance from
u to v can be readily computed by performing a breadth-first traversal of G, starting at u which is
marked with the number 0, and then marking a node with the value of one more than the marked
value of its parent.
Example 8. Repeat Example 4 for the case of breadth-first traversal, and marking each vertex as
described above. Verify that the marks correpsond with the distance from vertex a.
16
Finding Distances in Networks Using Dijkstra’s Algorithm
In the case of networks, the distance between u and v is now defined as the minimum cost of a path
from u to v, where the cost is defined as the sum of all weights along the the edges of the path.
Assuming all edge weights are nonnegative, Dijkstra’s algorithm computes, for each vertex u, the
distance d(s, u) from a source vertex s to u.
Dijkstra’s algorithm for a distance traversal of a Graph from source vertex s. Let G =
(V, E, c) be a network with nonnegative edges, and s ∈ V a vertex from which the algorithm begins.
Add the elements of V to an initially empty min Heap H.
Give s priority p(s) = 0, and all other vertices infinite priority.
Set parent(s) = NULL.
While H is nonempty:
Pop node u from H.
Set d(s,u) = p(u).
If p(u) < INFINITY:
For every v that is a child of u:
If d(s,u) + c(u,v) < p(v):
Set p(v) = p(u) + c(u,v), and adjust H.
Set parent(v) = u.
Proposition 4. In the above described distance traversal of G, the final priority of a vertex represents
its distance from source s. Furthermore, the traversal tree formed yields the minimum-cost path from
the (root) s to every other node that is reachable from s (of course, if a node is not reachable from
s, then it will not be inserted into the queue, and hence finish with infinite priority).
Proof of Proposition 4. Since the parent of vertex v is updated every time that v’s priority is
reduced, it follows that, if the final priority indeed represents d(s, v), then the path from s to v in
the resulting tree will represent the min-cost path from s to v. Hence, it suffices to prove that the
final priority does in fact represent d(s, v).
To show this, let s = v1 , v2 , . . . , vn denote the order in which vertices are removed from the queue,
and let vk , 2 ≤ k ≤ n, be the first vertex whose final priority does not equal its distance from s.
Since the parent vj of vk has an index j < k (why ?), its priority does in fact equal its distance from
s. Hence, the final priority of vk is d(s, vj ) + c(vj , vk ) and so represents the cost C of a path from s
to vk . Thus, we must have C > d(s, vk ). Now let P be a minimum-cost path from s to vk and let w
be the first vertex of P that is not member of {v1 , . . . , vk−1 }. Certainly, w 6= vk , since otherwise the
priority of vk would have been correctly set to vk ’s distance from s, upon the popping of node pw
from Q, where pw ∈ {v1 , . . . , vk−1 } is the predecessor of w in path P . But then certainly, once pw
is popped from Q, the priority of w will be reduced to a value that is no greater than d(s, vk ) < C.
But this implies that w should be popped before vk , contradicting the fact that w 6∈ {v1 , . . . , vk−1 }.
Hence, it must be the case that C = d(s, vk ), and the proposition holds.
17
Example 9. Let G = (V, E, c), where
V = {a, b, c, d, e, f }
and the edges-costs are given by
E = {(a, b, 1), (a, c, 3), (b, c, 1), (b, d, 1), (b, e, 4), (c, d, 1), (c, f, 2),
(d, e, 1), (d, f, 3), (e, f, 2)}.
Draw the graph and construct the distance traversal tree rooted at a. Show the priorities of each
node in the original graph as the algorithm progresses.
18
Directed Acyclic Graphs (DAGs)
A directed acyclic graph (DAG) is simply a directed graph that has no cycles (i.e. paths of length
greater than zero that begin and end at the same vertex). DAGs have several practical applications.
One such example is to let T be a set of tasks that must be completed in order to complete a large
project, then one can form a graph G = (T, E), where (t1 , t2 ) ∈ E iff t1 must be completed before t2
can be started. Such a graph can be used to form a schedule for when each task should be completed,
and hence provide an estimate for when the project should be completed.
The following proposition suggests a way to efficiently check if a directed graph is a DAG.
Proposition 5. If directed graph G = (V, E) is a DAG, then it must have one vertex with out-degree
equal to zero.
Proof of Proposition 5. If DAG G did not have such a vertex, then one could construct a path
having arbitrary length. For example, if P is some path that ends at v ∈ V , then P can be extended
by adding to it vertex w, where (v, w) ∈ E. We know that w exists, since deg+ (v) > 0. Thus, in
constructing a path of length greater than |V |, it follows that at least one vertex in |V | must occur
0
0
more than once. Letting P denote the subpath that begins and ends at this vertex, we see that P
is a cycle, which constradicts the fact that G is a DAG.
The following algorithm makes use of Proposition 5.
Algorithm for Deciding if a Directed Graph is Acyclic. Let G = (V, E, c) be a network.
Initialize queue Q with all vertices having out degree 0.
Initialize function d:V->N so that d(v) is the out degree of v
While Q is nonempty:
Exit node v from Q.
For every u that is a parent of v:
Decrement d(u) by the number of connections from u to v.
If d(u) == 0:
Enter u in Q.
If d(u) == 0 for all u in V:
Return true.
Otherwise:
Return false.
Note that a similar algorithm can be used to determine a topological sort of a DAG. Given DAG
G = (V, E), of order n, then the ordering of V , v1 , . . . , vn is said to be a topologically sorted iff, for all
edges (u, v) ∈ E, u comes before v in the ordering. The only difference with the topological sorting
19
algorithm is that we use in-degrees instead of out-degrees. In this manner, the order in which nodes
will be removed from the queue will be in topological-sorted order.
20
Example 10. Given the DAG G = ({a, b, c, d, e, f }, E), with
E = {(a, b), (b, c), (b, d), (c, d), (e, f ), (e, b), (f, a), (f, d)},
use the above algorithm to verify that G is a DAG, and provide a topological sort for V .
21
Minimum Spanning Trees
Consider a problem in which roads are to be built that connect all four cities a, b, c, and d to one
another. In other words, after the roads are built, it will be possible to drive from any one city to
another. The costs of building a road between any two cities are given in the following table.
cities a b
a
30
b
c
d
d d
20 50
50 10
75
Using this table, find a set of roads of minimum cost that will connect the cities.
Weighted Graphs. These are graphs of the form G = (V, E, c), where c : E → R+ represents a cost
function that maps edges to nonnegative real numbers. Weighted graphs are similar to networks,
except we assume that they are undirected.
Recall that a tree is a connected, acyclic (undirected) graph. Given an undirected graph G = (V, E)
of order n, a spanning tree of G is a tree T of size n that is a subgraph of G, meaning that the
vertices and edges of T all belong to G.
22
Proposition 6. The following statements are true about trees in general.
1. every tree has at least one vertex of degree 1
2. every tree with n vertices has n − 1 edges
Proof of Proposition 6. Exercise.
Minimum Spanning Tree (MST). Let G = (V, E, c) be a weighted graph of order n. A minimum
spanning tree for G is a tree T having the following properties.
1. Spanning. T is a spanning tree of G.
2. Cost Minimality.
P
e∈E(T )
c(e) is minimum with respect to all spanning trees of G.
23
Minimum Spanning Tree Algorithms
Prim’s Algorithm builds a minimum spanning tree (mst) in stages, where one edge is added to the
current tree at each stage.
Build a min heap H with data equal to vertex set V.
Select u0 in V to have priority p(u0)=0.
Set parent(u0) = NULL.
Assign all other vertices infinite priority.
Initialize all vertices of G as being unmarked.
While H is nonempty:
Pop vertex u from H.
Mark u.
If parent(u) != NULL:
Add edge (parent(u),u) to mst
For every unmarked v that is adjacent to u:
If c(u,v) < p(v):
Set parent(v) = u.
Set p(v) to c(u,v) and adjust H.
24
Example 11. Use Prim’s algorithm to find an mst for the following weighted graph G = (V, E, c),
where the edges-costs are given by
E = {(a, b, 1), (a, c, 3), (b, c, 3), (c, d, 6), (b, e, 4), (c, e, 5), (d, f, 4), (d, g, 4),
(e, g, 5), (f, g, 2), (f, h, 1), (g, h, 2)}.
25
Proposition 7. Prim’s Algorithm returns a minimum spanning tree for input G = (V, E, c).
Proof of Proposition 7. Let T be the tree returned by Prim’s Algorithm on input G = (V, E, c)
and assume that e1 , e2 , . . . , en−1 is such that ei is added at Stage i of Prim’s Algorithm, for i =
1, 2, . . . , n − 1. Let T̂ be an MST for G which contains edges e1 , . . . , ek−1 , but does not contain ek ,
0
for some 1 ≤ k ≤ n − 1. We show how to transform T̂ into an MST T that contains e1 , . . . , ek .
Let Tk−1 denote the tree consisting of edges e1 , . . . , ek−1 ; in other words, the tree that has been
constructed at the beginning of Stage k of Prim’s Algorithm. Consider the result of adding ek to T̂
to yield the new graph Tc . Then, since Tc is connected and has n edges, Tc has a cycle C containing
ek . Now since ek is selected at Stage k of Prim’s Algorithm, ek must be incident with exactly one
vertex of Tk−1 . Hence, cycle C must enter Tk−1 via ek , and exit Tk−1 via some other edge e that is
not in Tk−1 , but is incident with exactly one vertex of Tk−1 . Thus, e was a candidate to be chosen at
Stage k, but was passed over in favor of ek . Hence, c(ek ) ≤ c(e).
0
0
Now define T to be the tree obtained from T̂ by adding ek and removing e. Then T has n − 1
edges and remains connected, since any path in T̂ that must cross e can detour by traversing the
0
0
path C − {e} which is contained in T . Thus, T is a tree and it is an MST since e was replaced with
0
ek which does not exceed e in cost. Notice that T agrees with T in the first k edges selected for T
in Prim’s Algorithm, where as T̂ only agreed with T up to the first k − 1 selections. Therefore, by
repeating the above transformation a finite number of times, we will eventually construct an MST
that is identical with T . QED
26
Kruskal’s Algorithm also builds a minimum spanning tree in stages, except now each stage chooses
the edge of least cost which does not form a cycle in the current set of edges which forms a forest;
i.e. a collection of trees.
Example 12. Use Kruskal’s Algorithm to find an mst for the graph of Example 11.
Kruskal, Prim, and Dijkstra’s algorithms are called greedy algorithms, since they each involve the
construction of a solution in stages, in which in each stage a selection (i.e. greedy choice) is made
based on some criterion. In Dijkstra’s algorithm, each stage selects the vertex with minimum distance
from the source. In Prim’s algorithm, each stage selects the vertex that has the least connection cost
to the tree that is currently under construction. And in the case ofo Kruskal’s algorithm, each stage
selects the edge of minimum cost that does not create a cycle in the forest that is currently under
construction. Greedy algorithms represent one of the most common heuristics for solving problems.
In many cases however, the algorithm does not always yield the best solution.
Network Flows
Some important practical problems involve finding the most optimal way of sending resources through
a network. As an example, consider the problem of finding a route through a street network which
minimizes the driving time of a commuter. In the previous lecture we witnessed how to solve this
problem using Dijkstra’s Algorithm, assuming that we can model the street network as a weighted
27
graph where the weights of each edge represent the estimated driving time needed to traverse that
edge (street). As a second example, suppose a computer has a packet of information that it must
send through a network. There may exist many ways to route the information through the network.
It may even be possible to distribute the information in parallel, using all the channels at once, and
then reassembling the packet at the final destination.
Network Flow. Let G = (V, E, c, s, t) be a directed network, where c : E → R+ determines the
capacity of each edge, s ∈ V is the designated source vertex and t ∈ V is the designated sink
vertex. A flow for the network is a function f : E → R+ with the following properties:
1. For every e ∈ E, f (e) ≤ c(e). In other words, the flow through an edge should not exceed the
edge’s capacity.
2. For every vertex v, let E + (v) equal the set of edges that end at v, and E − (v) the set of edges
that start at v. Then for every intermediate vertex v ∈ V − {s, t}, we have
X
f (e) =
X
f (e).
e∈E − (v)
e∈E + (v)
In other words, the total flow going into a vertex must equal the total flow going out. Note
that it is assumed that E + (s) = E − (t) = ∅.
Example 13. For the directed network below, give an example of a flow for this network.
28
Network Flow Problem. Let G = (V, E, c, s, t) be a directed network. Find a flow f : E → R+
which maximizes
X
f (e);
e∈E − (s)
in other words a flow which represents a maximal amount of resources leaving from source s.
To solve the network flow problem for network G = (V, E, c, s, t), we adopt the strategy of defining an
initial flow f , and then try to increase f in stages. To do this, we need to define the following structure.
0
0
The augmented network N (G, f ) with respect to G and f is the network N (G, f ) = (V, E , c , s, t),
0
0
where V , s, and t are are the same as for G, and E , c are defined as follows:
0
1. E = E ∪ Ef , where (u, v) ∈ Ef if and only if e = (v, u) ∈ E and f (e) > 0.
0
0
2. for e ∈ E, c (e) = c(e) − f (e), and for e ∈ Ef , c (e) = f (e).
0
In passing, notice that, since f (e) ≤ c(e), the capacity function c is properly defined, since all
capacities are nonnegative.
Example 14. For the network G and flow f of Example 1, draw the augmented network N (G, f ).
29
Proposition 8. Let G = (V, E, c, s, t) be a network, f a flow for the network, and N (G, f ) the
augmented network. If ∆f is a flow for N (G, f ), then f + ∆f is a flow for G, where (f + ∆f )(e) is
defined as follows:
• If e = (u, v) ∈ E, let er = (v, u), be the reversal of e. Then
(f + ∆f )(e) = f (e) + ∆(e) − ∆(er ).
Proof of Proposition 8. We leave this as an exercise. There are two items to establish. The first
is that (f + ∆f )(e) is nonngegative and within the capacity of e. Secondly, we must establish that,
whatever flow is being added (subtracted) into a node, that the increase (decrease) in flow equals the
increase (decrease) of the flow leaving the vertex that the edge is entering.
30
The above theorem provides for an algorithm for finding the maximum flow for a network G:
Max-Flow Algorithm
1. Stage 0: begin with the 0-flow f0 ; i.e. f (e) = 0, for every e ∈ E.
2. Stage k ≥ 1 (repeat until augmented network N (G, fk ) does not possess a path from
s to t): Form the augmented network N (G, fk ) and find a path p from s to t. Let fk+1 = fk +p.
3. return fk if N (G, fk ) does not possess a path from s to t.
Example 15. Use the maximum flow algorithm to find a maximum flow for the network in Example
1.
31
Example 15 Continued:
32
Example 16. Use the maximum flow algorithm to find a maximum flow for the following network.
Assume the source is x0 and the sink is z.
33
Example 16 Continued:
34
Exercises.
1. The cubic graph Qn , n ≥ 1, has vertex set equal to the set of all binary strings of length n.
Moreover, two vertices are adjacent iff they differ in at most one bit place. For example, in Q3 ,
000 is adjacent to 010, but not to 011. Draw Q1 , Q2 , and Q3 . Show that Q3 has a Hamilton
cycle, i.e. a cycle that visits every vertex.
2. Provide formulas for both the order and size of Qn . Explain.
3. Prove that for any simple graph having n vertices, that there exist dn/2e vertices for which at
least 3/4 of all edges are incident with.
4. Starting at vertex 000, perform a breadth-first traversal of Q3 . Assume all adjacency lists are
in numerical order. For example, (000, 001) occurs before (000, 010). Repeat using a depth-first
traversal. In both cases show the resulting spanning trees.
5. Show the resulting spanning forests for both a breadth-first and depth-first traversal of the
directed graph having vertex set a-k and edges
{(j, a), (j, g), (a, b), (a, e), (b, c), (c, k), (d, e), (e, c), (e, f ), (e, i), (f, k),
(g, d), (g, e), (g, h), (h, e), (h, i), (i, f ), (i, k)}.
6. Give a linear-time algorithm that determines if a simple graph has any odd cycles. Hint:
perform a breadth-first traversal and mark the visited nodes with one of two colors (either red
or blue). Whenever a (parent) node that is removed from the queue reaches an unvisted node,
mark that node and give it the opposite color of its parent. What happens when the child node
is already visited/marked with the same color? with the opposite color?
7. Perform the acyclic-topological sort algorithm on the directed graph having vertex set a-k and
edges
{(j, a), (j, g), (a, b), (a, e), (b, c), (c, k), (d, e), (e, c), (e, f ), (e, i), (f, k),
(g, d), (g, e), (g, h), (h, e), (h, i), (i, f ), (i, k)}.
Show the state of the queue each time it changes state (by either inserting or removing a vertex).
8. For the the directed graph whose vertices are a-g, and whose edges are given by
{(a, b), (b, g), (g, e), (b, e), (b, c), (a, c), (c, e), (c, d), (e, d),
(e, f ), (d, f ), (d, a), (g, f ), (f, d), (d, b)}.
perform two depth-first traversals (one on G, the other on Gr ) to obtain the strongly connected
components of G.
9. Draw the simple graph whose vertices are a − k, and whose edges are given by
{(a, c), (a, d), (c, b), (c, d), (c, f ), (b, e), (e, f ), (e, i), (e, h), (f, g), (h, j),
(i, k), (j, k)}.
Determine the articulation points of the graph by performing a depth-first traversal starting at
vertex a, and computing num(v) and low(v) for each node.
35
10. Draw the weighted graph whose vertices are a-j, and whose edges-weights are given by
{(a, b, 3), (a, d, 4), (a, e, 4), (b, c, 10), (b, e, 2), (b, f, 3), (c, f, 6), (c, g, 1), (d, e, 5),
(d, h, 6), (e, f, 11), (e, h, 2), (e, i, 1), (f, g, 2), (f, i, 1), (f, j, 11), (h, i, 4), (i, j, 7)}.
Perform Kruskal’s algorithm to obtain a minimum spanning tree for G. Repeat the exercise
using Prim’s algorithm.
11. Does Prim’s and Kruskal’s algorithm work if negative weights are allowed? Explain.
12. Explain how Prim’s and/or Kruskal’s algorithm can be used to find a maximum spanning tree.
Hint: neither algorithm needs modification, if the input graph is suitably modified.
13. Draw the weighted directed graph whose vertices are a-g, and whose edges-weights are given
by
{(a, b, 2), (b, g, 1), (g, e, 1), (b, e, 3), (b, c, 2), (a, c, 5), (c, e, 2), (c, d, 7), (e, d, 3),
(e, f, 8), (d, f, 1), (d, a, 2)}.
Perform Dijkstra’s algorithm to determine the distances and shortest paths from a to every
other node.
14. Let G be a graph with vertices 0, 1, . . . , n−1, and let parent be an array, where parent[i] denotes
the parent of i for some shortest path from vertex 0 to vertex i. Assume parent[0] = −1;
meaning that 0 has no parent. Provide a recursive implementation of the function
void print_optimal_path(int i, int[] parent)
which prints from left to right the optimal path from vertex 0 to vertex i. You may assume
access to a print() function that is able to print strings, integers, characters, etc.. For example,
print i
print "Hello"
print ’,’
are all legal uses of print.
15. What is the worst-case running time of Dijkstra’s algorithm if nodes are stored in a d-heap.
Assume that it takes O(1) steps to locate a node in the heap.
16. Let G be a simple undirected graph that has at least one bridge. Argue that there is no way
to orient G’s edges (i.e. turn it into a directed graph) so that the resulting directed graph will
be strongly connected. Is this also true for graphs that have at least one articulation point?
Either argue as such, or give a counterexample.
17. Draw the network directed graph whose vertices are a-j, and whose edges-capacities are given
by
{(a, b, 6), (a, d, 8), (a, e, 8), (b, c, 2), (b, e, 2), (b, f, 3), (c, f, 6), (c, g, 1), (d, e, 5),
(d, h, 6), (e, f, 11), (e, h, 2), (e, i, 1), (f, g, 2), (f, i, 1), (f, j, 11), (h, i, 4), (i, j, 7)}.
Considering vertex a as the source, and vertex j as the sink, use the max-flow algorithm to
determine the maximum flow that can leave a and enter j. Start with a 0-flow, and successively
list the augmenting paths (and their flow contributions) that increase the flow at each stage of
the algorithm.
36
18. Draw the undirected graph whose vertices are a-k and whose edges are given by
{(j, a), (j, g), (a, b), (a, e), (b, c), (c, k), (d, e), (e, c), (e, f ), (e, i), (f, k),
(g, d), (g, e), (g, h), (h, e), (h, i), (i, f ), (i, k)}.
Assuming each edge has unit capacity, use the max-flow algorithm to determine the maximum
flow that can leave a and enter k. Shade all the edges that are used in the flow.
19. At a school ice-cream party there are five dixie cups of ice cream that remain to be served.
Each cup has a different flavor: vanilla, chocolate, cherry, rocky road, and mint and chip.
There are five children who have yet to be served: Abe, Ben, Cris, Dan, and Eva. The icecream preferences of these children are shown below.
Child
Abe
Ben
Cris
Dan
Eva
Vanilla
X
Chocolate
Cherry
X
Rocky Road
X
X
X
Mint & Chip
X
X
X
X
X
In a rush to get their ice cream, Abe grabbed the cherry, Cris the chocolate, Dan the mint and
chip, and Eva the rocky road. This left Ben with a (vanilla) flavor that he does not like, and
which he refused to eat. Show how the max-flow/matching algorithm can be used to increase
the current matching (of four children to four ice creams that they prefer) to a matching of size
five, in which each child will be assigned an ice cream that he or she prefers.
37