2. Network flows

2. Network flows
A network consists of a collection of locations along with connections between them. The locations, called the nodes of the network, can correspond
to places of various kinds such as factories, warehouses, stores or customers.
The connections, known as arcs, tie together certain pairs of nodes; they may
designate roads, cables or flight corridors. A network model is concerned with a
flow of something — packages, for example, or messages or electricity — from
node to node along the arcs.
To be specific, for this case study we consider a network in which nodes are
cities, arcs are highways, and flows are truckloads of manufactured goods.
It is convenient to represent a network by a diagram in which the nodes are
drawn as circles and the arcs as arrows connecting circles. Thus our cities are
circles and our highways are arrows, as in this simple example:
5
Reno
2
Denver
4
Los Angeles
10
1
6
12
Dallas
3
Phoenix
5
Houston
3
The dashed arrows pointing into cities represent amounts produced there and
available for shipment out: 10 truckloads at Los Angeles and 5 at Reno. The
dashed arrows pointing out of cities represent amounts sold there and hence
needed as shipments in: 3 truckloads at Houston and 12 at Dallas. Solid arrows
represent possible directions of shipments between cities. The network flow
problem seeks seeks to ship all of the supply from Los Angeles and Reno so as
to meet all of the demand at Houston and Dallas. A solution to the problem
tells, for each city pair represented by an arrow, how many truckloads to ship
between the cities in the direction of the arrow.
Flow balance equations. The shipment amounts in a solution must satisfy
certain linear equations to insure that all truckloads are accounted for. In general terms, these equations say that, at each city, the amount coming in must
equal the amount going out. In our particular case, goods can come into a city
by being produced there or by being shipped in from other cities; goods can go
out from a city by being sold there or by being shipped out to other cities. Thus,
in terms of the network, the equations can be stated as follows:
Balance of flow: At each node, demand plus total shipments out must
equal supply plus total shipments in.
8
In our example, supply is 10 at Los Angeles and 5 at Reno, while demand is 3 at
Houston and 12 at Dallas. All other supply and demand amounts are 0.
To turn this word description into equations that we can solve, we’ll write
xij to represent the flow from node i to node j, where the numbering of nodes
is as given in the diagram. Then at node 2, for example, shipments in are given
by x12 and shipments out by x23 + x24 + x25 . Together with the supply of 5 and
demand of zero, this gives us the balance equation 0 + x23 + x24 + x25 = 5 + x12 .
At node 3, the flow in is x13 + x23 and the flow out is x35 , while both supply and
demand are zero, leading us to the balance equation 0 + x35 = 0 + x13 + x23 .
Proceeding in this way, and dropping the zeroes, we arrive at the following
balance equations for the six nodes:
x12 + x13
x23 + x24 + x25
x35
x45 + x46
3 + x56
12
=
=
=
=
=
=
10
5 + x12
x13 + x23
x24
x25 + x35 + x45
x46 + x56
The flows through the network are described by the values of nine variables
(corresponding to the nine arcs) that must satisfy six linear equations (corresponding to the six nodes).
In order to characterize the possible solutions, we first put these equations
into matrix form. We move variables to the left of the = sign and constants
to the right, and give a coefficient of 0 to each variable that does not appear
explicitly. Then the balance of flow can be regarded as a 6 × 9 equation system:




1
1
0
0
0
0
0
0
0
10




0
1
1
1
0
0
0
0 
5 
 −1





 0 −1 −1

0
0
1
0
0
0 
0 

x = 
,
 0

0
0 −1
0
0
1
1
0 
0 








0
0
0 −1 −1 −1
0
1 
 0
 −3 
0
0
0
0
0
0
0 −1 −1
−12
x12 x13 x23 x24 x25 x35 x45 x46 x56
where x = (x12 , x13 , x23 , x24 , x25 , x35 , x45 , x46 , x56 ). We have placed the name
of each variable under its column of coefficients in the matrix; this list of names
has no mathematical significance, but it makes the equations easier to interpret.
When Gaussian elimination is applied to these equations, the result is the
following equivalent system in echelon form:




1
1
0
0
0
0
0
0
0
10




1
1
1
1
0
0
0
0 
 0
 15 




 0


0
0
1
1
1
0
0
0 

 x =  15 
 0

 15 
0
0
0
1
1
1
1
0








0
0
0
0
0
0
1
1 
 0
 12 
0
0
0
0
0
0
0
0
0
0
x12 x13 x23 x24 x25 x35 x45 x46 x56
9
The pivot (or “basic”) variables, corresponding to the columns in which the echelon form “steps” down, are x12 , x13 , x24 , x25 and x46 . The free variables are
the other four: x23 , x35 , x45 and x56 .
There are only 5 pivot variables for these 6 equations. One equation at the
bottom turns out to be all zeroes. Thus we can conclude that there must be
infinitely many solutions. To get a general form for the solutions, we can substitute x23 = c23 , x35 = c35 , x45 = c45 , and x56 = c56 for the free variables;
then the first five equations become an upper-triangular system in the pivot
variables:
x12 + x13
x13 + x24 + x25
x24 + x25
x25 + x46
x46
=
=
=
=
=
10
15 − c23
15 − c35
15 − c35 − c45
12 − c56
These equations are easily solved to give
x46
x25
x24
x13
x12
= 12 − c56
= 3 − c35 − c45 + c56
= 12 + c45 − c56
= 0 − c23 + c35
= 10 + c23 − c35
which together with
x23
x35
x45
x56
= c23
= c35
= c45
= c56
gives a general solution to the network flow equation system. This solution may
be rewritten in vector notation as

 
 




















x12
x13
x23
x24
x25
x35
x45
x46
x56
10
1
−1
0
0
0
  0  −1 
 1
 
 


  0   1 
 0
 
 


  12   0 
 0
 
 


 =  3  +  0  c23 + −1
 
 


  0   0 
 1
 
 


  0   0 
 0
 
 


  12   0 
 0
0
0

 0



 0



 1


 c35 + −1



 0



 1



 0

 0



 0



−1


 c45 +  1



 0



 0



−1
0
1






 c56 .






Several interesting properties of the network can be deduced from this representation.
Cycles of flow. Consider first the solution obtained by setting c23 = c35 =
c45 = c36 = 0. In this case the solution has positive flows only along the five
arcs that correspond to the basic (pivot) variables. To depict this solution in
a network diagram, we can thicken the 5 “basic arcs” and show the amount of
flow next to each one:
10
5
2
10
10
12
4
12
3
1
0
3
6
12
5
3
The basic arcs form a structure in the network known as a spanning tree: they
reach every node (hence are “spanning”) yet form no loops (hence are a “tree”).
It turns out that, for any network flow equations, every subset of basic variables
corresponds to some spanning tree. Conversely, every spanning tree gives rise
to a triangular system of equations, which has a unique solution when the flows
on the other arcs are fixed.
Now consider the term that contains c23 in the general-form solution above.
It adds +c23 to x12 , −c23 to x13 , and +c23 to x23 . Looking at the network
diagram, you can see that x12 , x13 and x23 form a loop or cycle. In fact, exactly
this one cycle is formed when nonbasic arc x23 is added to the spanning tree. If
you go around this cycle forward through arc 2 → 3, you will then go backward
3 ← 1 and forward 1 → 2; this observation corresponds to the fact that you add
c23 to x23 , subtract it from x13 , and add it to x12 .
In summary, you can think of the c23 term in the general solution as representing an adjustment of flow around a cycle. This adjustment leaves the
equations satisfied at all nodes, but changes the solution on the cycle’s arcs.
The other three terms in the general solution correspond, in an analogous way,
to cycles created when the other nonbasic arcs are added to the spanning tree.
The addition of arc 3 → 5, for example, gives a cycle with 2 ← 5, 1 ← 2 and 2 → 3.
Suppose you start with the solution given by c23 = c35 = c45 = c56 = 0, but
increase c35 to a positive value t. Then according to the general-form solution,
the flow increases on x13 and x35 , while it decreases on x12 and x25 :
5
10 – t
10
2
12
4
12
3–t
1
0+t
3
t
6
5
3
11
12
Clearly all the flows remain nonnegative for 0 ≤ t ≤ 3. At t = 3, the flow on
2 → 5 falls to zero, and the remaining nonzero flows are x12 = 7, x13 = 3, x24 =
12, x35 = 3, and x46 = 12. These five variables make up another basic solution
(which we might have arrived at had we ordered the variables differently when
performing Gaussian elimination). You can also check that the arcs corresponding to these variables make up another, different spanning tree in the network.
Thus, given a nonnegative basic solution to the network flow equations, we
have a simple way of moving to another such solution. There are also an infinity
of nonnegative solutions between these two, given by setting t to values between
0 and 3.
Minimum-cost solutions. Among all the solutions to the flow balance equations, which should be preferred? There’s no good way to tell, from only the
information given. Suppose however that we also know a cost per truckload for
each arc, expressed as follows in hundreds of dollars:
$2
2
$1
1
$2
3
$6
$9
$7
4
$3
$4
5
6
$2
Then the cost of all shipments on a particular arc is given by the cost per truckload times the number of truckloads: 2x12 for shipments on 1 → 2, 9x25 for
shipments on 2 → 5, and so forth. Ignoring the variables at zero, the total cost
for all shipments in our original spanning-tree solution is
$200x12 + $200x13 + $600x24 + $900x25 + $300x46
= $200 · 10 + $200 · 0 + $600 · 12 + $900 · 3 + $300 · 12 = $15500,
while the total cost for the new spanning-tree solution that we derived by increasing the flow on x35 to 3 is
$200x12 + $200x13 + $600x24 + $700x35 + $300x46
= $200 · 7 + $200 · 3 + $600 · 12 + $700 · 3 + $300 · 12 = $14900.
The second solution turns out to be the better one.
We could have checked whether the second solution would be preferable,
before going to the trouble of computing it. Recall that for each truckload added
on 3 → 5 — that is, for each unit of increase in the our parameter t — we must
remove a truckload from 2 → 5, remove a truckload from 1 → 2, and add a
truckload on 1 → 3. Thus each increase of t by one truckload incurs a cost of
12
$700 more on 3 → 5, $900 less on 2 → 5, $200 less on 1 → 2, and $200 more on
1 → 3 — for a total of $200 less overall. It follows that, the higher you make t,
the more you save. The best you can do is to increase t to 3, which is the highest
it can go without forcing some flow to a negative value. At that point you have
reached a different basic solution, whose total cost is 3 × $200 = $600 less.
These observations suggest a way of finding the lowest possible shipping
cost to meet the demands. You start with any basic solution that has nonnegative flows. Then you pick any one nonbasic variable such that the cost per
unit of putting flow on the resulting cycle is negative. Finally you increase the
flow around this cycle until some basic variable falls to zero, at which point you
have found a lower-cost basic solution. You keep repeating this procedure until, eventually, you arrive at some basic solution where none of the associated
cycles afford any reduction in cost. At that point, you can stop and declare that
you have found the lowest-cost flow that satisfies the balance equations.
This approach, with a few refinements, is known as the network simplex
method. It really does find a minimum-cost flow for any network. The details of
why it works, and how it can be made to run efficiently on large networks, are
however beyond the scope of a linear algebra course.
You may have noticed that we have not said anything about how a nonnegative basic solution can be found in the first place. When we applied Gaussian
elimination, we happened to arrive at a solution that had all variables ≥ 0, but
if the variables had been ordered differently we could well have determined a
different solution in which some variables had values < 0. Gaussian elimination has no easy way of forcing the solution to be nonnegative, which is why
nonnegativity is ignored in the development of elimination methods. In fact
finding a nonnegative solution to an equation system is as hard as finding a
minimum-cost solution, and requires the same advanced methods.
Existence of solutions. To conclude, we consider the following question:
Which other combinations of supplies and demands would allow the network
equations to have a solution? Suppose that we represent our original equations,




1
1
0
0
0
0
0
0
0
10




0
1
1
1
0
0
0
0 
5 
 −1





 0 −1 −1

0
0
1
0
0
0 
0 

x = 
,
 0

0
0 −1
0
0
1
1
0 
0 








0
0
0 −1 −1 −1
0
1 
 0
 −3 
0
0
0
0
0
0
0 −1 −1
−12
x12 x13 x23 x24 x25 x35 x45 x46 x56
as Ax = b, so that A stands for the matrix on the left, and b for the vector on
the right. You can see that any supply at network node i appears as a positive
value of bi , while any demand appears as a negative value. At a node i where
there is no supply or demand, bi = 0.
Our question thus amounts to determining the conditions on the vector b
13
such that a solution exists. One way to do this is to apply elimination to




1
1
0
0
0
0
0
0
0
0
0
0
0
0 0
0 0
0 0
1 0
0 1
0 −1 −1
0
1
1
0
0
0
0
1
1
1
0
0
0
0
1
1
0
0
0
0
0
1
0
0
 −1 0 1 1 1 0 0

 0 −1 −1 0 0 1 0

 0 0 0 −1 0 0 1

 0 0 0 0 −1 −1 −1
with the result being








1
0
0
0
0
0
1
1
0
0
0
0
0
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
0
b1

 b2




 x =  b3

 b4



 b5



,



b6








x = 






b1
b2 + b1
b3 + b2 + b1
b4 + b3 + b2 + b1
b5 + b4 + b3 + b2 + b1
b6 + b5 + b4 + b3 + b2 + b1




.



Clearly a solution will exist if and only if the constant term for the last equation
becomes zero; that is, if and only if
6
bi = 0.
i=1
Since the supplies are positive bi values, and the demands are negative bi values,
this just says that the total supply minus total demand must equal zero. In other
words, when total supply equals total demand then there is always a solution
to the network equations, while when supply and demand are unequal there is
never a solution. This confirms exactly what you would expect.
Another way to approach this same result is to observe that every column of
A has two nonzeroes, one +1 and one −1. Thus the sum over each column of A
is zero, or equivalently eA = 0 where e = [1 1 1 1 1 1] is a row vector of all ones.
We can use this fact to reason that
Ax = b
⇒ eAx = eb
⇒ 0 =
6
bi .
i=1
That is, if a solution exists, then supplies must equal demands in b, by the same
reasoning as before. This approach can also be used to establish the converse,
that if supplies minus demands in b are zero then a solution must exist, but
somewhat more advanced reasoning is required in that direction.
14