Wormhole Message - Dr. Mark G. Karpovsky

UNICAST WORMHOLE MESSAGE ROUTING IN IRREGULAR
COMPUTER NETWORKS
SHARAD JAISWAL1, LEV ZAKREVSKI2, MEHMET MUSTAFA1, MARK KARPOVSKY1
1
2
({sjaiswal, mmustafa, markkar}@bu.edu) ECE Dept., Boston University,8 St. Mary’s Street, Boston, MA 02215
([email protected]) ECE Dept., New Jersey Institute of Technology, University Heights, Newark, NJ 07102
ABSTRACT
In this paper we consider the problem of deadlockfree unicast wormhole routing in computer networks with
irregular topologies, such as Networks of Workstations
(NOWs). The proposed algorithm consists of two stages.
At the first stage, we minimize the set of turns in the
network graph which should be prohibited for deadlock
prevention by breaking all cycles in the channel
dependency graph. Proposed approach guarantees that the
constructed set of prohibited turns is irreducible and the
fraction of the prohibited turns does not exceed 1/3 for
any network topology. At the second stage, routing tables
are constructed based on the set of prohibited turns that
minimize average message path lengths and delivery
times in a decentralized manner. Given N, the number of
nodes, and d the maximal number of ports in routers, the
complexity of the proposed algorithm does not exceed
O(N2d). This algorithm is invoked when a change in the
topology of the network is detected, which we assume is
infrequent. We also present results of simulation
experiments of average latency for uniform, transpose and
local traffic patterns, saturation points and scalability.
These results illustrate the performance and advantages of
the proposed approach as compared with the existing
up/down techniques [1,2].
KEYWORDS: routing algorithms, wormhole routing,
turn model, deadlock prevention
1. INTRODUCTION AND
FORMULATION OF THE PROBLEM
Recently, Networks of Workstations (NOWs) have
emerged as an inexpensive alternative to massively
parallel multiprocessors [1,3,4]. NOWs are comprised of
a collection of routers or switches, communication links
and workstations interconnected in an irregular topology.
In order to minimize network latency and achieve high
bandwidth communications, recent experimental and
commercial switches for NOWs implement wormhole
routing [2,4,5]. However, wormhole routing is susceptible
to deadlocks [6-13] because packets are allowed to hold
many resources while requesting others. Design of
efficient deadlock-free routing algorithms in irregular
topologies introduces new challenges, which we address
in this paper.
Existing routing strategies can be divided into
adaptive [7,9,12-14], that consider existing queue sizes
and non-adaptive techniques [10,13,15,16]. In this paper,
we will consider non-adaptive methods, which can
however be easily extended for adaptive routing. Several
routing methods currently exist for regular topologies,
such as 2-D meshes, tori and hypercubes
[7,8,10,12,15,17,18]. The more difficult problem of
routing in the presence of faults [2,3,9,15,17-20] has also
been studied. If the number of faults is large then the
fault-tolerant routing problem for networks with regular
topologies becomes almost equivalent to routing in an
arbitrary topology [19-21].
For an irregular topology most of the existing routing
strategies are based on spanning trees and (up/down)
routing [1,2]. According to this strategy, once a spanning
tree is constructed, any two nodes can communicate with
each other along the tree without any deadlocks. The
drawbacks of this approach are the long message paths
and high load on the edges near the root node [1]. This
method can be improved by allowing shortcuts using
edges not belonging to the spanning tree [1], but this
could result in deadlocks due to the formation of cycles in
the channel dependency graph.
To measure the efficiency of a routing strategy, the
average message delivery time can be used [10,13,14,16]
as a parameter for comparison. The average message
delivery time is a function of a message generation rate.
At a message generation rate known as saturation point,
delivery time increases exponentially. Any good routing
strategy aims to increase the maximal sustainable
throughput and decrease the delivery time for generation
rates near the saturation point.
We assume that the given network consists of N
nodes connected by E edges constituting a connected
network graph G. In general, the network graph can be
considered to be a multigraph with several edges between
the same two nodes [1,8,10]. In particular, if k virtual
channels are used where each physical channel is split
into k logical channels using time multiplexing techniques
[10,11], every pair of nodes is connected either by 0 or k
virtual edges.
Every routing algorithm prohibits some of the turns
in network graph G. A turn (a,b,c) is a three-tuple of
nodes such that (a,b) and (b,c) are edges in the network
graph G. In order to correctly model existing switchbased networks such as Myrinet [4], we assume that G is
symmetric, i.e. if (a,b) is an edge in G , then (b,a) is also
an edge. These channels can be used simultaneously
without contention thus if (a,b,c) is prohibited, then
(c,b,a) is also prohibited, and we will consider these two
turns as one. The total number of turns is T = idi(di-1)/2,
where di is the degree of node i. For the up/down routing
[1,2,20] first a spanning tree for G is constructed and
nodes are labeled preserving the partial order defined by
the tree where the root has label 1 and a turn (a,b,c) is
prohibited if b>a and b>c.
prohibited turns is discussed minimizing average message
path lengths and thus average delivery time.
The complexities of the algorithms described in
Sections 2 and 3 do not exceed O(N2d), where N is the
number of nodes, and d is the maximal number of ports in
routers. These algorithms are invoked only when there is
a change in the topology of the original network. Results
of simulation experiments on average latency for uniform,
local and transpose traffic patterns, saturation points and
scalability for the TPBR-algorithm are presented in
Section 4. These results clearly show the advantages and
benefits of the proposed approach as compared with the
up/down approach.
2. TPBR-ALGORITHM FOR
CONSTRUCTING A SET OF
PROHIBITED TURNS FOR DEADLOCK
PREVENTION
It was shown in [12] for meshes and tori and in [3,20]
for irregular topologies that reduction in the number of
prohibited turns results in a decrease of average path
lengths of messages and in a reduction of average
delivery time accompanied by an increase in throughput.
In Section 4 we present simulation results for random
topologies with N=256 nodes. Fig.3 is showing a strong
correlation between fraction of prohibited turns and
average delivery time. Reduction of the fraction of
prohibited turns from 30% to 20% results in about 100%
increase in maximal sustainable throughput.
In this section we describe the TPBR-algorithm for
creating set of prohibited turns Z(G) for a given graph G
with N(G) nodes. The TPBR-algorithm is a recursive
algorithm, in which at each step one node is selected and
all turns through the selected node are either permitted or
prohibited. For example, if after deleting a node a with
degree da and all edges incident on it, the remaining graph
G-a is still connected, then we prohibit all da(da-1)/2 turns
(c,a,b) and permit all turns (a,b,c). Algorithm is invoked
recursively as long as there are some edges in the
remaining graph.
For the up/down approach [1,2,20] the fraction of
turns which are prohibited depends on the selection of the
spanning tree for a given network topology and could be
close to 1. The problem of construction of an optimal
spanning tree is NP-hard. In [3] a method for minimizing
the fraction z, of turns to be prohibited was presented.
With this method the fraction of prohibited turns does not
exceed 1/3 for any topology but the approach does not
guarantee an irreducible set of prohibited turns. The set of
prohibited turns is irreducible if deletion of any turn from
the set results in cycles in the channel dependency graph
and deadlocks in the system. In the next section, we
describe the TPBR (Turn Prohibition Based Routing)
algorithm for deadlock prevention, which results in
irreducible set of prohibited turns maintaining the upper
bound for the fraction of prohibited turns at 1/3 for any
network topology. In Section 2 we also present
experimental results for random topologies with 256
nodes showing that the proposed TPBR-algorithm results
in a considerable reduction in a number of prohibited
turns as compared to the up/down approach. We note that
a set of prohibited turns for deadlock prevention does not
completely specify the routing strategy, i.e. several
routing strategies can satisfy the same set of restrictions
on turns in the network graph. In Section 3, decentralized
construction of routing tables based on selected set of
Following properties can be proven for the set Z(G)
of prohibited turns generated by the TPBR-algorithm:
1. Any cycle in G contains at least one turn included
in Z(G)
2. |Z(G)|  1/3 |T(G)|, where T(G) is the set of all
turns in graph G;
3. The
TPBR-algorithm
maintains
graph's
connectivity. For any two connected nodes a and b
in the original graph, there exists at least one path
between a and b, without any turns from Z(G)
4. Set Z(G) is irreducible. Deletion of any turn from
Z(G) creates a cycle in G containing no turns from
Z(G).
We note that the TPBR -algorithm has a complexity
of the order of O(N2d), where N is the number of nodes in
G, and d is the maximal degree of the node. The complete
information about Z(G) can be represented in 3 arrays of
length N. First array shows the order in which nodes are
selected, the second one shows the order in which
components of connectivity are indexed, and the third one
stores the information about special edges.
Description of TPBR(G):
0) Initialize: Z(G) := , all nodes and edges are marked as
non-special, HALF_LOOP: = 0.
1) if N(G) < 2, no turns are prohibited, then return
2) a non-special node a is selected in G, such that (d2a2)/i (di-1) is minimal where the summation is over nodes i
sharing an edge with node a.
3) Components of connectivity G1,…,Gk are constructed in
graph G - a such that any edge between nodes in different
components should include node a. Following rules apply:
- if a special node exist in G then it should be in G1 ,
the first connectivity component.
- else, component Gi, connected to a in G with fewer
number of edges, should have a larger index i.
4) for i=2,3,…,k, one edge connecting component Gi to a is
marked as a special edge.
5) all turns (b,a,c) are included in Z(G), except turns, for
which (a,b) is a special edge, bGi, cGj and i>j an all
turns (a,b,c) are permitted.
6) TPBR (G1)
7) for i=2,3,…,k {
if (HALF_LOOP = 0) AND (in G exists a sequence of
nodes a, x1,…, xj, a, such that x1,…xj  Gi-1),
then HALF_LOOP := 1.
if (HALF_LOOP = 1)
then node in Gi, connected to a in G with special
edge, is declared and marked as special.
TPBR (Gi)
}
8) return
To illustrate the efficiency of the TPBR-algorithm
compared with the up/down approach, we show in Fig.1
percentages of prohibited turns generated using these two
algorithms for random networks with 256 nodes, as
function of average node degree. One can see from Fig.1
that moving from the up/down curve to the TPBRalgorithm curve results in a 15% to 50% reduction of the
fraction of prohibited turns.
In Fig.2 , TPBR-algorithm is illustrated for a graph
with N = 14 nodes. Nodes are labeled in the order that
they are selected by the TPBR-algorithm. Prohibited
turns, e.g. (3,2,12) and special nodes, e.g. 13 and 14, and
special edges e.g. (13,1) are shown in dark, heavy lines.
For this example, |Z(G)|=12 out of |T(G)|=50 turns are
prohibited. We note, that if node 3 or 12 were selected by
the TPBR-algorithm instead of node 2, the total number of
prohibited turns would have gone down to 11. This
example suggests that although the TPBR-algorithm
constructs an irreducible set of prohibited turns this set is
not necessarily minimal.
Fig.1. Fraction of prohibited turns, z(G)=|Z(G)| / |T(G)|
as a function of average node degrees for the TPBRalgorithm and the up/down approaches for randomly
configured irregular topologies with 256 nodes.
10
11
4
7
8
9
14
5
6
13
3
1
2
12
Fig. 2. Example illustrating the construction of
prohibited turns by the TPBR-algorithm. Special
nodes and edges are shown in black, nodes are
labeled in order according to their selection by the
TPBR-algorithm
3. CONSTRUCTION
TABLES
OF
ROUTING
In this section we describe a decentralized algorithm
for the construction of local routing tables for a given set
of prohibited turns Z(G), minimizing average path length
and average latency. Our goal is: for any source s and
destination d select the shortest routing path a1,…,am
(a1=s, am=d) among all paths, satisfying the routing
restrictions imposed by the set Z(G). Here we reiterate
that these computations are infrequent as they are
performed only when network topology changes are
detected which we assume do not occur often.
For any intermediate node i, the algorithm estimates
the length of the shortest permitted (under the restriction
on turns generated by the TPBR-algorithm) path between
adjacent nodes of i and the destination, and routes the
message to neighbor j with the shortest estimated path
length provided that the corresponding turns at nodes i
and j are permitted. This algorithm can be implemented in
both hardware and software depending on speed/memory
parameters of routers. We assume that every node has up
to d neighbors. The term "node" here is the router
component of the processor-router pair. Hence, it would
have up to d+1 channel buffers, including the buffers for
the consumption channel to the processor.
compare the efficiency of the up/down approach with the
TPBR-algorithm.
4. SIMULATION EXPERIMENTS
Initially, every router knows its set of permitted and
prohibited turns. This can be represented by (d+1)(d+1)
matrix P, such that P(i,j)=1 if the turn from input buffer i
to the output buffer j is permitted and i,j{0,1,…,d}
where i, j = 0 correspond to the consumption channel. It
follows from the TPBR-algorithm that P is symmetric and
that P(0,i) = 1.
For every node, two matrices R(i,k) and D(i,k) are
constructed, where i{0,1,…,d}, k{1,2,…,N}, N is the
number of nodes. If a message coming in from input port i
to be forwarded to destination node k, should be routed to
output port R(i,k). Elements of R take values from 0 to d.
The distance matrix D(i,k) is the length, in hops of the
path from input buffer i to destination node k. Distance
matrix D is used at the first routing stage only, while the
routing matrix R is used for on-line routing during the
second stage. The total memory required to store these
matrices is of the order of 2(d+1)N. For d=4, N=1,000 it is
around 10K words, and a hardware implementation for
this algorithm is feasible.
Now, we describe a decentralized procedure for
constructing the matrices R and D at every node as
follows:
 Ra and Da for node a are initialized as follows:
Ra(i,j):=X, Ra(i,a):=0, Da(i,j):=X, Da(i,a):=0 where X is
interpreted as undetermined.
 At each step, elements of Ra and Da are recalculated,
using matrices R1,…,Rd, D1,… Dd of neighbors 1,…,d of
node a. After t steps all paths of length up to t hops will be
determined.

The rule for step t (initially, t=1) is the following:
if (Ra (i,j) = X) then for all m, neighbor of a that
P(i,m)=1
if ( Dm (y,i) = t-1) then
{ Ra (i,j):= m ; Da (i,j):=t }
where y is the input port for node m incident on the
neighboring node a.
For the hardware realization, at each step t every
node should transmit to its neighbors, messages with
numbers i, such that Dm (y,i) = t-1. During the entire prerouting procedure, up to N such numbers can be sent by
every link. The algorithm is terminated after L steps,
where L is the maximal possible length of the minimal
permitted path between two nodes, or if at some step t no
changes have been made in any of the matrices. We note,
that the proposed algorithm can be used to construct a set
of shortest paths for any given set of prohibited turns.
This property has been used in our simulations to
We have implemented an event-driven simulator to
evaluate the performance of the TPBR-algorithm for
wormhole-routed networks. In all experiments the TPBRalgorithm and the up/down routing algorithms are
compared. Our experiments were performed on randomly
generated connected graphs ranging in size from 32 to
256 nodes. The experiments were also performed for
different node degrees. A typical simulation would be
averaged over a 100 random graphs in each of which
10,000 messages were exchanged.
The following assumptions for our experiments are
similar to those used by [16,18]. All network channels we
studied for the TPBR-algorithm are local, transpose and
are bi-directional and symmetric. The message length was
constant (200 flits) and the input/output buffers in the
wormhole routers were 1-flit deep. The message queues at
each node are of infinite length. Output channel/buffer
contention is resolved using the FIFO queuing policy,
with each incoming flit being time stamped on its arrival
at the router input buffer. In our simulations, we used
mostly uniform traffic pattern where each node can send a
message to any other node with equal probability.
Communications arising from nodes are independent and
identically distributed by the Poisson process with the
generation rate equal to p (messages/cycle/node, the
probability of message generation for any cycle, at any
node). Also a separate experiment has been conducted
that investigated the impact of different traffic patterns on
the latency and on saturation point. Generally
performances of routing algorithms are measured in terms
of the average message latencies and saturation point
(throughput) which is defined as the highest sustainable
message generation rates.
First set of experiments provide additional evidence
that supports choosing the fraction of the prohibited turns
as the criterion for high-efficiency routing. This can be
shown by comparing the message generation rates at
which the network reaches saturation for different
numbers z(G) of prohibited turns. In Fig.3, it can be seen
that the networks with larger fraction z(G)=|Z(G)| / |T(G)|
reached saturation at lower generation rates. For random
graphs with N=256 nodes with degree d=4, the reduction
in z(G) from 30% to 20% results in 100% increase for the
saturation point.
In the next set of simulation experiments
performances of the TPBR-algorithm and the UP/DOWN
are studied. In Fig.4 average message latency in
simulation cycles versus message generation probability
for both algorithms are shown. At the saturation point the
average latency graph has an almost infinite slope. It is
observed that on the average TPBR-algorithm provides
performance improvements about 15% with respect to
maximum message generation rates over the up/down
approach.
mean path length. For the uniform traffic pattern there is
no such restriction With the local traffic pattern the
saturation point is the largest and with the transpose
traffic it is the smallest. This is a further experimental
verification that messages that in general travel farther,
i.e. have longer path lengths as in the case of transpose
traffic, have longer average latencies.
Fig. 3 Average saturation point versus fraction of
prohibited turns for random graphs with N=256 nodes of
degree = 4
Fig. 5 Scalability of the TPBR-algorithm compared with
the Up/Down routing algorithm.
Fig. 4 Average message latency versus generation rate for
the up/down and TPBR-algorithms.
Third set of experiments deals with scalability issues
for the TPBR-algorithm. We measure maximal
sustainable throughput for networks of different sizes. It
is observed that the algorithm scales well, offering better
performance for larger graphs. Persistent superiority of
the TPBR-algorithm over the up/down approach is clearly
visible in Fig.5.
Lastly in Fig.6 we see the general behavior for the
three traffic patterns. Three traffic patterns that we studied
for the TPBR-algorithm are local, transpose and uniform.
In transpose traffic pattern, source and destination pairs
are chosen so that path length is greater than the mean
path length. In the local traffic path length is less than the
Fig. 6 Performance of the TPBR-algorithm under different
traffic patterns.
5. CONCLUSIONS
In this paper we proposed a new algorithm for
wormhole routing in irregular topologies. We have
shown, that the fraction of turns, which should be
prohibited to break all cycles in a channel dependency
graph can be used as the efficient criterion for
performance evaluation of routing strategies. evaluation
of routing strategies. We developed an algorithm which
generates irreducible set of prohibited turns containing
not more than 1/3 of total number of turns for any
topology. As far as we know this is the first published
non-trivial upper bound on the fraction of turns to be
prohibited to prevent deadlocks in networks. We also
developed a decentralized algorithm for construction of
routing tables, based on selected sets of prohibited turns
and minimizing average delivery time. The complexity of
the proposed algorithms does not exceed O(N2d), where N
is the number of nodes, and d is the maximal number of
ports in routers. The results of computer simulations
illustrate advantages of the proposed approach as
compared with the existing up/down approach. The
methods can be easily extended to the cases of several
virtual networks, adaptive routing and multicasting. (For
the latter case additional channel dependencies, due to
consumption channels, have to be taken into account.)
[8] W. Dally and C. Seitz, L. "Deadlock-Free Message
Routing
in
Multiprocessor
Interconnection
Networks," IEEE Trans. on Comput. vol. 36, pp. 547553, 1987.
6. ACKNOWLEDGMENTS
[12] C. Glass and L. Ni "The Turn Model for Adaptive
Routing," Journal of ACM, vol. 5, pp. 874-902, 1994.
The authors would like to thank Prof. L. B. Levitin
from Boston University for many useful suggestions. This
work was supported by the NSF under Grant MIP
9630096.
[13] L. Ni, M. and P. McKinley, K. "A Survey of
Wormhole Routing Techniques in Directed
Networks," Computer, vol. 26, pp. 62-76, 1993.
REFERENCES
[1] R.Liebeskind-Hadas, D. Mazzoni and R. Rajagopalan
"Tree-Base Multicast Routing in the Mesh with No
Virtual Channels," Proc. of the First Merged Int.
Parallel Processing Symp. and Symp. on Parallel and
Distributed Processing, pp.244-249, 1998.
[2] M. Schroeder and et al. "Autonet: A High-Speed self
configuring Local Area Network Using Point-topoint Links," (Technical Report 59, DEC SRC, April
1990).
[3] L. Zakrevski, S. Jaiswal, L. Levitin and M.
Karpovsky "A New Method for Deadlock
Elimination in Computer Networks With Irregular
Topologies," Proc. of the IASTED Conf. PDCS-99,
1999
[4] N. Boden and e. al. "Myrinet: A Gigabit per second
Local Area Network," IEEE Micro, pp. 29-35, 1995
[5] R. W. Horst. "ServerNet™ Deadlock Avoidance and
Fractahedral Topologies," Proc. of IEEE Int. Parallel
Processing Symp., pp.274-280, 1996
[6] L. Zakrevski, A. "Fault-Tolerant Wormhole message
Routing in Computer Communication Networks,"
Ph.D. Thesis, Boston University, College of
Engineering, pp.55-102, 2000.
[7] W. Dally and H. Aoki "Deadlock-Free Adaptive
Routing in Multiprocessor Networks Using Virtual
Channels," IEEE Trans. on Parallel and Distributed
Systems vol. 8, pp. 466-475, 1997.
[9] J. Duato "A New Theory of Deadlock-Free Adaptive
Routing in Wormhole Networks," IEEE Trans. on
Parallel and Distributed Systems vol. 4, pp.13201331, 1993.
[10] J. Duato, S. Yalamanchili and L. Ni, M.
Interconnection
Networks:
An
Engineering
Approach, Los Alamitos, IEEE CS Press, 1997.
[11] E. Fleury and P. Fraigniaud, "A General Theory for
Deadlock
Avoidance
in
Wormhole-Routed
Networks," IEEE Trans. on Parallel and Distributed
Systems, vol. 9, pp. 626-638, 1998.
[14] R. Boppana and S. Chalasani "A Comparison of
Adaptive Wormhole routing Algorithms," Computer
Architecture News, vol. 21, no. 2, pp. 351-360, 1993.
[15] R. Boppana, V. and S. Chalasani "Fault-Tolerant
Wormhole Routing Algorithms in Mesh Networks,"
IEEE Trans. on Comput. vol. 44, pp. 848-864, 1995.
[16] B. Ciciani and M. Colajanni, Paolucci, C.
"Performance Evaluation of Deterministic Routing in
k-ary n-cubes," Parallel Computing, no. 24, pp.
2053-2075, 1998.
[17] Y. Boura, M. and C. Das, R. "Fault-Tolerant Routing
in Mesh Networks," Proc. of Int. Conf. on Parallel
Processing vol. O, pp. 106-109, 1995.
[18] C. Glass and Ni,L. "Fault-Tolerant Wormhole
Routing in Meshes," Proc. of Int. Symp. on FaultTolerant Computing, 1993.
[19] L. Zakrevski and M. Karpovsky, G. "Fault-Tolerant
Message Routing for Multiprocessors," Parallel and
Distributed Processing, pp.714-731, 1998
[20] L. Zakrevski and M. Karpovsky, G. "Fault-Tolerant
Message Routing in Computer Networks," Proc. of
Int. Conf. on PDPA-99, pp.2279-2287, 1999.
[21] L. Zakrevski and M. Karpovsky "Unicast Message
Routing in Communication Networks with Irregular
Topologies," Proc. of CAD-99, 1999.