Performance Analysis of a Pac.ket Switch Based on Single

1014
IEEE JOURNAL ON SELECTED
AREAS
IN COMMUNICATIONS, VOL. SAC-1, NO.
6 , DECEMBER 1983
Performance Analysisof a Pac.ket Switch
Based on Single-Buffered Banyan
Network
YIH-CHYUN JENQ, MEMBER, IEEE
Abstract -Banyannetworksarebeingproposedforinterconnecting
memoryand processor modules inmultiprocessor systems as well as for
packetswitchingincommunicationnetworks.
This paper describes an
analysis of the performance of a packet switch based on a single-buffered
Banyan network. A model of a single-buffered Banyannetworkprovides
results on the throughput,delay,andinternalblocking.
Resultsofthis
model are combined with modelsof the buffer controller(finite and infinite
buffers). It is shown that for balanced loads, the switching delayis low for
loads below maximum throughput (about45 percent per input link) and the
blocking at theinput buffer controlleris low for reasonable buffersizes.
1. INTRODUCTION
R
ECENTLY, there has been considerable interest in
Banyan networks among computer and communication technology researchers. The Banyan network has been
considered as a candidate for the interconnecting network
linking a large number of memory and processor modules
together in a multiprocessor system and as a switching
fabric for packet-switched interconnection of computers
(see, e.g., [11-[31).
In this paper, we present analytical results for the performance of a packet switch basedon a single-bufferedBanyan
network. Results show reasonably low blocking probabilities and low delays for balanced internal loads. In Section
I1 we describe the switch and its operation. In Section 111
we present a simple model of the switch and obtain the
throughput and the internal delay of the switch. In Section
IV we refine the model by removing one independence
assumption used in the simple model of Section I11 and
show that the results from two models are very close. In
Section V we add a queue with unlimited waiting rooms as
a front-end buffer t o the switch and obtain the switching
delay. The case of a finite waiting rooms front-end buffer
is analyzed in Section VI. Conclusions are given in Section
VII.
‘
11.
THE
PACKET
SWITCH
The packet switch consists of k input buffer controllers,
k output buffer controllers, and a k x k square switch. A
Banyan network is considered for the switching fabric
Manuscript recejved February4, 1983; revised June 22, 1983.
The author 1s wth Bell Laboratones, Holmdel, NJ 07733.
because of its economy, potential VLSI implementation,
and easiness of routing, A 4-stage (16 X 16) Banyan network is shown in Fig. 1. Each switching element (box) is a
2 X 2 crossbar switch with one buffer on each of its two
input links. The packet routing is done by the hardware as
follows. Each packet has an n-bit header in an n-stage
switch. The switching element (a 2 X 2 crossbar) at stage 1
routes the packet up or down according to the first bit of
the header’(“zero” or “one” indicate up or down routing,
respectively) and then removes the first bit from the header.
The succeeding switching elements will perform the same
routing function by removing one bit from the header and
routing the .packet to the next stage until the packet
reaches its destination output link. It is easy to see that the
header is simply the binary address of the output link at
the last stage and is independent of the input link at the
first stage.
If both buffers at a switching element have a packet and
both packets are going to the same output link of the
switching element, then a conflict occurs. In this case, we
assume that one of the packets will be chosen, randomly, to
go to the next stage and the other one will stay in the
buffer. Of course, in order for a packet to be able to move
forward, either the buffer at the next stage is empty or
there is a packet in the buffer and that packet is able to
move forward. Thus, the ability for a packet to move
forward depends on’the state of the entire portion of the
network succeeding the current stage.
For simplicity, we can consider the switch operating in a
clocked format. In the first portion 71 of a clock period
T = T~ + r2, control signals. are passed across the network
from the last stage toward the first stage, so that every
packet knows whether it should move forward one stage or
stay in the same buffer. Then, in the second portion T~ of a
clock period T , packets move in accordance with control
signals and the clock period ends. The whole process
repeats in every clock period, and packets continue to be
shifted in and out of the switch.
111. MODEL7
We consider a Banyan switch with n stages. Considering
the clocked operation described in the previous section, it
0733-8716/83/1200-1014$01.00 01983 IEEE
1015
JENQ: PERFORMANCE ANALYSIS OF A PACKET SWITCH
Fig. 1. A 4-stage Banyan network.
is clear that the whole switch can be modeled as a Markov
chain. However, the number of the states in the chain
grows exponentially with the number of stages n ; the
number of states in an n-stage network is k” 2”, where k is
the number of states in a node. With such a rapidly
growing number, it is almost impossible to even simulate it
for large n (say n > 10). Therefore, some simplifying assumptions will be used.
We assume that packets arriving at each input link at
stage 1 are destined uniformly (or randomly) for all output
links at stage n . We also assume a uniform load for each
input link at stage 1. Thus, we assume that the loads are
balanced in the wholeswitchingnetwork.
Under these
assumptions, the state of a switching element ( 2 x 2 crossbar) at stage k is statistically not distinguishable from that
of another switching element of the same stage. Hence, the
state of a “stage” can be characterized by that of a
“switching element.” If we further assume that the two
buffers in the same switching elementare statistically independent, then the state of a “stage” can be reduced further
to that of a single buffer. This last independence assumption is motivated by the fact that input packets arriving at
each of the two input links of a switching elementare from
disjoint sets of input links at stage 1 and, hence, are
independent. However, packets in the two buffers of a
switching element do interfere with each other, and thus it
is not clear that this independence assumption is reasonable. In this section we will assume independence to obtain
a very simple model. In the next section wewill remove
this independence assumption to check if the removal of
the independence assumption makes any noticeable difference.
We first introduce the following notation.
n
= number of switching
stages.
p o ( k ,t ) = probability that a buffer of a switching element at stage k is empty at the beginning of
the t th clock period.
‘
pl(k,
q( k , t )
=i- p o ( k , t ) .
= probability
that a packet is ready to come to a
buffer of a switching element at stage k during the tth clock period.
r ( k , t ) = probability that a packet in
buffer of a
switching element at stage k is able to move
forward during the tth clock period, given
that there is a packet in that buffer.
With these definitions, we can state the equations
governing the relationships among these variables:
a
,
q(k,t)=0.75pl(k-1,t)pl(k-1,t)
+0.5po(k-1,t)pl(k-1,t)
+o.5p1(k -1, t ) p o ( k -i, t ) ,
k = 2 , 3 , * . * , n (1)
r ( k , t ) = [ P o ( k , t ) + 0 . 7 5 p l ( k , t ) ][ P o ( k + I , t )
+ Pl(k +I, t ) r ( k + I , I>]
,
k = 1 , 2 , 3 , * * * , n - l (2)
r(n,t)=p,(n,t)+0.75pl(n,t)
P o ( k t + I > = [I- 4 ( k , 0 1 [ P o k t > + P l ( k t ) r ( k
01
(3)
(4)
pl~k,t+l)=l-po(k,t+l).
Equations (1)-(4) describe the, dynamics of the state
transition of the system. If this system has a steady state,
then these quantities should converge to time-independent
quantities q ( k ) , r ( k ) ,p o ( k ) , and p,(k). Then the normalized throughput S (the number of output packets per
output link perclock period) and the normalized mean
delay d are given by
S=p,(k)r(k)
and
foranyk
(5)
1016
IEEE JOURNAL ON SELECTEDAREAS IN COMMUNICATIONS, VOL. SAC-1, NO.
0.8
L
6, DECEMBER 1983
1
STAGES n
Fig. 2. Normalizedthroughput of Banyannetworks.
1.6
1
,
.
I
,
I
,
1
,
I
STAGES n '
' 0
0.2
0.4
a6
0.8
1.0
Fig. 3. Normalizedinternaldelay of Banyannetworks.
stages n in a switch increases to 10. Normalized delay ' d
( 6 ) [(6)]is displayed in Fig. 3. It is interesting to see that the
k=l
normalized delay also seems to converge to a quantity
approximately l.55 as n increases., A computer run is
The quantity q(1) is determined by the load applied to the
performed for n = 20 to check this convergence, and the
switch. In this section it is considered as an independent
result shows that the normalized delay standsat 1.55.
input parameter. The determination of q(1) for a given
Thus, for an n stage switch the average delay'is inthe range
input load is deferred to Section V.
between n~ and 1.55n7, depending on the load and the
In. Fig. 2 we plot the normalized throughput S as a
number of stages n.
function of q(1) for different number of stages n. This was
obtained by iteratively solving for the steady state (1)-(5).
IV. MODEL2
There are two interesting observations. 'First, the throughput grows almost linearly with respect to the offered load
In the previous section we assumed that txe .state of a
in the range where the offered load is less than 0.4. This buffer in a switching element is independent of that of the
agreeswith theintuitionthat under low load, all input other buffer in the same switching element. In this section
traffic will pass through the switch with little interference. we present a refined model by removing this independence
Second, with the maximum load at 1, the normalized assumption. Numerical results show little difference from
throughput seems to converge (monotonically decreasingly) the previous model. Therefore, we conclude'that the above
to a quantity of approximately 0.45 as the total number of independence assumption is a good approximation.
1
d=-
'
'n
1/ r ( k ) .
1017
JENQ: PERFORMANCE ANALYSIS OF A PACKET SWITCH
This model uses the following notation.
n
= number of switching
stages.
p i j ( k ,t ) = probability that a switching element at stage
k has i packets in the upper buffer and’j
packets in the lower buffer at the beginning of
the t th clock period (i = 0 , l ; j = 0,l).
q(k, t ) = probability that a packet is ready to come to a
buffer of a switching element at stage k during the t th clock period.
It is noted that “arrivals” coming to each of the two
buffers in a switchingelement are independent of each
other.
rol(k, t ) = probability that the packet in the lower
buffer of a ‘switching element at stage k is
able to move forward during the tth clock
period, given that the upper buffer is empty
and the lower buffer has a packet.
rlo(k, t ) = probability that the packet in the upper
buffer of a switching element at stage k is
44 P o l q
@lo8
able to move forward during the tth clock
period, given that the lower buffer is empty
and the upper buffer has a packet.
ri{( k , t ) = probability that i packets in the upper buffer
and j packets in the lower buffer of a switching element at stage k aie able to move
fohvard during the tth clock period, given
that both buffers have packets ( i = 0,1, j =
091).
The equations governing the relationships among these
variables are as follows:
k = 2 , 3 , 4,...,n
where the arguments (k,t ) of the variables in the coefficient matrix are left out for the simplicity of presentation.
In (13) we also use the shorthand notations 4, Fol, and Flo
to denote (1 - q ) , (1 - rol), and (1 - rl0), respectively.
Equations (7)-(13) govern the dynamics of the state
transition of the system. If the system has a steady state,
then all variables should converge to some corresponding
time-independent quantities (after some iteration). The
normalized throughput S and the normalized delay d are
given by
~ = 0 . 5 ~ol(k)ro1(k)+p10(k)r1o(k)
{
~(k,t)=0.5pol(k-l,t)+0.5plo(k-l,t)
+O175pli(k-l,t)
Equation (9) results from the fact that the ioad is assumed to be balanced and the switch is symmetric. Equation (10) results from the fact that both output links of a
switchingelement in a Banyan network are connected
either to two upper input links or to two lower input links
of two different switching elements of the next stage. (See
Fig. 1.) Equation (11) follows from the same reasoning.
Equation (12) results from the symmetry property of the
switch and the conservation law of probability.
Equations governing the state transitions are givenby
+ pll(k)[r:,O(k)+rp:(k)+2r::(k)l}
(7)
V4)
and
‘Ol(k, t ) = Pm(k + I , t )
+0.5pol(k + I , t ) [ l + rol(k + I , t)]
+0.5plo(k + I , t ) [ l + r l o ( k+ I , t)]
+ ~ ~ ~ ( k + 1 , t ) [ r : : ( k + 1 , t ) + 0 . 5 r p , ’ ( k + 1 , t )where
+OSr$’(k+l,t)]
k = 1 , 2 , 3,...,n - 1
rol(n, t ) =1.0
r l o ( k ,t ) = rol(k,t ) ,
k =1,2 ,...,n
(8)
P ( k )= { Plo(k)rlo(~)
+ Pll(k)[r:,’(k)+r:,O(k)l
)/[PlO(k)+ PllWI.
(16)
r ~ ~ ( k , t ) = 0 . 2 ~ ( [ r ~ , ( k + 1 , r ) ] ~ + [ r ~ (9)
~ ( k + l It
, ris) ]noted
~ ) that p ( k ) is the probability that a packet in
the
upper
buffer of a switching element is able to move
= O S [ rol(k + I , t ) ] ’
(10)
forward, given that there is a packet in that buffer. Because
r F ( k , 1 ) = 0.5[1- rol(k + 1 , t)]
of the symmetry of the problem, p ( k ) is also applicable to
+0.5[1- rol(k + I , t ) I z
(11) the lower buffer. Therefore, (15) gives the correct expression for normalized delay d .
rP:( k , t ) = rt:( k,t )
Wehaveused
(7)-(16) to compute the normalized
=0.5[1-r::(k,t)-r,9P(k,t)].
(12) throughput and average delay for varying loads and switch-
1018
IEEE JOURNAL ON SELECTEDAREAS IN COMMUNICATIONS, VOL. SAC-1, NO. 6, DECEMBER 1983
Fig. 4. Packetswitchwithinfinitewaitingrooms
controllers.
ing network sizes, and have found that the results of this
refined model differ very little from those obtained in the
previous model (differences are less than 0.1 percent).
Thus, we conclude that the independence assumption is
reasonable, and model 1 of Section I11 givesgood approximate results with greatly reduce computation effort.
V. SWITCHING
DELAY-IBC WITHINFINITE
WAITINGROOM
In this section we study a systemwith inputbuffer
controllers (IBC's) added to thefront end of a Banyan
switch. The ,model is shown in Fig. 4. Packets arrive at an
input buffer controller at a rate of X packets per clock
period T.' At clock ticks, if the corresponding'buffer of 'the
switching element at stage 1 of the switching network is
able to ziccept a packet from the input buffer controller,
the IBC will place a packet into that buffer (if there is any
packet in the IBC queue). Arriying packets are placed in
the IBC queue according to the order of their arrivals, and
they ,are delivered to the switching network in the first-infirst-out (FIFO) fashion. We are interested in quantifying
the switching delay, which is the time a packet spends in
the IBC queue and in the switch before it leaves the switch.
The size 'of the IBC waiting room is assumed to be infinite.
In' the next section wewill address the problem with a
finite IBC waiting room.
.Let us consider a discrete-time queueing model for automatic repeat request (ARQ) used in data communications
as shown in Fig; 5 and previously analyzed in [4].
The time
axisis slotted into. uniformly spaced timeepochs. The
random variable X; is the number of arriving packets
at its inputbuffer
during the ith slot, Dj is the number of packets removed
from the queue during the ith slot, and L i is the queue
length (number of waiting packets) at the beginning of the
ith slot. Usually, a slot time is the time required to transmit a packet. If the transmission is successful, Dj is 1,
otherwise, Di is zero. The probability that a transmission is
successful is p. This model can be characterized by the
following twoequations:
L,+l= Li
+ X, - D,
(17)
and
0,
0,
( i ,
if Lj = 0
with probability 1 - p,
with probability p,
I
if Li> 0. (18)
If 4.
is independent of i (i.e., stationary), then the steady
state exists and we have E ( X ) = E ( D ) = p -prob{ L # 0}
from (17) and (18).
If we compare this model to our switching model of Fig.
4, wesee that two models are mathematically equivalent
and the corresponding variables and parameters are
E(X)=X
P = PO(O+ p , ( l ) r ( l )
(19)
(20)
and
prob { L # 0} = q(1)
(21)
where po(l), p,(l), r(l),
and q(1) are steady-state variables
of our ,Model 1 of Section 111.
Now we are ready to describe an algorithmic solution to
our switching delay problem.
1019
JENQ: PERFORMANCE ANALYSIS OF A PACKET SWITCH
0
0.1
0.2
0.3
PACKET ARRIVAL RATE
0.4
i
0.5
0.4520-
(PACKET/CLOCKPERIOD)
Fig. 6. Switching delay of a packet switch with 10-stage Banyan network.
STAOE 1
1
-
0
K
FAST PACKET SWITCH WITH FINITE WAITING
ROOM
Fig. 7. Packet switch with finite waiting rooms at its input buffer controllers.
Step 1) Computep:
a) Guess an initial q(1).
b) Perform the Banyannetworkcomputationusing
Model 1 to obtain po(l),pl(l), and r(1).
c) Let P = porn+ P l ( 1 ) m .
d) Let 4(l) = A / p .
e) Go to b) and iterate.
After the procedure converges, we obtain p .
Step 2) Solve the discrete-time queuing model.
With the p obtained instep l), we then solve the
well-known discrete-time queuing model.
This model has previously been solved [4], and the mean
switching delay is given by
where u,’ is the variance of the arrival X .
A plot of mean switching delay for a 10-stage Banyan
switch with front-end IBC queue is shown in Fig. 6. It is
seen that this delay is relatively,smallfor loads (q(1)) less
than 0.4, and increases rapidly to infinity at about 0.45
load. Notice that 0.45
is
the maximum achievable
throughput for a 10-stage Banyannetworkasshown
Fig. 2.
in
VI. BLOCKING PROBABILITY-IBC WITH
FINITE
WAITINGROOM
In this section we study the case where the IBC waiting
room is of finite size K and the quantity of interest is the
blocking probability Pb. The blocking probability is the
probability that an arriving packet will find the IBC waiting room is full. Blocked packets are dropped and considered to be lost.
The model is shown in Fig. 7. This model is similar to
the one shown in Fig. 4, except that the IBC has only K
waiting rooms. Packets arrive at the buffer controller at a
rate of A packets per clock period. Let A’ be the “carried
load” of the system, that is, the average number of packets
accepted by the switch; then the blocking probability Pb is
given by
Pb = ( X - A‘)/A.
(23)
Therefore, if for every arrival rate X we can compute X,
then we can obtain Pb readily by (23).
1020
IEEE JOURNAL ON SELECTEDAREAS IN COMhRJNICATIONS, VOL. SAC-1, NO.
1
2
3
4
S
~
7
8
9
1
-
0
6, DECEMBER 1983
0
BUFFER SIZE K
Fig. 8. Blocking probability of a packet switch with a 10-stage Banyan
network.
Recall that A' is simply the normalized throughput of the
Banyan switch described in Section 111. To obtain A,' we
need the quantityq(1). If we consider the %Banyanswitch as
a server for the finite queue, then the servicetime for a
customer (a packet) is n~ (recall that T is the clock period)
with probabilityp(1- p)"-' wherep is the probability that
an input link at the first stage of the Banyan network is
able to accept a new packet at clock ticks. This variable p
can be expressed in terms of the parameters of the Banyan
network:
P = PO(l)+ Pl(l)r(l).
(24)
Because the switch operates in clocked mode, the server
will not return until next clock tick if it finds an empty
queue at the clock tick. Hence, assuming Poisson arrivals,
the system can be modeled as an M / G / l / K queue with
vacation [5] (vacation timeis the clock period 7). The
quantity q(1) is simply the probability that the server will
find a nonemptyqueue
at clockticks.
It isgivenby
B / ( V + B ) where B and V are, respectively, the mean busy
and idle periods of a M / G / l / K queue. Now we are ready
to describe the algorithm for computing the blocking probability as follows.
Step 1) Guess an initial q(1).
Step 2) PerformBanyannetwork
computation using
Model 1 of Section 111..
Step 3) P = PO(l)+ P1(1)4)Step 4) Do M / G / l / K computation.
Step 5 ) q(1) = B / ( B + V ) .
Step 6 ) Go to Step 2).
After the procedure converges, the carried load A' is given
by the normalized throughput of the switch, and the blocking probability is given by (23).
In Fig. 8, we plot the blocking probability Pb against the
buffer size K with load A as a parameter for the 10-stage
network. It is noted that, for h 0.45, Pb decreases sharply
as K increases, while for h > 0.45, the Pb curves behave
dramatically differently. This isduetothe
fact that a
10-stage Banyan switch has the maximum achievable normalized throughput 0.45. If the input load h is greater than
0.5, the blocking probability Pb is about ( h - 0.45)/h for
K > 3. In Fig. 9, we plot the. blocking probability P,,
against the load X with the buffer size K as a parameter.
Again, we see the breakdown of the smoothness of the
curves happens near h = 0.45.
VII. CONCLUSIONS
In this paper, we analyzed the performance of the packet
switch based on the single-buffered Banyan network. First,
we derived a model for computing the normalized
throughput and the average internal delay of the Banyan
1021
JENQ: PERFORMANCE ANALYSIS OF APACKET SWITCH
BUFFER SIZE
IO-STAGE
BANYAN SWITCH
IK=iO
1
..
0
1
0.2
1
1
0.4
1
1
0.6
1
1
0.8
1
i.o
PACKET. ARRIVAL RATE (PACKETICLOCK 1
Fig. 9. Blocking probability of a packet switch with a 10-stage Banyan
network.
switch. Then we used a discrete-time queuing model to
compute the switch delay of the switch with infinite waiting rooms at its input buffer controllers. It was found that
the switchdelay is not significant until the input load
approaches the maximumachievable throughput of the
Banyan switch. An M / G / l / K queue with vacation was
used to compute the blocking probability of the packet
switch with finite waiting rooms at its input buffer controllers.
REFERENCES
[l] “Interconnection Networks” Special Issue of IEEE Computer, Dec.,
1981.
[2] D. M. Dias and R. Jump, “Analysis and simulation of buffered delta
network,” IEEE Trans. Compur., vol. C-30, Apr. 1981.
[3] S. Cheemalavagu and M. Malek, “Analysis and simulation of Banyan
interconnection networks with 2 x 2 , 4 x 4 , and 8X 8 switching elements,” in Proc. 1982 Real-Time Syst. Symp., Los Angeles, CA, Dec.
1982.
[4] Y. C. Jenq, “On calculations of transient statistics of a discrete
queuing system with independent general arrivals and geometric
departures,” IEEE Trans. Cornmun., vol. COM-28, pp. 908-911,
June 1980.
[5] T. T. Lee, ‘‘Y/G/l/N queue with vacation time and exhaustive
service disciphne,” J . Oper. Res., submitted for publication.
Yih-Chyun Jenq (S’74-M77) was born in Taipei,
Taiwan, on October 26,1949. He received the
B.S.E. degree from the National Taiwan University, Taipei, Taiwan, in 1971, and the M.S.E.,
M.A., and Ph.D. degrees in electrical engineering
from Princeton University, Princeton, NJ, in
1974, 1975, and 1976, respectively.
He worked
for
COMSAT Laboratories,
Clarksburg, MD, in the summer of 1974 and the
Datalog Division of Litton Industries, Melville,
NY, in the summer of 1979. From 1976 to 1980
he was Assistant Professor of Electrical Engineering at the State University of New York at Stony Brook, Stony Brook, NY. Since June 1980 he
has been with Bell Laboratories, Holmdel. NJ. His current research
interests are in the areas of data and computer communications, queueing
analysis, and digital signal processing.
Dr. Jenq is a member of Eta Kappa Nu and Sigma Xi.