On the stability of queueing networks and fluid models

On the stability of queueing networks and fluid
models
Dieter Armbruster∗, †,
Erjen Lefeber∗ and J.E. Rooda∗
January 11, 2010
Abstract
The stability of the Lu-Kumar queueing network is re-analyzed. It
is shown that the associated fluid network is a hybrid dynamical system
that has a succession of invariant subspaces leading to global stability. It
is explained why large enough stochastic perturbations of the production
rates lead to an unstable queuing network while smaller perturbations do
not change the stability. The two reasons for the instability are the breaking of the invariance of the subspaces and a positive Lyapunov exponent.
A service rule that stabilizes the system is proposed.
1
Introduction
Queueing networks are dynamical networks of production stations and queues,
where customers or lots arrive into a system at random time intervals and get
service of random length at various stations until they eventually depart from
the system. Typical examples are production systems, communication networks
or logistic systems. The networks are characterized by the topology of the flows
through the network, the stochastic arrival processes, the stochastic service
process that a job receives at a particular node and in particular through the
policies that govern the choices that can be made in the network. We are
concerned with choices arising from a station that services more than one class
of jobs — the so called service rules or disciplines for multiclass networks. We
consider fixed deterministic routing representing a flow through the network that
follows a fixed route, disregarding networks that present routing choices which
may depend on the state of the network as well as on stochastic parameters.
One of the first important questions for a queueing network is whether it is
stable or not, relative to a set of service rules, and given arrival and production
processes. Stability in this context intuitively means that the length of the
queues remains bounded for all time and for all initial conditions whereas a
queueing network is called unstable if, for some initial state, the number of jobs
in the network will, with positive probability, go to infinity as t → ∞ [1].
∗ Department of Mechanical Engineering, Eindhoven University of Technology, P.O.
Box 513, NL-5600 MB, Eindhoven, The Netherlands, email: [email protected],
[email protected]
† School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ
85287-1804, USA, email: [email protected]
1
Queuing networks are analyzed using network equations tying together the
random variables describing the state of the network. Since they are equations
for random variables further analysis is very complicated. Fluid model equations
are the deterministic equations replacing the random variables with their means.
Fluid models conceptually arise from treating the jobs in a queueing network
as a continuous fluid that flows, via inflow and outflow rates, through a finite
number of buckets (the queues). The resulting models are hybrid dynamical
systems: Sets of ordinary differential equations for the time evolution of the
queue lengths as a function of time. The systems are hybrid since depending
on the state of the system, a different service rule may apply which will change
the flow of jobs through the network. Fluid models are deterministic dynamical
systems and hence easier to analyze than queueing networks. A fluid model is
called stable if there exists a time t0 depending on the initial condition such
that for t > t0 all buckets are empty, i.e. zero is a stable fixed point of the
hybrid dynamical systems that is reached from every initial condition in finite
time. The preferred method of showing that a fluid model is stable is to derive
a global Lyapunov function with zero as a stable fixed point.
The appeal of the fluid models is that they are deterministic, they fall into
the category of dynamical systems which have been studied extensively and
hence are well understood and they represent an intuitive analogy between the
flow in the queueing network and fluid flow.
However, since these systems are hybrid they pose some unique challenges to
dynamical systems theory. Specifically, there has been a long standing discussion
about the stability of a queuing network and the stability of the associated fluid
network. Dai [2] proved that a sufficient condition for stability of a queuing
network is the fact that the associated fluid network is stable. Subsequent work
showed that the converse is not true (see e.g. [4] and [1]) and [3] confused the
issue significantly. Specifically, [3] analyze a queueing network that is stable
if the arrival and processing rates are deterministic and is unstable if they are
exponentially distributed. Since the associated fluid model reflects the behavior
of the mean quantities of the queueing network it is the same for both cases,
casting doubt on Dai’s [2] theorem and the relationship between stability of the
fluid model and the queueing system in general.
Our paper resolves the issue raised in [3]. We consider a specific uniquely
defined fluid model forming a hybrid dynamical system for the example in [3].
We will analyze the flow of this fluid model and show that the chosen service
discipline is in effect a clearing policy that leads to a hybrid dynamical system
for which all trajectories eventually enter invariant subspaces with successively
smaller dimensions until the final subspace has dimension zero, i.e the trajectories arrive at the origin which is a globally stable fixed point. We will show
that large enough fluctuations of the production rates will kick the system out
of the invariant subspaces generating a flow that has positive Lyapunov exponents leading to instability. This observation explains the different behavior of
the queueing system under different types of stochastic arrival and production
rates. In addition it suggests a revised service rule that prevents the fluid model
to leave the invariant subspaces and hence stabilizes the queuing system with
exponential arrival and production rates.
2
2
A queueing network whose stability depends
on the distributions of the stochastic processes.
Dai et al [3] study the two stations five steps queueing system shown in Figure 1.
The mean arrival rate into the system is given by λ1 = 0.1, the mean processing
Machine A
x4
m4 = 1
m3 = 4
1
λ1 = 10
x1
Machine B
x5
m5 = 4
m2 = 1
x3
x2
m1 = 4
Figure 1: A push start Lu-Kumar network
times are given as mi = (4, 1, 4, 1, 4). The priorities rule for Machine A is: Serve
jobs of type 1 as long as there is something in the buffer of type 1, then serve
jobs of type three and only serve jobs of type four if both, buffer one and three
are empty. Similarly, the rule for Machine B is to serve jobs of type five before
serving jobs of type two.
For this system Dai et al show that
• The queueing system is stable if arrival rates and all the processing rates
are deterministic. Figure 2 shows a typical simulation.
Deterministic service times
1500
x1
x2
buffer contents
x3
x4
1000
x5
500
0
0
0.5
1
1.5
2
time
2.5
3
3.5
4
4
x 10
Figure 2: Total WIP as a function of time for deterministic rates
• If the arrival and the processing rates of the queueing system are exponentially distributed then, starting from any initial state,
lim |X(t)| = ∞ with probability 1,
t→∞
i.e., the queueing system is unstable. Figure 3 shows a typical simulation.
Additional results show that, for uniformly distributed processing rates, small
perturbation levels lead to stable, large perturbations lead to unstable networks.
3
Exponential service times
1500
x1
x2
buffer contents
x3
x4
1000
x5
500
0
0
0.5
1
1.5
2
time
2.5
3
3.5
4
4
x 10
Figure 3: Total WIP as a function of time for exponentially distributed rates
3
The fluid model
It has long been known that the integral or algebraic form of the fluid equations
have non-unique solutions: The issue arises if a buffer with high priority is empty
but has a arrival rate that is below the machine capacity. As long as the high
priority buffer is empty the machine can serve another queue. For any fixed
time interval, there are infinitely many ways in which to distribute the machine
effort over the two processes that leaves the high priority buffer empty at the
end of the time interval. In order to arrive at a unique fluid model in differential
form, we model such a case in the following way: An empty high priority buffer
with an arrival rate λ and a processing time m receives continuous service at its
arrival rate by the server, with the remaining service time simultaneously and
continuously allocated to the low priority buffers.
Define xi to be the buffer level in queue i. Then there are ten different
regions in phase space called stages in which the fluid system has a different
flow. The constraints for the stages are shown in Table 1, the associated vector
fields are shown in Table 2.
Stage
Stage
Stage
Stage
Stage
Stage
Stage
Stage
Stage
Stage
I:
II :
III :
IV;
V:
VI :
VII :
VIII :
IX :
X:
x1
x1
x1
x1
x1
x1
x1
x1
x1
x1
> 0, x5
> 0, x5
> 0, x5
= 0, x3
= 0, x3
= 0, x3
= 0, x3
= 0, x3
= 0, x3
= 0, x3
>0
= 0, x2
= 0, x2
> 0, x5
= 0, x5
= 0, x5
≥ 0, x5
> 0, x5
= 0, x5
= 0, x5
>0
=0
> 0, x2
> 0, x2
> 0, x2
= 0, x2
= 0, x2
= 0, x2
= 0, x2
≥ 0, x4
≥ 0, x4
≥ 0, x4
> 0, x4
= 0, x4
= 0, x4
= 0, x4
≥0
>0
=0
≥0
≥0
>0
=0
Table 1: The different regions in phase space with different flows
With these vectorfields, trajectories go through the following sequence of
4
Stage
x01
x02
x03
x04
I : −0.15 0.25
0
0
II : −0.15 −0.75
1
0
III : −0.15
0
0.25
0
IV :
0
0.1
−015 0.15
V:
0
0.1
0
−0.6
VI :
0
0.1
0
0
VII :
0
−0.9
0.85
0.15
VIII :
0
0
−0.05 0.15
IX :
0
0
0
−0.1
X:
0
0
0
0
x05
−0.25
0
0
−0.25
0.35
−0.25
0
0
0
0
Table 2: The vector fields in the different regions
stages:
I, II, III
IVa (x3 ≥ x5 )
IVb (x3 < x5 )
V
→
→
→
→
IV
VII
V
VI → VII → VIII → IX → X.
(1)
This implies that the fluid model has zero as a globally asymptotically stable fixed point. In addition, the stages VII - X are invariant subspaces with
successively smaller dimensions.
dim VII : = 3
dim VIII : = 2
dim IX : = 1
dim X : = 0
(2)
As a result, the chosen priority policy acts like a clearing policy that subsequently reduces queues to zero.
3.1
Resolving the instability mystery
The fact that the subspaces (2) are invariant is one key observation to understanding the sensibility of the system to stochastic perturbations: A perturbation of the production rates that reduces the production rate of a machine,
may lead to the growth of a buffer that has already been reduced to zero. As
a result, the trajectory is thrown back to an earlier stage in the succession of
stages shown in ( 1). This also explains the sensitivity to stochastic production
times (rates) that are either generated from large uniformly distributions or
from exponential distributions and the insensitivity to small uniform distributions: All production machines have some slack capacity. If the perturbation of
the production rate only affects slack capacity, queues that have been cleared
before may still stay zero, although the production rate has been reduced. In
that case, the trajectory will continue to flow towards the origin.
Violating the invariance of the subspaces is however not sufficient to generate instability. The second ingredient is that the flow becomes recurrent with
a positive Lyapunov exponent. In this specific example this happens for a perturbation at Stage IX: Assume for instance that Machine B has a significantly
5
lower production rate, leading to the growth of buffer 5 and buffer 2. As a
result, we are back in Stage V. The map from stages V → IX is given by:
6
6
(0, x2 , 0, x4 , x5 ) → (0, 0, 0, 3x2 + x4 + x5 , 0).
5
5
(3)
Hence a perturbation in Stage IX to an initial value of (0, 1 , 0, x4 , 2 ) by a
momentary reduction of the production rate for step 5 creates a Poincaré map
of the form
Stage IX
→
(0, 0, 0, x4 , 0) →
Stage IX
(4)
6
6
(0, 0, 31 + x4 + 2 , 0).
5
5
(5)
The important point in this map is not the influence of the perturbations i
but the Lyapunov exponent 56 which is bigger than one. Hence, any time the
perturbation is big enough to move the trajectory back into Stage V, the size
of the buffer 4 grows.
Notice also that, if the system stays long enough in Stage IX, buffer 4 will
drain to zero. Hence, not only the size of the perturbations that triggers a transverse instability of the invariant subspace is important but also its frequency.
Low frequency events will not lead to instability (a bound for the frequency can
obviously be determined from the draining rate in Stage IX and the Lyapunov
exponent).
3.2
Stabilizing the queuing system
Instability is generated by the fact that an outflux is slower or an influx is higher
than nominal and hence a queue that should be zero becomes nonzero. Hence
a stabilizing policy will have to:
• control the influx into the queues that are initially zero to keep them at
the zero level.
• execute upstream from the machine that has a nonzero buffer.
• have this control action only at the stage that has a positive transversal
Lyapunov exponent.
Figure 4 shows a simulation for the Lu-Kumar network using a modified policy
where from the moment we reach Stage IX Machine B starts a job of type two
whenever both x3 = 0 and x2 > 0. The modified policy now stabilizes the
system which generated the instability shown in Figure 3.
4
Conclusion
We have shown for the example of a re-entrant queueing model discussed by [3]
why the fluid model is stable and its associated queueing model may be stable
or unstable depending on the distributions used for the stochastic behavior.
In addition, we have demonstrated, that the unstable queueing system can be
stabilized by a small change in the service policy.
An interesting issue is the relationship of this example to the theorem by Dai
[2] that states that a stable fluid limit model implies a stable queueing system.
6
Exponential service times
1500
x1
x2
buffer contents
x3
x4
1000
x5
500
0
0
0.5
1
1.5
2
time
2.5
3
3.5
4
4
x 10
Figure 4: The stochastic Lu-Kumar network with a stabilizing policy
The resolution of this apparent contradiction lies in the relationship between
the fluid limit model (see e.g. [1] ) and the fluid model as used e.g. in our
Table 2. The fluid limit model is an integral version of the fluid model which
is a hybrid dynamical system. As a result the fluid limit model allows a set of
solutions to an initial value problem only some of which are solutions to the
hybrid dynamical system. The theorem by Dai states that if all solutions to the
fluid limit model are stable then the queueing system is stable. On the other
hand, there are cases, (and our current example is one of those), where some
solutions of the fluid limit model are stable and others are unstable and hence
no statement on stability of the queueing system can be made.
The modification of the service policy that leads to the stable system in
Figure 4 then can be understood as a small chance in policy that does not affect
the general policy (i.e. the policy in the interior of the phase space) but only
the policy on some subset (i.e. an invariant subspace, as set of measure zero).
As a result the fluid limit model has an additional constraint that removes all
unstable solutions from the set of possible solutions of the fluid limit model.
Notice that Kopzon et al. [5] came up with a very similar change in policy on
a set of measure zero to stabilize the Kumar-Seidman-Rybko-Stolyar network.
In future work we are planning to generalize our result to the following:
Conjecture: If a deterministic queueing network is stable for a certain
service policy, then there exists a closely related service policy such that the
exponential queueing network is also stable and both are equivalent to the same
stable fluid system.
Acknowledgments
Fruitful discussions with Yoni Nazarathy and Gideon Weiss are gratefully acknowledged. E.L. was supported by the Netherlands Organization for Scientific
Research (NWO-VIDI grant 639.072.072). D.A. was supported by a grant from
the Stiftung Volkswagenwerk under the program on Complex Networks and by
NSF grant DMS-0604986.
7
References
[1] M. Bramson, Stability of Queueing Networks, Lecture Notes in Mathematics, 1950, Springer Verlag, Berlin, 2008
[2] J. G. Dai, On Positive Harris Recurrence of Multiclass Queueing Networks:
A Unified Approach Via Fluid Limit Models, The Annals of Applied Probability, 5(1) pp. 49-77, (1995)
[3] J. G. Dai, John J. Hasenbein, John H. Vande Vate, Stability and Instability
of a Two-Station Queueing Network, The Annals of Applied Probability,
14(1) pp. 326-377, (2004)
[4] S.H. Lu and P.R. Kumar, Distributed scheduling based on due dates and
buffer priorities. IEEE Tans. Automat. Control 36 1406 - 1416, (1991)
[5] Anat Kopzon , Yoni Nazarathy, Gideon Weiss, A Push-Pull network with
infinite supply of work. Queueing Syst. 62 75-111, (2009)
8