Derivation of equilibrium and time-dependent solutions to MIMI 001

Derivation of equilibrium and time-dependent
solutions to MIMI 001IN and MIMI 00 queueing
systems using entropy. maximization*
by JOHN E. SHORE
Naval Research Laboratory
Washington, DC
specialiZing this general solution. We make some general
remarks about applications in the fourth section, and in the
fifth section we apply the model to MJMJooIIN and MJMJoo
queueing systems and show how the classical equilibrium
distributions result. In the sixth section we discuss how
maximum entropy techniques can be used in deriving timedependent results. After a preliminary application in which
we consider a pure birth process, we derive time-dependent
distributions for the MJMJooIIN and MJMJoo queues. Discussion follows in the last section.
INTRODUCTION
Queueing theory has provided the basis for remarkable successes in the performance modeling and analysis of computer systems. 6,19,21 Because it is clear that computer systems do not satisfy assumptions made by the stochastic
process models that are used, this success has been somewhat puzzling; it appears that queueing theory equations
have wider applicability than is suggested by their classical
derivations. Buzen has offered one possible explanation in
terms of "operational analysis."4 In this paper we point out
that certain results in elementary queueing theory can be
derived simply and with relatively few assumptions by
means of entropy maximization. This entropy maximization
viewpoint provides the basis of another possible explanation
for the widespread applicability of queueing theory formulas.
Analysis by means of entropy maximization has been of
interest since Shannon 24 showed, for discrete noiseless systems, that the best encoding of an information source, in the
sense of enabling the highest information rate over a fixed
capacity channel, is the one that maximizes the source entropy. In addition to continuing applications in communication theory, there has been a growing interest in the use
of entropy maximization techniques for probabilistic analysis and problem solving in other fields. Much of this work
has been stimulated by that of E. T. Jaynes. Detailed discussions concerning the motivation, justification, and validity of various entropy maximization techniques are available
elsewhere.3',9,12-15,17,26,27 There have been applications in statistical mechanics,lO,11,16 traffic networks,3,8 reliability estimation,27 production line decision making,13,29 system simulation, 5 statistics ,9,20,22 spectral analysis ,1 image
reconstruction,30 and general probabilistic problem
solving. 12-15,17,28
After summarizing the entropy maximization technique in
the second section, we obtain in the third section the maximum entropy solution for the state probabilities of an abstract, general system. All of our results are obtained by
ENTROPY MAXIMIZATION
Given a set of independent propositions {Si} that enumerate all of the possibilities in some situation or problem,
and given relevant information that is not expressed directly
as probabilities, one frequently wishes to convert this information into suitable probability assignments P(Si) for each
of the propositions. Entropy maximization is a means for
doing so. In the usual case, known information is expressed
as expectations of suitable functions fi(Si) defined on the
set of propositions. The entropy
H=-
(1)
i
is then maximized subject to the constraints
L p(Sd=1
(2)
i
L fi (Si)P(Si)=( fi)
(/=1,2, ... ).
(3)
i
The well-known solution is
p(Si)=exp [-f30- L fl(Si)f3IJ,
(4)
I
where f30 is a Lagrangian multiplier determined by the normalization constraint (2), and where each f31 is a Lagrangian
multiplier determined by a known expectation (Ji). If the
"partition function"
* The results in this paper were presented orally at the 1976 IEEE IntematiOl~al
L P(Si) log P(Si)
Z=exp (,80)= L exp[ -
Symposium on Information Theory.
i
L
fi (Si)f3zJ
I
483
From the collection of the Computer History Museum (www.computerhistory.org)
(5)
484
National Computer Conference, 1978
can be evaluated, then the relationship between each Lagrangian mUltiplier P, and its associated constraint (fz) is
determined by the equation
CJpo
( fz) = - CJP,
=-
1 az
Z ap, .
SI. We shall now assume that the expectation of n(Si) is
known, and proceed to maximize the entropy of the distribution P(Si) given this constraint. Specifically, we shall
maximize
(6)
2N
H= -
For continuous functions, the appropriate generalization
appears to be the maximization of "cross-entropy"
H=-
f
dx p(x) log [p(x)/q(x)]
L P(Si) log P(Si)
i=1
subject to the constraints
(7)
where q(x) is some "prior" distribution. Like (1), (7) originated with Shannon 25 (Shannon's results for continuous
channels and for source rates relative to fidelity evaluations
are based on expectations of (7». Maximization of (7) and
related functions is discussed by Good, 9 Kullback,20 and
Kashyap.17 The term cross-entropy is due to Good. 9 Equation (7) has a discrete analog in cases where a prior distribution is known in addition to the expectations (3) (i.e.,
given no information about the prior, (1) assumes a uniform
prior).
As an example, consider the following problem: It is
known that a series of events occurs and that the time
between successive events can range between 0 and 00. It is
also known that the average time between successive events
is (t), i.e., that events occur at the average rate "-=1/(1). A
function p(t) giving the probability density of the time between events is needed. The result of maximizing entropy
subject to the constraint Itp(t)dt=(t) is p(t)="-e-A.t.5,28
Thus, when a finite mean is specified, the distribution of
inter-arrival times that maximizes entropy is exponential.
From the maximum-entropy viewpoint, the term "exponential arrivals" describes processes about which one knows
only the average arrival rate. If it is further required that the
inter-arrival times are independent, then we are assured of
a Poisson process. Derivation by maximum entropy of the
full exponential family is discussed by Noonan, et al. 22 and,
!~_~ somewhat different form, by Kullback. 20
MAXIMUM ENTROPY SOLUTION FOR A MODEL
SYSTEM
In this section, we will obtain the maximum entropy solution for the state probabilities of an abstract system. All
of our later results will be obtained by specializing this
general solution. The results presented in this section are
similar to those in Reference 14. We consider a system
composed of N components, each of which can be in one of
two "microstates." A system state Si is defined as an enumeration of which components are in which microstates.
Hence, there are 2N such system states Si. If one of the two
microstates is specified as the "microstate of interest" (it
doesn't matter which one), then we may define the function
n(Si) as
n(Si)=the number of components that are in the microstate of interest, given that the system is in
the system state Si(Osn(SdsN).
Let P(Si) be the probability that the system is in the state
and
2N
L
(n) =
n(Si)p(Si).
i=1
(8)
In this case, the partition function (5) can be evaluated
exact)y as follows:
2N
N
Z= L e- 131n(Sj)= L g(k)e- 131k
i=1
k=O
=
f
k=O
(N)e- 131k
k
=(1 +e- 131 )N
(9)
In (9), PI is a Lagrangian multiplier determined by the constraint (8) and g(k) is the number of system states Si such
that n(Si)=k. The relationship between the multiplier PI and
the constraint (n) can be expressed explicitly by applying
(6) to (9). The result is
~-131= (n)/N
I-(n)/N
(10)
Using (9) and (10), one can express the maximum entropy
solution (4) for P(Si) as follows:
P(Si)= (1 + e-131 )-Ne-131n(S il
=(
(~) n(Si)( 1_ (~) N-n(Si)
(11)
In deriving (11), we began by maximizing entropy subject to
the constraint that the expectation of a function n(Si) is
known. Equation (11) expresses the result explicitly in terms
of the function n(Si) and its known expectation (n).
In applying the result (11), we shall be particularly interested in the quantity
p(k)=probability that the system is in any state S i such
that n(Si)=k,
which is obtained easily from (11):
p(k)= g(k)(1 + e- 131 )-N e-131k
=(
n
(1 + e- 131 )-N e-131k
(12)
Thus, given the model system and its constraints, the max-
From the collection of the Computer History Museum (www.computerhistory.org)
Derivation of Equilibrium and Time-Dependent Solutions
imum entropy distribution for p(k) is a binomial. This derivation of the binomial distribution was fIrst obtained, in a
slightly different form, by Jaynes. 14
in certain applications, one is interested in the limit N~oo
while <n) remains finite. The result for p(k) in this case is
obtained by using standard methods (e.g., see Reference 7,
p. 142) to derive an approximation to (12) given <n)~N,
yielding
p(k)=e-{n)
<~~k
.
APPLICATION TO THE EQUILIBRIUM SOLUTION
OF MJMJooIIN AND MJMJoo QUEUES
PI =probability that any given system component is in
the "microstate of interest."
(~-=-/)
system states
that contribute to the sum of P(Si)' Thus, using (11), we
obtain
= N(N-l)(n»)k(
k~
= (n)
N
k-l
N
1
_ (n»)N-k
N
Nil (N-l)(n»)'(I_ <n»)N-l-l
l=O
In the notation of queueing theory (Reference 18, p. viii),
MJMJooIIN refers to a system of N customers and an "infI-
Since there are 2N - l system states that meet the requirement
that the given component be in the" microstate of interest, "
PI is the sum of P(Si) over these 2N - l states. The function
n(Si) is restricted to the values n(Si)= 1,2, ... ,N. For a
PI
outcome of thejth coin toss (1s;j~N). Thus, a system state'
represents a record of the results of each toss and the 2N_
element set {Si} is a complete enumeration of the results
obtainable by tossing the coin N times. Since PHis the same
as PI, the combination of (12) and (14) yields the standard
result
(13)
Another probability of interest is
particular value n(Si)=k, there are
485
INN
<n)
=N'
(14)
An equivalent form, in terms of the Lagrangian multiplier
(31' is
GENERAL REMARKS ON APPLICATIONS
In general, application of the foregoing results to a given
problem requires two steps: First, one must verify that the
abstract system on which our results were based is an accurate representation of what is known about the given
problem. Specifically, one must make sure that the set {St}
describes completely the set of possibilities implied by the
given problem, and that the extent of additional knowledge
is limited to the expectation <n) of the function neSt). If
more is known, then our solutions will be incorrect in the
sense that additional information is available but not considered, although the solutions may be useful approximations. 14
The second step is to identify the relationship between
(n) (or (31) and parameters contained in the problem statement so that our results for P(Si) or p(k) can be expressed
in terms of these relevant parameters.
As an example, consider the problem of computing the
probability of obtaining k heads in N flips of a coin whose
single-toss probability of obtaining a head is PH' In applying
out model system, we identify the jth system component as
being in the microstate (heads or tails) that records the
nite" number of servers (effectively equal to N). Both the
interarrival times for each customer and the service times
for each server are exponentially distributed. Customers
arrive for service at an average rate A per customer and are
served at the average rate /L per server (1//L is the average
time it takes to complete the service of a customer, after
which the customer returns to the state "not being served;"
1/A is the average time it then takes for that customer to
return to the state "being served"). A server is always
available to begin service immediately on any customer who
needs it. Recall that the specification of only A and /L is
equivalent, from the maximum entropy viewpoint, to the
statement that the interarrival and service time distributions
are exponential.
In order to apply our model system, we let each system
component represent one of the N customers, each of which
can be in the microstate of interest "being served" or in the
other microstate "not being served." Thus, the 2N element
set {St} represents all possible states of the MJMJooIIN
queueing system. The function neSt) is then the number of
customers being served given that the system is in state Si
and the expectation (n) may be interpreted as a time average. Conventionally (Reference 18, p. 90), p(k) is defined
to be the t~oo equilibrium limit of a time dependent probability P (k, t). In this context, the probability PI is the fraction
of time that a given customer spends "being served" in a
system that has reached equilibrium. The relationship between <n) and the parameters of interest (A,/L,N) is obtained
by noting that the following equality must hold at equilibrium:
(15)
PI/L= (1- PI)A.
Hence, using (14) and (10), we have
(n)/N
1-(n)/N
-~1
e
(16)
Substitution of (16) into (12) yields the following result for
the probability that k customers are being served:
P(k)=(
~) ( 1+ ~) -N(~) k.
(17)
This is the classical answer (Reference 18, pp. 107-108).
For the MJMJoo system, the situation is the same except
that the limit N~oo is taken and the arrival rate A is a total
From the collection of the Computer History Museum (www.computerhistory.org)
486
National Computer Conference, 1978
arrival rate (as opposed to an arrival rate per customer)
independent of the number of customers in service. In this
case, the equilibrium condition is
Solving these equations for d(n) I dt=l=O would yield timedependent functions for the expectation <n;t ). Both (21) and
(22) have the form
f.t(n)=A,
dt =a-cp(n),
which, when combined with (13), yields the classical M/M/
00 result (Reference 18, p. 101):
P(
k)= -A/~ (AI f.t)k
e
d(n)
(18)
which has the general solution
<n;t)= (alcp)(l- e-'I't)+( n;O) e-'I't.
(19)
k!.
Based on work by Benes,3 a maximum entropy analysis
somewhat similar to the foregoing was made by Ferdinand 8
for the machine servicing problem (M/M/l//N) at equilibrium. In that derivation, however, the relationship between
the Lagrangian mUltiplier and the parameters of interest was
obtained by assumption and analogy, rather than by derivation. Specifically, for a system in a state Si with n(Si)=k,
Ferdinand assumed a state "energy" E(k)=-k 10g(A/f.t)
and was thereby able to eliminate the Lagrangian mUltiplier
by analogy with statistical mechanics. In obtaining (17) and
(19), on the other, we eliminated the Lagrangian mUltiplier
by means of (10) and the equilibrium conditions (15) and
For the M/M/ooIIN case (21), one has a=AN and cp=f.t+A
so that
AN
(n; t)= A+f.t (1- e-(H~)t)+< n;O) e-(H~)t.
(23)
By assuming that (n;O)=O and substituting (23) into (12), we
obtain the following time-dependent solution for the M/M/
001IN queue:
p(k,t)=
(~)(A~f.tr[1-e-(H~)t]k
X
[
A
1- A+f.t
(1-e-(H~)t)
]N-k .
(18).
TIME DEPENDENT SOLUTIONS
The (n)-constraint equation (8) can represent knowledge
of (n) at any particular time; the maximum entropy solutions
P(Si) or p(k) then apply to that time. It was only in the last
section that we chose a time (t~oo) that corresponded, in the
queueing systems, to equilibrium values of (n). Time-dependent knowledge (n;t) of the constraint (8) can therefore
be substituted meaningfully into (12) or (13) in order to
obtain time-dependent probabilities p(k,t). For example,
consider a pure birth process in which customers arrive at
a constant, total rate A from an infinite (N~oo) customer
population. Once having arrived, a customer stays forever ..
The expected number of customers having arrived therefore
grows constantly at the rate A:
d(n) =A
dt
.
(20)
By substituting the solution of (20) into Eq. (13), and assuming that no customers have yet arrived at t=O, we obtain
(At)k
p(k t)=e- At - -
,
k!
'
which is the classical result (Reference 18, p. 60).
In the last section, we transformed the general solution
(12) for p(k) into specific solutions by relating (n) or the
Lagrangian multiplier (31 to the physical constraints A, f.t,
and N; We did so by writing down the equilibrium conditions
(15) and (18), which we repeat here in more explicit form:
d~~) =-f.t(n)+A(N-(n»=O
(M/M/oo//N)
(21)
d(n) =-f.t(n)+A=O
dt
(M/M/oo)
(22)
This result appears not to be well-known among queueing
theorists, although it can be derived from a well-known
result in reliability theory (Reference 2, p. 78).
For the M/M/oo case (22), one has a=A and Cp=f.t, so that
(n; t)= (A/f.t)(I- e-~t)+< n;O) e-~t.
(24)
By assuming that (n;O)=O and substituting Eq. (25) into (13),
we obtain the following time-dependent solution for the M/
M/oo queue:
p(k,t)=
;! (;) (1-e-~t)k
k
exp [ - ;
(1-e-~t)] .
This is the classical solution, which can be derived from a
slightly more general result given by Saaty (Reference 23,
pp. 99-100).
DISCUSSION
In what sense does the maximum entropy distribution for
our model system relate to observations made on corresponding real systems? Jaynes 14 has shown that the maximum entropy distribution is related to observed frequencies
in the following sense: Given the imposed constraints, the
maximum entropy distribution can be realized experimentally in overwhelmingly more ways than any other distribution. Thus, if the analysis includes all of the experimentally operative constraints, then the maximum entropy
distribution is overwhelmingly the most likely distribution
to be observed experimentally.
In general, one can consider experimental confirmation of
a maximum entropy distribution as evidence that the operative physical constraints have been accounted for properly
in the maximum entropy analysis. Conversely, major discrepancies between an experimentally observed distribution
From the collection of the Computer History Museum (www.computerhistory.org)
Derivation of Equilibrium and Time-Dependent Solutions
and a corresponding maximum entropy distribution is evidence that important physical constraints have been over100ked. 14 When a maximum entropy analysis does not include all of the operative physical constraints, the maximum
entropy distribution may still be a useful approximation.
The accuracy of the approximation will depend on the importance of the ignored constraints.
In the case of the queueing formulas derived in this paper,
we should expect them to apply when a real system's states
correspond to those of the model ({Si}) and when the system's dynamics are most strongly dependent on the constrained mean value (n) determined by the stated equilibrium conditions. If these criteria are satisfied, then the
system need not satisfy other assumptions that may be made
in classical derivations of the queueing formulas. The maximum entropy viewpoint may therefore explain the surprisingly widespread applicability of queueing theory results.
ACKNOWLEDGMENTS
I thank W. S. Ament, E. Freeman, and D. Kaplan for helpful
discussions.
REFERENCES
1. Ables, J. G., "Maximum Entropy Spectral Analysis," Astron. Astrophys.
Suppl., 15, 1974, pp. 383-393.
2. Barlow, R. E. and F. Proschan, Mathematical Theory of Reliability, John
Wiley, New York, 1975.
3. Benes, V. E., Mathematical Theory of Connecting Networks and Telephone Traffic, Academic Press, New York, 1965.
4. Buzen, J. P., "Operational Analysis: The Key to the New Generation of
Petformance Prediction Tools," Proceedings COMPCON 76, IEEE
Computer Society, Washington, D.C., Sept. 1976, pp. 166-171.
5. Chan, M., "System Simulation and Maximum Entropy," Operations
Research, 19, 1971, pp. 1751-1753.
6. Chen, P. P., "Queueing Network Model of Interactive Computing Systems," Proc. IEEE, 63, June 1975, pp. 954-957.
7. Feller, W., An Introduction to Probability Theory and Its Applications,
Vol. I, Wiley, New York, 1957.
8. Ferdinand, A. E., "A Statistical Mechanics Approach to Systems Analysis," IBM J. Res. Develop., 1970, pp. 539-547.
9. Good, I. J., "Maximum Entropy for Hypothesis Formulation, Especially
481
for Multidimensional Contingency Tables," Annals Math. Stat., 34, 1963,
, pp. 911-934.
10. Jaynes, E. T., "Information Theory and Statistical Mechanics I," Phys.
Rev., 106, 1957, pp. 620-630.
11. Jaynes, E. T., "Information Theory and Statistical Mechanics II," Phys.
Rev., 108, 1957, pp. 171-190.
12. Jaynes, E. T., "Probability Theory in Science and Engineering," Field
Research Laboratory, Socony Mobil Oil Company, Inc., Colloquium
Lectures in Pure and Applied Science No.4, 1958.
13. Jaynes, E. T., "New Engineering Applications ofInformation Theory,"
Proceedings of the First Symposium on Engineering Applications of
Random Function Theory and Probability, (Bogdanoff, J. L., Ed.), John
Wiley, New York, 1963, pp. 163~203.
14. Jaynes, E. T., "Prior Probabilities," IEEE Trans. on Systems Science
and Cybernetics, SSC-4, 1968, pp. 227-241.
15. Jaynes, E. T., Probability Theory in Science and Engineering, unpublished lecture notes (available from Physics Department, Washington
Univ., St. Louis, Mo.), 1972.
16. Katz, A., Principles of Statistical Mechanics-The Information Theory .
Approach, W. H. Freeman Company, New York, 1967.
17. Kashyap, R. L., "Prior Probability and Uncertainty," IEEE Trans. Infor.
Theory, IT-I7, pp. 641-650.
18. Kleinrock, L., Queueing Systems Vol. I: Theory, John Wiley, New York,
1975.
19. Kleinrock, L., Queueing Systems Vol. II: CQmputer Applications, John
Wiley, New York, 1976.
20. Kullback, S.,Information Theory and Statistics, Dover, New York, 1968.
21. Muntz, R. R., Analytic Modeling of Interactive Systems, Proc. IEEE,
63, June 1975, pp. 946-953.
22. Noonan, J. P., N. S. Tzannes, and T. Costello, "On the Inverse Problem
of Entropy Maximizations," IEEE Trans. Inform. Theory, IT-22 , 1976,
pp. 120-123.
23. Saaty, T. L., Elements of Queueing Theory, McGraw Hill, New York,
1961.
24. Shannon, C. E., "A Mathematical Theory of Communication," Bell
System Tech. Jour., 27, 1948, pp. 379-423.
25. Shannon, C. E., "A Mathematical Theory of Communication," Bell
System Tech. Jour., 27, 1948, pp. 623-656.
26. Shore, J. E., Derivation of Equilibrium and Time-Dependent Solutions
to MlMIool/N and MlMIoo Queueing Systems Using Entropy Maximization, Naval Research Laboratory Memorandum Report 3233 (1976).
27. Tribus, M., "The Use of the Maximum Entropy Estimate in the Estimation of Reliability," Recent Developments in Information and Decision
Processes, (Machol, R. E., and Gray, D., Eds.), Macmillan, New York,
1962, pp. 102-139.
28. Tribus, M., Rational Descriptions, Decisions and Designs, Pergamon
Press, New York, 1969.
29. Tribus, M. and G. Fitts, "The Widget Problem Revisited," IEEE Trans.
on Systems Science and Cybernetics, SSC-4, 1968,pp. 241-248.
30. Wernecke, S. J., and L. R. D'Addario, "Maximum Entropy Image Reconstruction," IEEE Trans. on Computers, C-26, 1977, pp. 351-364.
From the collection of the Computer History Museum (www.computerhistory.org)
From the collection of the Computer History Museum (www.computerhistory.org)