Multiuser Throughput Dynamics under Proportional Fair and other

Multiuser Throughput Dynamics
under Proportional Fair and other
Gradient Based Scheduling Algorithms
Alexander L. Stolyar
Bell Labs, Lucent Technologies
Murray Hill, NJ 07974
[email protected]
Abstract
We consider the model where N queues (users) are served in discrete time by a
The switch state is random, and it determines the set of possible
service rate choices (scheduling decisions) in each time slot. This model is primarily
motivated by the problem of scheduling transmissions of N data users over a shared
time-varying wireless environment, but also includes other applications such as inputqueued cross-bar switches and parallel exible server systems.
The objective is to nd a scheduling strategy maximizing a concave utility function
H (u1 ; : : : ; uN ), where un 's are long-term average service rates (data throughputs) of
the users, assuming users always have data to be served.
We prove asymptotic optimality of the Gradient scheduling algorithm (which generalizes the well known Proportional Fair algorithm) for our model which, in particular,
allows for simultaneous service of multiple users and for discrete sets of scheduling decisions. Analysis of the transient dynamics of user throughputs is the key part of this
work.
generalized switch.
Scheduling, Optimal Throughput Allocation, Proportional Fair,
Gradient Algorithm, Wireless, Generalized Switch
Key words and phrases:
1
Introduction
We consider the model where N queues (users) are served in discrete time by a generalized
switch. The switch state is a random ergodic process, and it determines the set of possible
service rate choices (scheduling decisions) in each time slot. This model is primarily motivated by the problem of scheduling transmissions of N data users over a shared time-varying
1
wireless environment, which naturally arises, for example, in modern wireless data technologies, such as HDR (3G1X EV-DO) [2, 7]. It also includes as special cases other models and
applications such as input-queued cross-bar switches [11, 12], a discrete time version of the
parallel server system [6, 19], and a parallel-processing system [5]. (The specic model we
consider is identical to that of [14]. However, models of this type have a long history, going
back to the work of Tassiulas and Ephremides [15, 16]. See [14] for a further review of the
model history and applications.)
In this paper we consider the \saturated system" where each user has innite amount of data
to be served (transmitted), and we are concerned with optimizing the vector of average service
rates (throughputs) u = (u1; : : : ; uN ), so as to maximize some concave utility function H (u).
In the wireless data scheduling context, P
this problem was rst considered by Tse in [17], where
the Proportional Fair utility function log(un), and the corresponding Proportional Fair
scheduling algorithm (dened below) were introduced for the model where (as in HDR) one
user can be served (transmit) at a time.
The Gradient Algorithm (dened below) is a natural generalization of Proportional Fair
algorithm in that it applies to any concave utility function, and to the systems where multiple
users can be served at a time. The goal of this paper is to formally prove (asymptotic)
optimality of the Gradient Algorithm for our rather general model, which in particular
allows for
(a) simultaneous service of multiple users
and
(b) the set of scheduling decisions to be discrete (as is the case in HDR and other wireless
data technologies).
To describe our main results and connections to previous work, let us rst describe the
model in more detail. (The formal description will be given in Section 3.) Queues (users)
n = 1; : : : ; N , are served by a generalized switch in discrete time t = 0; 1; 2; : : : . Switch
state m = (m(t); t = 0; 1; 2; : : : ) is a random ergodic process. In each state m, the switch
can choose a scheduling decision k from a set K (m). Each decision k has the associated
service rate vector m(k) = (m1 (k); : : : ; mN(k)). This vector represents the amount of data
of each user which will be served (transmitted) in one \time slot" if decision k is chosen.
(We emphasize the fact that our model allows for features (a) and (b) specied earlier.)
As mentioned earlier, we consider the situation when each user always has data to be served,
and we are interested in optimizing the average service rates u = (u1; : : : ; uN ). Namely, we
seek to nd a scheduling algorithm which produces u solving the problem
max H (u)
(1)
subject to
u2V ;
(2)
where H (u) is a strictly concave utility function, and set V above is the system rate region,
i.e. the set of all feasible long-term service rate vectors. It is easily observed that (under some
2
natural non-restrictive assumptions) V is a convex closed bounded set and, consequently,
the solution u to the problem (1)-(2) exists and is unique.
The Gradient Algorithm is dened as follows. If at time t switch is in state m, the algorithm
chooses a (possibly non-unique) decision
k(t) 2 arg max rH (X (t)) m (k)
k
maximizing the dot-product with the gradient of H (X (t)), where the vector X (t) is updated
as follows:
X (t + 1) = (1 )X (t) + m (k) ;
with arbitrary initial value X (0) and with > 0 being a xed (small) parameter. (We
see from this denition that, roughly, vector X (t) represents exponentially smoothed average
service rates.)
The
Fair (PF) algorithm is the Gradient Algorithm with utility function H (u) =
P log(Proportional
un). Also, very often PF algorithm is considered for the \one-at-a-time" case, when
service rate vectors m(k) can have no more than one non-zero component. In other words,
in each time slot only one user can be picked for service, and if user n is picked, it is served at
the rate mn , depending on the switch state m. In this case the PF algorithm is particularly
simple [17, 7]: in each time slot t it serves the user n for which mn =Xn(t) is maximal.
In this paper we prove the following property for our model.
(P1) Asymptotic Optimality of Gradient Algorithm. Let u denote the vector of
expected average service rates in a system with xed parameter > 0. Then, as ! 0,
u ! u :
(The rates u are dened in terms of averages over nite time intervals. Formal denitions
and corresponding results, Theorem 1 and Corollary 1, are presented in Section 7. The
results show that the convergence in (P1) holds in a quite strong sense.)
Establishing property (P1) naturally involves the analysis of the (limiting, continuous time)
trajectories of the process X (t= ), with very small xed > 0. Such trajectories x =
(x(t); t 0), which we call uid sample paths (FSP), satisfy, in particular, the dierential
inclusion
x0 (t) = v (t) x(t); v (t) 2 arg max[rH (x(t))] v :
(3)
v2V
As the key element of proving (P1), we prove the following attraction property of FSPs.
(P2) Attraction Property. The convergence
x(t) ! u ; t ! 1 ;
holds for any FSP x, with any initial state x(0). (In fact we prove uniform convergence on
the initial states x(0) from a bounded set. See Theorem 3 for precise formulation.)
3
The problem of asymptotic optimality of the Gradient Algorithm was studied in the recent
papers [8] and [1]. In particular, in those papers the dierential inclusion (3) was derived
and the attraction property (P2) was proved, under certain constraints which we now discuss
in more detail.
In both [8] and [1] it is required that the arg max in (3) is a single point v, depending on
x(t) continuously. This makes (3) a dierential equation of the form
x0 (t) = f (x(t))
(4)
with continuous f (). In essence, this is the condition that rate region V is strictly convex.
Such condition does not hold, for example, if the set of all possible scheduling decisions is
nite - a situation typical in many applications, including HDR [2, 7].
The proof of convergence (P2) in [8] relies in essential way on a further assumption that the
dynamics described by (4) is cooperative. This roughly means that the vector-function f (u)
is such that if we decrease all components of vector u except one, say ui, then fi(u) will also
decrease. This is another condition on the shape of V . It holds in the \one-at-a-time" case
(when the service rate vectors m(k) can have at most one non-zero component). It however
typically does not hold in important cases when multiple users can be served simultaneously
in one time slot, as in [18] (or in cross-bar switch models).
In [1], property (P2) is proved under the additional assumption that the initial state x(0) 2
V . (In this case H (x(t)) is a natural Lyapunov function, which is no longer the case if x(0)
is outside of V .) This assumption is restrictive for the following two (related) reasons. First,
such form of property (P2) does not imply the asymptotic optimality (P1). Secondly, in
applications, and especially wireless applications, the stationarity of the switch (channel)
state process is only an approximation - the \stationary" distribution of this process is
subject to both slow and sudden changes (for example, due to arrival and departure of
users). Therefore, the ability of an algorithm to bring system to the optimal state from any
initial state is in fact required for algorithm robustness in real applications.
Our proof of property (P2) follows a \geometric" approach (inherited from [14]), which does
not rely on the cooperative dynamics assumption, or on the condition that rate region is
strictly convex, or on the utility function H as the sole Lyapunov function. As a result, we
are able to establish property (P2), and consequently the asymptotic optimality (P1), in the
desired generality, which allows in particular for features (a) and (b) of the model.
We
also note that the proofs of (P2) in [8] and [1] exclude the utility functions of the type
P log(
un), which take value 1 when some un = 0. Due to the wide use of such (somewhat
less \benign") utility functions in applications, we include them in our results. To do this, for
such utility functions, we prove and use additional properties of uid sample paths, beyond
the basic dierential inclusion (3).
As we comment later in the paper, the assumption of strict concavity of utility function H
is not essential. If H is just concave, all our results and proofs hold if we replace the single
optimal solution u of the problem (1)-(2) by the set of optimal solutions (which necessarily
will be convex and compact).
4
To summarize, the main contributions of this paper are as follows:
- The key attraction property (P2) of uid sample paths under Gradient Algorithm is proved
for arbitrary initial state, and moreover, uniformly on the initial states from a bounded
subset;
- Property (P2) is proved under very general (but common in applications) assumptions on
the model and utility function;
- The asymptotic optimality (P1) of the Gradient Algorithm is proved, based on the property
(P2).
Finally, we note that papers [8] and [1] (as well as earlier papers [9, 10]) also consider versions
of the Gradient Algorithm, such that the parameter is not xed, but rather is a decreasing
vanishing function of time t. Such versions of the algorithm are asymptotically optimal in
the sense that long-term average throughputs converge to the exact optimal values u, as
time t ! 1. However, the smaller the initial value (0), the greater the convergence time.
The algorithm with xed (which we study in this paper) is dierent in that it brings user
throughputs into a (small) neighborhood of u. However, as we show, the convergence is
within the time of the order of O(1= ), uniformly on essentially any initial state. (This is
because, in applications, usually there exists some a priori known upper bound a on the
maximum possible service rate of any user at any time. As a result, as long as all Xn(0) a,
the components of X (t) will always stay bounded by a.) This means that the algorithm
is able to adjust to the \environment" changes without resetting the algorithm parameters.
Such robustness of the algorithm with xed is very desirable in applications (as already
mentioned earlier).
The rest of the paper is organized as follows. In Section 2 we introduce basic notations
and conventions used in the paper. We introduce the formal model in Section 3, and in the
following Section 4 dene the system rate region V . The optimization problem for the average
user rates is formulated in Section 5, and the Gradient Scheduling Algorithm is dened in
Section 6. The asymptotic optimality results (Theorem 1, Corollary 1, and Theorem 2),
which formalize and strengthen property (P1), are presented in Section 7. Section 8 contains
the main result on uniform convergence of uid sample paths (Theorem 3), which formalizes
property (P2), and the process level convergence result (Theorem 4). In Section 9 we prove
Theorem 3. Section 10 contains formal denition of uid samlpe paths, and proofs of their
properties. Finally, in Section 11 we prove Theorems 1, 2, and 4. Section 12 contains
concluding remarks.
2
Notations
We will use standard notations R and R+ for the sets of real and real non-negative numbers,
respectively. Corresponding N -times product spaces are denoted RN and R+N . The space
RN is viewed as a standard vector-space, with elements x 2 RN being row-vectors x =
5
(x1 ; : : : ; xN ). The dot-product (scalar product) of x; y 2 RN , is
x y =:
and the norm of x is
Xx y
N
n=1
n n
;
p
kxk =: x x :
Sometimes we use notation
(x; y ) =: kx y k
for the distance between vectors x and y, and notation
(x; V ) =: inf (x; y )
y2V
for the distance between vector x and a set V RN .
Suppose x is an N -dimensional vector with non-negative components xn, and V is a nite
closed subset of RN . Then,
arg max x v
v2V
denotes the subset of vectors v 2 V with the maximum value of x v. We do not exclude
the case when xn = +1 for some n. For this case we use the convention that the product
(+1) c is equal to 1, 0, and +1, if c is negative, zero, and positive, respectively. Thus,
if xn = +1 and V 2 R+N , then any u 2 V with un > 0 belongs to arg maxv2V x v.
For a scalar function h(t) of a real variable t, we use the following notations for the lower
and upper right derivative number:
h(t + dt) h(t)
d+
h(t) = lim inf
;
dt#0
dt
dt
d+
h(t + dt) h(t)
h(t) = lim sup
:
dt
dt
dt#0
The abbreviation u.o.c. means uniform on compact sets convergence of functions.
3
Generalized Switch Model
Generalized switch [14] is the following queueing system. There is a nite set N = f1; 2; : : : ; N g
of queues (and corresponding \customer types") served by a switch. (We will use the same
symbol N for both the set and its cardinality.) In this paper we assume that queues always
have suÆcient \supply" of customers to serve.
The system operates in discrete time t = 0; 1; 2; : : : . By convention, we will identify an
(integer) time t with the unit time interval [t; t + 1), which will sometimes be referred to as
the time slot t.
6
The switch has a nite set of switch states M . In each time slot, the switch is in one of the
states m 2 M ; and the sequence of states m(t); t = 0; 1; 2; : : : , forms an irreducible (nite)
Markov chain with stationary distribution fm; m 2 M g,
X m = 1 :
m > 0; 8m 2 M;
m2M
When switch is in state m 2 M , a nite number of scheduling decisions can be made, which
form a nite set K (m); if a decision k 2 K (m) is chosen at time t, then the \amount"
mn (k) 0 of \customers" of type n 2 N are served and depart the system at time t + 1.
We will denote m(k) =: (m1 (k); : : : ; mN(k)) 2 R+N the corresponding vector of service rates,
and make a non-degeneracy assumption that for any type n 2 N , there exist at least one
state m 2 M and a decision k 2 K (m) such that mn(k) > 0. (The values of mn(k) are real
numbers, not necessarily integers. For exposition purposes, we sometimes refer to mn(k) as
\number of customers.")
We will denote by the maximum of the values of mn(k) over all possible triples (n; m; k),
and by > 0 the smallest strictly positive value of mn(k) over all (n; m; k).
Remark. Formally, our model does not include those in [8, 1], primarily because the set
of switch states M and the sets K (m) of scheduling decisions are nite. However, rst, the
key result of this paper on the uniform convergence of uid sample paths (Theorem 3) is
strictly more general than the corresponding results in [8, 1]. Secondly, the extension of our
results, for example, to the case of ergodic m with arbitrary state space M and compact
(uniformly bounded) sets K (m) (as in [1]) is straightforward. This would require only minor
adjustments in formulations and proofs. We concentrate on a \nite" model to simplify the
exposition. Note also that this nite model is common in applications and is a model not
covered by the results of [8, 1]. (See also remark in the next section.)
4
System Rate Region
Suppose, for each of its states m 2 M , a probability measure
m = (mk ; k 2 K (m)), is
P
xed, which means that mk 0 for all k 2 K (m), and k mk = 1.
Consider
a Static Service Split (SSS) scheduling rule, parameterized by the set of measures
:
= (m ; m 2 M ). When the switch is in state m, the SSS rule chooses one of the scheduling
decisions k 2 K (m) randomly with probability mk . Then, clearly, the long-term service rate
allocated to queue n 2 N is equal to
X m X m(k) :
vn =
mk n
m2M
Thus,
k2K (m)
v =: (v1 ; : : : ; vN ) = v () ;
7
where v() is the function of an SSS rule , dened by the above expressions. (Here and
below, a measure dening an SSS rule is itself called an SSS rule.)
The system rate region V is dened as the set of all service rate vectors v() corresponding
to all possible SSS rules . In other words, V is the set of all feasible long-term service rate
vectors. Obviously, V is a closed bounded convex set (and, moreover, a polyhedron) in R+N ,
as a linear image of the convex polyhedron of possible values of . Rate region V may turn
out to be degenerate (i.e., have dimension less than N ).
Remark. The fact that V is a polyhedron is a consequence of the fact that sets of switch
states M and scheduling decisions K (m) in each state m are nite. However, our analysis
only requires that V is convex - the polyhedral shape of V is not used. Moreover, the fact
that V is allowed to be non-strictly convex is a generality not covered in the previous work
([8, 1]), because it causes uid sample paths to be generally non-smooth.
5
Optimization Problem for the Rate Allocation
Consider the following optimization problem:
max H (u)
(5)
subject to
u2V ;
(6)
where H (u) is a strictly concave utility function. We will consider two types of utility
function, dened as follows.
Type (I) Utility Function. H (u) is continuous stictly concave function on R+N . Moreover,
H (u) is continuously dierentiable, i.e. the gradient rH is nite and continuous everywhere
in R+N .
P
Type (II) Utility Function. H (u) = n Hn (un ) where each Hn (un ) is strictly concave
continuously dierentiable function, dened for all un > 0, and such that Hn(un) # 1 as
un # 0. (The denition implies Hn0 (un ) " +1 as un # 0. We adopt the conventions that
Hn (0) = 1 and Hn0 (0) = +1.)
Remark 1. Type (I) utility function is more \benign." The system dynamics under the
Gradient algorithm with such utility function is rather simple. Type (II) function has additive form, but the components are not bounded and have innite derivative
at 0, which
P
somewhat complicates analysis. The Proportional Fair utility function n log(un), widely
used in applications, is of type (II). This is one of the reasons why we do not want to exclude such functions from our analysis. We also note that all our results hold as is for more
general type (II) functions, where some of the components Hn are allowed to be (more \benign") nite and continuously dierentiable everywhere on R+, including 0. Extensions of
the analysis to this case will be transparent - we do not do it here to simplify the exposition.
8
Proposition 1 Suppose the utility function H is of either type (I) or type (II). Then, solution u to the problem (5)-(6) exists and is unique.
Proof is trivial for type (I) utility function H , since V is a bounded convex compact set and
H is strictly concave. For the type (II) function H , its domain needs to be rst restricted to
a compact set V \ [; 1)N , with > 0 small enough so that the supremum of H is certainly
attained within this set.
Our denition of the utility function does not require that H is non-decreasing
on each argument. So, the optimal point u is not necessarily a point on the outer (\northeast") boundary of V . Although, in many, if not most, applications of interest this would
be the case.
Remark 3. The assumption of strict concavity of utility function H is not essential for the
results of this paper. If H is just concave, the set of optimal solutions to the problem (5)-(6)
is a convex compact set u. Our key results and proofs in Sections 8 and 9 hold virtually
verbatim, with u replaced by u. And so do the asymptotic optimality results in Section 7
(with the exception of the proof of Theorem 1 which would have to be slightly adjusted).
We make the strict concavity assumption to simplify the exposition.
Remark 2.
6
Gradient Scheduling Algorithm
Now, consider the random process describing behavior of the Generalized Switch under the
following scheduling algorithm which is called Gradient Algorithm (GA).
Gradient Algorithm: If at time t switch is in state m, choose a scheduling decision
k(t) 2 arg
max
[
r
H (X (t))] m (k) ;
k2K (m)
where X (t) = (X1 (t); : : : ; XN (t)) is the vector of the average service rate estimates, which
is updated as follows:
X (t + 1) = (1 )X (t) + m(t) (k(t)) ;
with > 0 being a (small) parameter. The initial vector X (0) 2 R+N is arbitrary.
Let us denote
S (t) =: (X (t); m(t)) :
Our assumptions imply that S = fS (t); t = 0; 1; : : : g is a discrete time Markov chain with
the state space R+N M .
9
7
Asymptotic Optimality of Gradient Algorithm
This section presents the results showing that, as # 0, the average user service rates
converge to the optimal solution u of problem (5)-(6). Moreover, this convergence is uniform
(up to a time scaling by the factor of 1= ).
First, we need to dene the asymptotic regime. From this point on in the paper, we consider
a sequence of processes S , indexed by the value of parameter , with # 0 along a sequence
B = fj ; j = 1; 2; : : : g such that j > 0 for all j . The initial state S (0) = (X (0); m (0))
is xed for each 2 B. (Here and below, the processes and variables pertaining to a xed
parameter will be supplied the upper index . Expression # 0 means that converges
to 0 along the sequence B, unless otherwise specied.)
The probability law of the Markov chain m () describing the switch state process is the
same for each .
Let us denote by U (l1; l2) the vector of average service rates in the interval [l1 ; l2]. Namely,
for a pair of integers 1 l1 l2,
l
X
1
:
D (j ) ;
U (l1 ; l2 ) =
l2 l1 + 1 j =l
where D (l) = m(l 1) (k(l 1)) the vector of numbers of customers departed at time l (with
k(l 1) 2 K (m(l 1)) being the scheduling decision chosen in slot l 1).
2
1
Theorem 1 Let A be a bounded subset of R+N . Then, for any > 0 there exists T > 0 and
T > 0 (both depending on and A), such that
lim
sup
kEU (l1 ; l2) uk < :
#0 X (0)2A; l1 >T =; l2 l1 >T =
Corollary 1 Suppose for each we consider a stationary version of the process S , in which
case ED (1) = EU (l1 ; l2 ) (for any integer 1 l1 l2 ) is simply the average service rate
vector. Then,
:
lim
ED
(1)
!
u
#0
of Theorem 1 will be presented in Section 11. It is obtained using the following
uniform convergence in probability result for X .
Proof
Theorem 2 Let A be a bounded subset of R+N . Then, for any >
(depending on and A) such that
lim sup P fkX (t) uk > g = 0 :
#0 X (0)2A; t>T=
Proof.
Presented in Section 11.
10
0 there exists T
>
0
8
System Dynamics when
is Small
We consider the sequence of systems introduced in the previous section, with parameter
# 0. Let us extend the denition of X (t) to the continuous time t 2 R+ , by adopting a
convention that X (t) is constant within each time slot [l; l + 1). (We use this convention
throughout the paper, for all other discrete time processes, which are originally dened on
the integers.) Then, for each we can consider the scaled process
x (t) =: X (t= ); t 2 R+ :
In this section we formulate our main results, describing asymptotic behavior of the sequence
of processes fx g.
To this end, rst consider the family of uid sample paths (FSP) x = (x(t); t 2 R+ ), which
arise as possible limits of sample paths of the processes x as # 0 (given that the sample
paths of the switch state process m satisfy the law of large numbers condition). The formal
construction of the family of FSPs is presented in Section 10, and it will be proved (in
Lemmas 2 and 3) that this family satises the following properties (i)-(iii).
(i) \Basic dynamics." Each FSP x is a Lipschitz continuous function in [0; 1), with Lipschitz
constant C + kx(0)k (where C > 0 is a xed constant depending only on system parameters),
and satisfying the following basic dierential inclusion for almost all t 2 R+:
x0 (t) = v (t) x(t); v (t) 2 arg max[rH (x(t))] v :
(7)
v2V
(Points t for which (7) holds will be called regular.)
(ii) \Boundary condition." Suppose the utility function H is of type (II). Suppose xi (t) = 0
for i 2 B N and xi(t) > 0 for i 2 N n B . Then
X
d+
x (t) c > 0 ;
dt i2B i
where c > 0 is a xed constant depending only on the system parameters.
(iii) \Compactness." If a sequence of FSP x(j) ! x u.o.c. as j ! 1, then this x is also an
FSP.
The following uniform convergence theorem is the key result of this paper.
Theorem 3 Suppose H (x) is a utility function of either type (I) or type (II). Then, uid
sample paths x have the following convergence property. For any bounded subset A R+N ,
x(t) ! u ; t ! 1;
uniformly on x(0) 2 A.
11
Proof is presented in the next section.
It is easy to observe that Theorem 3 is a \deterministic analog" of Theorem 2 of the previous
section. In fact the proof of Theorem 2 follows from Theorem 3 and the following process
convergence result.
Theorem 4 Consider the sequence of processes fx g with # 0, such that x (0) ! x(0),
where x(0) 2 R+N is a xed vector. Then, the sequence fx g is relatively compact and any
weak limit of this sequence (i.e., a process obtained as a weak limit of a subsequence of fx g)
is a process with sample paths being FSPs with probability 1.
The processes x are processes in the Skorohod space DRN [0; 1) of rightcontinuous left-limits (RCLL) functions, taking values in the space RN (with standard topology and Borel -algebra on RN ). (See [4] for the denition.)
Remark 2. Theorem 4 is analogous to the corresponding process convergence results in
[8, 1]. Note however that Theorem 4 is stronger is the sense that it claims that the limiting
process is concentrated on the family of FSPs, which is \narrower" than simply the family
of solutions of the dierential inclusion (7). This subtlety is important because when we
use this theorem, we need all properties of FSPs, not just property (i). This also eects the
proof.
Proof is in Section 11.
Remark 1.
9
Proof of Theorem 3
In this proof, without loss of generality, we assume that the set A is closed, and therefore
compact.
9.1
Uniform Convergence to the Set
V
The following lemma formalizes a simple, but nevertheless important, general observation
that if velocity x0 (t) is always a vector from x(t) to an arbitrary point within a convex
set V , then the distance from x(t) to V can only decrease. Note, that in the dierential
inclusion (8) in Lemma 1 it is only required that v(t) 2 V . (There is no condition v(t) 2
arg max[rH (x(t))] v.)
Lemma 1 Suppose V is a convex bounded closed subset of RN . Suppose a vector-function
(x(t); t 0) is Lipschitz continuous, satisfying the following dierential inclusion for almost
all t 0:
x0 (t) = v (t) x(t);
v (t) 2 V :
(8)
12
Then, (x(t); V ) is a Lipschitz continuous non-increasing function, and moreover, for almost
all t 0,
d
(x(t); V ) (x(t); V ) ;
dt
which implies that
(9)
(x(t); V ) (x(0); V )e t :
As we see, under the conditions of this lemma, functions satisfying (8) converge to set V
uniformly on the initial states from a bounded set. For ease of reference we record this fact
as a Corollary 2 below.
For 0, let us denote by V, the -thickening of a subset V RN , namely
V =: fy 2 RN j (y; V ) g
Corollary 2 Let arbitrary bounded set A RN and arbitrary > 0 be xed. Then there
exists T1 = T1 (; A) such that for any (x(t); t 0) satisfying conditions of Lemma 1 and
x(0) 2 A,
x(t) 2 V; 8t T1 :
Lipschitz continuity of (x(t); V ) follows from the fact that x(t) is
Lipschitz continuous. To prove the rest of the statement, it suÆces to show that (9) holds
for t = , where > 0 is a regular point such that (x(); V ) > 0. Consider the point which is the (unique) point of V closest to x(), i.e., (x(); ) = (x(); V ). (See Figure 1.)
Consider vectors
1 = x(); 2 = v () ;
and
= v () x() = 1 + 2 :
Consider auxiliary linear function
y (t) = + 2 (t ); t ;
and note that y(t) 2 V for all suÆciently small increments (t ) 0. Then, we can write
Proof of Lemma 1.
d
d+
(x(t); V ) (x(t); y (t)) =
dt
dt
k1k = (x(); V ) :
13
v2
v ()
2
y (t)
V
x()
1
v1
Figure 1: Proof of Lemma 1.
9.2
Proof of Theorem 3 for Type (I) Utility Function
This proof uses only the basic dynamics property (i) of the FSPs.
Let us x arbitrary Æ > 0. Then x > 0: small enough so that h < H (u), where h is the
maximum of H (x) over the subset UÆ; = V n OÆ (u). Using concavity and smoothness of
H (x), it is easy to verify that at any point x 2 UÆ; , the directional derivative of H (x) in the
direction of point u is at least (H (u) h )=Æ, namely
(10)
[rH (x)] (u x) H (u Æ) h ku xk H (u) h :
By Corollary 2 (from Lemma 1), there exists T1 such that x(t) 2 V for all t T1 . Then, for
any regular point t T1 such that x 2 UÆ; we have:
d
H (x(t)) = [rH (x(t))] (v (t) x) [rH (x(t))] (u x) H (u ) h :
dt
Since H (u) h > 0, we see that there exists a constant T2 T1 such that (uniformly on
x(0) 2 A),
t2 = minft T1 j x(t) 2 O Æ (u )g T2 ;
i.e. x(t) must \hit" the closed neighborhood OÆ (u) no later than at time T2 .
For all t t2, we must have H (x(t)) h, where h is the minimum of H (x) over x 2
V \ O Æ (u ). (This is because, for t T1 , H (x(t)) can not decrease outside of O Æ (u ).) Since
our Æ > 0 could be chosen arbitrarily small, we observe that (uniformly on x(0) 2 A)
lim
inf H (x(t)) = H (u) :
(11)
t!1
Finally, there exists T3 T2 such that (uniformly on x(0) 2 A) for all t T3
x(t) 2 V \ O Æ (u ) ;
14
because otherwise (11) would not hold.
9.3
Proof of Theorem 3 for Type (II) Utility Function
This proof uses all three properties (i)-(iii) of the FSPs.
As in the proof for type (I) utility function, let us x arbitrary Æ > 0, and then x > :0
small enough so that h < H (u), where h is the maximum of H (x) over the subset UÆ; =
V n OÆ (u ). By Corollary 2 (from Lemma 1), there exist T10 such that x(t) 2 V for all t T10 .
(The above statement, and other statements in this proof, hold uniformly on x(0) 2 A,
unless specied otherwise.)
The rest of the proof consists of a sequence of propositions.
Proposition 2 At any regular point t 0, we have xi (t) > 0 for all i.
Suppose not. Consider the subset B N of types i such that xi(t) = 0. Since t is
a regular point, we must have x0i (P
t) = 0. (Otherwise xi () would be negative just before or
just after time t.) Consequently, i2B x0i(t) = 0. This however contradicts to the property
(ii) of FSPs (proved later in Lemma 3 (ii)).
Proof.
Proposition 3 For all t > T10 , we have xi (t) > 0 for all i.
We can choose a regular point t0 > T10 , arbitrarily close to T10 . We have H (x(t0)) >
1, because all xi (t01) > 0. Similarly to the way it is established in the proof for type
(I) utility function, we see that H (x(t)) has strictly positive derivative at any regular point
t 2 [T10 ; 1) such that H (x(t)) > 1 and x(t) 2 UÆ;. This implies that H (x(t)) > 1 for
all t t01.
Proof.
Proposition 4 For any T1 > T10 , there exists 1 > 0 such that we have xi (T1 ) 1 for all i.
Suppose not. Then there exists a sequence of FSPs (with initial states within the
compact set A) converging u.o.c. to a function x = (x(t); t 0) such that for at least one
i, xi (T1 ) = 0. By property (iii) of FSPs (proved later in Lemma 3 (iii)), this function x is
also an FSP with x(0) 2 A. This contradicts to Proposition 3.
Proof.
Proposition 5 For any T1 > T10 , there exists > 0 such that we have xi (t) for all i
and all t T1 .
15
Let us x arbitrary T1 > T10 and choose a small 1 > 0 such that the statement of
Proposition 4 holds. Without loss of generality we can assume that 1 is small enough so
that
H (1 ; : : : ; 1 ) < min
H (x) :
x2O Æ
Then, again using the fact that H (x(t)) has strictly positive derivative at any regular point
t 2 [T10 ; 1) such that H (x(t)) > 1 and x(t) 2 UÆ;, it is easy to see that
H (x(t)) H (1 ; : : : ; 1 ); 8t T1 :
This easily implies that all xi(t) must stay separated from 0 in [T1 ; 1).
Proof.
Thus we have proved that there exists T1 > 0 such that for all t T1 , x(t) 2 V; =: V \fxi ; 8ig. Note that H (x) is bounded on V; . Given this fact, the rest of the proof is virtually
identical to the proof for the type (I) utility function. (Namely, the choice of constants T2
and T3, and the proofs of the corresponding properties repeat those in the proof for type (I)
utility function verbatim.)
Proof of Theorem 3 for type (II) function H is complete.
10
Fluid Sample Paths under the Gradient Algorithm
Let us dene some additional random functions associated with the system for each value
of the scaling parameter . These functions will be dened for a continuous time t. (Recall
our convention that functions originally dened on the integers l are assumed to be constant
within corresponding time slots [l; l + 1).)
Let
btc
^Fn (t) =: X Dn (l)
l=1
denote the number of type n customers that were served by time t 0 (i.e. in the interval
[0; t]), where Dn (l) = mn (l 1) (k(l 1)) is the number of type n customers served in slot l 1
(with k(l 1) 2 K (m(l 1)) being the scheduling decision chosen in slot l 1). Denote by
Gm (t) the total number of time slots by (and including) time t 1, when the server was in
state m; and by G^ mk (t) the number of time slots by (and including) time t 1 when the
server state was m and the scheduling decision k 2 K (m) was chosen.
Obviously, for any n 2 N , m 2 M , and k 2 K (m), we have:
F^n (0) = 0; Gm (0) = 0; G^ mk (0) = 0 ;
and we have the following relations:
X G^ (t); t 0;
Gm (t) =
mk
k2K (m)
16
F^n (t) =
X X
m2M k2K (m)
mn (k)G^ mk (t); t 0:
Let Xn (t) 0 is the value of variable Xn at time t. Recall that for all integer l 1,
Xn (l) = Dn (l) + (1 )Xn (l 1) :
(12)
Also, recall that the probability law of the Markov chain m () describing the switch state
process is the same for each .
We dene the process Z , describing system evolution:
Z = (X ; F^ ; G ; G^ ) ;
where
X = (X (t) = (X1 (t); : : : ; XN (t)); t 0);
F^ = (F^ (t) = (F^1 (t); : : : ; F^N (t)); t 0);
G = ((Gm (t); m 2 M ); t 0) ;
G^ = ((G^ mk (t); m 2 M; k 2 K (m)); t 0):
In this section we study the sequence of processes Z under uid limit scaling. To be
precise, we will only need to consider sample paths of the processes under this scaling, and
their limits.
For each consider the scaled process
z = (x ; f^ ; g ; g^ ) ;
where x (t) =: X (t= ) is obtained: from X by time scaling only, and the other components
by time and space scaling, f^ (t) = F^ (t= ) and similarly for g and g^ .
Thus, the component functions of z are piece-wise constant, with a \time slot" of the length
1= .
^ g; g^) we will call a uid sample path (FSP) if
Denition. A xed set of functions z = (x; f;
there exists a sequence B0 of positive values of , such that # 0, and a sequence of sample
paths (of the corresponding processes) fz g such that ( as # 0 along sequence B0 )
z ! z; u:o:c: ;
and in addition
kx(0)k < 1 ;
(gm (t); t 0) ! (mt; t 0)
17
u:o:c: ; 8m 2 M:
(13)
It follows directly from the above denition that any FSP satises the following properties:
gm (t) = m t ; t 0; m 2 M ;
X g^ (t); t 0; m 2 M ;
gm (t) =
mk
2
X
X
f^(t) =
k K (m)
m2M k2K (m)
m (k)^gmk (t); t 0:
A sequence B0 whose existence is required in the above denition may be completely unrelated to the sequence B we introduced earlier.
The following two lemmas establish some properties of uid sample paths. We note that
these properties may not characterize the family of FSPs completely, but they will suÆce to
establish the desired convergence properties.
Remark.
Lemma 2 For any uid sample path z , all its component functions are Lipschitz continuous
in [0; 1), with the Lipschitz constant upper bounded by C + kx(0)k, where C > 0 is a xed
constant depending only on the system parameters.
Consider a xed FSP z and a sequence of paths z \dening it." For each , Z is the \unscaled" path from which z is obtained. To prove Lipschitz property of z, let us
recall the Xn () update equation (12):
Xn (l) = Dn (l) + (1 )Xn (l 1) :
(14)
Applying this equation iteratively for l = 1; 2; : : : , we obtain:
Xn (l) = X~ n (l) + (1 )l Xn (0) ;
(15)
where
Proof.
X (1
X~ n (l) =:
l 1
)j Dn (l
j =0
We see that for any l 1
X~ n (l) = max mn (k) ;
n;m;k
j) :
(16)
(17)
because X~n (l) is simply the weighted sum of the values of Dn (0); : : : ; Dn (l) (with positive
weights summing up to at most 1), and each of those values is of course bounded above by
. We can notice (for future reference) that expressions (15)-(17) imply that
Xn (l) maxf; Xn (0)g; 8l 0 :
(18)
18
Let us rewrite (14) as follows:
Xn (l) Xn (l 1) = Dn (l) Xn (l 1) :
(19)
We see that
jXn (l) Xn (l 1)j (Dn (l) + X~n (l 1) + Xn (0)) (2 + Xn (0))) : (20)
All other components of the process Z are non-decreasing and we trivially have for all
integer l 1 (and any n, m, k):
F^n (l) F^n (l 1) ;
Gm (l) Gm (l 1) 1 ;
G^ mk (l) G^ mk (l 1) 1 :
These relations along with (20) imply that all components of an FSP z are Lipschitz in [0; 1)
with the specied bound on the Lipschitz constant.
Since all component functions of an FSP are Lipschitz, they are absolutely continuous, and
therefore almost all points t 2 R+ (with respect to Lebesgue measure) are such that all
component functions of z have derivatives. We will call such points regular. In the rest of
the paper, when we write an expression containing derivatives of FSP components at time
t, we always mean that it holds under the additional condition that t is regular, even if we
do not state this condition explicitly.
Lemma 3 The family of FSPs z satises the following additional properties:
(i) For almost all t 0 (with respect to Lebesgue measure) we have:
x0 (t) = v (t) x(t);
where
v (t) =: (d=dt)f^(t) =
X X
m2M k2K (m)
0 (t)m (k)
g^mk
(21)
(22)
satises the condition
v (t) 2 arg max(rH (x(t))) v :
v 2V
(23)
(ii) Suppose the utility function H is of type (II). Suppose, for some subset B N , xi (t) = 0
if i 2 B and xi (t) > 0 if i 2 N n B . Then
X
d+
x (t) c > 0 ;
dt i2B i
where c > 0 is a xed constant depending only on the system parameters.
(iii) If a sequence of FSPs z (j ) ! z u.o.c. as j ! 1, then this z is also an FSP.
19
As in the proof of Lemma 3, consider a xed FSP z and a sequence of paths z
\dening it," along with their \unscaled" versions Z .
To prove properties (i) and (ii) let us rst derive the basic integral equation for x(). (Equation (26) below.) The derivation of this equation is same as in [8, 1] - we present it for
completeness. Let us sum up equations (19) for l = 1; : : : ; j . We obtain
Proof.
Xn
(j )
Xn
X
X X (l
(0) = D (l)
j
j
n
l=1
l=1
n
1) :
(24)
Switching to scaled processes gives, for all integer j 0,
xn
(j )
xn
(0) = f^n (j )
Z
l
0
xn ( )d :
(25)
Using the fact that all limiting functions xn and f^n are Lipschitz continuous and we have
u.o.c. convergence z ! z, we nally obtain the desired integral equation
xn (t) xn (0) = f^n (t)
Z
0
t
xn ( )d ; t 2 R+ :
(26)
Since both x() and f^() are Lipschitz continuous, the equation (21) (for every regular point
t 0) follows.
Now, let us prove property (ii). (After that we will prove (23), i.e., the rest of property (i)).
Suppose B N . (The case B = N is simpler. The proof for this case is a simplied version
of the proof for B N .) We can make the following observation, which follows from the
continuity of x() and properties of the component utility functions Hi().
There exist suÆciently small xed > 0 and > 1 (both depending on the values xn (t) for
n 2 N n B ) such that for any 2 [t; t + ], any i 2 B and any n 2 N n B ,
Hi0 (xi ( )) > 0
and
Hi0 (xi ( ))
N :
jHn0 (xn( ))j
(We remind that denotes the smallest positive value of mn (k) over all n; m; k.)
This means that if we consider the \unscaled" paths Z , then for all suÆciently small and any (positive integer) time slot l within interval [t=; t= + = ] we have the following
property:
if m = m(l 1) is such that m
i (k ) > 0 for at least one pair of k 2 K (m) and i 2 B , then
X D(l) :
i
i2B
20
Let us pick a state m such that mi(k) > 0 for at least one i 2 B . (Such a state exists.) By
the denition of a uid sample path, the fraction of time in the interval [t=; t= += ] the
switch state is m converges to its stationary probability m. This easily implies
d+ X
d X^
xi (t) = +
fi (t) m > 0 :
dt
dt
i2B
i2B
Since there is only a nite number of subsets B N , the property (ii) has been proved.
To prove (23), we rst notice that in the case of type (II) utility function, at any regular
point t, we must have xn (t) > 0 for all n 2 N . Otherwise (since t is regular) we would
have x0n(t) = 0 for any n with xn(t) = 0, which is impossible due to property (ii). Thus, if
t is regular, then (for any utility function type), the gradient rH (x()) is continuous in a
neighborhood of point t. Given this, the proof of (23) is analogous to the proof of Lemma
4(ii) in [14]. Namely, due to the continuity of the gradient rH (x()) at time t, the following
observation is true.
For any > 0, there exists a suÆsiently small > 0, such that for all \unscaled" paths Z ,
with suÆciently small , we have the following property. For any (positive integer) time slot
l within interval [t= =; t= + = ],
krH (X (l 1)) D (l) k2max
rH (x(t)) m(l 1) (k)k :
m(l 1)
This easily implies that
krH (x(t)) v(t) max
rH (x(t)) vk ;
v 2V
and since can be arbitrarily small, rH (x(t)) v(t) = maxv2V rH (x(t)) v, which completes
the proof of (23) and with it the proof of property (i).
The compactness property (iii) of the family of FSPs is analogous to the compactness properties described in Lemmas 5.1-5.3 in [13]. It follows directly from the construction of an
FSP as a limit. Namely, for each FSP z(j) consider a \dening" sequence z(j); ; ! 0; such
that z(j); ! z(j) u.o.c. For each j , consider z(j);j which is within distance j > 0 from z(j)
in the uniform metric in the interval [0; j ], where j # 0. Then, it is easy to see that the
sequence z(j);j denes FSP z.
11
11.1
Proofs of Theorems 4, 2, and 1
Proof of Theorem 4
It suÆces to prove a more general fact that any weak limit of the sequence of processes z
is a process z concentrated on FSPs with probability 1. (In this case all processes are in the
21
Skorohod space DRL [0; 1), where L is the total number of scalar component functions of
z .)
Each sequence fgm g satises the law of large numbers, namely
gm (t2 ) gm (t1 ) ! m (t2 t1 ) in probability 8 0 t1 t2 ;
(27)
and the sequence fz g is \asymptotically Lipschitz", namely,
P fkz (t2 ) z (t1 )k C (t2 t1 ) + g ! 1; 8 0 t1 t2 ; > 0 ;
for some xed C > 0.
Using the above two facts, the proof is completely analogous to the proof of Theorem 7.1 in
[13].
We note that, alternatively, the proof can also be obtained more directly using Skorohod representation (see [4]). Namely, asymptotic Lipschitz property implies that the sequence fz g
is relatively compact. Then, any converging subsequence can be constructed on a probability
space such that the convergence is with probability 1. Then, since property (27) holds, the
property (13) holds with probability 1. This, by the denition of an FSP, implies that z
converges to an FSP with probability 1.
11.2
Proof of Theorem 2
First of all, let us choose a > 0 to be xed and large enough so that a > , V [0; a]N , and
A [0; a]N . Then, without loss of generality, it will suÆce to prove the theorem for the set
A rechosen to be A = [0; a]N .
For a given > 0 and Æ > 0, by Theorem 3, we can choose T > 0 (depending on A and ),
so that for any FSP with x(0) 2 A, we have
sup kx(T ) uk =2 :
t2[T;T +Æ]
Using this, Theorem 4, and the continuous mapping theorem (see [3]), it is easy to obtain
lim
sup P f sup kx (t) uk > g = 0 :
(28)
#0 x (0)2A
t2[T;T +Æ]
Note, however that (by (18)), xn(t) remains bounded by the maximum of and xn (0) for
all t. This means that x (t) 2 A for all t and all . Applying (28) to the processes restarted
at times 0, we can rewrite (28) in a stronger form
lim
sup sup
P f sup
kx (t) uk > g = 0 ;
#0 0
x (0)2A
which implies the desired result.
t2[ +T; +T +Æ]
22
11.3
Proof of Theorem 1
Both U (l1 ; l2) and X (l2) are dierently computed averages of the values of D (l): the
former one is the average over l1 l l2 with equal weights 1=(l2 l1 + 1), and the latter
one is (roughly) the average of D (l2 j ) with \exponential" weights (1 )j . We know
from Theorem 2 that EX (l) is close to u for large l, and therefore, when l1 is large,
X EX (l)
1
(29)
l2 l1 + 1 l ll
is close to u, uniformly on l2 l1 . In this proof, we show that, when both l1 and l2 l1 are
suÆciently large, the expected value EU (l1 ; l2) is close to (29).
Let us choose a > 0 the same way as in the proof of Theorem 2. So, without loss of generality
we can assume that A = [0; a]N . Then, again as observed in: the
p proof of Theorem 2, for all
l 0 and all , X (l) 2 A, and consequently kX (l)k b = Na.
Let us x (small) > 0. Theorem 2 along with the fact that the values of X (l) are uniformly
bounded implies that we can choose T > 0 such that for all suÆciently small , uniformly
on l T= , we have
kEX (l) uk < :
(30)
Without loss of generality, we can always choose T to be large enough so that e T is arbitrarily small. We note that
X (1 )j = e T + O() ;
1
2
j T =
where (here and below) O( ) denotes a term with absolute value bounded above by C as
! 0, with C > 0 being a xed constant.
In what follows, l1 , l2 are functions of , such
that l1 T= , l2 l1 T 0= , and T 0 > T .
:
Also, we will use a simplied notation d(p) = ED (p). Then, we can write
X EX (l) =
1
Q =:
(31)
l2 l1 + 1 l ll
1
1
X X (1
l2
T 0 =
l 1
l=l1 j =0
2
)j d(l
j ) + O( ) =
(32)
Q1 + Q2 + Q3 + O( ) ;
where Q1 , Q1 , and Q3 , break down the summation in (32), into the summations over
X X
l2
l=l1
0j<l l1
;
X
X
l1 l<l1 +T=
l l1 j l 1
23
;
X
X
l1 +T= ll2
l l1 j l 1
;
respectively. Since we have kd(p)k b, we obtain the following estimates
kQ2k bT=T 0 + O( ) ; kQ3k be T (T 0 T )=T 0 + O( ) :
For Q1, by changing summation indexes from (l; j ) to (p = l
l lXp
X
1
Q1 = 0
(1
T =
2
2
p=l1 j =0
1
1
XX
(1
l2
where
)j d(p) =
)j d(p) Q4
T 0 = p=l1 j =0
EU (l1 ; l2 ) + O( ) Q4
1
X
1
X
Q4 = 0
T = l1 p<l2
Q5 = 0
T = l2
T=
T= pl2
X (1
j>l2 p
X (1
j>l2 p
j; j ), we can write:
Q5 =
Q5 ;
)j d(p) ;
)j d(p) ;
kQ4 k be T (T 0 T )=T 0 + O( ) ; kQ5k bT=T 0 + O( ) :
We see that
kQ EU (l1; l2)k kQ2k + kQ3 k + kQ4k + kQ5 k + O( ) 2be T (T 0 T )=T 0 +2bT=T 0 + O( ) :
Thus, if we x T large enough so that e T is small, and then choose T such that T=T is
small, then we have kQ EU (l1; l2 )k for all suÆciently small , uniformly on T 0 > T .
Since we also know that kQ uk , the result follows.
12
Conclusions
In this paper we consider the problem of optimal throughput allocation to multiple users
in a generalized switch model, with the objective of maximizing a concave utility function.
We prove asymptotic optimality of the Gradient scheduling algorithm. Our model is general
enough to cover a wide class of applications of interest, including those not covered in the
previous work. Most notably, the model allows multiple users to be served simultaneously
and allows discrete sets of scheduling decisions. We also allow very general utility functions.
The analysis of transient dynamics of user throughputs under Gradient algorithm is the
key problem and the main part of this work. Our \geometric" approach to this problem is
inherited from [14], and may be of independent interest.
24
References
[1] R. Agrawal, V. Subramanian. Optimality of Certain Channel Aware Scheduling Policies. In Proc. of the 40th Annual Allerton Conference on Communication, Control, and
Computing. Monticello, Illinois, USA, October 2002.
[2] P.Bender, P.Black, M.Grob, R.Padovani, N.Sindhushayana, A.Viterbi. CDMA/HDR:
A Bandwidth EÆcient High Speed Wireless Data Service for Nomadic Users. IEEE
Communications Magazine, July 2000.
[3] P.Billingsley. Convergence of Probability Measures. John Wiley and Sons, New York,
1968.
[4] S. N. Ethier and T. G. Kurtz. Markov Process: Characterization and Convergence. John
Wiley and Sons, New York, 1986.
[5] N. Gans and G. van Ryzin. Optimal Dynamic Scheduling of a General Class of ParallelProcessing Queueing Systems. Adv. Appl. Prob., Vol. 30, (1998), pp. 1130-1156.
[6] J. M. Harrison. Heavy TraÆc Analysis of a System with Parallel Servers: Asymptotic
Optimality of Discrete Review Policies. Annals of Applied Probability, Vol. 8, (1998), pp.
822-848.
[7] A.Jalali, R.Padovani, and R.Pankaj. Data Throughput of CDMA-HDR, a High EÆciency - High Data Rate Personal Communication Wireless System. In Proc. of the
IEEE Semiannual Vehicular Technology Conference, VTC2000-Spring, Tokyo, Japan,
May 2000.
[8] H. Kushner, P. Whiting. Asymptotic Properties of Proportional Fair Sharing Algorithms.
In Proc. of the 40th Annual Allerton Conference on Communication, Control, and Computing. Monticello, Illinois, USA, October 2002.
[9] X.Liu, E.K.P.Chong, N.B.Shro. Opportunistic Transmission Scheduling with ResourceSharing Constraints in Wireless Networks. IEEE Journal on Selected Areas in Communications, 19(10), 2001.
[10] X.Liu, E.K.P.Chong, N.B.Shro. A Framework for Opportunistic Scheduling in Wireless
Networks. Computer Networks. (To appear.)
[11] N.McKeown, V.Anantharam, and J.Walrand. Achieving 100% Throughput in an InputQueued Switch. Proceedings of the INFOCOM'96, 1996, pp. 296-302.
[12] A.Mekkittikul and N.McKeown. A Starvation Free Algorithm for Achieving 100%
Throughput in an Input-Queued Switch. Proceedings of the ICCCN'96, 1996, pp. 226-231.
[13] A.L. Stolyar. On the Stability of Multiclass Queueing Networks: A Relaxed SuÆcient
Condition via Limiting Fluid Processes. Markov Processes and Related Fields, 1(4), 1995,
pp.491-512.
25
[14] A. L. Stolyar. MaxWeight Scheduling in a Generalized Switch: State Space Collapse and
Equivalent Workload Minimization under Complete Resource Pooling. 2001. (Submitted)
[15] L.Tassiulas, A.Ephremides. Stability Properties of Constrained Queueing Systems and
Scheduling Policies for Maximum Throughput in Multihop Radio Networks. IEEE Transactions on Automatic Control, Vol. 37, (1992), pp. 1936-1948.
[16] L.Tassiulas, A.Ephremides. Dynamic Server Allocation to Parallel Queues with Randomly Varying Connectivity. IEEE Transactions on Information Theory, Vol. 39, (1993),
pp. 466-478.
[17] D. Tse. Multiuser Diversity and Proportionally Fair Scheduling. In preparation.
[18] H. Viswanathan, S. Venkatesan, and H. C. Huang. Downlink Capacity Evaluation
of Cellular Networks with Known Interference Cancellation. 2002. To appear in IEEE
Journal on Selected Areas on Communications.
[19] R.J.Williams. On Dynamic Scheduling of a Parallel Server System with Complete
Resource Pooling. Fields Institute Communications, Vol. 28, American Mathematical
Society, Providence, RI, 2000, pp. 49-71.
26