Towards Feasible Region Calculus: An End-to

Towards Feasible Region Calculus: An End-to-end Schedulability Analysis of
Real-Time Multistage Execution
William Hawkins and Tarek Abdelzaher
Department of Computer Science
University of Illinois at Urbana-Champaign
Champaign, IL 61801
e-mail:{whawkin2, zaher}@cs.uiuc.edu ∗
Abstract
1. Introduction
Efficient schedulability analysis of aperiodic distributed
task systems has received much less attention in real-time
computing literature than its periodic counterpart. As systems become larger and more complex and as workloads
become less regular, simple aperiodic task analysis techniques are needed to accommodate unpredictability and
scale, while erring on the safe side. This paper presents
a simple analytic framework for computing the end-to-end
feasibility regions of distributed aperiodic task systems under a category of fixed-priority scheduling. It is based on
a simple primitive called the generalized stage delay theorem that expresses the maximum fraction of the end-to-end
deadline that a task can spend at a resource as a function of
the (instantaneous or synthetic) utilization of that resource.
For the task to meet its end-to-end deadline, the sum of such
fractions must be less than 1. This constraint identifies a
volume in a multidimensional space in which each dimension is the utilization of one resource. This volume is a generalization of the notion of utilization bounds for schedulability in single-resource systems. It extends the bound (a
uni-dimensional schedulable region) to a multi-dimensional
representation for distributed-resource systems. Prior work
identified this volume for the special case of an infinite number of concurrent infinitesimal tasks. This paper generalizes
the result to arbitrary sets of finite tasks, making it applicable to realistic workloads. We evaluate the performance of
admission control based on feasible regions using simulation, showing that it is successful in eliminating deadline
misses.
Understanding the end-to-end temporal behavior of distributed real-time systems is a fundamental concern of realtime computing. Examples of distributed real-time systems
include performance-sensitive server farms, radar data processing back-ends, shipboard computing clusters, and sensor networks. In such systems, different classes of traffic or
computation (e.g., Web requests) typically traverse several
stages of distributed processing and must exit the system
within specified per-class end-to-end latency bounds. An
important goal is to understand the ability of such a system to meet end-to-end latency constraints as a function of
resource utilization and traffic flow patterns.
To accomplish this goal, this paper extends the notion
of utilization bounds for schedulability to the case of distributed resource systems. A utilization bound Ubound of a
resource (such as a CPU) with a utilization factor U is essentially a point that defines a boundary between a schedulable region (U ≤ Ubound ) and a potentially unschedulable region (U > Ubound ). The generalization of such
a bound to a multi-stage processing system, in which resource stage i has the utilization Ui , is to determine a function f (U1 , ..., Un ) of individual resource utilization values
where f (U1 , ..., Un ) ≤ Cbound implies that the end-toend latency requirements of all tasks traversing processing stages 1, ..., n are met. In the above expression,
Cbound quantifies the system’s capacity to meet deadlines.
In essence, the boundary f (U1 , ..., Un ) = Cbound is a surface that separates a schedulable region in the resource utilization space, f (U1 , ..., Un ) ≤ Cbound , from a potentially
unschedulable region, f (U1 , ..., Un ) > Cbound . This surface is the multi-dimensional extension of a single-resource
utilization bound. This paper develops the theory for constructing such surfaces for arbitrary-topology resource systems under a category of fixed-priority scheduling policies.
It generalizes a similar previous result [3] that derives the
∗ The work reported in this paper was supported in part by the National
Science Foundation under grants CCR-0093144, ANI-0105873, and EHS0208769, and MURI N00014-01-1-0576.
1
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
IEEE
multidimensional schedulable region in the special case of
the liquid task model. The liquid model assumes that an infinite number of concurrent infinitesimal tasks are present in
the system. In this paper, the liquid assumption is dropped.
Results are derived for arbitrary sets of finite tasks. The
generalized model is therefore applicable to a much larger
variety of realistic systems.
Utilization-based schedulability analysis is attractive in
that it offers very simple schedulability tests that rely only
on high-level aggregate metrics (namely, resource utilization). With the decreasing cost of hardware and the increasing complexity and scale of distributed computing systems, it can be argued that simplicity of analysis will gradually outweigh optimality. Designers of real-time systems
may be more concerned with their ability to determine simple sufficient conditions that, if met, ensure temporal correctness, rather than complex optimal conditions that maximally utilize available resources. Simplicity is favored because it leads to algorithms and implementations that are
less error-prone, more scalable, and generally easier to understand and apply.
The goal of feasible region calculus is to provide simple
schedulability analysis techniques that allow designers to
reason conveniently about the ability of a distributed realtime system to satisfy end-to-end timing constraints. The
schedulability conditions derived herein are sufficient (but
not necessary). Hence, they err on the safe side. In other
words, when these conditions are enforced, temporal correctness is guaranteed. When these conditions are violated,
the system may or may not miss deadlines. The need for
fast dynamic admission control in many distributed realtime systems such as military shipboard computing clusters
is an important driving motivation for this approach.
Feasible region calculus relates end-to-end schedulability to the utilizations of individual resources. The definition
of utilization that is pertinent to schedulability analysis of
distributed aperiodic tasks deserves further attention. Traditionally, the concept of utilization factor has been used for
periodic tasks [13]. The concept of instantaneous utilization was introduced later in the context of EDF scheduling
of aperiodic tasks [14]. In this paper, we make a distinction
between instantaneous utilization and synthetic utilization.
We also make the distinction between acyclic resource systems and non-acyclic resource systems. Informally, acyclic
systems are those that do not exhibit feedback cycles in their
overall task flow graph (i.e., the graph that superimposes all
individual task flow graphs in the system is acyclic). Nonacyclic systems can exhibit such cycles. We show that feasible regions of acyclic systems are a function of synthetic
utilization whereas feasible regions of non-acyclic systems
are generally a function of instantaneous utilization. We
also show that the theory derived in this paper predicts a
much larger schedulable region for acyclic systems. All re-
sults are verified using extensive simulations.
The remainder of this paper is organized as follows. Section 2 describes the system model. Section 3 details the
derivation of the generalized stage delay theorem. Section 4
describes the usage of the generalized stage delay theorem
for deriving feasible regions of distributed systems. Simulation results are presented in Section 5. Related work is
summarized in Section 6. The paper concludes with Section 7.
2. System Model
Consider a distributed system such as a multi-tier server
cluster or a multi-stage data processing pipeline. Tasks arrive to this system and require execution on a subset of resources (such as processors1, each performing one stage of
task execution). Consider a task Ti that arrives to the distributed system and requires processing on N stages. For
simplicity, we re-number these stages from 1 to N in the
order visited by Ti . Let Aij be the arrival time of Ti at stage
j where 1 ≤ j ≤ N . The arrival time of the task to the
entire system, called Ai , is the same as its arrival to the first
stage, Ai = Ai1 . Let Di be the end-to-end deadline of Ti .
The end-to-end deadline denotes the maximum allowable
latency for task Ti to complete its computation in the system. Hence, the task must exit the system by time Ai + Di .
The computation time of Ti at stage j is denoted Cij , for
1 ≤ j ≤ N . It is desirable to determine if task Ti can meet
its end-to-end deadline.
Before proceeding to the feasible region derivation, a
precise definition of utilization must be given. Given a time
t, we define the set of current tasks V (t), where V (t) =
{Ti |Ai ≤ t < Ai + Di }. In other words, the set V (t) consists of all tasks that have arrived at the system but whose
deadlines have not already been reached. Let Vj (t) be the
subset of tasks in V (t) that require processing on resource
j. It is now possible to precisely define the instantaneous
utilization Uj of stage j as
Uj =
Ti ∈Vj (t)
IEEE
Cij
.
Di
(1)
Observe that, in acyclic systems, tasks that depart a stage
prior to a processor idle time have no effect on arrival
times and execution intervals that come after this idle time.
Hence, the contribution of such tasks to a resource’s utilization can be discarded. We define a set Sj (t) to be the subset
of Vj (t) that excludes tasks that departed stage j prior to
1 While we equate a resource to a processor, the same discussion applies to other resources such as network links and disks as long as they are
scheduled in priority order. A resource pipeline can thus contain heterogeneous resources that include processing, communication and disk I/O
stages.
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
the start of the latest idle time on that stage. The synthetic
utilization Uj of stage j is given by:
Uj =
Ti ∈Sj (t)
Cij
.
Di
(2)
The objective of this paper is to determine task schedulability as a function of the utilization of system resources.
In the following derivation, it is assumed that a timespaceindependent scheduling policy is employed. By definition,
a scheduling policy is timespace-independent if the priority
of a task (i) is not dependent on absolute time (except that
tasks with the same priority are scheduled FIFO), and (ii) is
not dependent on spatial properties such as the path taken
by a task through the system. In the following, we call such
policies independent for brevity.
In order to represent fixed-priority independent scheduling policies, two additional system parameters are defined.
First, we introduce the urgency inversion factor αj for
stage j. An urgency inversion occurs when a less urgent task is assigned a greater priority than a more urgent task on some processor. Mathematically, αj =
Dlo
min( D
) over all task pairs Thi and Tlo executing at stage
hi
j with priority(Thi ) > priority(Tlo )2 . Observe that for
deadline-monotonic scheduling, αj = 1 since Dlo > Dhi .
Second, we introduce the blocking factor βij , defined as
the maximum amount of time a task i can be blocked at
stage j due to a lower priority task holding a needed critical
resource. It is useful to normalize βij by the task’s endto-end deadline, Di . Given a priority ceiling protocol, this
normalized quantity is bounded by γj = max(βij /Di ). We
call γj the maximum normalized blocking factor.
The mathematical framework introduced in this paper
for computing utilization-based schedulable regions of distributed systems is based on a single primitive called the
generalized stage delay theorem. It states that, given a resource j with nj concurrent tasks at time t (i.e., |Vj (t)| =
n), which collectively add up to an instantaneous utilization
Uj , under an independent fixed-priority scheduling policy
with an urgency inversion factor αj , and given a maximum
normalized blocking factor γj , the fraction of its end-to-end
deadline a task spends at stage j, denoted Fj (hereafter referred to as the stage delay), is given by:
Fj =
Uj
αj (1
+
(n −2)
Uj ( 2(njj −1)
1 − Uj
− 1)) + γj
.
(3)
Moreover, if the system is acyclic, synthetic utilization
can be substituted for instantaneous utilization in the above
equation leading to a better bound. The end-to-end schedulability condition for a task traversing stages 1, ..., N is then
2 priority(T ) is a function that returns the priority of task T under the
resource’s scheduling policy.
simply j Fj ≤ 1, indicating that the sum of the stage delays incurred does not exceed the task’s end-to-end deadline. Given the complete system model described above,
it is now possible to begin describing the derivation of our
generalized stage delay theorem.
3. Derivation of Feasible Regions
In this section, a derivation of the generalized stage delay
theorem is presented. We describe the instantaneous utilization for a stage j as a function of several parameters, including the stage delay, then solve for the stage delay. This will
give the desired result.
Generalized Stage Delay Theorem 1 Given a processing
stage j with nj current tasks, an instantaneous utilization
Uj , an independent fixed-priority scheduling policy with
a priority inversion factor αj and a maximum normalized
blocking factor γj , the maximum fraction, Fj , of its end-toend deadline that a task will spend at stage j is given by:
Fj =
IEEE
(n −2)
+ Uj ( 2(njj −1) − 1)) + γj
1 − Uj
.
(4)
3.1. Proof
Consider a task pattern in which task Tm experiences a
delay Qmj on stage j. Let us focus on the analysis of a
single resource j visited by Tm . Assume that stage j is processing n concurrent tasks. The single machine analysis has
already been presented in [1] to derive a uniprocessor utilization bound for aperiodic tasks. We follow the steps of
that analysis with the minor change that Tm does not spend
all its time on one stage. Rather it spends only a fraction
Fj of its deadline on stage j. The idea is to calculate the
maximum possible Fj given a bound on the instantaneous
utilization Uj at this stage. Similar to the derivation in [1],
the instantaneous utilization at stage j is given by the following summation over all current tasks:
Uj
=
priority(Ti )>priority(Tm )
+
Cmj
+
Dm
Cij
Di
priority(Tk )<priority(Tm )
Ckj
.
Dk
(5)
To obtain a lower bound on utilization, we ignore the contributions of lower priority tasks, which leads to:
Cij
Cmj
+
.
(6)
Uj =
Di
Dm
priority(Ti )>priority(Tm )
From this point forward, the task under consideration will
be referred to as Tn .
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
Uj
αj (1
C1
B
An
tf
L
Time
T
D1
C2
Q nj
D2
C3
Figure 1. The busy period under consideration.
...
Let B be the end of the last processor execution gap3 on
stage j prior to the arrival of task Tn to that stage, as shown
in Figure 1. Let tf be the time task Tn departs the stage.
Let L = An − B be the offset of the arrival of Tn relative to
the end of the last processor gap. Assume that there are nj 4
tasks that contribute to the instantaneous utilization of stage
j at time tf (i.e., |Vj (tf )| = nj , where Vj (t) was defined
in Section 2). We call them tasks Ti , for 1 ≤ i ≤ n. It
was shown in [1] that in the worst case arrival scenario on
a single stage, the instantaneous utilization is constant and
L = 0 (i.e., no tasks precede the critically schedulable task
in the busy section). It is also proven in [1] that the maximum amount of time Cpri that the critical task is preempted
by tasks with absolute deadlines prior to tf is given by:
n−1
Cpri =
i=1
(Ai − An )Cij
,
Di
Cnj
+
=
n−1
i=1
Qnj ,
=
−
Figure 2. The tasks and their arrivals and
deadlines.
Given this expression for Cnj and the fact that Uj =
n Cij
i=1 Di , the instantaneous utilization can be written:
Uj
=
n−1
Cij +
Cij + βnj
Di
i=1
=
−
(9)
processor gap is a period of time where the CPU is not utilized.
readability, the resource subscript j will be omitted from nj for
the remainder of the proof. However, it is important to remember that nj
does not necessarily equal ni for two different resources j and i.
Dn
(Cij − Oij ) − βnj
Dn
Cij
.
Di
n−1
i−1
i=1
h=1
(T +
Qnj (1 −
n−1
(
i=1
(10)
Cij −
Cij − βnj . (11)
Di
i=1
n−1
Chj )
n−1
Cij
Di
i−1
n−1
n−1
Cij
)+
Cij (
Cij − 1)
Di
i=1
i=1
Chj ) − βnj .
(12)
h=1
Substituting this expression into the definition of Uj yields:
Uj
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
IEEE
n−1
C
(Aij − An ) Diji
Manipulating the definitions previously presented, the
above expression can be rewritten as
4 For
0-7695-2490-7/05 $20.00 © 2005
i=1
i=1
i=1
3A
i=1
Next, it is desired to minimize the above utilization with respect to the task arrival pattern. It was shown in [1] that the
worst case behavior is exhibited when each of the tasks arrive immediately upon the completion of the previous task.
A schematic diagram of this arrival pattern can be found in
Figure 2.
Define T = A1j − Anj and thus Aij − Anj = T +
i−1
h=1 Chj . The expression for Cnj can now be rewritten
as follows:
Cnj
Cij
−
(Aij − Anj )
Di
i=1
n−1
n−1
i=1
Cnj = Qnj −
n−1
Cij − βnj .
−
+
(8)
Qnj
Qnj −
(7)
where βnj represents the worst case blocking time by lowerpriority tasks. Rewriting this equation yields:
Cnj
Q nj
n−1
(Ai − An )
C n−1
D n−1
+
From Figure 1 (with L = 0), the length of the busy period
containing task Tn on stage j is Qnj , which is also the delay
of task n at stage j. (This is different from the analysis in [1]
where the busy period is of length Dn , since the task spends
all its time on one processing stage). The busy period is
composed of the execution time of task Tn , the execution
times of tasks in set Vj (tf ), since they must have preempted
Tn , the execution times of tasks that have deadlines prior
to tf , given by Equation (7), and blocking (if any) over a
critical section of some lower-priority task. In other words:
D3
=
Qnj (1 −
n−1
Cij
i=1 Di )
+
n−1
i=1
Dn
n−1
Cij ( i=1
Cij
Di
− 1)
n−1
i=1
−
n−1
+
i=1
C
( Diji
i−1
n−1
Chj ) − βnj
h=1
+
Dn
Cij
.
Di
(13)
From this, it is easy to see that Uj decreases as Di increases.
n
However, Di is bounded by D
αj (by definition of αj ). Thus,
in the worst case:
n−1
Qnj (1 − D1n
i=1 Cij )
αj
Uj =
Dn
n−1
i=1
+
−
αj
Cij ( D1n
n−1
i=1
αj
n−1
1
Dn
αj
i=1
i−1
h=1
+
Cij
Dn
i=1 αj
i=1
−
+
C
Dn
i=1 αj
To complete the proof, it is necessary to rearrange the above
equation to solve for the stage delay, Qnj :
n−1
n−1
− 1)
Qnj (1 − U n−1 )
n−1
i=1 C(U
=
U
−
U
−
j
Dn
Dn
i=1
+
where U n−1 =
Qnj
=
n−1
i=1
Dn
αj Uj
−
( Dn
C
i−1
h=1
αj
αj
C)
+
Dn
αj
C
Dn
αj
−
n−1
i=1
1 − U n−1
IEEE
Dn
αj
and
−
Dn
αj (Uj
−
Cnj
) − (n − 1)C(Uj −
Dn
αj
1 − (Uj −
(n−1)(n−2)C 2
n
2D
α
1 − (Uj −
Cnj
Dn
αj
Cnj
Dn
αj
Dn
αj Uj
=
−
Qnj =
)
Cnj
Dn
αj
)
.
(18)
Dn
αj Uj
− CUj (n − 1) + C(n − 1)
1 − Uj
(n−1)(n−2)C
n
2D
α
2
+ βnj
j
.
1 − Uj
(19)
βnj
Dn
αj
(n−2)
n
Uj D
αj (1 + Uj ( 2(n−1) − 1)) + βnj
1 − Uj
.
(20)
To turn the delay Qnj into the fraction of its deadline that
task n spends at stage j, we must divide both sides by Dn :
Fnj =
Uj
αj
(n−2)
(1 + Uj ( 2(n−1)
− 1)) +
1 − Uj
βnj
Dn
.
(21)
β
(16)
To make Fnj a worst case bound, Dnjn must be maximized.
As previously defined, γj is such an upper bound.
Fnj =
C(U n−1 − 1)
Uj
αj (1
− 1)
+ βnj
j
+
(n−2)
+ Uj ( 2(n−1)
− 1)) + γj
1 − Uj
.
(22)
Arriving at Equation (22) completes the proof. As noted
above, the theorem allows one to calculate the stage delay
of a task given αj , γj , Uj and n > 2.
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
Cnj
.
Dn
αj Uj
=
Qnj
. This implies that
Dn n−1
αj U
,
Taking the derivative of this equation with respect to utilization shows that (when n > 2) Qnj is maximized when
Cnj is negative. Since Cnj is in units of time, it cannot be
negative. For this reason, let Cnj = 0. Therefore,
(15)
n−1
(17)
where n > 2. Substituting for C and simplifying, we ultimately get:
C) − βnj
.
αj
j
n−1
αj
Dn
αj
n−1
n
Uj D
α
3. C =
(14)
h=1
αj
C
Dn
αj
2. U n−1 = Uj −
+
.
i−1
( DCn
.
1 − U n−1
1. Uj − U n−1 =
Qnj
Chj ) − βnj
In the above expression, Uj depends upon the distribution
of individual computation times in the nested summations.
In order to minimize the nested summations and, subsequently, Uj it can be shown that one must have Cij = C
for 1 ≤ i ≤ n − 1. Given this condition, the above expression of Uj can be rewritten as:
n−1
n−1 n−1
Qnj (1 − i=1 DCn ) + i=1 C( i=1 DCn − 1)
αj
αj
Uj =
Dn
n−1
C + βnj
h=1
Therefore,
Cij − 1)
Dn
αj
(Cij
i−1
C
Dn
αj
The following facts (from the definitions) can be used to
simplify the expression for Qnj :
Dn
αj
n−1
i=1
As mentioned earlier, Equation (22) will hold when the
instantaneous utilization is exchanged for the synthetic utilization in acyclic distributed real time systems. Remember
that instantaneous utilization considers all tasks that have
entered the system but that have not yet reached their deadline. Synthetic utilization, however, considers only those
tasks that have arrived since the last processor gap. In an
acyclic system any task that departed prior to the most recent processor gap does not affect the future schedulability
of that stage. Once a task departs a stage and a processor
gap occurs it no longer influences the schedule. As such, in
acyclic real time systems synthetic utilization can be substituted for instantaneous utilization and Equation (22) holds.
Generalized Stage Delay Theorem Corollary 1 In
an
acyclic task flow graph, given an end-to-end deadline Dn ,
synthetic utilization Uj , priority inversion factor αj and a
maximum normalized blocking factor γj , the fraction Fj of
its end-to-end deadline that task n spends at stage j is:
Fj =
Uj
αj (1
(n −2)
+ Uj ( 2(njj −1) − 1)) + γj
1 − Uj
(23)
Hence, for schedulability of a pipeline of N stages:
N
Uj (1 −
1 − Uj
j=1
≤1
(25)
Equation (24) and (25) above are identical to the feasible
region derived in [3] and the corresponding schedulability
condition. The reduction is thus complete.
3.2.2 Reduction to Single Stage Real-time Systems
In [2] and [1], synthetic utilization bounds were derived for
single-stage systems subject to aperiodic tasks. Consider
the result contained in [2] first.
In [2], the system model differs from the one described
in Section 2 in that (i) only one stage exists and (ii) the tasks
are infinitesimal.
In order to modify our result to represent a single stage
real-time system, it is enough to say that all of the endto-end deadline is spent in that one stage. Mathematically,
this is equivalent to saying Fnj = 1. Substituting in Equation (22), dropping the stage index, and rearranging, we get:
1−U =
under a fixed-priority scheduling policy.
We will demonstrate the tremendous benefit of such a substitution in Section 5.
Uj
2 )
U
U
(1 − ) + γ.
α
2
(26)
Using the quadratic formula and letting n → ∞,
U = 1 + α − 1 + 2αγ + α2 .
(27)
3.2. Applicability of Results
In this section, we show that the generalized stage delay theorem derived above reproduces previously published
aperiodic task scheduling bounds as special cases. Hence, it
subsumes the previous results and presents a more general
schedulability condition.
3.2.1 Reduction to Resource Pipeline with Infinite
Tasks
In [3], a feasible region was derived for meeting aperiodic
end-to-end deadlines in resource pipelines. The resource
pipelines modeled in [3] are identical to those modeled
herein, with three exceptions. First, the feasible region in
[3] is derived for resource pipelines with an infinite number
of tasks. Therefore, let n → ∞. Second, the feasible region in [3] is derived for a resource pipeline with a deadline
monotonic scheduling policy. Therefore, α = 1. Finally,
the feasible region in [3] is derived for a resource pipeline
whose tasks contain no critical sections. Therefore, γj = 0.
Substituting appropriately for the above assumptions, Equation (22) becomes
U
Fj =
Uj (1 − 2j )
.
1 − Uj
(24)
Equation (27) is equivalent to the result in [2] and the reduction is complete.
Finally, we reduce the generalized stage delay theorem to
the result in [1]. The system model used in [1] differs from
the system model used in our result in the following ways.
First, only one stage exists. Second, a deadline monotonic
scheduling policy is assumed (i.e., αj = 1). Finally, the
tasks contain no critical sections (i.e., γj = 0). Substituting
for αj and γj , Equation (20) becomes:
1 − Uj
0 =
IEEE
Uj (1 + Uj (
Uj2 (
n−2
− 1))
2(n − 1)
(28)
n−2
− 1) + 2Uj − 1.
2(n − 1)
Using the quadratic formula,
n−2
−1 + 2(n−1)
1
Uj =
=
n−2
2(n−1) − 1
1 + 12 (1 −
1
n−1 )
.
(29)
(30)
This result is equivalent to the result in [1]. For very
large
n, this result can be further simplified to 1/(1 + 1/2),
which implies that maintaining synthetic utilization of less
than 58.6% will ensure that infinitesimal tasks meet their
deadline, as proven in [1].
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
=
Having performed three separate reductions, it is possible to conclude that our results are in agreement with previously derived special case models. Our results are thus
a proper generalization of previously reported bounds for
aperiodic task schedulability analysis.
4. Usage of Feasible Regions
To apply the results derived in this paper, a practitioner
should first decide what the stages of computation are in
a given system. While the terminology used in the paper
suggests that each resource stage is a CPU, the results apply to any resource that processes tasks serially using some
resource-specific independent scheduling policy. Communication links and disks are thus possible resource stages. In
deciding on which stages to include in the system model, in
practice only potential bottlenecks merit consideration. The
scheduling policy for each such resource must be known.
The values of αj and γj should then be computed. For
example, as mentioned previously, if each resource uses a
deadline monotonic scheduling policy, then αj = 1. Observe that in systems where a limited number of deadline
misses is acceptable, it may be possible to use approximate
default values for the above parameters. For example, in
CPU-intensive soft-real-time systems, one can ignore γj altogether.
Second, the system designer must choose between building an admission controller based upon the generalized
stage delay theorem (i.e., instantaneous utilization) or the
generalized stage delay theorem corollary (i.e., synthetic
utilization). If the task flow in the system is acyclic then
synthetic utilization is used. If it is non-acyclic, instantaneous utilization should be used for admission control. If
the system cannot benefit from the generalized stage delay
theorem corollary, but a very small percentage of deadline
misses are acceptable, then using the generalized stage delay theorem corollary as a heuristic for task schedulability
may be an option. Doing so will provide great improvement
in system utilization. Once the appropriate utilization definition is chosen, the system should be instrumented to keep
track of the corresponding utilization value Uj for each resource stage, j.
The final step to using feasible region calculus in a real
time system is to build the admission controller. For each
task that arrives at the system, the controller adds the contribution of this task to the utilization Uj of each stage to
be traversed by the task. It then checks that the fractional
delays at these stages (as predicted by the generalized stage
delay theorem or its corollary) do not add up to more than
1. Otherwise, the task must be rejected and the utilization
modifications made on its behalf should be reversed. If the
task is admitted, its contribution to utilization remains until its deadline at which point it is decremented. In princi-
ple, the admission controller checks that the system operates within its feasible region. Observe that different tasks
may take different paths through the system. The feasible
region for a task depends on the path taken by this task.
Conceptually, the feasible region for the entire system is the
intersection of the feasible regions corresponding to all the
paths taken by current tasks. Observe how the generalized
stage delay theorem or its corollary is the only primitive
needed to compute this region for the system.
Since the generalized stage delay theorem computes the
fractional stage delay in constant time, the complexity of
the schedulability check performed upon arrival of a new
task is linear in the number of stages traversed by the task.
When using the generalized stage delay theorem as the basis of the admission controller, modifications to a stage’s
instantaneous utilization should be made only when a task
arrives or reaches its deadline. When using the generalized
stage delay theorem corollary as the basis of the admission
controller, the synthetic utilization at a stage must also be
adjusted when the resource is idle. The onset of idle intervals can be communicated by the idle resource, which
incurs an overhead only when the resource in question is
not used.
In the following, we evaluate the performance of admission controllers based on feasible region calculus.
5. Simulation Results
In order to evaluate the merit of the generalized stage
delay theorem, we have constructed a simulator. The simulator will model a distributed real time system with arbitrary
configuration and tasks.
In order to maintain real time guarantees within the simulated system, an admission controller is used. This admission controller is based on either the generalized stage delay
theorem or its corollary. The admission controller behaves
as follows:
• When a task arrives at the system, its utilization is tentatively added to every stage j it will traverse during
computation.
• The generalized stage delay theorem, or its corollary if
applicable
(as discussed in Section 4), is used to check
whether j Fj ≤ 1 over the stages to be traversed by
the incoming task. If so, the task is admitted. If not,
the task is rejected and its utilization is removed from
further consideration.
Each of the experiments below is configured in terms
of task granularity and load. Task granularity is defined as
the ratio of total computation time of a task to its deadline.
Load is defined as the sum of computation times of all tasks
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
IEEE
120
100
100
80
80
Admitted Tasks (%)
Actual Processor Utilization (%)
120
60
40
1 stage
2 stages
3 stages
4 stages
5 stages
60
40
20
1 stage
2 stages
3 stages
4 stages
5 stages
20
0
0
60
80
100
120
140
160
180
200
60
Offered Load (%)
80
100
120
140
160
180
200
Offered Load (%)
Figure 3. The effect of a generalized stage delay theorem corollary based admission controller on a the utilization of a pipeline distributed real time system.
Figure 4. The effect of a generalized stage delay theorem corollary based admission controller on a the acceptance rate of a pipeline
distributed real time system.
that arrive during the simulation divided by the duration of
the experiment.
For completeness, each point on the figures below represents average values obtained from 100 executions of the
simulator. At the start of each of the simulations, a pool of
task classes is created. Unless otherwise specified, the pool
contains 5 task classes whose configuration is determined
by the specification of the experiments. Deadlines of the
classes within the pool are separated by a minimum value.
This separation value ensures that each class represents a
different priority. The task computation times are computed
based upon these deadlines, the granularity under consideration and a normally distributed random variable. During
the simulations, tasks are chosen from this pool and repeatedly invoked to generate a specific load. Unless noted otherwise, the task arrival times during an experiment follow
the Poisson distribution.
on a pipeline with 5 stages must be processed by stages 1,
2, 3, 4 and then 5 – in that order.
Task deadlines are drawn from a uniform distribution.
Tasks have a granularity of 1/100. This is consistent with
high performance distributed real time systems where many
tasks execute concurrently, each relatively quickly. Load is
approximately equal across different stages. The load is varied from 60% to 200%. As the load increases, the number
of concurrent tasks also increases.
Since the experiment involves an acyclic system, the
corollary to the generalized stage delay theorem can be
used to define this system’s feasible region. The results
are shown in Figures 3 and 4. None of the accepted tasks
missed their deadlines in this experiment.
Two observations can be drawn from these graphs. First,
the utilization of the system is generally high for all offered
loads since the generalized stage delay theorem corollary
was used in the admission controller. This is important to
show that admission control is not overly pessimistic. Second, the number of stages does not affect the utilization of
the system. This is important for system designers who may
need to use the generalized stage delay theorem in the presence of long pipelines.
5.1. Acyclic Task Systems
Assuming that the resources in a system are given a
unique numeric id, an acyclic task system accomodates
tasks that require processing from resources in a strictly increasing order. In general, this describes systems whose
tasks never require processing from a stage i where 0 ≤
i ≤ x after leaving stage x.
Pipelines represent the basic block of acyclic task systems. For this reason, we simulate a pipeline to quantify
the performance of the generalized stage delay theorem in
acyclic task systems. We vary the number of stages in the
pipeline from 1 to 5. In this experiment, each task must be
processed by each stage. For instance, a task that executes
5.2. Non-Acyclic Task Graphs
In this experiment, we show the utility of the generalized
stage delay theorem for non-acyclic real time distributed
systems. Recall that tasks within a non-acyclic real time
distributed system may receive computation from the same
resource more than once. A system like this may occur in
a real world web server farm, for example, when a particular stage is responsible for all database queries and a
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
IEEE
120
120
100
1 stage
2 stages
3 stages
4 stages
5 stages
80
1 stage
2 stages
3 stages
4 stages
5 stages
80
Admitted Tasks (%)
Actual Processor Utilization (%)
100
60
60
40
40
20
20
0
0
60
80
100
120
140
160
180
200
60
Offered Load (%)
80
100
120
140
160
180
200
Offered Load (%)
Figure 5. The effect of a generalized stage delay theorem based admission controller on
utilization of a non acyclic distributed real
time system.
Figure 6. The effect of a generalized stage delay theorem based admission controller on
task acceptance rate of a non acyclic distributed real time system.
task requires multiple queries to complete its processing.
This results in cycles in the task graph as the same stage
is visited repeatedly. To illustrate the basic implications of
non-acyclic behavior without getting into detailed workload
characterization of specific applications, the task traversal
graphs for our tasks were computed at random. There is
no guarantee that a task requires processing from each resource. Additionally, there is no guarantee that a stage will
be used by any task for the duration of a particular execution
of the simulator.
Specifically, each task in this experiment traverses 1
more stage than is available in the pipeline. For example,
a task in a system with 4 resources will be processed by 5
stages. This ensures that each task contains a cycle. Each
task has a granularity of 1/100 and its computation times
are approximately equal across the stages it traverses. Arrival rates are varied to generate variable load between 60%
and 200%. The results are shown in Figures 5 and 6. No
deadline misses were observed.
It is important to note that this system cannot employ the
generalized stage delay theorem corollary. As described extensively above, the generalized stage delay theorem corollary relies on the fact that a task departed from stage x will
no longer affect the schedulability of stage x. In the case of
cyclic real time distributed systems, a task may depart from
stage x and return later to stage x for additional processing.
Therefore, using the corollary in this type of system may
lead to missed deadlines.
From these graphs, observe that the utilization of a system with one stage is significantly higher than those systems with two, three, four or five stages. Intuitively, this
is because the lowest priority task suffers roughly the same
delay whether the higher-priority tasks that preempt it were
all on the same machine or distributed across several stages.
Hence, in an n stage pipeline, in the worst case, the task
can miss its deadline even if the load on each stage is equivalent to only 1/n of a full single-machine load. Without
making additional assumptions, it is thus impossible to derive schedulability conditions that do not become more pessimistic as the number of stages grows. Indeed, the study
of better schedulability conditions for non-acyclic systems
is an important research topic suggested by the results reported in this paper.
Given the improvements in system utilization made possible by the stage delay corollary, it would be nice if the
non acyclic distributed system could benefit. While we described above that it is not possible to use it and guarantee
that all deadlines will be met, it may be possible to use the
corollary if the number of missed deadlines is small compared to the number of met deadlines. In order to determine
the deadline miss ratio in a non acyclic task system that uses
the corollary, the above experiment was repeated with an
admission controller based on synthetic utilization (instead
of instantaneous utilization). The seeds used for the above
experiment were recycled from the previous experiment to
ensure that tasks and their arrivals were identical. The results of the new experiment are shown in Figures 7 and 8.
Surprisingly, the miss ratio is very small. In our experiments, just 27 of 829,903 tasks missed their deadlines.
Comparing Figures 5 and 6, with Figures 7 and 8 the
benefits of the corollary are clear. The task acceptance rate
and system utilization increase significantly when it is used.
Assuming that a very small number of task deadline misses
are acceptable, the generalized stage delay corollary is a
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
IEEE
120
120
100
80
80
Admitted Tasks (%)
Actual Processor Utilization (%)
100
60
40
1 stage
2 stages
3 stages
4 stages
5 stages
60
40
1 stage
2 stages
3 stages
4 stages
5 stages
20
20
0
60
80
100
120
140
160
180
200
Offered Load (%)
0
60
80
100
120
140
160
180
200
Offered Load (%)
Figure 7. The effect of a generalized stage delay theorem corollary based admission controller on utilization of a non acyclic distributed real time system.
good heuristic to use for admission control in non-acyclic
systems.
5.3. Finite Tasks with Varying Granularity
In this experiment we compare the generalized stage delay theorem with the result derived in [3] when used as
an admission controller for a two-stage pipeline distributed
real time system. Both of these theorems calculate a feasible region for scheduling real time tasks. However, the
generalized stage delay theorem derived in this paper differs from the stage delay theorem derived in [3] in the parameters used to generate the feasible region. The stage
delay theorem considers a system where there are a very
large number of concurrent tasks and bases its calculation
only on the utilization of the stages. The generalized stage
delay theorem calculates the feasible region on the basis of
the number of concurrent tasks as well as the utilization of
a stage. This experiment will show the advantages of using
the generalized stage delay theorem by simulating a system
with a small number of concurrent tasks.
To compare the two theorems, multiple runs of the simulator were performed. The first run defines the feasible
region using the generalized stage delay theorem. This feasible region is used in the admission controller to make the
accept/reject determination. All seed values generated to
support the first run of the simulator are saved. The second
run defines the feasible region using the stage delay theorem. As in the first experiment, this feasible region is used
by the admission controller.
In this experiment, we consider a pool of 6 task classes
as defined in Section 2. Each task’s arrival time is determined by a uniformly distributed random variable. Because
the simulated system is a two-stage pipeline distributed real
Figure 8. The effect of a generalized stage delay theorem corollary based admission controller on task acceptance rate of a non
acyclic distributed real time system.
time system, the tasks receive computation at stage 1 and
then stage 2. Task computation times at each stage are
roughly equivalent and governed by the granularity under
consideration. Task deadlines are chosen with a uniform
distribution.
Because the generalized stage delay theorem depends on
the number of tasks in the system as well as the utilization of
each stage, both must be varied to obtain meaningful results.
It is easy to vary the number of tasks in the system: modify a
parameter of the simulator. However, to vary the utilization
of the stages is more difficult. To do so, the granularity
must be varied. Figures 9 and 10 shows the results of the
simulation.
As you can see from the figures, for very small granularities and a moderate number of tasks (six, in this case), the
generalized stage delay theorem provides substantial improvements in both actual utilization and the ratio of accepted tasks. This is important since it verifies that when
limiting or recording the number of tasks in the system, the
generalized stage delay theorem provides much greater actual system utilization. As the number of tasks in the system increases, the two stage delay theorems converge as expected.
6. Related Work
In 1973, the first study of feasible regions in real-time
systems was conducted by Liu and Layland [13]. They studied a set of real-time systems with specific restrictions. As
research progressed in this field, bounds have been derived
that removed those constraints. In [10], Kuo and Mok were
the first to generalize the Liu and Layland bound by consid-
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
IEEE
120
100
100
80
80
Accepted Tasks (%)
Actual Processor Utilization (%)
120
60
40
Stage Delay Theorem
Generalized Stage Delay Theorem
60
40
20
Stage Delay Theorem
Generalized Stage Delay Theorem
20
0
0
2
3
4
5
6
7
8
9
10
2
Granularity (%)
Figure 9. Comparing the effects of a stage delay theorem based admission controller and
a generalized stage delay theorem based admission controller on utilization of a twostage pipeline distributed real time system.
ering K harmonic task chains created by the values of the
task periods. They proved that if K < n, then the feasi1
ble utilization region is K(2 K − 1). In [16], Park, Nararajan and Kanevsky presented a utilization bound for systems whose tasks had non-static computation times. Moreover, the authors presented rigorous design practices for
constructing real-time systems using utilization bounds. In
[5], Bini and Buttazzo extended utilization bounds to multiprocessor real-time systems. Moreover, they considered
resource constraints and presented a single-stage utilization
bound that was less pessimistic than Liu and Layland. In
[15], Lopez, Garcia and Diaz followed this work by presenting a feasible region for multiprocessor real-time systems
using the rate monotonic scheduling policy. Researchers
have commonly proposed considering aperiodic tasks as an
exception to periodic tasks. For example, the slack server
[18], the sporadic server [7] and the deferrable server [17]
all use this abstraction. In our result, just the opposite is
the case: periodic tasks are considered to be a special case
of aperiodic tasks. The priority inversion problem was described by Lampson and Redell in [11]. As a solution to this
problem, Strosnider, Lehoczky and Sha presented the priority ceiling protocol [8]. In [6], Caccamo, Sha, Lipari and
Buttazzo have made advances in aperiodic real-time feasible regions.
Recently, researchers have devoted efforts to investigating the aperiodic schedulability problem. In [4], Andersson proposed an exact admission controller algorithm for
EDF systems subjected to aperiodic tasks. This admission
controller is implemented using an AVL tree which gives
a run time of O(log n). Each node in this tree represents
IEEE
4
5
6
7
8
9
10
Granularity (%)
Figure 10. Comparing the effects of a stage
delay theorem based admission controller
and a generalized stage delay theorem based
admission controller on task acceptance rate
of a two-stage pipeline distributed real time
system.
a task and stores information about that task such as the
task deadlines and computation times. Storing information about each task in a system where large numbers of
tasks are present could be a bottleneck. Through simulation, Andersson claims that the overhead of the exact admission controller is negligible when the scheduled tasks
have long computation time compared to the running time
of the admission controller. This exact admission controller
is currently being implemented to test its practical applicability.
In [9], Wu et al have proposed a framework for deriving feasible regions for arbitrary real time systems. Their
framework is derived using network calculus and depends
upon the workload rate. This rate is used to provide a measurement of the resource demand within a period of time
that is proportional to the deadline of the task. This framework is independent of the scheduling policy employed
and does consider critical sections and blocking problems.
Their framework provides bounds only for single-resource
systems subject to an infinite number of aperiodic tasks. In
the future, the authors hope to extend their bounds to a system subject to a finite number of tasks and to multi-stage
systems.
In [12], Lundberg et al propose a very simple feasible
region for multiprocessor real time systems. In their work,
the authors show that if the actual utilization of a system remains below 35%, then all incoming tasks will meet their
deadlines. We believe pipeline bounds deserve as much
consideration as multiprocessor bounds, since the two architectures present alternative ways to multiply single pro-
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
3
cessor capacity. Results from this paper suggest that execution pipelines might offer less pessimistic bounds.
7. Conclusions
In the past, most real-time scheduling research on utilization bounds has focused on evolutionary improvements to
the original Liu and Layland result for the schedulability of
periodic tasks in a single-resource system. As systems become larger, more complex and satisfy requests from irregular workloads, simple aperiodic task analysis techniques
become necessary for distributed systems. This paper presented a simple analytic framework for computing the endto-end feasibility regions of distributed aperiodic task systems under independent fixed-priority scheduling. It extends the previous derivations of uni-dimensional schedulability regions for single processors.
Prior work by the authors considered the special case
of distributed systems with an infinite number of concurrent liquid tasks. This paper generalized that result to arbitrary sets of finite tasks, making it applicable to more realistic acyclic and non-acyclic workloads. Many more extensions need to be performed. For example, other categories
of scheduling policies (such as EDF) must be considered.
Relaxed schedulability conditions can be derived for systems that accept some percentage of deadline misses. Finally, cases must be addressed where tasks need multiple
resources simultaneously. These are left as topics for future
work.
References
[1] T. Abdelzaher and C. Lu. A utilization bound for aperiodic
tasks and priority driven scheduling. IEEE Transactions on
Computers, 53(3), March 2004.
[2] T. Abdelzaher and V. Sharma. A synthetic utilization bound
for aperiodic tasks with resource requirements. In 15th Euromicro Conference on Real-Time Systems, Porto, Portugal,
July 2003.
[3] T. Abdelzaher, G. Thaker, and P. Lardieri. A feasible region for meeting aperiodic end-to-end deadlines in resource
pipelines. In International Conference on Distributed Computing Systems, Tokyo, Japan, March 2004.
[4] B. Andersson and C. Ekelin. Exact admission-control for integrated aperiodic and periodic tasks. In Real-Time and Embedded Technology and Applications Symposium, San Francisco, California, March 2005.
[5] E. Bini, G. Buttazzo, and G. Buttazzo. A hyperbolic bound
for the rate monotonic algorithm. In 13th Euromicro Conference on Real-Time Systems, Delft, Netherlands, June 2001.
[6] M. Caccamo, G. Lipari, and G. Buttazzo. Sharing resources
among periodic and aperiodic tasks with dynamic deadlines.
In IEEE Real-Time Systems Symposium, December 1999.
[7] M. Caccamo and L. Sha. Aperiodic servers with resource
constraints. In IEEE Real-Time Systems Symposium, London, England, December 2001.
[8] J. B. Goodenough and L. Sha. The priority ceiling protocol:
A method for minimizing the blocking of high priority Ada
tasks. Ada Letters, 7(8):20–31, August 1998.
[9] J.-C. L. Jianjia Wu and W. Zhao. On schedulability bounds
of static priority schedulers. In Real-Time and Embedded
Technology and Applications Symposium, San Francisco,
California, March 2005.
[10] T. W. Kuo and A. K. Mok. Load adjustment in adaptive
real-time systems. In IEEE Real-Time Systems Symposium,
December 1991.
[11] B. W. Lampson and D. D. Redell. Experiences with processes and monitors in mesa. Communications of the ACM,
February 1980.
[12] H. L. Lars Lundberg. Global multiprocessor scheduling of
aperiodic tasks using time-independent priorities. In IEEE
Real-Time Technology and Application Symposium, Washington, DC, May 2003.
[13] C. L. Liu and J. W. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. J. of ACM,
20(1):46–61, 1973.
[14] J. Liu. Real-Time Systems. Prentice Hall, 1st edition, 2000.
[15] J. M. Lopez, J. L. Diaz, and D. F. Garcia. Minimum and
maximum utilization bounds for multiprocessor rate monotonic scheduling. In 13th Euromicro Conference on RealTime Systems, Delft, Netherlands, June 2001.
[16] D.-W. Park, S. Natarajan, and A. Kanevsky. Fixed-priority
scheduling of real-time systems using utilization bounds.
Journal of Systems and Software, 33(1):57–63, April 1996.
[17] J. K. Strosnider, J. P. Lehoczky, and L. Sha. The deferrable
server algorithm for enhanced aperiodic responsiveness in
hard real-time environments. IEEE Transactions on Computers, 44(1):73–91, January 1995.
[18] S. R. Thuel and J. P. Lehoczky. Algorithms for scheduling hard aperiodic tasks in fixed-priority systems using slack
stealing. In Real-Time Systems Symposium, pages 22–33,
San Juan, Puerto Rico, December 1994.
Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05)
0-7695-2490-7/05 $20.00 © 2005
IEEE

Download Report

Towards Feasible Region Calculus: An End-to

Paperzz.com

Your Paperzz