Dynamic workload Example: phone call Example: video playing

Dynamic workload
Several applications (e.g., multimedia systems) are
characterized by highly variable computational
requirements:
load
avg
time
Example: phone call
It consists at least of 2 periodic tasks, executed every 20 ms:
Receives audio signal, decodes
it, and transfers packets to the
speaker buffer for reproduction.
modem
Ci-min = 1 ms (silence)
Ci-max = 3 ms
processor
RX
Example: video playing
MPEG standard encodes frames in "Group of Pictures" (GOP),
consisting into 3 types of frames:
 I frames (Inter-frames) include the entire content of a picture;
 B frames, (Backwards predictive) look backwards at the I
frames and P frames;
 P frames (Predictive) looks forward at the frame, constantly
comparing frame content.
TX
Voice sampling, data encoding,
speech enhancement, and packet
transmission through the modem.
Ci-min = 5 ms (silence)
Ci-max = 10 ms
Overall CPU bandwidth: 30-65%
Probability density
Mpeg decoding
Frame execution
times distribution
for “Star Wars”
PAL standard:
 frame rate = 25 fps
 decoding time: 2-20 ms
I-frames can take 10 times
more than B-frames
Other causes of overloads
 Optimistic system design (based on average
rather than worst-case behavior)
 Malfunctioning of input devices (sensors may
send sequence of interrupts in bursts)
 Variations in the environment
 Simultaneous arrivals of events
 Exceptions raised by the kernel
Decoding time (s)
1
Instantaneous load (t)
Load definitions
Does not consider past or future activations, but
only current jobs:
  C
 Soft aperiodic tasks:
n
 Hard periodic tasks:   U  
i 1
Ci
Ti
1
2
 Generic RT application:
if g(t1, t2) is the processor
demand in [t1, t2], then:
3
 g (t , t ) 
  max  1 2 
t1 ,t 2
 t 2  t1 
0
g (t , d k )
 max
 (t )  max
k
k
dk  t

di d k
4
6
8
10
12
14
t
Instantaneous load (t)
Maximum processor demand among those
intervals from the current time and the deadlines
of all active tasks.
2
Instantaneous load (t)
We can have at most one deadline for each active
task k, hence:
 k (t ) 
ci (t )
dk  t

ci (t )
di d k
dk  t
 (t )  max  k (t )
k
Example
Load and design assumptions
System designed under
average-case assumptios
System designed under
worst-case assumptios
1
1(4) = 2/4 = 0.5
2
load
load
2(4) = 5/6 = 0.83
1
1
3(4) = 7/9 = 0.78
0.75
0.75
(4) = 0.83
0.5
0.5
0.25
0.25
3
0
2
4
6
8
10
12
14
(t)
1
0.5
0
0
2
4
6
8
10
12
time
0
time
14
2
Predictability vs. efficiency
 High predictability and low efficiency means
wasting resources  high cost
predictability
efficiency
A matter of cost
– it can be justified only for very critical systems
 High efficiency in resource usage requires a
the system to:
Allocated resources
– handle and tolerate overloads
– adapt for graceful degradation
– plan for exception handling mechanisms
SAFE
UNSAFE
Optimistic
design
Pessimistic
design
Definitions
Definitions
Overrun: Situation in which a task exceeds its
expected utilization.
Overload: Situation in which (t) > 1.
Activation Overrun
The next task activation occurs before its expected time:


1
1
0.75
0.75
0.5
0.5
0.25
0.25
0
expected interarrival time
Consider the following applications with U < 1:
1
0
2
4
1
8
overrun
6
8
10
12
But in general may delay the execution of other tasks,
causing a deadline miss:
deadline miss
1
time
Consequences of overruns
A task overrun may not cause an overload:
1
0
time
Consequences of overruns
2
Permanent Overload
avg > 1
Transient Overload
max > 1 but avg  1
Execution Overrun
The task computation time exceeds its expected value:
(2/4)
4
8
12
20
16
24
2
(2/6)
6
12
18
24
3
2
(1/12)
8
overrun
1
12
0
2
4
6
8
10
24
12
3
Consequences of overruns
Example of overload
U < 1 but sporadic overruns can prevent 3 to run:
early completion
1
The activation rate of these
tasks is proportional to the
angular velocity  of the engine:
execution overrun
(2/4)
4
8
12
20
16
In engine control, some tasks are activated at specific
angles of the crankshaft.
24
2
t
(2/6)
6
12
24
18
3
u() =
C 
2
1
(1/12)
12
* =
24
2
C
*

Load control methods
Overrun handling
Resource Reservation
Overload handling
Reactive vs. Proactive
Local vs. Global
Resource Reservation
Providing temporal isolation
Transient overloads can
Resource Reservation:
be
handled
through
Resource partition
10 %
a mechanism to bound the resource consumption of a
task set to limit the interference caused to other tasks.
To implement Resource Reservation we have to:
 reserve a fraction of the processor bandwidth to a
set of tasks;
 prevent the task set to use more than the reserved
fraction.
4
3
45 %
Resource enforcement
20 %
1
2
25 %
Each task receives a bandwidth
i and behaves as it were
executing alone on a slower
processor of speed i
 A mechanism that prevents
a task to consume more
than its reserved amount.
 If a task executes more, it
is delayed, preserving the
resource for other tasks.
4
Priorities vs. Reservations
1
2
3
Prioritized
Access
Resource
Reservation
P1
READY QUEUE
1 2 3
P2
Priorities vs. Reservations
Prioritized
Access
P3
1
1 = 0.5
50%
2
3
2 = 0.3
30%
3 = 0.2
20%
1
2
3
Resource
Reservation
1. Resource allocation is easier than priority mapping.
 Important for modularity and scalability
1
50%
2
3
30%
20%
Implementing RR
Benefits of Resource Reservation
2. It provides temporal isolation: overruns occurring in
a reservation do not affect other tasks.
1
2
3
1
server
1
server
2
2b
2a
3
server
3
Ready queue
CPU
scheduler
3. Simpler schedulability analysis:
 Response times only depends on the application
demand and the amount of reserved resource.
Fixed priorities
Dynamic priorities
4. Easier probabilistic approach
Analysis under RR
1
server
2b
server
2a
3
server
3
server
RM/DM
Sporadic Server
EDF
CBS
Types of reservations
1
2
scheduler
HARD: when the budget is exhausted, the server is
blocked until the next budget replenishment.
Ready queue
CPU
scheduler
SOFT: when the budget is exhausted, the server
remains active but at a lower priority level.
Note that
If a processor is partitioned into n reservations, we
must have that:
n
  i  U lubA
i 1
where A is the adopted scheduling algorithm.
 a HARD reservation (Qs, Ps) guarantees at most
a budget Qs every period Ps
 a SOFT reservation (Qs, Ps) guarantees at least
a budget Qs every period Ps
5
Soft CBS
Problem with soft reservations
The original CBS formulation implements a SOFT
reservation:
In fact, more the Qs units can be executed in
an interval Ps if the processor is available.
Ps
Ps
Ps
5
1
The use of idle times from a server can cause
irregular job executions and long delays:
6
1
Qs = 3
Ps = 9
3
6
9
12
15
18
21
24
3
1
0
3
6
9
12
15
18
21
24
0
6
9
12
15
18
21
24
3
6
9
12
15
18
21
24
Hierarchical scheduling
Example with Hard CBS
The deadline aging problem can be avoided by suspending the
server when qs = 0 and recharging the budget at ds:
16
Resource reservation can be used to develop
hierarchical systems, where each component is
implemented within a reservation:
Application N
Application 1
0
4
8
12
16
20
24
d0
d1
d2
d3
d4
d5
d6
5
1
8
4
qs
1
1
16
12
1
20
24
28
Local Scheduler
Component 1
2
4
6
8
10
12
14
16
18
Ps
10
2
0
Ps
1
long delay
0
2
s
Ps
1
Qs = 1
Ps = 3
0
1
Ps
20
22
24
26
Local Scheduler
  
Component N
28
Global Scheduler
CBS: Qs = 1, Ts = 4
Computing Platform
33
Hierarchical scheduling
Hierarchical scheduling
In general, a component can also be divided in other
sub-components
Global scheduler: the one managing the system ready queue
Local schedulers: the ones managing the servers queues
Server S4
Local Scheduler
Local Scheduler
9 8
Server S1
Local Scheduler
Component 1
3 2 1
  
Local scheduler
Local scheduler
Local Scheduler
  
Component N
Global Scheduler
S4 S3 S2 S1
Computing Platform
Global scheduler
CPU
6
Analysis under RR
Hierarchical Analysis
Global analysis: Servers must be schedulable by the global
scheduler running on the physical platform.
Local analysis: Applications must be scheduled by the local
schedulers running on the components.
Application 1
Application N
Local Scheduler
Local Scheduler
  
Component 1
Processor
Demand
Criterion:
dbf (t )  t
t  D
Timing behavior
of the application
Workload
Analysis:
Component N
processing time
available in [0,t]
i  1,..., n
 t  (0, Di ] : Wi (t )  t
Global Scheduler
Computing Platform
Analysis under RR
Analysis under RR
Under EDF, the analysis of an application within a
reservation is done through the Processor Demand
Criterion:
To describe the time available in a reservation, we
need to identify, for any interval [0, t], the minimum
time allocated in the worst-case situation.
Under Fixed Priority Systems (FPS), the analysis is
done through the Workload Analysis:
Supply bound function sbf(t):
t  0,
dbf (t )  t
 t  (0, Di ] : Wi (t )  t
i  1,..., n
minimum amount of time available in reservation
Rk in every time interval of length t.
The difference is that in an interval of length t the
processor is only partially available.
Example: Static time partition
Analysis under RR
Example of reservation providing 4 units every 10
(bandwidth = 0.4).
Hence the Processor Demand Criterion can be
reformulated as follows:
t  0,
0
2
4
6
8
10
12
14
16
18
20
22
24
dbf (t )  sbf (t )
t
sbf(t)
t
sbf(t)
8
dbf(t)
4
0
2
4
6
8
10
12
14
16
18
20
t
t
7
Analysis under RR
Supply lower-bound function
A simpler sufficient test, can be done by replacing
sbf(t) with a lower bound, called supply lower-bound
function slbf(t):
t  0,
A supply bound function has the following form:
slbf (t )  max{0,  (t   )}
dbf (t )  slbf (t )
slbf(t)
sbf(t)
slbf(t)
t
 = bandwidth
 = service delay

dbf(t)
t

t
Deriving  and 
Deriving  and 
Given a generic supply function sbf(t), the bandwidth
 is the equivalent slope computed for long
intervals:
While the delay  is the highest intersection with the
time axis of the line of slope  touching the sbf(t):
sbf (t )
  lim
t 
t
 sbf (t ) 
  sup t 

 
t 0 
sbf(t)
sbf(t)
sbf(t)



sbf(t)
t

Example: Periodic Server
For a periodic server with budget Qs and period Ps
running at the highest priority, we have:
Ps
Qs
 
Qs
Ps
Example: Periodic Server
For a periodic server with budget Qs and period Ps
running at unknown priority, we have:
  Ps  Qs
Ps
 
Qs
sbf(t)
sbf(t)
Qs
Qs
Ps – Qs
t
t
Qs
Ps
  2( Ps  Qs )
2(Ps – Qs)
8
Observation
Observation
In a periodic server with bandwidth , we have that:
It is worth comparing the following supply functions:
Supply
function of a
full processor
Supply function
of a fluid partition
with bandwidth 
Supply function of a real
partition with bandwidth 
of a periodic server Q,P
Q  P
The delay  is proportional
to the server period P
  2( P  Q)  2 P (1   )
t
t




t

Observation
Taking overhead into account
Note that, for a given bandwidth , reducing P
reduces the delay  and improves schedulability,
tending to a fluid reservation, but …
smaller periods generates
higher runtime overhead
t

If  is the context switch overhead, we have:
P
Q

smaller P
Allocated bandwidth:
 
Effective bandwidth:
 eff
 eff  0
Pmin 
Q
  2 P (1   )
P
Q 


 
P
P
1 
 min  2 Pmin (1   )  2 

  


t
Effective bandwidth
Reservation interface
Hence reducing the server period P reduces the delay
, but also reduces the effective bandwidth eff that
can be exploited:

 eff   
eff
P
Note that the (, ) parameters offer an alternative
interface, which is independent of the implementation
mechanism (static partitions or Q-P server):
1

2
3
/
1
1
QoS1
2
2
QoS2
3
3
QoS3
P
9
Example of sporadic waste
Bandwidth allocation error
Bandwidth
Demanded
Allocated


allocation error
Bandwidth
Bandwidth
demanded
bandwidth
Resource is insufficient
(application is delayed)
bandwidth
allocation
error
allocated
bandwidth
avg
time
time
The application
demands less
(resource is wasted)
under utilized
correctly utilized
over utilized
Resource Reclaiming
Budget reclaiming
Unused bandwidth in a reservation can be used to
satisfy extra occasional needs in another reservation:
1
2
R1
3
2
3
2
4
1
2
4
3
1
CASH: Capacity Sharing algorithm
 When a job finishes and qs > 0, the residual budget is put in
a global queue of spare capacities (the CASH queue), with
a deadline equal to the server deadline.
R2
 A server first uses the capacity in the CASH queue with the
earliest deadline dq  ds, otherwise qs is used.
 Idle times consume the capacity in the CASH queue with
the earliest deadline.
Capacity Sharing algorithm
1
2
3
2
3
2
1
4
2
5
3
Handling wrong allocations
There are situations in which the reservation error is
caused by a wrong bandwidth allocation:
3
any reservation is not appropriate
resource
needs
3
3
CASH queue 2
2
1
4
2
5
1
1
3
3
2
safe but
not efficient
efficient but
not enough
1
time
10
Need for adaptivity
Adaptive QoS Management
In these cases, the only solution is to design the
system to be adaptive so that reservations can be
changed based on runtime requirements:
Reservation parameters can be changed at run time
by a Reservation Manager:
demand
Reservation Manager
resource
needs
1
2
3
time
1
1
QoS1
2
2
QoS2
3
3
QoS3
Local adaptation
A local adaptation approach is also possible for a task
to comply with the assigned reservation:
i
Ci
Ti
Local Policy
Reservation


Real-Time
System
probes
11