Multi-Resource Round Robin:
A Low Complexity Packet Scheduler with
Dominant Resource Fairness
Wei Wang, Baochun Li, Ben Liang
Department of Electrical and Computer Engineering
University of Toronto
October 10, 2013
Background
Middleboxes (MBs) are ubiquitous in today’s networks
The sheer number is on par with the L2/L3 infrastructures [Sherry12]
Perform a wide range of critical network functionalities
E.g., WAN optimization, intrusion detection and prevention, etc.
Users
Public
Network
Private
Network
Middleboxes
(Packet filter,
NAT)
Servers
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
2
Background
MBs perform deep packet processing based on packet
contents
Require multiple MB resources, e.g., CPU, link bandwidth
Flows may have heterogenous resource demands
Basic Forwarding: Bandwidth intensive
IP Security Encryption: CPU intensive
Ghodsi et al. SIGCOMM’12
Figure 1: Normalized resource usage of four middlebox functions
Click:
basic forwarding, flow monitoring,
Wei Wang, Department of Electrical
andimplemented
Computer Engineering,in
University
of Toronto
Figure 2: Pe
(NIC) fails to
riod from tim
3
How to let flows fairly
share multiple resources
for packet processing?
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
Desired Fair Queueing Algorithm
Fairness
Each flow should receive service (i.e., throughput) at least at the level
when every resource is equally allocated
Low Complexity
To schedule packets at high speeds, the scheduling decision has to be
made at low time complexity
Implementation
The scheduling algorithm should also be simple enough so that it can
be easily implemented in practice
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
5
The Status Quo
Traditional fair queueing algorithms have only a single
resource to schedule, i.e., output bandwidth
Switches simply forward the packet to the next hop
WFQ, WF2Q, SFQ, DRR, etc.
Simply extending single-resource fair queueing fails to achieve
fairness in the multi-resource setting [Ghodsi12]
Per-resource fairness
Bottleneck fairness
Dominant Resource Fair Queueing (DRFQ) [Ghodsi12]
Implements near-perfect Dominant Resource Fairness (DRF) in the
time domain
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
6
However...
DRFQ is expensive to implement at high speeds
Requires O(log n) time complexity per packet
n could be very large
Given the ever growing line rate and the increasing volume of traffic passing
through MBs
Recent software-defined MB innovations further aggravate
this scalability problem
More software-defined MBs are consolidated onto the same
commodity servers
They will see an increasing amount of traffic passing through them
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
7
Our Contributions
A new multi-resource fair queueing algorithm
Multi-Resource Round Robin (MR3)
Near-perfect fairness across flows
O(1) time complexity per packet
Very easy to implement in practice
MR3 is the first multi-resource fair queueing algorithm that
achieves nearly perfect fairness with O(1) time complexity
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
8
Preliminaries
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
GPS model posed Dominant Resource Fairness (DRF) [12] serves as a
mplements promising notion of fairness for multi-resource scheduling.
ell-known Informally speaking, with DRF, any two flows receive the
F2 Q [21], same processing time on their dominant resources in all
Dominant
Resource
backlogged
periods. The dominant resource is the one that
ing. While
requiresthat
the requires
most packet
time. Specifically,
for a a
isolation,
The resource
the processing
most processing
time to process
pensivepacket
to packet p, its dominant resource, denoted d(p), is defined as
rivals, are
d(p) = arg max{⌧r (p)} .
(1)
r
increasing
scheduling
For example, consider two flows in Fig. 1a. Flow 1 sends
For example
he earliest packets P1, P2, . . ., while flow 2 sends packets Q1, Q2, . . ..
A packet
p requiring
1 CPU
transmission
time and 3
g n) time
Packet
P1 requires
1 time
time and
unit 3for
CPU processing
ws passing
time resource
units for is
link
and has the processing time
Dominant
thetransmission,
link bandwidth
mplement h1, 3i. All the other packets require the same processing time
DRF
h3, 3i on both CPU and link bandwidth. In this case, the
omplexity dominant resource of packet P1 is the link bandwidth, while
Flows receive the same processing time on their dominant resources
When there the dominant resource of packets Q1, P2, Q2, P3, . . . is CPU
Max-min
on flows’
dominant
resources scheme shown in
schedulers
(or fairness
bandwidth).
We see
that the scheduling
the output Fig. 1a achieves DRF, under which both flows receive the same
are served processing time on their dominant resources (see Fig. 1b).
e the
sortIt has been shown in [20] that by achieving strict DRF at
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
Dominant Resource Fairness (DRF)
10
Initially,
fairness metric [12], [20]:
Definition 1: For any packet arrivals, let Ti (t1 , t2 ) be the each flow,
packet processing time flow i receives on its dominant resource flow is allo
in the time interval (t1 , t2 ). Ti (t1 , t2 ) is referred to as the deficit coun
allocated d
dominant service
flow Bound
i receives(RFB)
in (t1to
, t2 measure
). Let B(t1 ,the
t2 ) be
We use Relative
Fairness
fairness
the set of flows
that are backlogged in (t1 , t2 ). The Relative and in eac
of a scheduling
algorithm
packets as
Fairness Bound (RFB) is defined as
does not e
RFB =
sup
|Ti (t1 , t2 ) Tj (t1 , t2 )| .
(2)
The unused
t1 ,t2 ;i,j2B(t1 ,t2 )
round as th
require
a scheduling
scheme
havei receives
a small RFB,
such
Ti (t1 , t2We
): the
packet
processing
timetoflow
on its
dominant
As an ex
that
the
difference
between
the
normalized
dominant
service
P2, . . . , wh
resource in the time interval (t1 , t2 )
received by any two flows i and j, over any backlogged time 1 has proc
Referred
as1 ,the
services
periodto(t
t2 ),dominant
is bounded
by a small constant.
for CPU p
Each packe
Objective
B. Challenges of Round-Robin Extension
illustrates
mentioned
in Sec. II, among various scheduling schemes,
RFB as aAs
small
constant
extension,
round-robin algorithm is of particular attractiveness for practiIn round 1,
O(1) scheduling
complexity
per
packet
cal implementation due to its extreme simplicity and constant
1 packet e
time complexity. To extend it to the multi-resource setting
the domina
with DRF, a natural way is to directly apply it on flows’
in the follo
dominant resources, such that in each round, flows receive
are schedu
roughly
the
same
dominant
Such a general extension
Wei Wang, Department
of Electrical and
Computer
Engineering,
University ofservices.
Toronto
11
Design Objective
When there is a single
resource to schedule...
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
Fair Queueing Based on Round Robin
Round-robin (RR) scheme
Flows are served in rounds
In each round, each flow transmits roughly the same amounts of bits
A credit system is maintained to track the amounts of bits transmitted
RR is an ideal single-resource packet scheduler
Nearly perfect fairness with O(1) RFB
O(1) time complexity
Simple, and widely implemented in high-speed routers
E.g., Cisco GSR
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
13
Will the attractiveness of
RR extend to the multiresource setting?
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
The First Try
Intuition
DRF implements max-min fairness on flows’ dominant resources
Simply applying RR to flows’ dominant resources
Approach
Maintain a credit system to track the dominant service a flow has
received
Ensure that flows receive roughly the same processing time on their
dominant resources in each round
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
15
Credit System
Active flows are served in rounds
Each flow i maintains a credit account
Balance: Bi
The dominant service flow i is allowed to consume in one round
Whenever a packet p is processed, Bi = Bi
domProcTime(p)
Flow i is allowed to process packet, as long as Bi
0
A new round begins when all active flows have been served in
the previous round
All flows receive a credit, i.e., Bi = Bi + c, after which Bi
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
0 for all i
16
1
However...
Q1
P2
P1
2
Q1
4
6
Q2
P3
P2
Q2
8
10
12
Q3
P3
14
Q1
...
P1
CPU
...
Link
Time
Q2
P2
P1
Q3
P3
Q1
Q4
P4
P2
Q5
P5
Q2
...
P6
P3
Q3
...
Time
0
(a) The scheduling
discipline.
Such
a simple
extension may lead
to
arbitrary
unfairness!
(a) Direct application of DRR to schedule multiple resources.
Q2
P1
2
Q3
P2
4
6 Q1 8
P3
10
Q2
...
...
12Q3 14 Q4 TimeQ5
P1 received
P2 on the
P3dominant
P4 resources.
P5
processing
CPU time
ustration
that
P1 discipline
Q1
P2 achieves
Q2 DRF.
P3
Linkof a scheduling
P6
Q3
...
...
Dominant Service Received
Q1
100
80
Flow 1 (CPU processing time)
Flow 2 (Transmission time)
60
40
Dominant Service Received
0
ss: No flow
can receive better service (finishTime
20
sreporting
amountofofDRR
resources
it requires.
(a) Directthe
application
to schedule
multiple resources.
0
servation: No resource that could be utilized to
0
20
40
60
80
100
100
Time
hroughput ofFlow
a backlogged
flow is time)
wasted in idle.
1 (CPU processing
(b) The dominant services received by two flows.
ese highly desired
properties,
DRF is
Flow 2 scheduling
(Transmission
time)
80
e notion of fairness for multi-resource scheduling. Fig. 2. Illustration of a direct DRR extension. Each packet of flow 1 ha
processing time h7, 6.9i, while each packet of flow 2 has processing time
Heterogeneous
resource
demand
how well
a
packet
scheduler
approximates
DRF,
60
h1, 7i.
g Relative Fairness Bound (RFB) is used as a
Inconsistent
work progress on different
Initially, theresources
algorithm assigns a predefined quantum size to
ic [12],40
[20]:
1: For any packet arrivals, let Ti (t1 , t2 ) be the each flow, which is also the amount of dominant service the
20 flow i receives on its dominant resource flow is allowed to receive in one round. Each flow maintains a
ssing time
interval (t1 , t2 ). Ti (t1 , t2 ) is referred to as the deficit counter that measures the current unused portion of the
rvice flow00 i receives
, t2 ) be100allocated dominant service. Packets are scheduled in rounds
20 in (t
40
60 B(t180
1 , t2 ). Let
Time
and in each round, each backlogged flow schedules as many
owsWei
that
are
backlogged
in
(t
The Relative
1 , t2 ). Engineering,
Wang, Department of Electrical and Computer
University of Toronto
17
(b) The dominant services received by two flows. packets as it has, as long as the dominant service consumed
Root cause
The Second Try
Credit system gives the right order, but wrong timing
Enforce consistent work progress across resources
Allow only one packet to be processed at one time
Q3
P3
14
Q1
...
...
Time
Unfair
P1
CPU
Link
Q2
P2
P1
0
Q3
P3
Q1
Q4
P4
P2
Q5
P5
Q2
...
P6
P3
Q3
...
Time
(a) Direct application of DRR to schedule multiple resources.
Dominant Service Received
ofQ3arbitrary
un...
Q1
Q2
100
nly, yet ...generally
P2
P3
time)
CPU P1 Flow 1 (CPU processing
P3
xample, one can
Fair Link80 FlowP12 (Transmission
time)
...
Q1
P2
Q2
TimeRound
nd14Elastic
gnant
in resources.
a similar way
Time
0
60
is easy to verify
at achieves DRF.
40fix of the DRR extension shown in Fig. 2a by withholding the
exactly the
same
Fig.
3.
Naive
Significantly
high
delay
with
low
resource
utilization
2
scheduling
opportunity
of
every
packet
until
its
previous
packet
is
completely
ed
. In(finish
fact,
ter RFB
service
processed on20
all resources.
nts
among
flows,
urces
it requires.
Wei Wang,
Department of Electrical and Computer Engineering, University of Toronto
rvals on different
18
The Right Timing for Scheduling
Enforce a roughly consistent work progress without sacrificing
delays
Progress control mechanism
Bound the progress gap between any two resources by 1 round
A packet p of flow i is ready to be processed in round k
Check the work progress on the last resource
Process p immediately if
Flow i is a new arrival
i has already received services on the last resource in the previous round k-1
Otherwise, withhold p until the condition above is satisfied
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
19
setting in a similar
. IV). It is easy to
will give exactly the
unded RFB2 . In fact,
ments among flows,
ntervals on different
on one resource may
ple, in Fig. 2a, when
mission of packet P3
mismatch that leads
’ dominant services.
obin algorithms on
fair services. A new
the basic idea in the
Time
0
An example
Fig. 3. Naive fix of the DRR extension shown in Fig. 2a by withholding the
scheduling opportunity of every packet until its previous packet is completely
processed on all resources.
Q1
P1
CPU
Link
Q2
P2
P1
Q3
Q4
P3
Q1
P2
P4
Q2
P3
...
Q3
...
Time
0
Dominant Service Received
(a) Schedule by MR3 .
60
50
Flow 1 (CPU processing time)
Flow 2 (Transmission time)
40
30
n extensions fails is
20
ant services in real10
xample. In Fig. 2a,
0
on CPU, flow 2’s
0
20
40
60
80
100
Time
at flow 2 has already
(b) The dominant services received by two flows.
nt services (i.e., link
to be processed but
Fig. 4. Illustration of a schedule by MR3 .
this quantum incurs
fterWei
the
transmission
Wang, Department of Electrical
and Computer Engineering,
Toronto
bandwidth).
If flowUniversity
i hasofalready
received services on the last
20
MR3 Recap
Flows are served in rounds
Credit system
Applied to flows’ dominant resources
Track the dominant services a flow has received
Decide the scheduling order
Similar to the single-resource scenario
Progress control mechanism
Ensure a roughly consistent work progress across resources
Decide the right timing for scheduling
Unique to the multi-resource scenario
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
21
Simple idea, easy to
implement, yet is sufficient to
lead to near-perfect fairness
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
Analytical Results
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
Properties of MR3
O(1) time complexity
Nearly perfect fairness with O(1) RFB
Slight delay increase as compared with DRFQ
Easy to implement
No a priori information about packet processing is required
TABLE I
P ERFORMANCE COMPARISON BETWEEN MR3 AND DRFQ, WHERE L IS
THE MAXIMUM PACKET PROCESSING TIME ; m IS THE NUMBER OF
RESOURCES ; AND n IS THE NUMBER OF BACKLOGGED FLOWS .
MR3
O(1)
4L
2(m + n 1)L
(4m + 4n 2)L
Mo
Basic F
Statistical
IPSec E
DRFQ [11]
O(log n)
2L
nL
Unknown
metrics are widely used in the fair queueing literature to
Wei Wang, Department ofmeasure
Electrical and Computer
Engineering, University
of Toronto
the latency
performance:
startup latency [11], [16]
vice (s)
Performance
Complexity
Fairness (RFB)
Startup Latency
Single Packet Delay
L INEAR MODEL
M ODEL PARA
15
10
FCFS
3
MR
24
Simulation Results
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
General Setup
3 MB modules
Basic forwarding
Statistical monitoring
IPSec Encryption
Bandwidth intensive
CPU intensive
Ghodsi et al. SIGCOMM’12
Figure 2: Pe
(NIC) fails to
riod from tim
of servers than users, they decide how many resources each user
ules: basic for
width than of
26
on memory ba
Figure 1: Normalized resource usage of four middlebox functions implemented in Click: basic forwarding, flow monitoring,
redundancy elimination, and IPSec encryption.
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
Fairness (Service Isolation)
TABLE II
WHERE L IS
MBER OF
LOWS .
30 UDP
11]
n)
Module
CPU processing time (µs)
Flows 1 to 10Basic
require
basic
forwarding
Forwarding
0.00286 ⇥ PacketSizeInBytes + 6.2
Monitoring
0.0008
⇥ PacketSizeInBytes + 12.1
Flows 11 toStatistical
20 require
statistical
monitoring
IPSec Encryption
0.015 ⇥ PacketSizeInBytes + 84.5
erature to
[11], [16]
how long
rvice after
ency from
put queue
resources.
Let m be
number of
gged flow
15
FCFS
MR3
10
5
0
0
5
10 15 20
Flow ID
25
30
(a) Dominant service received.
Throughput (103 pkts/s)
Flows 21 to 30 require IPSec encryption
Dominant Service (s)
wn
L INEAR MODEL FOR CPU PROCESSING TIME IN 3 MIDDLEBOX MODULES .
M ODEL PARAMETERS ARE BASED ON THE MEASUREMENT RESULTS
flows with 3 rogueREPORTED
flows IN [11].
7
6
5
4
3
2
1
0
0
FCFS
MR3
5
10 15 20
Flow ID
25
30
(b) Packet throughput of flows.
Fig. 6. Dominant services and packet throughput received by different flows
under FCFS and MR3 . Flows 1, 11 and 21 are ill-behaving.
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
27
Latency
150 UDP flows
Slight latency increase as compared with DRFQ
1
0.6
0.4
DRFQ
MR3
0
0
10 20 30 40 50
Single Packet Delay (ms)
0.2
50
100
Flow ID
rtup latency of flows.
150
(b) CDF of the single packet delay.
ig. 7. Latency comparison between DRFQ and MR3 .
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
CPU Share
CDF
0.8
1
0.8
0.6
Flow 1
Flow 2
Flow 3
0.4
0.2
0
0
idth Share
FQ
R3
5
Flow 1
1
0.8
0.6
28
Stability
Different traffic arrival patterns
Different packet size distributions
Different network functionalities applied to flows
20
ived.
25
3
2
1
Constant
Exponential
0
Large Small Bimodal Random
Packet Size
(b) Basic forwarding.
4
3
2
1
Constant
Exponential
0
Large Small Bimodal Random
Packet Size
(c) Statistical monitoring.
Average Latency (ms)
4
Average Latency (ms)
Average Latency (ms)
4
3
2
1
Constant
Exponential
0
Large Small Bimodal Random
Packet Size
(d) IPSec encryption.
Fairness and delay sensitivity of MR3 in response to mixed packet sizes and arrival distributions.
om 200 B to 1400 B. Each group
[3] M. Honda, Y. Nishida, C. Raiciu, A. Greenhalgh, M. Handley, and
H. Tokuda, “Is it still possible to extend TCP?” in Proc. ACM IMC,
withWeiexponential
and
constant
packet
Wang, Department of Electrical and Computer Engineering, University
2011.of Toronto
ively. The input queue of each flow
29
Conclusions
We propose MR3 and evaluate its performance both
analytically and experimentally
The first multi-resource fair queueing algorithm
achieves nearly perfect fairness
O(1) time complexity
Slight increase of packet latency as compared with DRFQ
MR3 could be easily extended to some other multi-resource
scheduling contexts
E.g., VM scheduling inside a hypervisor
[email protected]
http://iqua.ece.toronto.edu/~weiwang/
Wei Wang, Department of Electrical and Computer Engineering, University of Toronto
30
© Copyright 2026 Paperzz