Packet classification for global network view of software

TCS -TR-A-14-74
TCS Technical Report
Packet Classification for Global Network View of
Software-Defined Networking
by
Takeru Inoue, Toru Mano, Kimihiro Mizutani,
Shin-ichi Minato, and Osamu Akashi
Division of Computer Science
Report Series A
July 16, 2014
Hokkaido University
Graduate School of
Information Science and Technology
Email:
[email protected]
Phone:
+81-011-706-7682
Fax:
+81-011-706-7682
Packet Classification for Global Network View of
Software-Defined Networking
Takeru Inoue∗†
Toru Mano†
Shin-ichi Minato‡
Kimihiro Mizutani†
Osamu Akashi†
July 16, 2014
Abstract
In software-defined networking, applications are allowed to access a global
view of the network so as to provide sophisticated functionalities, such as
quality-oriented service delivery, automatic fault localization, and network
verification. All of these functionalities commonly rely on a well-studied technology, packet classification. Unlike the conventional classification problem
to search for the action taken at a single switch, the global network view requires to identify the network-wide behavior of the packet, which is defined
as a combination of switch actions. Conventional classification methods, however, fail to well support network-wide behaviors, since the search space is
complicatedly partitioned due to the combinations.
This paper proposes a novel packet classification method that efficiently
supports network-wide packet behaviors. Our method utilizes a compressed
data structure named the multi-valued decision diagram, allowing it to manipulate the complex search space with several algorithms. Through detailed
analysis, we optimize the classification performance as well as the construction of decision diagrams. Experiments with real network datasets show that
our method identifies the packet behavior at 20.1 Mpps on a single CPU core
with only 8.4 MB memory; by contrast, conventional methods failed to work
even with 16 GB memory. We believe that our method is essential for realizing
advanced applications that can fully leverage the potential of software-defined
networking.
1
Introduction
Software-Defined Networking (SDN) [1] is a new paradigm that separates the control and forwarding planes in a network. The control plane, which is operated
∗
[email protected]
NTT Network Innovation Laboratories, Yokosuka, Japan.
‡
Graduate School of Information Science and Technology, Hokkaido University, Sapporo,
Japan.
†
1
2
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
by a logically centralized controller, provides a global view of the distributed network state. This global view allows SDN applications running on the control plane
to readily identify network-wide packet behaviors, e.g., which path the packet traverses, whether the packet is discarded in the network, and what actions the packet
is subjected to at middle boxes. Many applications have been developed that utilize
the network-wide packet behavior to offer sophisticated functionalities.
• In quality-oriented services [2] and an upcoming infrastructural paradigm
called “network fabric” (separation of intelligence from the network core) [3,
4], the network edge is assumed able to determine the network-wide behavior
for every incoming packet.
• Reference [5] proposed an automatic fault localization technique that exploits
the differences between actual packet behavior monitored in the network and
the theoretical behavior estimated from the global view.
• References [6–11] estimate the network-wide packet behavior from the network status, in order to verify conformity with a network policy.
An method that can efficiently determine the network-wide behavior of a packet
is essential to successfully implementing these advanced applications; they cannot
work if only the action to be taken at a single switch can be identified.
1.1
Summary and Limitations of Prior Art
Packet classification [12–20], a functionality that determines the action taken on
a packet based on multiple header fields, has been a key technology in modern
networks to provide services beyond basic packet forwarding, such as access control, quality of services, and traffic monitoring and analysis. Packet classification has been extensively studied in the past fifteen years, and the state-of-the-art
method [16] looks up large classifiers with tens of thousands of rules very efficiently.
These well-studied methods are, however, easily overwhelmed by the complexity
of handling network-wide behaviors. This is because, as shown in Fig. 1, networkwide behavior is defined as the combinations of switch actions. Our preliminary
experiments on the Stanford backbone network [10], which is a medium-scale network with 16 switches, 757,170 forwarding rules, and 1,584 ACL rules, revealed
that the classification method of [16] fails to construct a classifier of the networkwide behaviors, exhausting the available computer memory of 16 GB. The main
cause of this failure is that the search space defined by the packet header was partitioned into a huge number of blocks; 652 million rules were required to express
the complicated space, but the conventional methods baulk at this number.
In order to efficiently handle such a complicated partitioned space, some network verifiers [6–8] employ Binary Decision Diagrams (BDDs) [21], which are a
compressed data structure designed to represent Boolean functions. Due to the
great space efficiency of BDDs, only 6.6 MB memory was required to represent
Packet Classification for Global Network View of Software-Defined Networking
Example network
A
A1
A3
3
C1
A2
B
B1
B2
C2
B3
C
C3
Actions at switches
Field1 Field2 Action
Field1 Field2 Action
Field1 Field2 Action
*
[3,4]
Drop
[0,3]
[2,5]
B2
[2,2]
[1,3]
C2
[4,5]
*
A1
[4,5]
*
B1
[4,5]
*
C1
[6,7]
*
A3
[7,7]
[1,3]
B3
[6,7]
*
C3
[0,3]
*
A2
*
*
Drop
*
*
Drop
0
0
Field1
B2
B3
0
0
7
Field1
Network-wide packet behaviors
Field2
7 I
VII IX
II
III IV VIIIXXI
0
0
V
VI
Field1
B1
XII
7
7
Drop
0
0
C2
Drop
7
Field2
7 Drop
A1 A3
Field2
A2
Field2
7
C1 C3
Field1
7
I A B C
Drop
II A B C
III A B
:
C
XII A B
C
Figure 1: Search space defined by packet header and network-wide packet behaviors.
The example network at the top includes three switches (A, B, and C), each of which has
three ports. Each switch has a table that maintains several rules, and each rule associates
two header fields (field1 and field2 of 3-bit) with actions. The two-field header space (a
square) is divided into only seven blocks at switch A, but it is partitioned into 17 blocks
for the twelve network-wide behaviors as shown in the bottom.
the Stanford backbone in our experiments. However, a behavior must be looked
up by linear search which is impractically slow; considering the several Boolean
functions represented in BDDs, each of which maps the header space exclusively
to a network-wide packet behavior, these functions must be tested individually to
determine the packet behavior (e.g., in Fig. 1, twelve Boolean functions have to be
tested one by one). In our experiments on the Stanford backbone, which has 1,093
network-wide behaviors, only 5.78 Kpps was achieved with BDDs. Classification
throughput should exceed 10 Mpps in order to examine packets at line rate on 10
Gbps links.
As discussed so far, no method has been introduced that can identify networkwide packet behavior with the necessary time and space efficiency. This deficiency
imposes the following limitations on the SDN applications discussed earlier on.
4
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
• Quality-oriented services and network fabric cannot be realized in a complex
network like the Stanford backbone, whose complicated network-wide packet
behaviors cannot be maintained by network edge switches using conventional
classification methods.
• Reference [5] only employs predefined test packets for failure detection, since
it is hard for conventional methods to identify the theoretical network-wide
behavior for arbitrary packets; this restriction might hinder the resolution of
faulty behaviors of production traffic.
• References [6–11] can test only a few properties every few msec, which may
cause serious policy violations to be overlooked if a lot of properties have to
be checked in real-time.
Similar limitations are faced by other applications. In addition, the limitations will
hinder the advent of more advanced SDN applications. To remove the limitations
and leverage the full potential of SDN, it is imperative to develop a new classification method satisfying the following requirements; the complicated header space
of network-wide packet behaviors should be represented compactly enough to fit
within a fast cache memory, and packets at the rate of beyond 10 Mpps should be
classified based on their network-wide behavior. Rapid construction and update
of classifiers are also appreciated.
1.2
Our Contributions
This paper proposes a novel packet classification method that meets the above
requirements. Our idea is simple but very effective. Boolean functions, each of
which is associated with a single behavior, are unified into a single multi-valued
function representing a whole classifier by itself. This multi-valued function is represented by a data structure named the multi-valued decision diagram (MDD) [22],
a variant of BDD. Since this unification removes the slow linear search from the
classification process, the throughput can be greatly improved while maintaining
the space-efficiency of BDDs.
Our method is two-fold.
• An efficient algorithm that constructs the MDD from a set of BDDs representing Boolean functions, is presented. The construction process is analyzed
and optimized by reducing it to Huffman coding. Upon receipt of a new
packet behavior, the MDD can be incrementally updated in a short time.
• A very simple and fast algorithm examines the MDD to classify packets based
on their behaviors. Another algorithm further accelerates the classification
process by regarding a bunch of header bits as a single variable. The timespace tradeoff entailed in the bit aggregation is analyzed.
Packet Classification for Global Network View of Software-Defined Networking
5
Our method is evaluated with three real network configurations: Internet21 ,
Stanford backbone network [10], and Purdue campus network [23]. In the case of
Stanford, the MDD only requires 8.4 MB of memory, which fits within fast CPU
cache. The classification throughput is 20.1 Mpps on a single CPU core, nearly
3,500 times faster than conventional BDD methods. The MDD is constructed in
0.93 sec, and a new behavior can be added just in 29 msec.
Our method does not require any special hardware support like TCAMs, and
can be applied to any use case without significant difficulties. In this paper, our
method is implemented as a fully software-based classifier assuming its use in
software-based SDN applications, but it is so simple that it can be realized as a
hardware-based system.
The rest of this paper is organized as follows. After defining our problem in
Section 2, Section 3 reveals deficits of conventional methods. Section 4 designs our
method, and Section 5 elaborates algorithms used in the method. Section 6 reports
the experiments and their results, and Section 7 discusses application aspects.
Section 8 summarizes related work, and Section 9 concludes this paper.
2
Problem Statement
The logical search space defined by the L-bit packet header is denoted by X =
{0, 1}L , and is called the header space [10] (e.g., a square in Fig. 1 represents a
header space of L = 3 + 3). Packet header x corresponds to a point in the header
space, x ∈ X . The protocol-specific semantics of the packet header are ignored in
our method, and a packet header is considered as a flat sequence of bits.
The header space is partitioned into equivalent subspaces, each of which corresponds exclusively to a single behavior (e.g., the bottom square of Fig. 1 has twelve
equivalent subspaces). An equivalent subspace is not necessarily continuous, and
can consist of several blocks, where block is used to specify a continuous subspace.
Let Xi be an equivalent subspace that maps to the i-th behavior, the subspace is
defined as,
Xi = {x ∈ X : fi (x) = >},
where fi (x) is a Boolean function indicating whether a packet with header x follows
the i-th behavior, fi : X → {⊥, >}, and ⊥ is false and > is true. Since equivalent
subspaces are mutually exclusive, Xi ∩ Xj = ∅ (i 6= j), and collectively exhaustive,
S
i Xi = X , they form a partition of header space X , P , as follows,
P = {XI , XII , · · · , X|P | }.
The equivalent subspaces, Xi ’s, and corresponding Boolean functions, fi ’s, are
obtained by utilizing switch actions as follows [8]. Given P1 and P2 as arbitrary
partitions of X , we define operation ⊗ as,
P1 ⊗ P2 = {X ∩ Y : X ∈ P1 , Y ∈ P2 , X ∩ Y 6= ∅}.
1
http://vn.grnoc.iu.edu/Internet2/fib/
6
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
Using this operation, the whole header space is then partitioned into equivalent
subspaces as follows,
O
P =
Pj ,
(1)
j
where Pj is a partition defined by the actions of the j-th switch (e.g., a square in
the middle of Fig. 1)2 . This operation divides the header space into several blocks,
which become combinatorially finer (e.g., the bottom square is much finer than
middle squares in Fig. 1). In other words, intersecting rules of different switches
spawn new blocks to distinguish network-wide behaviors, while rules on a single
switch are simply masked by higher-priority rules if intersected.
Multi-valued function F is defined by,
F : X → {nil, I, II, · · · , |P |},
where F (x) = i if and only if header x is in the i-th equivalent subspace, x ∈ Xi .
A multi-valued function that cannot be “nil” is called a complete multi-valued
function, and can be used as a packet classifier.
Our goal in this paper is to construct an efficient data structure representing a
complete multi-valued function, F , given a set of Boolean functions, fi ’s.
3
Inefficiencies of Conventional Packet Classification
Methods
This section identifies why conventional packet classification methods fail to support network-wide packet behaviors. There are two major conventional approaches;
algorithmic methods [12–16] usually construct a decision tree to represent the
header space, while architectural methods [17–20] often try to reduce the number
of rules to fit them in space-limited TCAMs. These conventional methods commonly depend on the following assumption; the header space is a multi-dimensional
space defined by each header field (not each bit like our definition), and it is covered
by a set of hypercubes, where a hypercube is a convex block defined by a range or
prefix in each field. To resolve possible conflicts among hypercubes, they are given
an overall total order. Normally, hypercubes are specified by a list of “5-tuple”
rules (i.e., source/destination IP, source/destination port, and protocol).
The computational complexity of both conventional approaches depends on the
number of hypercubes, as follows.
• In a decision tree, the root node represents the whole header space, which
is recursively divided at every intermediate node until a few rules are left
2
Note that these partitions, Pj ’s, might be given for each port as well as each switch, depending
on network configuration.
Packet Classification for Global Network View of Software-Defined Networking
7
at a leaf node. The space complexity is known as O(N D ) for O(log N )
classification time [24], where N is the number of hypercubes and D is the
number of header fields. The required space in practice could be smaller than
this worst-case bound, but it still depends on the number of hypercube rules
because the decision tree must be large enough to divide the rule set into
several subsets with a few rules.
• Rule reduction methods reduce the number of hypercube rules by merging
those of the same action into a single large hypercube; in the following example, the left set of hypercube rules is reduced to the right set.
IP destination
0.0.0.0/3
32.0.0.0/3
64.0.0.0/2
128.0.0.0/1
Action
Drop
A1
Drop
Drop
IP destination
32.0.0.0/3
0.0.0.0/0
Action
A1
Drop
The time complexity of TCAM razor, the most-cited rule reduction method,
is roughly considered to be O(DN AL2 ), where A is the number of actions/behaviors, since it performs dynamic programming of O(N AL2 ) [25]
for each header field.
The number of hypercubes, N , is examined for the three real networks (the
statistics are given in Section 6). To our knowledge, there is no technique that can
calculate a set of hypercubes to represent a header space of network-wide behavior,
and so hypercubes are extracted from BDDs of fi ’s.
# of hypercubes
Internet2
35,700
Stanford
652,115,821
Purdue
834,648,394
The numbers for Stanford and Purdue networks are extremely large. This is because Stanford and Purdue include multi-field rules, which incurs the curse of
dimensionality [26]. Internet2 dataset includes single-field rules only, and so it has
a moderate number of hypercubes.
Against these networks, we evaluate three conventional methods: two decision
tree methods, HyperCuts [12] and HybridCuts [16], and one rule reduction method,
TCAM razor [18]. Open-source implementations were available for HyperCuts3
and HybridCuts4 , while our own implementation was developed for TCAM razor.
As shown in the following table, the single success was achieved by HybridCuts for
Internet2; otherwise, the conventional methods exhausted 16 GB memory or did
not finish in a half hour.
HyperCuts
HybridCuts
TCAM razor
3
4
Internet2
fail (memory)
success
fail (timeout)
Stanford
fail (memory)
fail (memory)
fail (timeout)
Purdue
fail (memory)
fail (memory)
fail (timeout)
http://hypercuts.masicek.net/, by Charles University.
http://github.com/lwj4333765/HybridCuts, by the authors of HybridCuts.
8
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
To recap, although the header space of network-wide behavior requires a huge
number of hypercube rules to represent it, the computational complexity of conventional methods is directly dependent on the number. In order to remove this
dependency, our method should be constructed from a compressed representation
like BDDs where individual rules do not need to be examined.
4
Abstract Procedure of Proposed Method
This section describes the abstract procedure of our method; algorithms to implement the procedure are given in Section 5.
Our method, first, builds incomplete multi-valued function Fi from the Boolean
function of the i-th behavior, fi . Function Fi maps the header space to just the
i-th behavior, as follows,
Fi : X → {nil, i},
where Fi (x) = i if fi (x) = >, or Fi (x) = nil otherwise.
Since two equivalent subspaces, Xi and Xj (i 6= j), are mutually exclusive, two
incomplete multi-valued functions, Fi and Fj , never define behaviors for the same
packet header; i.e., ∀x ∈ X , Fi (x) = nil ∨ Fj (x) = nil. These two incomplete multivalued functions can be unified into a single one without conflict, by introducing
the following unification operation,


if Fj (x) = nil,
Fi (x)
Fi (x) ] Fj (x) = Fj (x)
(2)
if Fi (x) = nil,


undefined otherwise.
This operation is clearly associative and commutative.
Performing this operation over multi-valued functions of all the behaviors yields
the multi-valued function of classifier, F (x), as follows,
F (x) =
|P |
]
Fi (x).
(3)
i=I
Since the equivalent subspaces are collectively exhaustive, F (x) is complete, that
is, ∀x ∈ X , F (x) 6= nil.
To incrementally update classifier F , we consider that rules of another multivalued function, F 0 , would be inserted above the original rules. More formally,
F (x) is overwritten by F 0 (x) just in the subspace X 0 in which F 0 (x) is defined,
∀x ∈ X 0 , F 0 (x) 6= nil. This update operation is defined as,
(
F 0 (x) if F 0 (x) 6= nil,
F 0 (x) F (x) =
(4)
F (x) otherwise.
Our method, first, builds an incomplete multi-valued funcFigure 2 presents an example
B. Procedure
B. Procedure
oftion,
Proposed
B. Procedure
Proposed
Method
Proposed
Method
A. Decision
A. Decision
Diagrams
A.
Decision
Diagrams
Fiof
(x),
fromofthe
BooleanMethod
function of i-th behavior,
fi (x).
A BDD
[20]Diagrams
is an acyclic dire
The
multi-valued
function,
F
(x),
maps
the
header
space
only
node
and
two
terminal
nodes,witfa
i
Our method,
Our method,
first,Our
builds
method,
first,anbuilds
incomplete
first,anbuilds
incomplete
multi-valued
an incomplete
multi-valued
func-multi-valued
funcA BDDfunc[16]is
A BDD
anA
[16]is
acyclic
BDDan[16]is
directed
acyclic
angraph
directed
acyclic
dg
i-th
as the
follows,
terminal
node
is
labeled
as
b-th
tion, Fi (x),
tion,from
Fito
(x),
tion,
the
from
Boolean
Fbehavior,
the from
Boolean
function
Boolean
function
of i-th behavior,
function
of i-th behavior,
fof
i-th behavior,
fnode
(x).
i (x),
i (x).
i (x).andf
i
node
two and
terminal
node
two and
terminal
nodes,
twofalse
terminal
nodes,
? and
false
nodes,
true
?
it hasisby
two
0-ch
The multi-valued
The multi-valued
The
function,
function,
Fi (x),
maps
function,
F
themaps
header
Fi (x),
thespace
maps
header
only
thespace
header
only
spaceterminal
only isand
terminal
node
terminal
labled
node
labled
node
a labeled
bitisby
number
labled
aarcs,
bit by
in
numb
[1,
a
Fmulti-valued
{nil,
i},i (x),
i :X !
indicates
the
b-th
bit
is
0
or
1;
to i-th behavior,
to i-th behavior,
as
to follows,
i-th behavior,
as follows,
as follows,
two labeled
twoarcs,
labeled
0-child
twoarcs,
labeled
and
0-child
1-child,
arcs,and
0-child
each
1-child,
and
of wh
e1
where Fi (x) = i if fi (x) = >, or Fi (x) = nil otherwise.
be
both
0is
and
11;is
(i.e.,
abit1;
“don’t
the
bit
number
the
bit
is
number
the
0
or
bit
1;
number
if
0
a
or
bit
is
if
0
skipped,
a
or
is
if
sk
ita
Fi : X !F{undef,
!
F
i},
{undef,
:
X
!
i},
{undef,
i},
i :X
i
Since two different subspaces, Xi and Xj (i 6= j), are root to a terminal corresponds
1 (i.e., a 1“don’t
(i.e., acare”
1“don’t
(i.e.,
bit).
acare”
“don’t
A path
bit).
care”
from
A path
bit).
the from
roo
A
exclusive,
multi-valued
functions,
at
the to
end
of the
path
indi
where Fiwhere
(x) = F
imutually
if(x)
fi (x)
= Fi i=
if(x)
>,
fi =
(x)
ori =
F
ifitwo
(x)
>,
fi (x)
or
=incomplete
=
F
undef
(x)or
=otherwise.
Fundef
otherwise.
undef
otherwise.
iwhere
i >,
i (x) =
corresponds
corresponds
to terminal
packet
corresponds
toheader
packet
x,
header
packet
and
x,header
termi
and
F
(x)
and
F
(x),
never
define
behaviors
for
the
same
Packet Classification for Global
Network
View
of
Software-Defined
Networking
height
(the
longest
i
j
Since two
Since
different
two
Since
different
subspaces,
two different
subspaces,
Xi and
subspaces,
Xij and
(i 6=Xij),
(iare
6=Xjj),
6=packet
j),
j and
of(iare
path
indicates
of are
path9maximum
indicates
of
thepath
value
indicates
the
of value
f (x).
the
of
A
value
fBDD
(x). pa
of
A
is
header;
8x
X,F
= multi-valued
nil _functions,
Fj (x)
=functions,
nil. These
two for a function and bit order. The
i (x)incomplete
mutuallymutually
exclusive,
mutually
exclusive,
twoi.e.,
incomplete
exclusive,
two2 incomplete
multi-valued
two
multi-valued
functions,
a function
a function
and bita order.
function
and bit order.
and bit order.
incomplete
multi-valued
functions
canthe
be same
unified
into
a packet
one number of non-terminal nodes
Fi (x) and
FiF
(x)
and
Fnever
(x)
and
define
never
Fj (x),
behaviors
define
neverbehaviors
for
define
the behaviors
same
for
packet
for the
packet
same
j (x),
iF
j (x),
A MDD
A[17],
MDD
[18]is
A[17],
MDD
also
[18]is
an
[17],
acyclic
also
[18]is
andirected
acyclic
also a
confliction,
by
introducing
the
following
unification
|fcan
(x)|
if
BDD
represents
header; i.e.,
header;
8xwithout
i.e.,
header;
2
X
8x
,
F
i.e.,
(x)
2
X
8x
=
,
F
undef
2
(x)
X
=
,
F
_
undef
(x)
F
(x)
=
_
=
undef
F
undef.
(x)
_
=
F
undef.
(x)
=
undef.
i
i
i j
j
j single root,
single
butby
root,
itsingle
buthave
root,
it the
can
more
but
have
it than
can
more
two
have
tha
te
Functions
Representations
operation,
A
[22]
is
also
These two
These
incomplete
two
These
incomplete
multi-valued
two incomplete
multi-valued
functions
multi-valued
functions
can be functions
melded
can be melded
can
be I,
melded
undef,
undef,
II, · · · ,I,|P
undef,
II,|. ·MDD
Each
· · ,I,|P
II,
non-terminal
|.[21],
·Each
· · , |P
non-termina
|. Each
node
nois
8
ait single
but
itcan
can
into a one
intowithout
a one
intoconfliction,
without
a one confliction,
without
by introducing
confliction,
by introducing
the
byfollowing
following
the following
aggregated
aggregated
bits,with
and
aggregated
bits,
canand
have
bits,
itroot,
can
more
and
have
itthan
more
two
hah
ifintroducing
Fthe
j (x) = nil,
<Fi (x) BDDs
Construction fI
fII
f|P| >
...
nodes,
II,
·
·
·
,
|P
|.
Each
K
K nil, I,K
operation,operation,
operation,
0, 1, · · · ,0,
2 1, · · 1.
· ,0,2Other
1, · · ·1.
properties
, 2Other 1.
properties
are
Other
same
prope
are
wi
Fi8
(x) ] Fj8
(x) = 8
Fj (x)
if Fi (x) = nil,
by aggregated bits, and it can
>
:>
In
our method,
In our method,
aIn BDD
our method,
ais BDD
used ais
to BDD
used
represe
is
t
>
>
F
(x)
F
(x)
if
F
F
(x)
(x)
if
=
F
undef,
(x)
if
=
F
undef,
(x)
=
undef,
K
i
i
i
j
j
j
undefined
otherwise.
<
<
<
arcs, 0, 1, · · · , 2
1. The max
function function
of a header
function
of aspace
header
of
mapped
aspace
header
to
mapped
aspace
single
to
map
pa
a
FiF
(x)
FjiF(x)
]operation
Fji...
(x)
]
= FisF
= ifFFj (x)
F
F|P|j(x)
=F
undef,
= Fundef,
j (x)
properties are same with the BD
j (x)
i (x) if
i (x) if
i (x) = undef,
I ] F
II =
This
>
>
>
while is
a MDD
used
whiletois
a express
MDD
used toisaexpress
used
multi-valued
to aexpr
mu
:
: associative
: and commutative. while a MDD
In our method, a BDD is used
not
available
not
available
otherwise.
not
available
otherwise.
otherwise.
Performing
this operation over all the incomplete
headermultispace
header
of space
packet
header
ofclassifier.
space
packetofclassifier.
Figure
packet 2classifi
Figu
show
Unification
tion that maps the header space
functions,
Fassociative
the and
multi-valued
function
i (x)’s,
MDDofofclasMDD
Fig. 1.ofMDD
Fig. 1.of Fig. 1.
This operation
This operation
isvalued
This
associative
operation
is associative
andiscommutative.
and commutative.
commutative.
while a MDD is used to expre
sifier,
Fthis
(x), operation
isover
obtained
as follows,
F
Performing
Performing
this
operation
Performing
thisalloperation
over
the
incomplete
all MDDs
over
the incomplete
all multithe incomplete
multi- multipacket
classifier.
Multi-Valued
Multi-Valued
Decision
B. Multi-Valued
Decision
DiagramDecision
Diagram
Constructio
Dia
Co
|Pthe
| multi-valued
valued functions,
valued functions,
valued
Fi (x)’s,
functions,
F
the
multi-valued
Fthe
multi-valued
function
function
of clas- function
ofB.clasofB.clasi (x)’s,
i (x)’s,
Bit
aggregation
]
sifier, F (x),
sifier,is Fobtained
(x),
sifier,is Fobtained
as(x),
follows,
is obtained
asFfollows,
(x) =as follows,
Fi (x).
(2)The
Construction
The BDD
of B.
fBDD
The
is
of easily
fBDD
of
is and
easily
fi (x)Updates
converte
istoeasily
the oM
i (x)
i (x) converted
F(K)
i=I
by replacing
by replacing
?-grams
and
by replacing
>-terminals
?- and >-terminals
?- with
and >-termin
undefwitha
|P |
|P |
|P |
]
]
]
respectively.
respectively.
The
respectively.
of
BDD
size
of
fBDD
(x)
is
of obviou
fBDD
Since
the =subspaces
collectively
exhaustive,
F (x)
F (x)
Fi (x).
= are
F (x)
F
=
Fi (x).
(1) F (x)
(1)is com-(1) size
isize
i (x)
TheThe
BDD
of The
fi (x)
is easily
co
i (x).
that of MDD
that ofofby
MDD
F
that
(x)ofofinMDD
Fterms
ofin
of
Fterms
the
number
in
of term
the
(new
rule)
plete,
that
is,
8x
2
X
,
F
(x)
=
6
nil.
i=I
i=I
i=I
ireplacing
i (x)
i (x)
?and
>-termina
Update
f ’
BDDs
diagram,
diagram,
that is,respectively.
diagram,
|fthat
is,=|fthat
|F
(x)|
is,time
=|fwhere
|F
=|A(
|F
w
In
order
toare
incrementally
update
weisconi (x)|
iThe
i (x)|,
i (x)|
i (x)|,
complex
Since theSince
subspaces
theSince
subspaces
are
the
collectively
subspaces
collectively
exhaustive,
are collectively
exhaustive,
F (x)classifier
exhaustive,
is comF (x)Fis(x),
comF (x)
comdiagram
of diagram
representing
of diagram
representing
function
representing
A(x).
The
funct
A(x)
spa
sider
F (x)
overwritten
byundef.
an incomplete of
multi-valued
The
MDD
size
isfunction
obviously
sa
plete, thatplete,
is, 8x
that
2
plete,
is,
X that
, 8x
Fthat
(x)
2
is,
X
6=, 8x
undef.
Fis(x)
2X
6=, undef.
F (x) 6=
0
this
step
ofconthis
is O(|f
step
of
this
is O(|f
while
(x)|),
is O(|f
thewhile
time
the
compl
whi
tim
function,
F 0 (x),
if classifier
itupdate
is defined,
(x)
6= wenil.
Roughly
i (x)|),
i (x)|),
|F
=step
|f
i (x)|
ii(x)|.
In order In
to order
incrementally
In
to order
incrementally
to
update
incrementally
classifier
F
update
(x),Fwe
classifier
F (x),
conFof(x),
conwe
0
F
’
constant,
O(1).
constant,
O(1). O(1).
speaking,
rules
of
F
(x)
would
be
inserted
above
thoseconstant,
of
Reference
[21] defines an effi
sider thatsider
F (x)that
issider
overwritten
F (x)that
is overwritten
F (x)
by isanoverwritten
incomplete
by an incomplete
by
multi-valued
an incomplete
multi-valued
multi-valued
0
0 MDDs
0 by,
Reference
Reference
[17]
Reference
[17]an
defines
efficient
[17]an
defines
efficient
algorithm
an ea
(x).
This
operation
defined
thatdefines
performs
arbitrary
operation
Bitfunction,
function,
Faggregation
(x),Ffunction,
if
F it0 (x),
is defined,
ifFupdate
it0 (x),
is defined,
F
if 0 (x)
it is6=defined,
Fis
undef.
(x)
6=FRoughly
undef.
(x) 6= Roughly
undef.
Roughly
0(
that of
performs
thatof
performs
arbitrary
that performs
arbitrary
operations
operations
overmethod
two
operati
over
MDD
algorithm
allowsarbitrary
our
to
speaking,speaking,
rules ofspeaking,
rules
F 0 (x)ofwould
rules
F 0 (x)of
bewould
Finserted
(x)Fbe0would
inserted
above
be
those
above
of
those
above
those
0inserted
(x) if F (x) 6= nil,algorithmalgorithm
0 (K)
(K)
allows
our
allows
method
our
allows
to
method
apply
our to
operatio
method
apply
F’
F(x)
to algorithm
multi-valued
functions,
so as
FThis
(x)operation
=is by,
(3)
F (x). This
F (x).
update
This
F operation
(x).
update
update
is F
defined
operation
definedis by,
defined by,
multi-valued
functions,
multi-valued
functions,
so as
tofunctions,
calculate
sosize
as tocan
calcula
so
(1)be
asantq
(
(
(F (x) otherwise. multi-valued
worst-case
MDD
0
0 0
0
0
F
(x)
if
F
F
(x)
(x)
if6=
F 0Fundef,
(x)
(x) if6= Fundef,
(x) 6= undef,
MDD of
canMDD
be quite
of
canMDD
large
be quite
can
withlarge
beoperations;
quite
withlarge
operat
i.e.,
wi
update
operation
is in
equivalent
toofunification
Figure 2: Flowchart ofF 0classifier
and
our(2)method.
(x) F 0Note
(x)
(x)construction
= that
F 0(x)
(x)
= F (x)
= update
(2)
(2) |A(x)3B(x)|  |A(x)||B(x)|,
shows
|A(x)3B(x)|
shows
|A(x)3B(x)|
shows

|A(x)||B(x)|,
|A(x)3B(x)|

|A(x)||B(x)|,

where
|A(x)
A
F
(x)
otherwise.
F
(x)
otherwise.
F
(x)
otherwise.
multi-valued
functions
and
3
is
operation ] when the latter is defined, and so update operation
arefunctions.
multi-valued
are multi-valued
functions
are the
multi-valued
functions
and
3 is
functions
and
arbitrary
3 isanao
ever,
size
is
usually
conside
can
be
applied
to
unify
incomplete
multi-valued
Note thatNote
operation
thatNote
operation
that
is equivalent
operation
is equivalent
to operation
is equivalent
to operation
] when
to operation
]
thewhen ]
thewhen the
ever,
the ever,
size
is
the
usually
ever,
size
is
the
considerably
usually
size is considerably
usually
consiths
case
upper
bound,
withsmaller
somethi
For sothe
lookup
acceleration,
multi-valued
function
inthe
latter isdefined,
latter
is defined,
and
latter
isoperation
defined,
and
so unification
operation
and
can
so operation
be applied
canaoperation
be
to
applied
can
meld
be to
applied
meld
to meld
Note that update operation
is equivalent
to
]
casewhen
upper
case
bound,
upper
case
with
bound,
upper
something
with
bound,
something
like
with|A(x)|
some
like
[41];
in
our
experiments,
the
siz
which
consecutive
K
bits
are
aggregated
into
a
single
variable
incomplete
incomplete
multi-valued
incomplete
multi-valued
functions.
multi-valued
functions.functions.
in our experiments,
in our total
experiments,
in our
the experiments,
size
was
the size
never
the
was
greater
size
neve
w
latter operation is defined,
but
this
paper
distinguishes
them
for
clarity.
size
of
operands.
These
ob
is
defined
as
follows,
For the For
lookup
the For
acceleration,
lookup
the acceleration,
lookup
a multi-valued
acceleration,
a multi-valued
function
a multi-valued
function
in
function
in
in
size of operands.
size oftheorem.
operands.
size
Our of
method
operands.
Our ismethod
established
Our ismethod
estab
on
L
(K)
Kinto
whichpacket
consecutive
which classification
consecutive
which
KFbits
consecutive
are
aggregated
bits
are
single
into
variable
a single
single
variablein
K
! {nil,
I,into
II, variable
·a·theorem.
· function
, |P |}.
:K
{0,
1,process,
· are
· ·K, aggregated
2bits
1}
In order to accelerate the
aa aggregated
multi-valued
theorem.theorem.
is definedis as
defined
follows,
is as
defined
follows,
as follows,
Theorem 1. Assume that the s
(K)
(K)
which continuous K bits are aggregated
into
a
single
variable
is
defined
as
follows,
Necessarily,
F LK (x) =LK
F (x), 8x
uses FTheorem
L 2 X . This paper
Theorem
1. Assume
Theorem
1. that
Assume
the
1. that
size
Assume
of
theMDD
that
size the
ofaf
K
K
KK
K
K
K
! I,{undef,
· · ,I,
{undef,
|PII,|}.
· · · and
,I,|P
II,|}.
· · · ,simply
|P |}. an operation is usually less than
F : {0,F1,only
· ·:·{0,
, when
2F1, · ·:it
1}
·{0,
,has
2 1,!
· · {undef,
·1}
, 2distinguished
1}II, · !
to
be
by
K,
uses
an
operation
an
operation
is
usually
an
operation
is
less
usually
than
is
less
usually
or
equal
than
less
or
to
eq
th
operand
MDDs,
L
KF otherwise.
Let1,X· ·2Let
2the
Let
X1}
input
X
beK2the
of
X input
this
be
the
aggregated
ofI,
input
this
variable
repre- variable
repreF (K) : {0,
· X, 2Xbe
−
→
{nil,
II,aggregated
·of·variable
·this
, |Paggregated
|}.
operand repreoperand
MDDs, operand
MDDs, MDDs,
|A(x)3B(x)|  |A(
sentation,sentation,
we have
sentation,
we
F Khave
(X)
F
=AKLGORITHMS
Fhave
(X)
(x) F
=
ifKX
F(X)
(x)
isINequivalent
=
ifP ROPOSED
F
X(x)
is equivalent
if to
XMx.isETHOD
equivalent
to x. to x.
IV.we
1
1
1
|A(x)3B(x)|
|A(x)3B(x)|

|A(x)|
|A(x)3B(x)|
+|A(x)|
|B(x)|,
+|A(x)|
|B(x)|,+
(K)
(K)
Obviously,
F∈ ⌘
F. .FThis
⌘ F .paper
F ⌘ F . uses F
Necessarily, F (x) = FObviously,
(x), ∀x
XObviously,
onlyBoolean
whenfunctions,
it has tothe time complexity of the opera
Given a set
of BDDs representing
given
by,
IV. A
IV.
A LGORITHMS
IV.OFAPLGORITHMS
ROPOSED
OF P ROPOSED
Min
OF
ETHOD
P ROPOSED
M
ETHODconstruct
M ETHOD
fiotherwise.
(x)’s,
algorithms
proposed
this
section
MDDs
the time
the
andtime
space
the
and
complexity
time
space
and
complexity
space
of the complexity
operation
of the o
be specified by K, or uses simply
FLGORITHMS
(K)
of
F
(x)
and
F
(X).
After
reviewing
decision
diagrams
in
K
Given a Given
set ofaGiven
BDDs
set ofarepresenting
BDDs
set of representing
BDDs
Boolean
representing
Boolean
functions,
Boolean
functions,functions,
K
K
K
O(2
(|A(x)|
+
(|A(x)|
O(2 +(|A(x)|
|B(x)|)).
O(2 +(|A(x)|
|B(x)|)).
+ |B(x)|))
Last, the constructionfi (x)’s,
and algorithms
procedures
method
areconstruct
summarized
Section
IV-A,
Section
IV-B
constructs
the
MDD
from O(2
a set
of
fupdate
fi (x)’s,
proposed
algorithms
proposed
in
this of
section
proposed
in our
this
construct
section
in this construct
section
MDDs
MDDs
MDDs
i (x)’s, algorithms
K Section
K IV-C accelerates lookup operations on the
BDDs,
The time complexity of CAS
FK
of
(X).
andF (x)
Fand
After
(X).
andreviewing
FAfter
(X).
reviewing
decision
After reviewing
decision
diagramsdecision
diagrams
in
diagrams
in complexity
in complexity
as a flowchart in Fig. 2. of F (x) ofandF (x)
The
The
isThe
defined
complexity
isby
defined
the number
is by
defined
the of
nub
MDD.
number of nodes and that of chi
Section IV-A,
Section
Section
IV-A,
Section
IV-B
Section
IV-A,
constructs
IV-B
Section
constructs
aIV-B
MDDconstructs
from
a MDD
a set
afrom
MDD
of a of
set
from
of
a
set
of
children
of children
per node.
of children
per node.per node.
BDDs, and
BDDs,
Section
and
BDDs,
IV-C
Section
and
accelerates
IV-C
Section
accelerates
lookup
IV-C accelerates
operations
lookup operations
lookup
on the operations
on the on
the Theorem
Theorem
From
the following
Theorem
1, the following
lemma
1, the follow
islem
gi
• In classifier construction,
Boolean functions mapped to behaviors,From
fi ’s,
are 1,From
MDD. MDD. MDD.
converted to multi-valued functions, Fi ’s, which are unified into the multivalued function of classifier, F , and further converted to F (K) by bit aggregation.
• To incrementally update classifier F (K) , new rule f 0 is converted into F 0(K)
through bit aggregation, and then inserted to the classifier.
5
Algorithms in Proposed Method
Given a set of BDDs representing Boolean functions, fi ’s, which are obtained using existing techniques like [8], the algorithms proposed in this section construct
10
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
MDD of F(2)
BDD of fI
1
0
0
1,2
0,1
1
21
0
1
2
3
32
3,4
2,3
43
43
5
4
5
4
3,4
2,3
3 2 0
5
4
1
5,6
4,5
2
1
3
5,6
4,5
3,4
2,3
0
0
5,6
4,5
5,6
4,5
3,4
2,3
2
1
3
0
5,6
4,5
1 3
5,6
4,5
2
5,6
4,5
5,6
4,5
6
5
3
1
0
2
3
III
0
1
0 1
2
II
3 2
1I
0
4
IV
3
2
5
V
1
1
6
VI
2 0
3
7
VII
1 2 3
0
8
VIII
3
1
10
X
2
0
0
23
9
IX
1
0
3
11
XI
1 2
12
XII
Figure 3: BDD (left) and MDD of K = 2 (right). The BDD represents the Boolean
function mapped to behavior “I” of Fig. 1, while the MDD represents the multi-valued
function mapped to all twelve behaviors of Fig. 1. Since consecutive two-bits are aggregated
in the MDD, non-terminal nodes are labeled by the two-bits while arcs are labeled by 0- to
3-child. A packet header of x = 010111 is indicated by the red paths in the both diagrams.
the MDD of F (K) . After reviewing decision diagrams in Section 5.1, Section 5.2
analyzes the unification and update operations performed on MDDs. Section 5.3
introduces the bit aggregation algorithm, and Section 5.4 proposes a search algorithm for the packet classification.
5.1
Decision Diagrams
As shown in Fig. 3 (left), a BDD [21] is an acyclic directed graph with a single root
node and two terminal nodes, ⊥ and >. Each non-terminal node is labeled as b-th
bit in the header, b ∈ [0, L − 1], and it has two labeled arcs, 0-child and 1-child,
each of which indicates the b-th bit is 0 or 1. A path from the root to a terminal
corresponds to a packet header, x, or a set of headers (a hypercube) when some
bits are skipped (e.g., the red BDD path in Fig. 3, on which the fifth bit is skipped,
expresses a hypercube of 01011∗). The terminal node at the end of path indicates
the value of f (x). The maximum height, or the longest path, is L. Common prefix
and suffix are shared among paths for compression (e.g., two paths, 00 ∗ 11∗ and
01111∗, share their prefix 0 and suffix 11∗ in Fig. 3); it is believed that BDDs are
well compressed for most practical functions [27]. The size of BDD is defined as
the number of non-terminal nodes in it, and the size is denoted by ||f || if the BDD
represents function f .
An MDD [22], which is shown in Fig. 3 (right), is also an acyclic directed graph
with a single root, but it can have more than two terminal nodes, nil, I, II, · · · , |P |.
Each non-terminal node is labeled by aggregated bits, and it can have more than
two children arcs, 0, 1, · · · , 2K − 1. The maximum height is L/K. Other properties
are the same as those of BDD.
Packet Classification for Global Network View of Software-Defined Networking
11
In our method, a BDD is used to represent a Boolean function that maps the
header space to a single packet behavior, while an MDD is used to express a multivalued function. BDD of fi (x) is easily converted to MDD of Fi (x), by replacing
⊥- and >-terminals with nil- and i-terminals, respectively. The time complexity of
this operation is clearly constant, O(1). MDD size is obviously the same as BDD
size, i.e., ||Fi || = ||fi ||.
5.2
5.2.1
Unification and Update Operations
CASE Algorithm
Reference [22] defines an efficient algorithm named CASE that performs arbitrary
operations over two operand MDDs and constructs the resulting MDD. This CASE
algorithm allows our method to apply operations ] and to multi-valued functions
so as to calculate (2) and (4).
The worst-case MDD size can be quite large, ||A3B|| ≤ ||A|| · ||B||, where A
and B are multi-valued functions and 3 is an arbitrary operator. However, the
size is usually considerably smaller than this worst-case upper bound, closer to
||A|| + ||B|| [28]; in our experiments, the size was never greater than the total
size of operands, though impractical counter examples can be considered. This
observation yields the following assumption.
Assumption 1. The size of MDD performing a single operation is less than or
equal to the total size of operand MDDs,
||A3B|| ≤ ||A|| + ||B||.
(5)
The time complexity of CASE algorithm is defined by the number of nodes and
that of children per node [22], and so it is fixed with Assumption 1 as follows,
O(2K (||A|| + ||B||)),
(6)
where the algorithm involves ||A|| + ||B|| node creations, at each of which 2K
children are set. Note that A and B must share a common K to apply CASE
algorithm.
5.2.2
Unification
The MDD size and time complexity of (3) is analyzed. Using (5), the MDD size is
bounded as follows,
||F || ≤
|P |
X
i=I
||Fi ||.
(7)
tion,
(x),
thefollows,
Boolean
function
of i-th behavior, fi (x). two
nodelabeled
and
terminal
nodes,
false
?each
and
>.
to
i-thFito
behavior,
i-thfrom
behavior,
to as
i-th
behavior,
as follows,
as follows,
twotwo
labeled
arcs,
two
0-child
arcs,
labeled
0-child
and
arcs,
1-child,
and
0-child
1-child,
andoftrue
1-child,
each
which
of Each
repre
each
whic
The multi-valued function, Fi (x), maps the header space only the
terminal
node
isislabled
by
aabit
number
inaskipped,
[1,
L],
and
bit
the
number
bit
number
the
0
bit
or
number
is
1;
0
if
or
1;
is
bit
0
if
is
or
a
skipped,
bit
1;
if
is
bit
it
can
is
skipp
be
it c0i
Fi behavior,
: X F!i :{undef,
Xas!
Ffollows,
: i},
X ! {undef,
i},
i},
i {undef,
to i-th
two
labeled
arcs,
0-child
and
1-child,
each
of
which
repre
1 (i.e., 1a (i.e.,
“don’t
a1 “don’t
care”
(i.e., abit).
care”
“don’t
A bit).
path
care”
Afrom
path
bit).the
from
A root
path
theto
from
root
a term
tht
the
bit corresponds
number
is 0toorpacket
1; iftoaheader
bit and
is skipped,
can
bethe0
whereFiwhere
F:i (x)
=
Fwhere
i(x)
if f=iF(x)
ii i},
(x)
if=f=
(x)
i or
if=fF>,
(x)
or==
F>,
(x)or=Fundef
otherwise.
undef corresponds
otherwise.
i{undef,
i>,
ii(x)
iundef
i (x) =otherwise.
tocorresponds
packet
header
packet
x,
x,
header
theand
terminal
the
x,it and
termina
atthe
X !
1are
aindicates
“don’t
care”
bit).
path
from
theofA
root
to aAisterm
Since two
Sincedifferent
two
Since
different
subspaces,
two different
subspaces,
Xi subspaces,
andXiXjand(iXX
6=
j),
(i 6=
are
Xj j),(iof
6=(i.e.,
j),of
are
i j and
path
path of
indicates
path
the value
indicates
theAof
value
f (x).
theof
value
Af (x).
BDD
fisBDD
(x).
canonica
BD
c
where mutually
Fi (x)
=mutually
i exclusive,
if ftwo
=
>, orincomplete
Ftwo
= undef
otherwise.
i (x)exclusive,
i (x)
toabit
packet
header
x, order.
and the terminal at the
mutually
exclusive,
incomplete
two
multi-valued
incomplete
multi-valued
functions,
multi-valued
functions,
functions,
acorresponds
function
a function
and
function
and
order.
bit
and
order.
bit
Since
two
different
Xi behaviors
andforXthe
6=thej),
j (i
of Apath
indicates
the
value
ofanalso
f[18]is
(x).anAalso
BDD
canonic
Fi (x)
and
Fi (x)
Fjand
(x),
Fi (x)
Fnever
andsubspaces,
define
never
Fj (x),define
behaviors
never
define
behaviors
for
same
packet
same
forare
thepacket
same
packet
j (x),
MDD
A [17],
MDDA
[18]is
[17],
MDD
also
[18]is
[17],
acyclic
acyclic
directed
an isacyclic
directed
graph diw
g
mutually
exclusive,
two
multi-valued
functions,
asingle
function
and
bit
order.
header;
header;
i.e.,
8x
header;
i.e.,2 8x
X
,i.e.,
F2iincomplete
(x)
X
8x, F=2
undef
X ,=
Fi (x)
undef
_ Fj=(x)
_undef
F=
undef.
_=
Fj (x)
undef.
=
undef.
i (x)
j (x)
single
root, but
root,
single
it can
butroot,
ithave
can
butmore
have
it can
than
more
have
two
than
more
terminal
twothan
term
nt
Fi (x) and
Fincomplete
(x),
never
define
behaviors
forfunctions
the
packet
j two
MDD
[18]is
also
acyclic
graph
wb
These
two
These
These
incomplete
two
multi-valued
incomplete
multi-valued
functions
multi-valued
cansame
be
functions
can
melded
be melded
canundef,
beA melded
undef,
I, Akashi
II, [17],
· · I,
·undef,
,II,
|P
·|.· ·Each
I,, |P
II,
|.·non-terminal
·Each
·an, |P
non-terminal
|. Eachdirected
node
non-terminal
isnode
labeled
is nla
12
T.
Inoue,
T.
Mano,
K.
Mizutani,
S.=Minato,
and O.
header;
i.e.,
8x
2
X
,
F
(x)
=
undef
_
F
(x)
undef.
i
j
single
root,bits,
but
itbits,
can
have
two
terminal
n
into a into
one without
a one
intowithout
aconfliction,
one without
confliction,
byconfliction,
introducing
by introducing
bytheintroducing
following
the following
theaggregated
following
aggregated
aggregated
and
it can
and
bits,
have
it more
can
and
more
have
itthan
can
than
more
have
two
than
more
children
twothach
These operation,
two incomplete
K |. Each
K non-terminal node is labeled
undef,
I,, 1,
II,
,21,
|P
operation,
operation,multi-valued functions can be melded 0,
1, · · ·0,
2K· ·· ·· ·0,
,1.
Other
· · · 1., 2properties
Other
1.properties
Other
are same
properties
arewith
sameare
the
with
sam
BD
8
8 by introducing
8
into a one without confliction,
the following aggregated bits, and it can have more than two children
In our Inmethod,
our In
method,
aour
BDD
method,
a is
BDD
used
a isBDD
toused
represent
istoused
represent
atoBore
>
Fj (x)
if =
(x)if=Fundef,
i (x)
j (x) = undef, K
<Fi (x)>
<Fi (x)>
<Fif
operation,
F(x) Fjundef,
0, 1, · · function
· ,of
2 a header
1.
area single
same
BD
function
function
of Other
a header
space
ofproperties
amapped
space
headermapped
to
space
mapped
to awith
packet
single
tothe
apack
beha
sin
8
(F
F
)
(F
F
)
Fi (x)
FiF(x)
]F=
Fij(x)
(x)
FXI
= Fif
(x)
Fj (x)
Fi (x)
if =
Fiundef,
(x)if=Fundef,
IV ]
V
VIF]
j (x)
j (x)
j=
j (x)
i (x) = undef,
In
our
method,
a
BDD
is
used
to
represent
a
Bo
>
>
>
while
a
while
MDD
a
while
is
MDD
used
a
is
to
MDD
used
express
is
to
used
express
a
multi-valued
to
express
a
multi-valued
a
function
multi-v
f
F
(x)
if
F
(x)
=
undef,
i :
j
:
:
<not
available
not available
not
otherwise.
available
otherwise.
otherwise.
function
of
a
header
space
mapped
to
a
single
packet
beh
F
header
header
space
of
space
header
packet
of
space
packet
classifier.
of
classifier.
packet
Figure
classifier.
Figure
2
shows
2
Figure
shows
a
BDD
2
F
Fi (x)
Fj (x)
if Fi (x) = undef,
IV ] FjV(x) =
while aof
MDD
isFig.
used1.
express
:associative
MDD
MDD
Fig.of1.
MDD
oftoFig.
1. a multi-valued function
This operation
This operation
This
is associative
operation
is>
and
is
associative
commutative.
and
commutative.
and
commutative.
not available otherwise.
headermultispace of packet classifier. Figure 2 shows a BDD
Performing
Performing
this
Performing
operation
this operation
this
overoperation
allover
the all
incomplete
over
the all
incomplete
themultiincomplete
multiMDD
of
Fig.
1.
This
operation
is
associative
and
commutative.
B.
Multi-Valued
B.
Multi-Valued
B. Decision
Multi-Valued
Decision
Diagram
Decision
Diagram
Construction
Diagram
Construction
Const
F
F
F
F
F
F
F
F
F
F
F
F
valued valued
functions,
functions,
valued
FVi (x)’s,
functions,
Fithe
(x)’s,
Fthe
(x)’s,
multi-valued
thefunction
multi-valued
ofII clasfunction
iVII
IV
VI
XI multi-valued
VIII
XII
I function
IIIof clasX ofIXclasPerforming
this
operation
over
all as
thefollows,
incomplete multisifier,
Fsifier,
(x), is
Fsifier,
(x),
obtained
isFobtained
(x),
as is
follows,
obtained
as follows,
BDD
TheofBDD
fiThe
(x)ofis
BDD
fieasily
(x)Diagram
ofis fconverted
easily
converted
easily
to theconverted
MDD
to the of
MD
toF
i (x) is
||Ffunctions,
6 6
6 6
7 7 7function
9 9
9 9B. The
10
Multi-Valued
Decision
Construction
valued
Fi (x)’s,
the multi-valued
of clasi|| = by
replacing
by
replacing
?by
and
replacing
?>-terminals
and
?>-terminals
and
with
>-terminals
undefwith
undefand
with
i-term
and
und
|P |
|P |
|P |
]
]
]
sifier, F (x), is obtained as follows,
The respectively.
BDD
fisize
(x)The
is
theofisMDD
ofis Fo
The
respectively.
ofeasily
size
BDD
The
ofconverted
ofBDD
size
fi (x)
ofofis
BDD
ftoiobviously
(x)
fobviousl
same
F (x) =F (x)F=i (x).
F (x)Fi=
(x). Fi (x).
(1) respectively.
(1)
(1) of
i (x)
Figure 4: Huffman tree representing
the
optimal
calculation orderthat
ofreplacing
(3)
for
Fig.
1.of
Leaf
by
?and
>-terminals
with
undefand
i-term
|P |
of
that
MDD
of
of
MDD
that
F
of
(x)
MDD
in
F
(x)
terms
of
in
F
of
terms
(x)
the
in
of
number
terms
the
number
of
of
the
nodes
nu
o
]
i=I
i=I
i=I
i
i
i
nodes are MDDs of Fi ’s, while
nodes represent ] operations.
The
MDD
sizes,
respectively.
The
size
of that
BDD
of
fii(x)|
(x)
is=obviously
same
F (x)internal
=
Fi (x).
(1) diagram,
diagram,
that
is,
diagram,
that
|f
(x)|
is,
|f
=
(x)|
|F
is,
(x)|,
=
|f
|F
where
(x)|,
|F
where
|A(x)|
(x)|,
|A(x)
is
where
the
i
i
i
i
i
Since the
Since
Since
subspaces
are
the collectively
subspaces
are collectively
are
exhaustive,
collectively
exhaustive,
F (x)
exhaustive,
isFcom(x) is Fcom(x) is com||Fi ||’s, are shown
at subspaces
thethe
bottom.
thatdiagram
of ofMDD
of
termsfunction
of
thefunction
number
of compl
nodes
i=I
i (x) in
of
diagram
representing
of F
diagram
representing
function
representing
A(x).
The
A(x).
space
The
A(x).
space
Th
plete, that
plete,
is,that
8x
plete,
2
is,X8x
that
, F2(x)
is,X 8x
,6=Fundef.
(x)
2 X6=, Fundef.
(x) 6= undef.
diagram,
that
is,
|f
(x)|
=
|F
(x)|,
where
|A(x)|
isisthe
i
i
of this
step
this
is step
O(|f
of this
is
O(|f
step iwhile
is
(x)|),
O(|f
the
while
timethe
while
complexity
timethe
complexi
time
clc
Since
theInsubspaces
are
collectively
exhaustive,
(x)
is(x),
comi (x)|),
i (x)|),
In order
to
order
incrementally
Intoorder
incrementally
to update
incrementally
update
classifier
classifier
update
FF
(x),
classifier
F
we
conwe
F (x),
conwe of
conof diagram
representing
function A(x). The space comp
constant,
constant,
O(1).
constant,
O(1). O(1).
plete, that
that
8x
Xthat
,F
(x)
undef.
sider
sider
Fis,(x)
that
sider
isF2overwritten
(x)
isF
overwritten
(x)6= is
by
overwritten
an incomplete
by an incomplete
by an
multi-valued
incomplete
multi-valued
multi-valued
of Reference
this step
O(|f
while
the
time
complexity
is na
cl
Since operation
]0 (x),
is
associative
and
commutative,
operations
inis[17]
(3)
can
be
i (x)|),
0 update
Reference
Reference
defines
[17] defines
an
[17]
efficient
an
defines
efficient
algorithm
an
efficient
algorithm
named
algo
C
In order
incrementally
F (x),
we 6=confunction,
function,
F to
function,
Fif0 (x),
it isif
Fdefined,
(x),
it is ifdefined,
F
it 0classifier
(x)
is defined,
F6=0 (x)
undef.
6=
F 0 (x)
undef.
Roughly
undef.
Roughly
Roughly
constant,
O(1).
0
0 by
0 incomplete
that
performs
that
performs
arbitrary
that
performs
arbitrary
operations
arbitrary
operations
over
operations
two
over
MDDs.
two
over
MDDs.
This
two
C
sider
F
(x)
is
overwritten
an
multi-valued
performed inspeaking,
an that
arbitrary
order.
In
addition,
the
time
complexity
of
each
operation
speaking,
rulesspeaking,
ofrules
F (x)
ofrules
F
would
(x)
of would
be
F (x)
inserted
be
would
inserted
above
be inserted
above
those of
those
aboveofthose of
0
Reference
[17]
defines
an method
efficient
named
C
algorithm
algorithm
allows
algorithm
allows
our
method
our
allows
to our
apply
method
toalgorithm
operations
applytooperations
apply
] and
ope
F update
(x),
ifupdate
itThis
is size
defined,
Fis0 (x)
(x). This
F (x).
This
Fon
(x).
operation
operation
update
isofdefined
operation
defined
by,6=isundef.
defined
by, Roughly
by,We can,
in (3) variesFfunction,
depending
the
operand
MDDs.
therefore,
optimize
0(
that
performs
arbitrary
operations
over
two
MDDs.
This
C
multi-valued
multi-valued
functions,
multi-valued
functions,
so
as
functions,
to
so
calculate
as
to
so
calculate
as
(1)
to
and
calculate
(1)
(2).
and
The
(
(
(
speaking, rules of F (x) would be inserted above those of
0
0 0
0
the calculation
order.update operation
algorithm
allows
ourbelarge
method
to
apply
operations
]
and
Fdefined
if0 (x)
F 0 (x)
Fifby,
(x)
6=
F undef,
(x)if6=Fundef,
(x) 6= undef,
of
MDD
of can
MDD
beofcan
quite
MDD
quite
canwith
be
large
quite
operations;
with
large
operations;
with
i.e.,operations
reference
i.e.,
re
F (x).FThis
is
0
0
0 F (x)
(x) F F(x)
(x) F
=F(x)
(x) =F (x) =
(2) multi-valued
(2)
(2) functions, so as to calculate (1) and (2). The
(
shows shows
|A(x)3B(x)|
|A(x)3B(x)|
shows 
|A(x)3B(x)|
|A(x)||B(x)|,
 |A(x)||B(x)|,
 |A(x)||B(x)|,
where where
A(x) and
A(x
wh
F (x)
Fotherwise.
(x) 0 Fotherwise.
(x) otherwise.
0
Theorem 1. The0 time complexity
is
given
F (x) ofif (3)
F (x)optimized
6= undef, with Huffman
of MDD
can
beare
quite
large
with
operations;
i.e.,
reference
are
multi-valued
arecoding
multi-valued
functions
multi-valued
functions
and 3
functions
and
is arbitrary
3 isand
arbitrary
3
operator.
is arbit
opeH
F
(x)
F
(x)
=
(2)
Note that
Note
operation
that
Note
operation
thatisF
operation
equivalent
tois operation
equivalent
to operation
]towhen
operation
] the
when]
the
when|A(x)3B(x)|
the
shows
|A(x)||B(x)|,
where
A(x)this
and
(x) is equivalent
otherwise.
by,
ever,
the
ever,
sizethe
isever,
size
usually
the
isusually
size
considerably
is usually
considerably
smaller
considerably
smaller
than
sma
than
w
latter islatter
defined,
is latter
defined,
andissodefined,
and
operation
so operation
and socan
operation
be can
applied
be applied
can
to meld
be applied
to meld
to
meld
are multi-valued
functions
and something
3with
islike
arbitrary
operator.
case
upper
case
bound,
upper
case
bound,
with
upper
something
with
bound,
something
|A(x)|
like
|A(x)|
+
|B(x)|
like
+|A
Note
that
operation
is
equivalent
to
operation
]
when
the
!
incomplete
incomplete
multi-valued
incomplete
multi-valued
functions.
multi-valued
functions.
|Pfunctions.
|
ever,
size
isin usually
considerably
smaller
than
this
w
in
ourthe
in
experiments,
our experiments,
ourthe
experiments,
size
thewas
sizenever
the
was
size
greater
never
was
greater
than
never
the
gr
th
X
latter
defined,
and
so operation
cana be
applied
to meld
For isthe
Forlookup
theFor
lookup
acceleration,
the
lookup
acceleration,
aacceleration,
multi-valued
multi-valued
a function
multi-valued
function
in case
function
in upperinbound, with something like |A(x)| + |B(x)|
K
size of size
operands.
of operands.
sizeOur
of operands.
method
Our method
isOur
established
method
is established
on
is establish
the on
follo
th
2arebits
, single
(8)
Obitsfunctions.
`(F
i )||F
ia||single
incomplete
which
consecutive
whichmulti-valued
consecutive
which
K consecutive
K
aggregated
are
K bits
aggregated
are
intoaggregated
into a variable
into avariable
single
variable
in our
experiments,
the size was never greater than the
theorem.
theorem.
theorem.
For
the
lookup
acceleration,
a
multi-valued
function
in
i=I
is defined
is defined
as follows,
is defined
as follows,
as follows,
size of operands. Our method is established on the follo
which consecutive
K Kbits
are Kaggregated
intoLa single variable Theorem
L
L
Theorem
1. Assume
Theorem
1. Assume
that1.the
Assume
that
sizetheofthat
size
MDD
the
of after
MDD
size of
perfor
after
MD
K
K
K
K
theorem.
K· ·!
K
K II,
{undef,
!
{undef,
I,
!
·
·
{undef,
I,
·
,
II,
|P
·
|}.
·
·
I,
,
|P
II,
|}.
·
·Huffman
· , |P |}.
F
:
{0,
F
1,
:
·
{0,
·
·
F
,
1,
2
·
:
·
{0,
·
,
1}
2
1,
·
1}
,
2
1}
where `(Fi ) isisdefined
the path
length from the root to leaf Fi on the an
tree.
as follows,
operation
an operation
isanusually
operation
is usually
less isthan
usually
lessorthan
equal
less
or to
than
equal
theortotal
toequa
thesi
Theorem
1. Assume
the size of MDD after perfor
K2 X
Let X
X be
2
Let·the
X· ·X,be
input
X of
input
beKLthis
the
input
this of
aggregated
variable
reprerepreoperand
operand
MDDs,
operand
MDDs,that
MDDs,
!ofaggregated
{undef,
I,this
II,
·aggregated
· · ,variable
|Prepre|}. variable
F Let
: {0,
1,
2K2Kthe
1}
sentation,
sentation,
we have
sentation,
weFhave
(X)
we
FK
=
have
(X)
F (x)
F=Kif(X)
FX
(x)=
isifequivalent
FX(x)is ifequivalent
X tois x.
equivalent
toanx.operation
to x. is usually less than or equal to the total s
1 only
|A(x)3B(x)|
|A(x)3B(x)|
|A(x)|
 |A(x)|
+ |B(x)|,
+|A(x)|
|B(x)|,
+ |B(x)|,
Let
X Obviously,
2operation
XF 1be
variable at
repreoperand
MDDs,
Proof. SinceObviously,
each
takes
two operands
a time
in|A(x)3B(x)|
the
CASE
algoObviously,
⌘ the
FF.1 input
⌘
FF. of
⌘this
F . aggregated
sentation, we have F K (X) = F (x) if X is equivalent to x.
rithm, the calculation
can
be
a ETHOD
binary
tree,
like
Fig.
4.time
Every
IV.F A
LGORITHMS
IV.F .A LGORITHMS
IV. A
OF
LGORITHMS
Pillustrated
ROPOSED
OF P ROPOSED
OF
M ETHOD
Pas
ROPOSED
M
M ETHOD
1order
the time
theand
time
space
the
and
complexity
space
andcomplexity
space
the
complexity
operation
of the operation
of isthegiven
oper
is
|A(x)3B(x)|

|A(x)|
+ of
|B(x)|,
Obviously,
⌘
function Fi onGiven
a leaf
atofa each
nodeBoolean
up
to
the root,
and
so
the
time
Given
a is
set used
aof
Given
set
BDDs
set
BDDs
representing
ofinternal
representing
BDDs Boolean
representing
functions,
Boolean
functions,
functions,
K
K
O(2K
(|A(x)|
O(2
(|A(x)|
+
O(2
|B(x)|)).
(|A(x)|
+ |B(x)|)).
IV. A LGORITHMS OF P ROPOSED M ETHOD
the time
and
space
complexity
of +the|B(x)|)).
operation is given
fi (x)’s,
algorithms
(x)’s,
falgorithms
proposed
algorithms
in this
proposed
in
section
this
inconstruct
this section
construct
MDDs
construct
MDDs MDDs
complexity of
(3)fiis
bounded
byproposed
(8)
using
(6).section
i (x)’s,
Kof BDDs
K
K
Given
a
set
representing
Boolean
functions,
K
of F (x)
of and
F (x)Fofand
F
(X).
(x)
F After
and
(X).Freviewing
After
(X).reviewing
After
decision
reviewing
decision
diagrams
decision
diagrams
in diagrams
in
in(|A(x)|
TheO(2
complexity
The
complexity
The
is
complexity
is defined
by the
is defined
by
number
the number
by
of the
nodes
numb
of and
no
+ defined
|B(x)|)).
The optimal
order
is this
given
as
aa from
binary
tree
minimizing
(8);
this
fi (x)’s,calculation
algorithms
in
section
construct
MDDs
Section
Section
IV-A,
Section
IV-A,proposed
Section
IV-A,
IV-B constructs
Section
IV-B
constructs
IV-B
a MDD
constructs
MDD
afrom
set
MDD
ofa set
from
ofchildren
a set
of
of
of
children
perofnode.
children
per node.
per node.
K
of can
F (x)
andSection
F
(X).
After
reviewing
decision
diagrams
in
optimizationBDDs,
be
regarded
as
Huffman
coding
[29],
asonleaf
MDDs
and
their
BDDs,
and
and
BDDs,
Section
IV-C
and
accelerates
Section
IV-C
accelerates
IV-C
lookup
accelerates
lookup
operations
operations
lookup
on the
operations
the
the
Theon
complexity
defined
byfollowing
the
number
nodes
and
From
Theorem
From Theorem
From
1,is the
Theorem
following
1, the
1, lemma
the
following
lemma
is of
given
is
lemma
for
given
(1
SectionMDD.
IV-A, MDD.
Section IV-B constructs a MDD from a set of of children per node.
MDD.
sizes are replaced with symbols and their weights (frequencies), respectively. This
BDDs, and Section IV-C accelerates lookup operations on the
Theorem 1,
the following lemma is given for (1
binary tree is
also known as a Huffman tree. Note that the timeFrom
complexity
ignores
MDD.
Huffman tree construction, since it is negligible compared with MDD operations.
5.3
Bit Aggregation
This subsection defines the bit aggregation algorithm that accelerates packet classification. Algorithm 1 constructs the MDD of F (K) from that of F , by aggregating
continuous K bits into a single variable. Note that in Algorithm 1, F or F (K) refers
to the root node of MDD representing function F or F (K) , not to the function itself. This algorithm recursively creates MDD nodes by finding their 2K children
at K bits ahead from the node; x0 means K continuous bits from F.b, where F.b
Packet Classification for Global Network View of Software-Defined Networking
13
Algorithm 1: Aggregate
Input: F
Output: F (K)
if F is non-terminal or F not found in cache then
create F (K)
for x0 ← 0 to 2K − 1 do
F 0 ← descendant reached by x0 from F
F (K) .child[x0 ] ← Aggregate(F 0 )
// x0 is K-bit from F.b
// set x0 -th child
cache[F ] ← F (K)
return cache[F ]
is the smallest bit number labeling node F (e.g., F.b = 0 at the root of MDD in
Fig. 3). To avoid repeatedly visiting the same node, visited nodes are cached; this
algorithm assumes that all terminal nodes would have been set to the cache in
advance. The time complexity of this algorithm is O(2K ||F ||).
Since the maximum height of MDD is shrunk to L/K by bit aggregation, the
worst-case search path length is also reduced by a factor of K. This is a great
acceleration, but there is a tradeoff between the search time and memory space,
as follows.
We set the following assumption about MDD size.
Assumption 2. The bit aggregation algorithm (Algorithm 1) shrinks the MDD
size by a factor of K,
||F (K) || ≈
||F (1) ||
,
K
(9)
assuming that MDD nodes are roughly equally distributed over each run of K bits
(i.e., bits between [Ki, K(i + 1) − 1], i ∈ [0, L/K − 1]).
Since the memory requirement of MDD F (K) is given by the product of MDD
size and memory requirement of each node [30], it is derived with Assumption 2
as follows,
||F (1) ||
|b| + 2K |child| ,
K
(10)
where |b| is the width of bit numbers and |child| is that of child IDs. Given
|b| = |child|, the memory requirement is minimized at K = 2, and increases
exponentially for K > 2.
The memory requirement might be reduced by introducing heterogeneous MDDs [30],
in which non-terminal nodes are allowed to maintain a different number of children
and different number of bits. This heterogeneity is also utilized by conventional
decision tree methods. However, it prevents efficient MDD traversal introduced in
the next subsection, and so we only use MDDs of uniform K in our method.
14
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
Algorithm 2: Search
Input: F (K) , pkt
Output: packet behavior ∈ {I, II, · · · , |P |}
while F (K) is non-terminal do
x0 = pkt[F (K) .b/K]
F (K) = F (K) .child[x0 ]
return F (K)
5.4
// pkt is in K-bit array
// get K-bit from F (K) .b
// go to next node
// terminal node of behavior
Packet Classification
Algorithm 2 presents the search algorithm that identifies the network-wide behavior of a given packet. It simply follows a path based on the packet. The time
complexity is determined just by the maximum height of F (K) , that is, O(L/K),
because this algorithm includes no operation other than MDD traversal; in contrast, conventional decision tree methods usually involve linear search at a leaf
node to select a single rule.
This algorithm is also very efficient in terms of implementation. A packet header
is represented as an array of K-bit elements; i.e., i-th K-bit element on the header
can be accessed by index i, like pkt[i]. The array element itself becomes a child
index without fixing numbers of children and bits, which makes this algorithm
very efficient in actual implementation. Our algorithm directly handles the raw
bit sequence of the packet header by a K-bit array, and so the packet does not need
to be pre-processed at all; other implementations might assume that the header
fields of interest would be extracted in advance5 .
6
Experiments
This section evaluates our method in terms of memory usage in Section 6.1, construction and update time in Section 6.2, and classification throughput in Section 6.3.
Our method was implemented in C++. The parameter of bit aggregation, K,
was chosen from 1, 2, 4, and 8, in order to align K-bit array elements with bytes.
The widths of bit numbers and child IDs, |b| and |child|, were defined as 32 bits.
Our method was compared with HybridCuts [16] as well as a classification method
utilizing BDDs [8]. For HybridCuts, parameters “binth” and “spfac” were set to
eight and four, which showed the best performance in terms of memory usage
and throughput. Search process was added to the original implementation by us.
The BDD classification method relies on a set of BDDs of fi ’s to determine the
network-wide behavior. It was also implemented in C++ by us.
5
For instance, http://www.arl.wustl.edu/~hs1/PClassEval.html and http://hypercuts.
masicek.net/
Packet Classification for Global Network View of Software-Defined Networking
15
Table 1: Statistics of Three Real Networks
#
#
#
#
#
#
of
of
of
of
of
of
switches
ports used
rules (FIB)
rules (ACL)
header bits of interest
network-wide packet behaviors
Internet2
9
56
126,017
0
32
86
Stanford
16
58
757,170
1,584
88
1,093
Purdue
1,646
2,736
0
3,605
104
10,353
Configuration datasets of the three real networks, Internet2, Stanford backbone
network [10], and Purdue campus network [23], were employed in the experiments.
The network statistics are shown in Table 16 . Configuration rules of FIB (Forwarding Information Base) specify the destination IP field only, while those of
ACL (Access Control List) are written as 5-tuple7 . Some rules specify the VLAN
field, but it was ignored in our experiments in order to use a packet trace generator
named ClassBench [31], which supports 5-tuple only. This omission, however, has
no significant impact on our results, because the MDD size is only 20 % larger at
most when the VLAN field is considered.
The experiments were conducted using a single core of Xeon 3.5 GHz with 8
MB cache and 16 GB main memory (DELL PowerEdge Server).
6.1
Memory Usage
Figure 5 shows the MDD size (the number of nodes in an MDD). The size was
smaller than the total number of nodes in a set of BDDs, as indicated by (7).
# of BDD nodes
Internet2
37,903
Stanford
547,298
Purdue
561,635
Figure 5 shows a good fit with (9), which validates Assumption 2; the discrepancy
is 9.6 % at maximum (Purdue with K = 8).
The amount of memory used by an MDD is shown in Fig. 6, while the size of
each MDD node, |b| + 2K |child|, is given in the following table (the memory usage
of an MDD is the product of MDD size and node size).
6
The numbers of network-wide packet behaviors in Table 1 are somewhat different from [8].
We are unable to explain this difference, but in the most complicated Purdue network, our number
is greater than that of [8] (3,917) and so this difference does not favor us.
7
Since Purdue network has no FIB rule, network-wide packet behaviors are defined only at the
network edge.
16
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
MDD size [K]
1000
Internet2
Stanford
Purdue
100
10
1
0
2
4
6
8
10
K
Figure 5: MDD size. The dotted lines indicate (9).
Memory usage [MB]
100
Internet2
Stanford
Purdue
10
1
0.1
0
2
4
6
8
10
K
Figure 6: Memory usage of MDD. The dotted lines indicate (10). The horizontal line is
memory usage of HybridCuts for Internet2.
K
1
2
4
8
Node size [B]
12
20
68
1028
The memory usage is minimized at K = 2 and increases exponentially for larger
K, as expected by (10), but it still fits within the CPU cache (8 MB) even for
K = 8 in Internet2 and Stanford, and for K = 4 in Purdue. Although the memory
usage is relatively large, 56 MB, at K = 8 in Purdue, the CPU cache remained
viable for this seven-fold data; this issue is discussed in the throughput evaluation.
BDD memory usage is as follows.
BDD memory usage [MB]
Internet2
0.454
Stanford
6.568
Purdue
6.740
MDD has smaller memory usage than BDD when K ≤ 4 in Internet2 and Stanford
and K ≤ 2 in Purdue.
HybridCuts successfully constructed a classifier just for Internet2 as described in
Section 3, which requires 739 KB of memory. HybridCuts could be comparable to
our method if the header space was represented by a small number of hypercubes,
but it does not scale with the header space complexity of network-wide packet
behavior.
Packet Classification for Global Network View of Software-Defined Networking
0.8
30
Time [sec]
Internet2
25
0.6
2500
Stanford
2000
20
0.4
1000
10
5
0
Optimal
500
0.562
6.35
0
Ascending Random
Purdue
1500
15
0.2
17
0
Optimal
Ascending Random
Optimal
Ascending Random
Figure 7: Computation time of MDD unification. Each point is the average of 10 trials.
Time [sec]
10
Internet2
Stanford
Purdue
1
0.1
0.01
0.001
0
2
4
6
8
10
K
Figure 8: Computation time of bit aggregation. Each point is the average of 10 trials.
6.2
Construction and Update Time
Figure 7 demonstrates the time to unify BDDs of fi ’s into a single MDD of F (1)
by the operations of (3). We evaluated three calculation orders: the optimal order
of Theorem 1, ascending order (a minimum MDD is added to the current unified
MDD at every step), and random order. As shown in Fig. 7, the optimal order
outperforms the others. Even for the complicated Purdue network, MDDs were
unified just in 6.35 sec, which is usually acceptable as an initial construction; note
that it can be incrementally updated on demand, as is detailed later.
MDD of F (1) was converted to that of F (K) by the bit aggregation algorithm
presented in Algorithm 1. The aggregation time is shown in Fig. 8. As expected,
the time grows exponentially with K, but it remained under 2.5 sec which is
considered acceptable.
It is worth noting that the calculation time of BDDs is less than 1 sec according
to [8], which has to be added to the total construction time of our method.
HybridCuts required 2.17 sec to construct a classifier for Internet2. This is
slower than our method with the random calculation order, because HybridCuts
processes each hypercube rule individually while our method handles them collectively in a compressed manner.
For MDD updates, we assume that the packet classifier is updated by a new rule
associated with a new network-wide behavior. First, the new rule was randomly
chosen from the original rules, and then the MDD of classifier was constructed from
18
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
Time [msec]
1000
Internet2
Stanford
Purdue
100
10
1
0
2
4
6
8
10
K
Figure 9: Computation time to update an MDD. Each point is the average of 100 trials.
# of hops
100
Internet2
Stanford
Purdue
10
1
0
2
4
6
8
10
K
Figure 10: Average hop counts on an MDD to classify a packet. The dotted lines guide
∝ 1/K. The horizontal line is average hop counts of HybridCuts for Internet2.
the remaining rules. We finally measured the time to perform update operation over the MDDs of new rule and original classifier. Figure 9 shows the update time.
It is less than 10 msec for Internet2, and less than 30 msec for Stanford. Even
for Purdue with K = 8, the MDD was updated in just 200 msec; these results are
entirely acceptable unless time constraints are extraordinarily severe.
6.3
Classification Throughput
Traditionally, classification throughput of packet classification has been evaluated
by the number of memory accesses per packet. However, modern processors have
complicated architectures with multi-level cache hierarchy, which has a significant
impact on throughput, and so classification throughput was measured by actually
processing packet headers. In addition to Xeon, classification throughput was also
measured using a single core of Core i7 1.7 GHz with 4 MB cache and 8 GB main
memory (Apple Macbook Air), in order to examine the impact of hardware.
In the experiments, the classification throughput was measured as follows. First,
a million packets with IP and TCP/UDP headers were generated by ClassBench
based on the network configurations. The packets were then written into a file
in network-byte order. Finally, each packet was read from the file and classified.
Note that packets should be generated based on the configurations as done by
ClassBench, in order to match all behaviors; randomly generated packets are likely
19
Packet Classification for Global Network View of Software-Defined Networking
Xeon
25
Throughput [Mpps]
50
Internet2
40
20
Average
Worst-case
30
12
Stanford
8
15
20
10
10
5
6
4
Average
Worst-case
0
0
0
2
4
6
8
0
10
2
4
6
8
0
10
Core i7
2
4
6
Throughput [Mpps]
K
8
10
1
Average
Worst-case
0
0
10
2
1
0
8
Purdue
2
Average
Worst-case
6
3
3
2
4
4
4
4
2
Stanford
5
6
0
K
6
Internet2
Average
Worst-case
2
K
K
8
Purdue
10
0
2
4
6
K
8
Average
Worst-case
0
10
0
2
4
6
8
10
K
Figure 11: Classification throughput measured on Xeon 3.5 GHz at the top, and on Core
i7 1.7 GHz at the bottom. Each cross is the average of 30 trials, and each trial processed
a million packets. The worst-case throughput is indicated by dashed lines. The horizontal
lines in Internet2 are throughput of HybridCuts.
to match larger subspaces (probably a “default rule”), but never match smaller
ones.
Figure 10 shows the average number of hops from the root to a terminal on
the MDD of F 0(K) . Considering that the number of worst-case hops at K = 1
is equivalent to the number of bits of interest given in Table 1, the average hop
number was roughly 2/3 of the worst-case. The average hop number follows 1/K
with maximum deviation of 17 % (Stanford with K = 8), and it is less than 10 even
with K = 8 for all networks, which ensures the fast classification of our method.
The hop number is much less than the 288 bits of the TCP/IP header, since header
fields of no interest are simply skipped on MDDs.
The classification throughput measured on the Xeon is demonstrated at the top
of Fig. 11. It exceeded 10 Mpps at K = 8 for all networks; at 10 Mpps, even short
packets of 125 bytes can fill a 10 Gbps link. The throughput of our method scales
linearly against K for all networks, though the MDD is larger than the CPU cache
for Purdue with K = 8. Figure 11 also shows the worst-case throughput estimated
using the worst-case hop number shown in Fig. 10. Even for the worst-case, the
throughput is rather high.
Interestingly, the classification throughput shows different behaviors on Core
i7, as shown at the bottom of Fig. 11. The throughput is quite high, greater than
3 Mpps at K = 8 for all networks, but it does not scale linearly, rather it scales
logarithmically.
20
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
12
10
Time [nsec]
35
Xeon
Core i7
30
25
8
20
6
15
4
10
2
5
0
0
0.1
1
10
Memory usage [MB]
100
0.1
1
10
100
Memory usage [MB]
Figure 12: Computation time required for a single hop versus memory usage, measured
on Xeon 3.5 GHz at the top and on Core i7 1.7 GHz at the bottom. The dotted line shows
the linear regression on (log(x), y).
This difference in scaling is investigated in depth. Figure 12 shows the computation time required for a single hop versus the memory usage of MDD. The hop
time on Xeon stays nearly constant, while that on Core i7 increases logarithmically
with MDD size; this is the reason for the scaling difference. We do not dive into
the details of CPU architecture and cache hierarchy, which is beyond the scope
of this paper, but these results justify our evaluation strategy; the classification
throughput should be evaluated based on actual measurements in order to consider
the hardware impact.
Throughput of HybridCuts was 7.26 Mpps on Xeon for Internet2, as shown in
Fig. 11; it is slower than our method of K ≥ 2, though the average hop count
was quite small, 3.46. This is because each hop took 31.8 nsec, which is nearly
four times of our method. HybridCuts has to examine decision criteria at every
node, while our simple search algorithm finds the next node just by array access.
Moreover, HybridCuts has to find a rule from “binth” number of rules by linear
search at a leaf node.
The BDD classification method was slower by several orders of magnitude due
to the linear search performed over all behaviors, as follows.
Throughput by BDDs [Mpps]
7
Internet2
0.0849
Stanford
0.00578
Purdue
0.00153
Discussion from Application Aspects
This section discusses our method in terms of applications. Section 7.1 chooses the
best K based on the experiments’ results. Section 7.2 discusses the contributions
of our method to SDN applications.
7.1
Choice of Best K
Network operators are needed to choose the best K to deploy their applications.
Our evaluation in Section 6 revealed a tradeoff between the throughput and con-
Packet Classification for Global Network View of Software-Defined Networking
Throughput [Mpps]
25
K=8
20
21
Stanford
15
K=4
10
5
0
0
20
40
60
80 100
Update rate [update/sec]
120
Figure 13: Classification throughput versus update rate. Internet2 and Purdue are
omitted since they showed similar results.
struction/update time; large K improves the throughput, but degrades the construction/update time. This tradeoff is shown in Fig. 13 (the construction time is
not shown to depict the tradeoff on a two-dimensional plane). The figure tells us
that K should be set to 4 or 8, since the points of K = 4 and 8 form a Pareto
frontier, i.e., they cannot be improved without degrading some property. If the
network were much simpler, K = 16 might be included in the Pareto frontier, but
MDDs of K = 16 require 128× memory capacity compared to K = 8, which would
probably make it difficult to fully leverage small CPU caches. In contrast, only
K = 4 would form a Pareto frontier in more complicated networks.
7.2
7.2.1
Contributions to SDN applications
Network Fabric
Network fabric [3,4] is an idea to improve the economic efficiency and manageability
of networks by separating the “intelligence” from the network core. An application
program of network fabric constructs a classifier of network-wide packet behaviors
based on the network policies. The classifier is then set to edge switches, which tag
the header of every incoming packet with some sort of “behavior label”. The labeled packets are forwarded through the network core based on the behavior labels.
This simple network core can consist of low-cost switches and can be managed just
by the labels independently of protocols used in external networks.
To realize the network fabric, the network edge is required to determine the
network-wide behavior for every packet at line rate, but that is virtually impossible
for conventional classification methods, which failed to construct the classifier or
achieved unacceptably low throughput given complex policies like those of the
Stanford backbone. Reference [4] estimated that the classification rate of 6.7 Gbps
on a single CPU core is enough, and the rate could be enhanced using computer
clustering techniques like RouteBricks [32] for a large domain. However, no specific
classification method that can deal with network-wide packet behaviors is listed in
the paper. Our method is the first one that can realize the network fabric even if
the network policy is complex.
22
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
7.2.2
Fault Localization
In network fault localization [5], the application software calculates network-wide
packet behaviors that could be monitored in the network, in advance. A few
probe packets are then selected for each behavior, and the packets are exchanged
periodically between switches. If the actually monitored behavior is not identical
to the pre-calculated one, the application investigates possible causes as in solving
a set cover problem.
Reference [5] assumes that only a small number of predefined packets are employed, since no conventional classification method can classify arbitrary packets.
This limitation, which was recognized as a match fault deficiency in [5], implies
that faulty behaviors of other packets are overlooked; more importantly, failures
occurring on production traffic are not detected. Our classifier allows the application software to examine all packets without ignoring any of them. Our efficient
method, of course, does not need to compromise the sampling rate.
7.2.3
Network Verification
Network verification [6–11] is used to check network properties, such as reachability
(e.g., packets of a given header subspace sent from a client can get to a specified
server), loop-free (no packet in the subspace would traverse any cyclic path), and
waypointing (packets from the outside of network should pass through a firewall).
The property tests rely on packet classification in order to map the header space
to a network-wide packet behavior represented as a directed subgraph.
Assume that in Stanford backbone network, a new rule should be applied immediately to fix a security hole and several properties must be checked with the new
rule before deploying it. Subgraphs representing packet behaviors can be updated
in 26 msec by [8], while the classifier is updated in 29 msec in our method (Fig. 9);
our method is comparably fast. The new behaviors are, then, confirmed with
each packet or each header subspace to be tested; assuming that many properties
equal to the number of rules in the network, e.g., 100,000, have to be checked, our
method only requires 5.0 msec for the lookups while [8] requires 17.3 sec, which
can be critical in a severe security incident.
8
Related Work
This section discusses related work; conventional packet classification methods [12–
20] are not discussed, since they were thoroughly examined in Section 3.
To fully leverage efficient packet classifiers, fast packet I/O mechanisms should
be created to receive packets at line rates. Recent research on system technologies [33] resulted in 14.9 Mpps on a single CPU core, which is roughly equal to
our results. Our classification method with the fast packet I/O can provide SDN
Packet Classification for Global Network View of Software-Defined Networking
23
applications with the quick identification of network-wide packet behavior on fast
links.
OpenFlow8 , one version of SDN, defines the multi-table pipeline; it divides a
large classifier into multiple small ones. If the original complicated header space
is divided into many parts, the complexity of each part might be greatly reduced.
However, the multi-table pipeline focuses on rules specified on a single switch,
not network-wide packet behaviors, and it is supposed to handle each dimension
with a separate table. Therefore, each machine holds only a few tables (e.g., four
tables [34]), and so our problem is not significantly mitigated.
Several papers [35–38], some of which were written in the context of SDN [37,38],
discussed algorithms to distribute classification rules among multiple switches without changing their semantics, in order to store the rules in space-limited TCAMs.
They are, however, not arranged to identify the network-wide packet behavior.
Multiple-bit striding was utilized to accelerate trie traversal [39, 40], and this
technique looks similar with our bit aggregation. Our method is, however, established on MDDs, not simple tries, and so Algorithm 1 is designed to handle path
convergence by caching visited nodes and to care skipped nodes of don’t care bits.
BDDs have often been used to deal with complicated header spaces along with
various network applications such as firewall analysis [41], network state verification [6–8], policy enforcement [42], and traffic analysis [43]. However, work to
date did not focus on classification throughput, and so speeds are low. Firewall
Decision Diagrams (FDDs) [44, 45] can be used to represent the header space in a
compressed manner, but they were designed to be human readable, not to be efficiently manipulated by computers; e.g., the network verifier using FDDs [9] seems
to be at least 1,000 times slower than those with BDDs [6, 8]. Prefix DAG [46]
employs a data structure similar to MDDs, but it focused on a simple classification
problem with a single header field, in which classifiers are readily constructed even
if all rules are extracted.
BDDs and MDDs have been intensively studied in the LSI-CAD community.
Reference [47] minimized the total size of given BDDs by applying different bit
orders to them, while our interest includes the size of MDD. The memory usage
of MDD was optimized in [30], but the MDD structure is made complicated and
degrading the classification throughput.
9
Conclusions
This paper has developed a packet classification method that efficiently represents
the complicated header space defined with network-wide packet behaviors. To fully
leverage the global network view provided by SDN, packet classification methods
are required to well handle network-wide packet behaviors, not actions taken on
8
http://www.openflow.org/documents/openflow-spec-v1.1.0.pdf
24
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
a single switch. Although packet classification has been studied intensely, conventional classification methods are unable to handle network-wide behaviors at all.
Our method is based on compressed decision diagrams and is supported by several new algorithms; it is simple but surprisingly efficient. Numerical experiments
on three actual networks showed that our method can identify the network-wide
packet behavior at 10 Mpps or more for all three networks. Our work is the only
one to solve this hard but important problem in SDN. Moreover, thanks to the introduction of SDN, complex network policies can now be automatically processed
by application programs without being manually written in the inefficient 5-tuple
format, and so our efficient internal representation will get more opportunities
beyond the example applications shown in this paper.
In future work, we will develop SDN applications that utilize our classification
method, and evaluate its feasibility with field experiments. Powerful techniques
studied in computer science, such as compressed self-indexes [46] and Boolean
expression minimization [48], will be applied to our method.
References
[1] Nick McKeown, Tom Anderson, Hari Balakrishnan, Guru Parulkar, Larry
Peterson, Jennifer Rexford, Scott Shenker, and Jonathan Turner. OpenFlow:
enabling innovation in campus networks. SIGCOMM Comput. Commun. Rev.,
38(2):69–74, 2008.
[2] S. Agarwal, M. Kodialam, and T.V. Lakshman. Traffic engineering in software
defined networks. In IEEE INFOCOM, pages 2211–2219, 2013.
[3] Martin Casado, Teemu Koponen, Scott Shenker, and Amin Tootoonchian.
Fabric: A retrospective on evolving sdn. In ACM HotSDN, pages 85–90,
2012.
[4] Barath Raghavan, Martı́n Casado, Teemu Koponen, Sylvia Ratnasamy, Ali
Ghodsi, and Scott Shenker. Software-defined internet architecture: Decoupling architecture from infrastructure. In ACM HotNets, pages 43–48, 2012.
[5] Hongyi Zeng, Peyman Kazemian, George Varghese, and Nick McKeown. Automatic test packet generation. In ACM CoNEXT, pages 241–252, 2012.
[6] E. Al-Shaer, W. Marrero, A. El-Atawy, and K. ElBadawi. Network configuration in a box: towards end-to-end verification of network reachability and
security. In IEEE ICNP, pages 123–132, 2009.
[7] R. McGeer. Verification of switching network properties using satisfiability.
In IEEE ICC, pages 6638–6644, 2012.
[8] Hongkun Yang and S.S. Lam. Real-time verification of network properties
using atomic predicates. In IEEE ICNP, pages 1–11, 2013.
Packet Classification for Global Network View of Software-Defined Networking
25
[9] A.R. Khakpour and A.X. Liu. Quantifying and querying network reachability.
In IEEE ICDCS, pages 817–826, 2010.
[10] Peyman Kazemian, George Varghese, and Nick McKeown. Header space analysis: Static checking for networks. In USENIX NSDI, pages 113–126, 2012.
[11] Ahmed Khurshid, Xuan Zou, Wenxuan Zhou, Matthew Caesar, and
P. Brighten Godfrey. VeriFlow: Verifying network-wide invariants in real
time. In USENIX NSDI, pages 15–28, 2013.
[12] Sumeet Singh, Florin Baboescu, George Varghese, and Jia Wang. Packet
classification using multidimensional cutting. In ACM SIGCOMM, pages 213–
224, 2003.
[13] Hyesook Lim and Ju Hyoung Mun. High-speed packet classification using
binary search on length. In ACM/IEEE ANCS, pages 137–144, 2007.
[14] Yaxuan Qi, Lianghong Xu, Baohua Yang, Yibo Xue, and Jun Li. Packet
classification algorithms: From theory to practice. In IEEE INFOCOM, pages
648–656, 2009.
[15] Balajee Vamanan, Gwendolyn Voskuilen, and T. N. Vijaykumar. EffiCuts:
Optimizing packet classification for memory and throughput. SIGCOMM
Comput. Commun. Rev., 40(4):207–218, 2010.
[16] Wenjun Li and Xianfeng Li. HybridCuts: A scheme combining decomposition
and cutting for packet classification. In IEEE HOTI, pages 41–48, 2013.
[17] Hao Che, Z. Wang, Kai Zheng, and Bin Liu. DRES: Dynamic range encoding
scheme for TCAM coprocessors. IEEE Transactions on Computers, 57(7):902–
915, 2008.
[18] A.X. Liu, C.R. Meiners, and E. Torng. TCAM razor: A systematic approach
towards minimizing packet classifiers in tcams. IEEE/ACM Transactions on
Networking, 18(2):490–500, 2010.
[19] A. Bremler-Barr and D. Hendler. Space-efficient TCAM-based classification
using gray coding. IEEE Transactions on Computers, 61(1):18–30, 2012.
[20] O. Rottenstreich, R. Cohen, D. Raz, and I. Keslassy. Exact worst case TCAM
rule expansion. IEEE Transactions on Computers, 62(6):1127–1140, 2013.
[21] R.E. Bryant. Graph-based algorithms for boolean function manipulation.
IEEE Transactions on Computers, C-35(8):677–691, 1986.
[22] A. Srinivasan, T. Ham, S. Malik, and R.K. Brayton. Algorithms for discrete
function manipulation. In IEEE ICCAD, pages 92–95, 1990.
26
T. Inoue, T. Mano, K. Mizutani, S. Minato, and O. Akashi
[23] Yu-Wei Eric Sung, Sanjay G. Rao, Geoffrey G. Xie, and David A. Maltz.
Towards systematic design of enterprise networks. In ACM CoNEXT, pages
22:1–22:12, 2008.
[24] Mark H. Overmars and Frank A. van der Stappen. Range searching and point
location among fat objects. Journal of Algorithms, 21(3):629–656, 1996.
[25] Subhash Suri, Tuomas Sandholm, and Priyank Warkhede. Compressing twodimensional routing tables. Algorithmica, 35(4):287–300, 2003.
[26] Christian Böhm, Stefan Berchtold, and Daniel A. Keim. Searching in highdimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv., 33(3):322–373, 2001.
[27] Ryo Yoshinaka, Jun Kawahara, Shuhei Denzumi, Hiroki Arimura, and
Shin’ichi Minato. Counterexamples to the long-standing conjecture on the
complexity of BDD binary operations. Information Processing Letters,
112(16):636–640, 2012.
[28] Donald E. Knuth. The Art of Computer Programming: Combinatorial Algorithms Part 1, volume 4A. Addison-Wesley, USA, 2011.
[29] David A Huffman. A method for the construction of minimum redundancy
codes. Proceedings of the IRE, 40(9):1098–1101, 1952.
[30] S. Nagayama and T. Sasao. On the optimization of heterogeneous MDDs.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 24(11):1645–1659, 2005.
[31] D.E. Taylor and J.S. Turner. ClassBench: a packet classification benchmark.
In IEEE INFOCOM, volume 3, pages 2068–2079 vol. 3, 2005.
[32] Mihai Dobrescu, Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin
Fall, Gianluca Iannaccone, Allan Knies, Maziar Manesh, and Sylvia Ratnasamy. RouteBricks: exploiting parallelism to scale software routers. In
ACM SOSP, pages 15–28, 2009.
[33] Luigi Rizzo. netmap: a novel framework for fast packet I/O. In USENIX
ATC, pages 101–112, 2012.
[34] Heng Pan, Hongtao Guan, Junjie Liu, Wanfu Ding, Chengyong Lin, and Gaogang Xie. The FlowAdapter: Enable flexible multi-table processing on legacy
hardware. In ACM HotSDN, pages 85–90, 2013.
[35] Minlan Yu, Jennifer Rexford, Michael J. Freedman, and Jia Wang. Scalable
flow-based networking with DIFANE. In ACM SIGCOMM, pages 351–362,
2010.
Packet Classification for Global Network View of Software-Defined Networking
27
[36] Chad R. Meiners, Alex X. Liu, Eric Torng, and Jignesh Patel. Split: Optimizing space, power, and throughput for TCAM-based classification. In
ACM/IEEE ANCS, pages 200–210, 2011.
[37] Nanxi Kang, Zhenming Liu, Jennifer Rexford, and David Walker. Optimizing the ”one big switch” abstraction in software-defined networks. In ACM
CoNEXT, pages 13–24, 2013.
[38] Y. Kanizo, D. Hay, and I. Keslassy. Palette: Distributing tables in softwaredefined networks. In IEEE INFOCOM, pages 545–549, 2013.
[39] Jahangir Hasan and T. N. Vijaykumar. Dynamic pipelining: Making IPlookup truly scalable. SIGCOMM Comput. Commun. Rev., 35(4):205–216,
2005.
[40] Yi Wang, Yuan Zu, Ting Zhang, Kunyang Peng, Qunfeng Dong, Bin Liu, Wei
Meng, Huichen Dai, Xin Tian, Zhonghu Xu, Hao Wu, and Di Yang. Wire
speed name lookup: A GPU-based approach. In USENIX NSDI, pages 199–
212, 2013.
[41] Lihua Yuan, Hao Chen, Jianning Mai, Chen-Nee Chuah, Zhendong Su, and
P. Mohapatra. Fireman: a toolkit for firewall modeling and analysis. In IEEE
S&P, pages 199–213, 2006.
[42] Yu-Wei Eric Sung, Carsten Lund, Mark Lyn, Sanjay G. Rao, and Subhabrata
Sen. Modeling and understanding end-to-end class of service policies in operational networks. In ACM SIGCOMM, pages 219–230, 2009.
[43] Lihua Yuan, Chen-Nee Chuah, and P. Mohapatra. ProgME: Towards programmable network measurement. IEEE/ACM Transactions on Networking,
19(1):115–128, 2011.
[44] Mohamed G. Gouda and Alex X. Liu. Structured firewall design. Computer
Networks, 51(4):1106–1120, 2007.
[45] A.X. Liu and M.G. Gouda. Diverse firewall design. IEEE Transactions on
Parallel and Distributed Systems, 19(9):1237–1251, 2008.
[46] Gábor Rétvári, János Tapolcai, Attila Kőrösi, András Majdán, and Zalán
Heszberger. Compressing ip forwarding tables: Towards entropy bounds and
beyond. SIGCOMM Comput. Commun. Rev., 43(4):111–122, 2013.
[47] Amit Narayan, Adrian J. Isles, Jawahar Jain, Robert K. Brayton, and Alberto L. Sangiovanni-Vincentelli. Reachability analysis using partitionedROBDDs. In IEEE/ACM ICCAD, pages 388–393, 1997.
[48] Kirill Kogan, Sergey Nikolenko, William Culhane, Patrick Eugster, and
Eddie Ruan. Towards efficient implementation of packet classifiers in
SDN/OpenFlow. In ACM HotSDN, pages 153–154, 2013.