Full article in PDF

Dynamic Scheme for Packet Classification Using Splay
Trees
Nizar Ben-Neji and Adel Bouhoula
Higher School of Communications of Tunis (Sup’Com)
University November 7th at Carthage
City of Communications Technologies, 2083 El Ghazala, Ariana, Tunisia
[email protected], [email protected]
Abstract. Many researches are about optimizing schemes for packet classification and matching filters to increase the performance of many network devices such as firewalls and QoS
routers. Most of the proposed algorithms do not process dynamically the packets and give no
specific interest in the skewness of the traffic. In this paper, we conceive a set of self-adjusting
tree filters by combining the scheme of binary search on prefix length with the splay tree
model. Hence, we have at most 2 hash accesses per filter for consecutive values. Then, we use
the splaying technique to optimize the early rejection of unwanted flows, which is important for
many filtering devices such as firewalls. Thus, to reject a packet, we have at most 2 hash accesses per filter and at least only one.
Keywords: Packet Classification, Binary Search on Prefix Length, Splay Tree, Early Rejection.
1 Introduction
In the packet classification problems we wish to classify incoming packets into
classes based on predefined rules. Classes are defined by rules composed of multiple
header fields, mainly source and destination IP addresses, source and destination port
numbers, and a protocol type. On one hand, packet classifiers must be constantly optimized to cope with the network traffic demands. On the other hand, few of proposed
algorithms process dynamically the packets and the lack of dynamic packet filtering
solutions has been the motivation for this research.
Our study shows that the use of a dynamic data structure is the best solution to take
into consideration the skewness in the traffic distribution. In this case, in order to
achieve this goal, we adapt the splay tree data structure to the binary search on prefix
length algorithm. Hence, we have conceived a set of dynamic filters for each packet
header-field to minimize the average matching time.
On the other hand, discarded packets represent the most important part of the traffic treated then reject by a firewall. So, those packets might cause more harm than
others if they are rejected by the default-deny rule as they traverse a long matching
path. Therefore, we use the technique of splaying to reject the maximum number of
unwanted packets as early as possible.
This paper is organized as follows. In Section 2 we describe the previously published related work. In Section 3 we present the proposed techniques used to perform
E. Corchado et al. (Eds.): CISIS 2008, ASC 53, pp. 211–218, 2009.
springerlink.com
© Springer-Verlag Berlin Heidelberg 2009
212
N. Ben-Neji and A. Bouhoula
the binary search on prefix length algorithm. In Section 4 we illustrate theoretical
analysis of the proposed work. At the end, in Section 5 we present the conclusion and
our plans for future work.
2 Previous Work
Since our proposed work in this paper applies binary search on prefix length with
splay trees, we describe the binary search on prefix length algorithm in detail, then we
present a previous dynamic packet classification technique using splay trees called
Splay Tree based Packet Classification (ST-PC). After that, we present an early rejection technique for maximizing the rejection of unwanted packets.
Table 1. Example Rule Set
Rule no.
R1
R2
R3
R4
R5
R6
R7
R8
R9
Src Prefix
01001*
01001*
010*
0001*
1011*
1011*
1010*
110*
*
Dst Prefix
000111*
00001*
000*
0011*
11010*
110000*
110*
1010*
*
Src Port
*
*
*
*
*
*
*
*
*
Dst Port
80
80
443
443
80
80
443
443
*
Proto.
TCP
TCP
TCP
TCP
UDP
UDP
UDP
UDP
*
2.1 Binary Search on Prefix Length
Waldvogel et al. [1] have proposed the IP lookup scheme based on binary search on
prefix length Fig.1. Their scheme performs a binary search on hash tables that are
organized by prefix length. Each hash table in their scheme contains prefixes of the
same length together with markers for longer-length prefixes. In that case, IP Lookup
can be done with O(log(Ldis)) hash table searches, where Ldis is the number of distinct
prefix lengths and Ldis<W-1 where W is the maximum possible length, in bits, of a
prefix in the filter table. Note that W=32 for IPv4, W=128 for IPv6.
Many works were proposed to perform this scheme. For instance, Srinivasan and
Varghese [3] and Kim and Sahni [4] have proposed ways to improve the performance
of the binary search on lengths scheme by using prefix expansion to reduce the value
of Ldis. On the other hand, an asymmetric binary search tree was proposed to reduce
the average number of hash computations. This tree basically inserts values of higher
occurrence probability (matching frequency) at higher tree levels then the values of
less probability. In fact, we have to rebuild periodically the search tree based on the
traffic characteristics. Also, a rope search algorithm was proposed to reduce the average number of hash computations but it increases the rebuild time of the search tree
because it use precomputation to accomplish this goal. That is why we have O(NW)
time complexity when we rebuild the tree after a rule insertion or deletion according
to the policy.
Dynamic Scheme for Packet Classification Using Splay Trees
213
Fig. 1. This shows a binary tree for the destination prefix field of Table 1 and the access order
performing the binary search on prefix lengths proposed by Waldvogel [1]. (M: Marker)
2.2 Splay Tree Packet Classification Technique (ST-PC)
The idea of the Splay Tree Packet Classification Technique [2] is to convert the set of
prefixes into integer ranges as shown in Table 2 then we put all the lower and upper
values of the ranges into a splay tree data structure. On the other hand, we need to
store in each node of the data structure all the matching rules as shown in Fig.2. The
same procedure is repeated for the filters of the other packet's header fields. Finally,
all the splay trees are to be linked to each other to determine the corresponding action
of the incoming packets.
When we look for the list of the matching rules, we first convert the values of the
packet’s header fields to integers then we find separately the matching rules for each
field and finally we select a final set of matching rules for the incoming packet. Each
newly accessed node has to become at root by the property of a splay tree (Fig.2).
Table 2. Destination Prefix Conversion
Rule no.
R1
R2
R3
R4
R5
R6
R7
R8
R9
Dst Prefix
000111*
00001*
000*
0011*
11010*
110000*
110*
1010*
*
Lower Bound
00011100
00001000
00000000
00110000
11010000
11000000
11000000
10100000
00000000
Upper Bound
00011111
00001111
00011111
00111111
11010111
11000011
11011111
10101111
11111111
Start
28
8
0
48
208
192
192
160
0
Finish
31
15
31
63
215
195
223
175
255
2.3 Early Rejection Rules
The technique of early rejection rules was proposed by Adel El-Atawy et al. [5], using
an approximation algorithm that analyzes the firewall policy in order to construct a set
of early rejection rules. This set can reject the maximum number of unwanted packets
as early as possible. Unfortunately, the construction of the optimal set of rejection
214
N. Ben-Neji and A. Bouhoula
Fig. 2. The figure (a) shows the destination splay tree constructed as shown in Table 2 with the
corresponding matching rules. The newly accessed node becomes the root as shown in (b).
rules is an NP-complete problem and adding them may increase the size of matching
rules list. Early rejection works periodically by building a list of most frequently hit
rejection rules. And then, starts comparing the incoming packets to that list prior to
trying normal packet filters.
3 Proposed Filter
Our basic scheme consists of a set of self-adjusting filters (SA-BSPL: Self-Adjusting
Binary Search on Prefix Length). Our proposed filter can be applied to filter every
packet's header field. Accordingly, it can easily assure exact matching for protocol
field, prefix matching for IP addresses, and range matching for port numbers.
3.1 Splay Tree Filters
Our work performs the scheme of Waldvogel et al [1] by conceiving splay tree filters
to ameliorate the global average search time. A filter consists of a collection of hash
tables and a splay tree with no need to represent the default prefix (with zero length).
Prefixes are grouped by lengths in hash tables (Fig.3). Also, each hash table is augmented with markers for longer length prefixes (Fig.1 shows an example). We still
use binary search on prefix length but with splaying operations to push every newly
accessed node to the top of the tree and because of the search step starts at root and
ends at the leaves of the search tree as described in [1], we have to splay also the successor of the best length to the root.right position (Fig4-a). Consequently, the tree is
adequately adjusted to have at most 2 hash accesses for all repeated values.
Fig. 3. This figure shows the collection of hash tables of the destination address filter according
to the example in Table 1. We denote by M the marker and by P the prefix.
Dynamic Scheme for Packet Classification Using Splay Trees
215
Fig. 4. This figure shows the operations of splaying to early accept the repeated packets (a) and
to early reject the unwanted ones (b). Note that x+ is the successor of x, and M is the minimum
item in the tree.
Fig. 5. This figure shows the alternative operations of splaying when x+ appears after x in the
search path (a) or before x (b)
Therefore, we start the search from the root node and if we get a match, we update
the best length value and the list of matching rules then we go for higher lengths and
if nothing is matched, we go for lower lengths (Fig 6). We stop the search process if a
leaf is met. After that, the best length value and its successor have to be splayed to the
top of the tree (Fig.4-a). We can go faster since we use the top-down splay tree model
because we are able to combine searching and splaying steps together.
We have also conceived an alternative implementation of splaying that have
slightly better amortized time bound. We look for x and x+, then we splay x or x+
until we obtain x+ as a right child of x. Then, these two nodes have to be splayed to
the top of the tree, as a single node (Fig.5). Hence, we have a better amortized cost
than given in Fig.4-a.
3.2 Early Rejection Technique
In this section, we focus on optimizing matching of traffic discarded due to the default-deny rule because it has more profound effect on the performance of the firewalls. In our case, we have no need to represent the default prefix (with zero length)
so if a packet don’t match any length it will be automatically rejected by the
Min-node. Generally, rejected packets might traverse long decision path of rule
matching before they are finally rejected by the default-deny rule. The left child of the
Min-node is Null, hence if a packet doesn’t match the Min-node we go to its left child
which is Null, so it means that this node is the end of the search path.
216
N. Ben-Neji and A. Bouhoula
Fig. 6. The algorithm of binary search on prefix length combined with splaying operations
In our case, in each filter, we have to traverse the entire tree until we arrive to the
node with the minimum value. Subsequently, a packet might traverse a long path in
the search tree before it is rejected by the Min-node. Hence, we have to rotate always
the Min-node in the upper levels of the self-adjusting tree. We have to splay the Minnode to the root.left position as shown in (Fig.4-b). In our case, we can use either
bottom-up or top-down splay trees. However, the top-down splay tree is much more
efficient for the early rejection technique because we are able to maintain the Minnode fixed in the desired position when searching for the best matching value.
4 Complexity Analysis
In this section, we first give the complexity analysis of the proposed operations used
to adapt the splay tree data structure to the binary search on length algorithm and we
also give the cost per filter of our early rejection technique.
4.1 Amortized Analysis
On an n-node splay tree, all the standard search tree operations have an amortized
time bound of O(log(n)) per operation. According to the analysis of Sleator and Tarjan [6], if each item of the splay tree is given a weight wx, with Wt denoting the sum of
the weights in the tree t, then the amortized cost to access an item x have the following upper bounds (Let x+ denote the item following x in the tree t and x- denote the
item preceding x):
3 log(Wt/wx) + O(1) .
3 log(Wt/min(wx-,wx+)) + O(1) .
If x is in the tree t .
(0)
If x is not in the tree t .
(1)
Dynamic Scheme for Packet Classification Using Splay Trees
217
In our case, we have to rotate the best length value to the root position and its successor to the root.right position (Fig.4-a). We have a logarithmic amortized complexity to release these two operations and the time cost is calculated using (0)
and (1):
3 log (Wt/wx) + 3 log(Wt-wx/wx+) + O(1) .
(2)
With the use of the alternative optimized implementation of splaying (Fig.5) we obtain slightly better amortized time bound than the one given in (2):
3 log(Wt-wx/min(wx,wx+)) + O(1) .
(3)
On the other hand, the cost of the early rejection step is 3 log(Wt-wx/wmin) + O(1), note
that wmin is the weight of the Min-node. With the use of the proposed early rejection
technique, we have at most 2 hash accesses before we reject a packet and at least only
one. If we have m values to be rejected by the default deny rule, search time will be at
most m+1 hash accesses per filter and at least m hash accesses.
4.2 Number of Nodes
The time complexity is related with the number of nodes. For the ST-PC scheme,
in the worst case, if we assume that all the prefixes are distinct, then we have at
most 2W nodes per tree, note that W is the length of the longest prefix. Besides, if
all bounds are distinct, then we have 2r nodes, note that r is the number of rules.
So, the actual number of nodes in ST-PC in the worst case will be the minimum of
these two values. Our self-adjusting tree structure is based on the binary search on
hash tables organized by prefixes length. Hence, the number of nodes in our case is
equal to the number of hash tables. If we consider W the length of the longest prefix, we have at most W nodes. Since the number of node in our self-adjusting tree
is smaller than in the ST-PC especially with an important number of rules (Fig.7),
we can say that our scheme is much more competitive in term of time and
scalability.
Fig. 7. (a) This figure shows the distribution of the number of nodes with respect to W and r,
where W is the maximum possible length in bits and r the number of rules
218
N. Ben-Neji and A. Bouhoula
5 Conclusion
The Packet classification optimization problem has received the attention of the research community for many years. Nevertheless, there is a manifested need for new
innovative directions to enable filtering devices such as firewalls to keep up with
high-speed networking demands. In our work, we have suggested a dynamic scheme,
based on using collection of hash tables and splay trees for filtering each header field.
We have also proved that our scheme is suitable to take advantage of locality in the
incoming requests. Locality in this context is a tendency to look for the same element
multiple times. Finally, we have reached this goal with a logarithmic time cost, and in
our future works, we wish to optimize data storage and the other performance aspects.
References
1. Waldvogel, M., Varghese, G., Turner, J., Plattner, B.: Scalable High-Speed IP Routing
Lookups. ACM SIGCOMM Comput. Commu. Review. 27(4), 25–36 (1997)
2. Srinivasan, T., Nivedita, M., Mahadevan, V.: Efficient Packet Classification Using Splay
Tree Models. IJCSNS International Journal of Computer Science and Network Security 6(5), 28–35 (2006)
3. Srinivasan, V., Varghese, G.: Fast Address Lookups using Controlled Prefix Expansion.
ACM Trans. Comput. Syst. 17(1), 1–40 (1999)
4. Kim, K., Sahni, S.: IP Lookup by Binary Search on Prefix Length. J. Interconnect. Netw. 3,
105–128 (2002)
5. Hamed, H., El-Atawy, A., Al-Shaer, E.: Adaptive Statistical Optimization Techniques for
Firewall Packet Filtering. In: IEEE INFOCOM, Barcelona, Spain, pp. 1–12 (2006)
6. Sleator, D., Tarjan, R.: Self Adjusting Binary Search Trees. Journal of the ACM 32, 652–
686 (1985)