A Study of Impacts of Flow Timeouts on Link Provisioning

A Study of Impacts of Flow Timeouts on Link
Provisioning
Jeroen Fokkema
University of Twente
P.O. Box 217, 7500AE Enschede
The Netherlands
[email protected]
ABSTRACT
Link provisioning is used in backbone links to ensure that
quality of service goals are met. This relies on accurate
estimations of the required capacity for a link. Current approached lack this accuracy which may result in problems
of over- and underdimensioning of links.
Alternative approaches, as found in the literature, often
require network traffic measurements at the packet level.
These measurements are most of the time costly and not
scalable at high-speed networks.
Therefore, a new method for doing link dimensioning is
proposed. This method relies on traffic measurements at
the flow-level and the results are promising. This research
further investigates this method by assessing the impact
of flow timeouts on the accuracy of the bandwidth provisioning formula.
Our results show that the smaller the timeouts, the higher
the costs for doing bandwidth dimensioning. On the other
hand do smaller timeouts not automatically result in more
accurate results.
Keywords
Link dimensioning, Bandwidth estimation, flows, IPFIX,
NetFlow
1.
However this is done using a rule of thumb and therefore not accurate. Another solution to this problem is by
measuring the packets that are sent over the link instead
of measuring the averaged traffic. But these packet traces
are very costly and are not scalable at high speed networks.
Therefore, a new method and formula for doing bandwidth
(link) dimensioning is proposed, which uses flow-data [12].
This method still uses averages, but the periods of time
over which these averages are calculated are much smaller
than those currently used to calculate average bandwidth
values.
There are several parameters that influence the outcome of
the proposed bandwidth formula provided by [12]. These
are for example the flow timeouts. This paper investigates
the impact of different settings for these timeouts. This is
done by observing the accuracy of the bandwidth provisioning and the costs of doing this for some different timeout settings. The outcomes of this research can be used to
investigate the impact of these these flow timeouts when
using the new bandwidth estimation formula.
To research the reliability and costs of the bandwidth provisioning method for different flow timeouts, the following
main research question need to be answered: ”What are
best timer settings for doing bandwidth estimation using
flow-level measurements?”. This question has been divided
in two subquestions, namely:
INTRODUCTION
Link dimensioning is a method to dimension the bandwidth capacity of a link to a certain amount. This method
is used by ISP’s to manage the bandwidth availability,
but in order to do proper link dimensioning, information
is needed about the usage of a link. Most of the time,
this is done by measuring the average traffic on the link.
SNMP [14] counters are used this purpose, using time intervals of five to ten minutes. The collected data can be
to estimate the usage of a link in the future by taking the
average traffic into account and by adding a safety margin
to handle short bursts of traffic. Hereafter the link can be
dimensioned using this information.
The major problem with this link dimensioning method
is that peaks in the bandwidth usage on short timescales
cannot be measured. Adding a safety margin to the dimensioned link can be used as a solution to this problem.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee.
19th Twente Student Conference on IT June 24st , 2013, Enschede, The
Netherlands.
Copyright 2013, University of Twente, Faculty of Electrical Engineering, Mathematics and Computer Science.
1. What are the implications of different timer settings
on the costs of doing bandwidth estimations using
flow-data?
2. What are the implications of different timer settings
on the accuracy of bandwidth estimation?
These two questions are answered by experiments and analyzing their results. Section 2 provides the state of the art.
Section 3 describes the setup of the experiments. The results of the experiments are described in Section 4. Section
5 analyses the results and answers the research questions
(1) and (2). Finally section 6 provides the conclusions and
recommends the future work.
2. STATE OF ART
2.1 Current Bandwidth Provisioning methods
Currently, bandwidth provisioning is mostly done using
the following steps. First, the average usage of a link is
measured by using SNMP counters [14]. Then the capacity
of this link will be dimensioned to the value of this average
usage plus a safety margin of about 30 percent to deal
with fluctuations in the usage of the link. This method is
easy to deploy, since all network devices implement SNMP.
However, fluctuations of the traffic are not measured. This
can result in a bad user experience when the traffic on
the link consists of a lot of short bursts [9]. Likewise,
this method can also result in overprovisioning and thus a
waste of resources for the maintainer of the link.
2.2
Methods for Bandwidth Estimation
To get a better estimation of the bandwidth usage, some
methods have been proposed. For example, research has
been done using the Gaussian distribution of traffic in fast
networks [10, 5, 8, 2]. Research has also been done using
the Batch Markov model [7]. This is just a subset of the
proposed methods for measuring link usage and most of
these methods provide accurate results. However, all these
methods rely on packet traces for finding the right parameters to work with. The costs of acquiring packet traces
make daily use of these methods very unattractive. Therefore bandwidth estimation using flow-level measurements
can be considered as an alternative.
2.3
Flow-level Measurements
Some methods have been proposed that use flow-level measurements to do bandwidth provisioning [12, 13]. These
methods rely on flow-data from, for example, NetFlow [4]
or IPFIX [11]. Because many network devices implement
one or more of these protocols, using flow-level measurements can be done without great investments. Furthermore the method of supporting bandwidth estimations using flow-data requires much less resources than the bandwidth estimations methods using packet traces. Flow-data
is acquired by measuring flows, which are a set of packets that share common properties, passing an observation
point in the network [11].
The difference between flow-level measurements and the
method often used for bandwidth provisioning nowadays
is that the timescale over which flow-level measurements
take averages is much smaller. This increases the probability recognizing small bursts of data are much higher. Furthermore flow-level measurements measure averages for
every connection on a link and not the average for all of
the connections on a link. Therefore, it can be considered
that flow-level measurements will be much more accurate
than methods often used nowadays.
2.4
Contribution of the Proposed Research
To use flow-level measurements as proposed by [12], some
parameters have to be precisely configured in order to get
accurate estimations. Two of these parameters are the
active and the inactive timeout of the timers used to create
the flows. The active timeout defines the length of a flow.
However, the active timer is used in combination with the
inactive timer in the following way. When a flow becomes
idle for the time set by the inactive timeout and the time
set by the active timeout has not expired yet, then the
flow will be terminated. The lower these parameters are
set, the more the bandwidth usage estimation is expected
to be equal to the actual bandwidth usage. However, the
costs, in terms of using computing recourses to do the
bandwidth estimation, are expected to be higher. This
research investigates the actual effects of these parameters
and searches for a good balance between the accuracy of
the estimation and the costs of doing the measurements.
3.
SETUP OF THE EXPERIMENTS
The traffic used in this work was captured in a backbone
link that interconnects the cities of Chicago and Seattle
[3]. A total of one hour of measurements was used, and
these are divided in four 15-minute trace files. The trace
files are called trace 1, trace 2, trace 3 and trace 4. These
traces consist of packet-data and out of that flow-data has
been generated. So for all of the files packet-data as well
as flow-data is available.
To process these files and obtain usable output, the tool
YAF [6] has been used. This tool processes the flows from
the trace files and makes readable text out them.
The rest of the processing for the research has been done
using self written AWK [1], bash scripts and C++ tools.
For the plotting gnuplot1 has been used.
3.1
Measuring the Costs
In order to assess the implications for the cost on accomplishing the bandwidth provisioning for different flow timeouts, the properties of the flow-data have to be acquired.
This is done by generating data from the trace files for
different timeout settings using YAF. The active timeouts
are 300, 120, 30 and 5 seconds and the inactive timeouts
are respectively set to 120, 30, 10 and 2 seconds. For the
rest of this paper, these timeout combinations will be presented in the form of [active timeout]-[inactive timeout].
The combination of 120 seconds for the active timeout setting and 30 seconds for the inactive timeout setting (12030) is a standard combination used for flow monitoring
on the University of Twente. Since it is likely that lower
timeouts give better results, the [active timeout]-[inactive
timeout] combinations of 30s-10s and 5s-2s are used. But
to have some data to compare with, also the higher flow
timeout combination of [active timeout]-[inactive timeout]
300s-120s is chosen to investigate.
The costs will be analyzed by measuring the amount of
flow records that are generated using different flow timeouts. The amount of flow records are an indication for
the amount of resources that have to be used doing the
bandwidth estimation using these timeouts. If the data is
to be used for doing real-time measurements, then every
record has to be sent over the network, which can result in
large amounts of extra traffic. On the other hand, when
the records are not used real time but have to be stored
on the network device, these devices require large storage.
Every flow record is needed when doing bandwidth provisioning estimation, so this is a good indication for the
costs using different flow timeouts.
3.2
Bandwidth Provisioning
The bandwidth provisioning is accomplished by using the
formula proposed in [12]:
p
C(T, ) = ρ + T1 −2 log() ∗ v(T ).
This formula has different parameters that have to be set.
First of all, there is the fault-margin, called . This is set
to 1 percent, as we do want to take care of most of the
fluctuations in the traffic. It means that 1 percent of the
time, the traffic is allowed to have a higher bandwidth than
provisioned by the formula. Furthermore we have to set
the maximum amount of delay a user may experience: T .
This parameter is set to 500ms, 750ms, 1 second, 2 seconds
and 5 seconds. This is because 1 second has been shown
to be the delay a user may experience before labeling his
connectivity as a bad user experience [8]. Using values
around this 1 second allows us to compare the results that
the formula gives. ρ is the mean throughput of the traffic
and v the variance of the traffic.
Providing the bandwidth provisioning is a case of using the
proposed formula for all of the generated flow files. This is
1
http://www.gnuplot.info/, accessed on June 1, 2013
done by calculating ρ and with use of the flow files. ρ is
calculated by taking the amount of bytes of every flow and
then divide that by the number of flows. is calculated using the standard formula to calculate variance. This is the
part of the bandwidth provisioning where the difference in
flow timeouts result in different outcomes. Since the ρ and
are calculated using the flow files. These variables will
differ for each of the flow timeout settings, while all of the
other variables are not influenced by these settings.
The outcome of the formula is then evaluated by plotting
the time series of the packet data. This data may exceed
the amount of data the bandwidth provisioning formula is
suggesting to dimension the link. But if the data exceeds
this limit more than the fault margin of 1 percent, the
outcome of the bandwidth provisioning formula is not applicable for these flow timeout settings, since the outcome
is too inaccurate.
The last part of the research is to take the outcomes of
the bandwidth provisioning formula that are applicable
and to see how accurate the results are. In other words:
how much the minimum amount of bandwidth that should
be provisioned according to the bandwidth estimation formula approaches the maximum amount of bandwidth that
is actually used, when taking the traffic peaks into account. These results show the influence of the different
timeout settings on the accuracy of the bandwidth estimation formula.
4. RESULTS
4.1 The Implications of Using Flow-Data instead of Packet-Data
Figure 1 show the usage of the measured link over a period
of 100 seconds - a larger amount of time would generate an
unreadable graph. One of the lines represent the packetdata. This packet data represents the actual usage of the
link. The other four lines show the usage of the link as
measured when using flow-data. Every one of the lines has
made use of a different [active timeout]-[inactive timeout]
combination.
It is clear that at the beginning of the measurements, using flow-data with large timeouts result in an inaccurate
representation of the real bandwidth usage. The reason
for this inaccuracy is that for the amount of time which is
shorter than the inactive timeout, none of the flows will be
terminated. This means that all the short bursts of traffic
in this time-space will not be accurately measured.
At the same time, some heavy fluctuations of bandwidth
usage associated with the flow-data can be observed. This
is because long connections are cut into smaller flows. At
the beginning of the measurements, this results in the
peaks that can be observed. For example, every 5 seconds for the timeout combination 5s-2s a drop in traffic is
measured, because no flow will be longer than 5 seconds
when these timeouts are used.
This graph shows the implications of using flow-data. There
are some irregularities in the representing of the actual
traffic when using flow-data. These irregularities may result in inaccurate bandwidth provisioning.
4.2
Record Measurements
Table 1 shows the number of flow records that are generated using different combinations of flow timeouts for
the first of the four trace files. The left column of the table shows the flow timeout combinations, while the right
column show the number of records that were generated
using these timeouts. These results are also generated for
Table 1. number of records for trace 1
timeouts number of records
300-120 16672439
120-30 18495129
30-10 21374046
5-2 29724124
Table 2. Bandwidth provisioning formula outcomes for trace 1
timeouts T
outcome e
120-30
1000 1560.67
0.332226
120-30
2000 1508.07
3.09051
120-30
5000 1477.27
19.6721
120-30
500
1667.51
0
120-30
750
1596.1
0
300-120
1000 1575.41
0.110742
300-120
2000 1515.86
2.20751
300-120
5000 1480.75
13.6612
300-120
500
1695.83
0
300-120
750
1615.43
0
30-10
1000 1561.7
0.332226
30-10
2000 1508.31
3.09051
30-10
5000 1477.18
19.6721
30-10
500
1670.58
0
30-10
750
1597.76
0
5-2
1000 1565.38
0.332226
5-2
2000 1510.41
3.09051
5-2
5000 1478.96
17.4863
5-2
500
1678.57
0
5-2
750
1602.49
0
the other three traces and can be found in Appendix a,
table 3, table 4 and table 5, respectively. All of these results show almost the same number of records that have
been generated for each of the combination of timeouts.
4.3
Bandwidth Provisioning
Table 2 shows the results of the bandwidth provisioning
formula for the first of the four trace files. The column
timeouts shows the combination of timeout settings that
have been used for generating the flows and the column T
shows the value that has been set as the maximum amount
of delay a user may experience in milliseconds. The column outcome presents the outcome of the bandwidth dimensioning formula in megabits per second. The column shows the error of the outcome of the formula, e.g. the percentage of time that the link has a higher bandwidth usage
than it should be provisioned according to the bandwidth
dimensioning formula. If is larger then 1, the outcome
of the formula is too inaccurate to be applicable.
The results for the bandwidth provisioning formula using
the other trace files can be found in the provided appendix.
5. DISCUSSION
5.1 Cost Analysis
As becomes clear out of tables 1, 3, 4 and 5, the costs for
measuring flow-data is lower for larger flow timeouts than
for smaller timeouts. The combination of timeouts 300s120s results in a average number of records of 16.784.725;
the combination of timeouts 120s-30s results in an average of 18.641.215 records; 30s-10s results in an average
of 21.570.438 records and 5s-2s in an average of 30099377
records. Between the largest timeouts of 300s-120s and
the smallest of 5s-2s this is a difference of 79%. On the
1600
1500
1400
Mbps
1300
1200
packet-data
120-30
300-120
30-10
5-2
1100
1000
0
10
20
30
40
50
60
70
80
90
100
seconds
Figure 1. flow-series vs. packet-series using T = 1000
same time, the timeouts are 60 times shorter.
The differences between the three longest timeout combinations are much smaller. The difference between the
combination 300s-120s and 30s-10s is only 29%. So taking very small timeouts is very costly while the difference
between the other timeouts are not that large.
5.2
Accuracy of the Bandwidth Provisioning
Formula
First of all the results with an inaccuracy greater than 1%
should be discarded. Tables 2, 6, 7 and 8 show that the
inaccuracy is greater than 1% for all the results where T =
5 seconds (5000 ms) and for almost all of the results where
T = 2 seconds. Apparently, the bandwidth provisioning
formula does not generate accurate results for values of
T > 1 second, regardless of the flow timeout settings.
So the analysis the results have to be done using the results
where the error is smaller than 1%. Using these results,
it becomes clear that the combinations of flow timeout
settings of 30s-10s generate the lowest results, while still
remaining accurate. The results using timeouts of 120s30s are very close and on average only differ for the value
of 8.1 Mbps. The results while using timeouts of 5s-2s are
on average 62.1 Mbps larger than the situation is using
30s-10s and the results using timeouts of 300s-120s are on
average 73 Mbps larger then when the combination 30s-10s
is used. A difference of 73 Mbps on a maximum amount
of bandwidth of 1588 Mbps - according to the packet-data
- is only a difference of 4.6%.
5.3
Balance between costs and accuracy
According to the obtained results, the costs are lower for
higher flow timeouts, but the accuracy of the bandwidth
estimation formula is not the highest for the lowest flow
timeouts. For the combination of flow timeouts of 5s-2s
the costs are the highest, but the results are worse than
when flow timeouts are chosen of 30s-10s or even of 120s30s. This is the same for all of the values of T . Thus the
results do not outweigh the costs for the flow timeouts of
5s-2s.
For the other timeouts the following conclusion applies:
the smaller the timeouts, the larger the costs and the better the accuracy of the bandwidth dimensioning formula.
The relative difference in the accuracy does not differ significantly: 4.6% or in other words 73 Mbps. The costs, on
the other hand, do differ 29%, which is much more. Thus
the costs do increase a lot more for smaller timeouts then
the difference in accuracy does.
6.
CONCLUSION AND FUTURE WORK
When doing bandwidth provisioning using the proposed
formula, it is clear that a trade-off has to be made between
the costs of doing the provisioning and the accuracy that
the bandwidth dimensioning formula gives. First of all, the
results of this research show that it is not recommended
to use a T higher than 1 seconds, when the flow timeouts
are used that are tested in this paper. The results for T
= 2000ms and T = 5000ms are mostly inaccurate.
At the same time, flow timeouts that are very low, like
5s-2s, are much more expensive, but the provisioning is
very inaccurate. So the combination of these timeouts are
discouraged to use in practice.
The trade off between the other tested combinations of
flow timeout settings may be harder to make. It seems
the best choice to use a relative high timeout, because
the costs are reasonably lower, while the results of the
bandwidth provisioning are just a bit better. But when the
bandwidth on a link is relative costly, lower flow timeouts
may pay out.
For improving the researched method of doing bandwidth
provisioning, future work can be done to investigate the
impact of the variable T of the formula. This research did
not intend to research the impact of this variable, but the
accuracy differences of the formula were large for different
values of T .
Furthermore this research could be extended by investigating the impact of other flow timeout combinations. For
example larger values could be taken for the inactive timeout. By trying more different flow timeout combinations,
more information becomes available to improve bandwidth
provisioning models using flow-data.
7.
REFERENCES
[1] A. V. Aho, B. W. Kernighan, and P. J. Weinberger.
Awk - a pattern scanning and processing language.
Software Pract Exper, 9(4):267–279, 1979.
[2] H. v. d. Berg, M. Mandjes, R. v. d. Meent, A. Pras,
F. Roijers, and P. Venemans. Qos-aware bandwidth
[10]
provisioning for ip network links. Computer
Networks, 50(5):631 – 647, 2006.
[3] K. Claffy, D. Andersen, and P. Hick. The caida
anonymized 2011 internet traces.
[11]
http://www.caida.org/data/passive/passive 2011 dataset.xml,
Accessed on March 20, 2013.
[4] B. Clause. Cisco systems netflow services export
[12]
version 9. RFC 3954, IETF, 2004.
[5] C. Fraleigh, F. Tobagi, and C. Diot. Provisioning ip
backbone networks to support latency sensitive
[13]
traffic. In INFOCOM 2003. Twenty-Second Annual
Joint Conference of the IEEE Computer and
Communications. IEEE Societies, volume 1, pages
375–385, 2003.
[6] C. M. Inacio and B. Trammell. Yaf: yet another
flowmeter. In Proceedings of the 24th international
[14]
conference on Large installation system
administration, LISA’10, pages 1–16. USENIX
Association, 2010.
[7] A. Klemm, C. Lindemann, and M. Lohmann.
Modeling ip traffic using the batch markovian
arrival process. Performance Evaluation,
54(2):149–173, 2003.
[8] R. v. d. Meent. Network link dimensioning : a
measurement & modeling based approach. PhD
thesis, Enschede, March 2006.
http://doc.utwente.nl/56434/, Accessed on: June
10, 2013.
[9] R. v. d. Meent, A. Pras, M. Mandjes, H. v. d. Berg,
and L. Nieuwenhuis. Traffic measurements for link
dimensioning: A case study, volume 2867 of Lecture
Notes in Computer Science (including subseries
Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics). 2003.
A. Pras, L. Nieuwenhuis, R. v. d. Meent, and
M. Mandjes. Dimensioning network links: A new
look at equivalent bandwidth. IEEE Network,
23(2):5–10, 2009.
J. Quittek, T. Zseby, B. Claise, and S. Zander.
Requirements for ip flow information export(ipfix).
RFC 3917, IETF, 2004.
R. d. O. Schmidt, R. Sadre, A. Sperotto, H. v. d.
Berg, and A. Pras. Link dimensioning: A flow-based
procedure. waiting for appliance, 2012.
R. d. O. Schmidt, A. Sperotto, R. Sadre, and
A. Pras. Towards bandwidth estimation using
flow-level measurements. In Dependable Networks
and Services, volume 7279 of Lecture Notes in
Computer Science, pages 127–138. Springer Berlin /
Heidelberg, 2012.
J. Schönwalder. Simple network management
protocol (snmp) context engineid discovery. RFC
5343, IETF, 2008.
APPENDIX
A.
NUMBER OF RECORDS AND BANDWIDTH PROVISIONING RESULTS
Table 3. number of records for trace 2
timeouts number of records
300-120 16740320
120-30 18599522
30-10 21525852
5-2 30036030
Table 4. number of records for trace 3
timeouts number of records
300-120 16806093
120-30 18675337
30-10 21625565
5-2 30234490
Table 5. number of records for trace 4
timeouts number of records
300-120 16920046
120-30 18794870
30-10 21756287
5-2 30402864
Table 6. Bandwidth
comes for trace 2
timeouts T
120-30
1000
120-30
2000
120-30
5000
120-30
500
120-30
750
300-120
1000
300-120
2000
300-120
5000
300-120
500
300-120
750
30-10
1000
30-10
2000
30-10
5000
30-10
500
30-10
750
5-2
1000
5-2
2000
5-2
5000
5-2
500
5-2
750
provisioning formula outoutcome
1591.42
1539.59
1509.21
1696.98
1626.47
1610.9
1549.82
1513.61
1734.46
1652
1580.9
1533.88
1506.66
1677.68
1612.98
1582.55
1534.64
1507.48
1682.08
1615.08
percentage
0
2.20751
18.5792
0
0
0
0.220751
13.6612
0
0
0
3.97351
19.6721
0
0
0
3.97351
19.6721
0
0
Table 7. Bandwidth
comes for trace 3
timeouts T
120-30
1000
120-30
2000
120-30
5000
120-30
500
120-30
750
300-120
1000
300-120
2000
300-120
5000
300-120
500
300-120
750
30-10
1000
30-10
2000
30-10
5000
30-10
500
30-10
750
5-2
1000
5-2
2000
5-2
5000
5-2
500
5-2
750
provisioning formula out-
Table 8. Bandwidth
comes for trace 4
timeouts T
120-30
1000
120-30
2000
120-30
5000
120-30
500
120-30
750
300-120
1000
300-120
2000
300-120
5000
300-120
500
300-120
750
30-10
1000
30-10
2000
30-10
5000
30-10
500
30-10
750
5-2
1000
5-2
2000
5-2
5000
5-2
500
5-2
750
provisioning formula out-
outcome
1698.72
1630.91
1590.93
1835.76
1744.2
1722.86
1643.34
1596.12
1883.09
1776.11
1683.46
1622.69
1587.39
1806.77
1724.35
1682.83
1622.65
1587.96
1806.34
1723
outcome
1735.76
1677.45
1643.45
1854.58
1775.17
1761.49
1690.86
1649.13
1904.41
1808.97
1736.26
1677.15
1643.04
1857.13
1776.17
1805.51
1710.09
1655.75
1993.16
1866.5
percentage
0
4.85651
18.5792
0
0
0
3.09051
14.7541
0
0
0.221484
6.40177
20.765
0
0
0.221484
6.40177
20.765
0
0
percentage
0
2.20751
4.37158
0
0
0
1.3245
3.82514
0
0
0
2.20751
4.37158
0
0
0
0.441501
3.27869
0
0