Wavelet-Based Network Traffic Modeling

Wavelet-Based
Network Traffic Modeling
Carey Williamson
University of Calgary
Introduction



Wavelets offer a powerful and flexible
technique for mathematically
representing network traffic at multiple
time scales
Compact and concise representation of
a signal using wavelet coefficients
Efficient O(N) technique for synthesizing
signals as well, for N data points
Wavelets: Background



Wavelet transformation involves
integrating a signal (continuous time or
discrete) with a set of wavelet functions
and scaling functions
Scaling: PHI(t)
Haar Wavelet:
PSI(t)
Wavelets: Background


The top-level wavelet function is called
the mother wavelet
The children are defined recursively
using the relationship:
– PHI J,K(t) = 2 J/2 PHI(2 J t - K)
– PSI J,K(t) = 2 J/2 PSI(2 J t - K)
where j is the (vertical) scaling level,
and k is the (horizontal) translation offset,
in a binary tree representation of the signal
Wavelets: Background



Child wavelets are narrower and taller,
and cover a specific subportion of the
time series
Shifted versions of the wavelet function
cover other portions of the time series
Entire time series can be expressed as
a sum (or integral) of scaling
coefficients U J,Kand wavelet coefficients
W J,K along with these functions
Wavelets: Background


Wavelet coefficients keep track of
information about the time series; in
essence they keep track of the sums
and/or differences between the wavelet
coefficients at finer-grain time scale
(plus a scaling factor)
Finest grain wavelet coefficients are
derived directly from empirical time
n/2
series, using C(k) = 2 Un,k
Wavelets: Background

Coarser-grained values are computed
recursively upwards using:
– U J-1,K= 2 -1/2
(U J,2K+ UJ,2K+1)
– W J-1,K= 2 -1/2(U J,2K- U J,2K+1)


Topmost scaling coefficient represents
mean of empirical time series
Wavelet coefficients capture the
behavioural properties of the time series
Wavelets: Background


Empirical time series can be exactly
reconstructed using only these values
(i.e., the scaling and wavelet
coefficients)
Furthermore, these coefficients become
decorrelated in the wavelet domain
(i.e., can model arbitrary signals)
Wavelets: An Example

Suppose the initial empirical time series
of interest has N = 8 observations in it,
namely:
– 17 7 12 6 10 15 8 13

(mean = 11.0)
Can construct binary tree representation
of the signal and its corresponding
scaling and wavelet coefficients
Wavelets: An Example
17
7
12
6
10
15
8
13
Wavelets: An Example
J=0
J=1
J=2
J=3
17
7
12
6
10
15
8
13
Wavelets: An Example
J=0
J=1
J=2
J=3
17
K=0
7
12
6
10
15
8
13
K=7
Wavelets: An Example
Compute scaling coefficients at bottom level
Un,k = 2-n/2 C(k)
17
23/2
7
23/2
12
23/2
6
23/2
10
23/2
15
23/2
8
23/2
13
23/2
Wavelets: An Example
Compute scaling coefficients at next level up
Uj-1,k = 2 -1/2(Uj,2k+Uj,2k+1)
9/2
6
17
23/2
7
23/2
12
23/2
6
23/2
21/4
25/4
10
23/2
15
23/2
8
23/2
13
23/2
Wavelets: An Example
Compute scaling coefficients at next level up
23
23/2
21
23/2
9/2
6
17
23/2
7
23/2
12
23/2
6
23/2
21/4
25/4
10
23/2
15
23/2
8
23/2
13
23/2
Wavelets: An Example
Compute scaling coefficient at top level
11
23
23/2
21
23/2
9/2
6
17
23/2
7
23/2
12
23/2
6
23/2
21/4
25/4
10
23/2
15
23/2
8
23/2
13
23/2
Wavelets: An Example
Now compute wavelet coefficients, bottom up
-1/2
11
Wj-1,k = 2 (Uj,2k-Uj,2k+1)
23
23/2
21
23/2
6
17
23/2
9/2
3/2
5/2
7
23/2
12
23/2
6
23/2
25/4
10
23/2
21/4
-5/4
15
23/2
8
23/2
-5/4
13
23/2
Wavelets: An Example
Now compute wavelet coefficients, bottom up
11
21
23/2
6
17
23/2
3 3/2
2
9/2
3/2
5/2
7
23/2
23
23/2
12
23/2
6
23/2
25/4
10
23/2
1 1/2
2
21/4
-5/4
15
23/2
8
23/2
-5/4
13
23/2
Wavelets: An Example
Now compute wavelet coefficient at top level
11 -1/2
21
23/2
6
17
23/2
3 3/2
2
9/2
3/2
5/2
7
23/2
23
23/2
12
23/2
6
23/2
25/4
10
23/2
1 1/2
2
21/4
-5/4
15
23/2
8
23/2
-5/4
13
23/2
Wavelets: An Example
Can reconstruct signal top-down using only
the indicated information (mean and wavelet coefficients)
11 -1/2
1 1/2
2
3 3/2
2
5/2
3/2
-5/4
-5/4
Wavelet-Based Traffic Models


To reconstruct the time series exactly,
you need to use exactly those wavelet
coefficients, and the starting mean
(I.e., one-to-one mapping between time
series values and coefficients in the
wavelet domain)
To generate something that looks like
the original time series, it suffices to use
Wj,k values from similar distribution
WIG Model

The wavelet independent Gaussian
(WIG) model chooses the Wj,k’s at
random from a Gaussian distribution,
with a specified mean and variance at
each level j of the tree (variance of the
Wj,k’s at a particular level of the tree
typically increases as you go down the
binary tree of wavelet coefficients)
Wavelet-Based Traffic Modeling



In network traffic time series, the
observed values are all non-negative
In wavelet terms, this constraint means
the Wj,k are smaller in absolute value
than the Uj,k (which themselves are
always non-negative)
The WIG model does not guarantee
this, and can thus generate negative
values in the synthetic time series
Multi-Fractal Wavelet Model


The Multifractal Wavelet Model (MWM)
proposed by Ribeiro et al does explicitly
consider this constraint, and thus
guarantees non-negative values for all
observations in the generated series
Can express Wj,k = Aj,k * Uj,k
where -1 <= Aj,k <= 1
Other Observations

For typical network traffic time series:
– The mean of the Aj,k’s is zero at each level
j of the binary tree of wavelet coefficients
– The variance of the Aj,k’s increases as you
progress down the levels of the binary tree
– The Aj,k’s are uncorrelated (whether the
original time series was correlated or not)
– Symmetric beta distribution works well for
modeling the distribution of Aj,k’s
Wavelet-Based Traffic Modeling



By generating random Aj,k values from
a specified distribution (e.g., symmetric
beta distribution), one can generate
synthetic time series with desired
variance (and fractal-like structure)
across many time scales
Non-Gaussian marginals no problem
See example plots for LBL-TCP and
Bellcore Ethernet LAN traces
Summary


Wavelets offer a flexible and powerful
traffic modeling technique that is able to
capture short-range and long-range
traffic characteristics, including
correlations in the time domain
Very efficient O(N) computational
procedure for trace generation to
generate N data points in trace