Time Series from their Observed
Sums: Network Tomography
Edoardo M. Airoldi
School of Computer Science
Carnegie Mellon University
(joint work with Christos Faloutsos)
SIGKDD, Seattle, WA
August 23nd 2004
Acknowledgements
Srinivasan Seshan, CSD, CMU
Russel Yount and Frank Kietzke,
Network Development, CMU
Stephen Fienberg, Statistics, CMU
Jin Cao, Bell Labs
Claudia Tebaldi, NCAR
Yin Zhang, AT&T Labs
Outline
Introduction / Motivation
Survey
Proposed Methods
Results
Conclusions
Application Domains
Communication Networks
goal: Who is sending to whom
refs: Cao et al (2001), Liang & Yu (2003), Zhang et al (2004)
Transportation Networks
goal: Who is going where
Network Probing (Rish et al, IBM)
goal: Which server is down
refs: Rish et al (2002, 2004)
Communication Networks
A large ISP network has 100s of nodes, 1000s of links,
10000s routes, and over 1 petabyte (1015 bytes) per day
OD flows
C
B
Reliability analysis
Traffic engineering
A
link loads
Predict link loads under
unexpected/planned
router/link failures
Optimize routes to
minimize congestion
Capacity planning
Forecast future capacity
requirements
Mathematical Formulation
X1
X
X2
Y
X3
X4
One Constraint:
Total i Yi = 0
Link Flows
LINK
Situation at time = t
Routing Matrix A
OD Flows
X1
Y1
1 1 0 0
=
=
X 2
Y2
0 0 1 1
X 3
Y3
1 0 1 0
X 4
Problem Definition
Given: topology, fixed routing scheme A[nxm], traffic
on the links of the network Y(t)=[Y1(t), …, Yn(t)]
over time t = 1, …, T
Find: non-observable traffic between origindestination (OD) pairs X(t)=[X1(t), …, Xm(t)] over
time t = 1, …, T.
Y(t) = A·X(t)
Under-constrained
A Glance at the Data
Find OD Flows X(t)
X1(t1)
X1(t2)
X1(t3)
X1(t4)
X2(t1)
X2(t2)
X2(t3)
X2(t4)
X3(t1)
X3(t2)
X3(t3)
X3(t4)
X4(t1)
X4(t2)
X4(t3)
X4(t4)
Y1(t1)
Y1(t2)
Y1(t3)
Y1(t4)
Y2(t1)
Y2(t2)
Y2(t3)
Y2(t4)
Y3(t1)
Y3(t2)
Y3(t3)
Y3(t4)
Time
Measure Link Flows Y(t)
?
Kb
hour of the day
Our Problem: No Traffic Matrix
Traffic matrix
Gives traffic volumes between origin and destination
Very difficult to directly measure
Direct measurement [Feldmann et al. 2000]
Semi-standard router feature: Netflow
Collect flow-level data around the whole edge of the network
Combine with routing data
Cisco, Juniper, etc.
Not always well supported
Potential performance impact on routers
Huge amount of data (500GB/day)
Widely available SNMP data gives only link loads
Even this data is not perfect (glitches, loss, …)
Outline
Introduction / Motivation
Survey
Proposed Methods
Results
Conclusions
Infinite Exact Solutions
Measurements (Yt) and routing scheme A[3x4]
allow for many feasible OD flows (Yt)
For example:
29
139
OD
1
167
37
4
OD
9
32
Links
Links
The problem is under-constrained and
we need some assumptions
Related Work
Solutions in the past
y = Ax
Direct solution: SVD
Scoring
criterion: GLS, maximum likelihood,
entropy, Bayesian methods, …
Regularization: assume independent OD flows
Estimate OD flows xt using { yt-, … yt+ }
Estimated OD
Kb
hour of the day
hour of the day
Pitfalls of Past Approaches
Unrealistic Models:
Gaussian or Poisson OD traffic flows. But we
observe bursty, log-Normal traffic flows.
Time Dependence across Epochs:
Never explicitly addressed, and typically
assume xt independent over time. But we
observe time dependence of single OD flows.
Empirical Laws: log-Normality
Aggregate OD flows look log-log Normal
Counts
Counts
Log Bytes
Log-Log Bytes
[ 12321 OD time series. CMU validation data. ]
Outline
Introduction / Motivation
Survey
Proposed Method
1st Stage - Linear Dynamical Systems
2nd Stage - Bayesian Dynamical Systems
Results
Conclusions
The Model
A smooth average process { t : t > 0 }
A possibly bursty process { xt : t > 0 } to model
the OD traffic flows
Parameter Estimation
Estimate parameters underlying the average
process { t : t > 0 }
Calibrate priors for the parameters driving the
dynamic of the OD flows process { xt : t > 0 }
Estimate the OD flows using a Particle Filter
Outline
Introduction / Motivation
Survey
Proposed Method
1st Stage - Linear Dynamical Systems
2nd Stage - Bayesian Dynamical Systems
Results
Conclusions
Introducing Time Dependence
We introduce explicit time dependence:
(t) = F[nxn] (t-1) + e(t)
The distinct OD flows, components of (t),
are assumed to be independent
Use EM algorithm
Introducing Time Dependence
Our Linear Dynamical System contains the
models by Cao et al. as a special case
Outline
Introduction / Motivation
Survey
Proposed Method
1st Stage - Linear Dynamical Systems
2nd Stage - Bayesian Dynamical Systems
Results
Conclusions
Bayesian Dynamical System
Gamma and log-Normal OD flows (Xt)
Use preliminary estimates of { t : t > 0 },
the average OD flows, to softly constrain
the dynamical behavior of the OD flows to
identify the correct solution for Xt
Non-Deterministic Dynamics
Introduce explicit non-deterministic
dynamics (F) on the average OD flows:
’(t+1) = F’[nxn] · ’(t)
Diagonal matrix F’[nxn] : F’[i,i] ~ log-Normal
Learning Latent Dynamics
We want a preliminary estimate for Ft in:
t+1 = Ft+1 t
?
P(247|Y247)
Solve for
F247
P(246|Y246)
Outline
Introduction / Motivation
Survey
Proposed Methods
Results
Datasets
Importance of Time Dependence
Importance of non-Gaussianity
Informative Priors for non-Gaussian BDS
Conclusions
Validation Data sets
Consider star network topologies
[ 4 OD flows, 9 OD flows and 16 OD flows ]
Carnegie Mellon
Lucent Technologies
[ 12321 time series ]
[ 32 time series ]
X1
X
X2
X3
X4
Y
LINK
Situation at time = t
Log-Normal OD Traffic Flows
The validation OD traffic flows are
skewed on both data sets
Outline
Introduction / Motivation
Survey
Proposed Methods
Results
Datasets
Importance of Time Dependence
Importance of non-Gaussianity
Informative Priors for non-Gaussian BDS
Conclusions
Reduce Variability
Narrower range of possible values for the OD
traffic flows: those which receive positive
posterior probability
Robust Estimates
Capture sharp changes in the distribution of
the OD traffic flows
Outline
Introduction / Motivation
Survey
Proposed Methods
Results
Datasets
Importance of Time Dependence
Importance of non-Gaussianity
Informative Priors for non-Gaussian BDS
Conclusions
Capture Several Bursts
Kb
time
Outline
Introduction / Motivation
Survey
Proposed Methods
Results
Datasets
Importance of Time Dependence
Importance of non-Gaussianity
Informative Priors for non-Gaussian BDS
Conclusions
Priors and Bayesian inference
Informative Priors on { t : t > 0 } lead to
uni-modal posteriors
True values
Speed and Scalability
The computing is time about 3 minutes
[ 4 OD - 3 Links using R on Mac G4 667 ]
Linear in (#OD) for each time point
1 day worth
of data in 45
minutes
Model Comparison
Numerical Comparison
l2
Outline
Introduction / Motivation
Survey
Proposed Methods
Results
Conclusions
Past Approaches
Unreasonable Models:
Gaussian or Poisson arrivals
Time Dependence:
never explicitly addressed
Conclusions
Log-Normal models account for skewed
and bursty, non-observable OD flows
Novel BDS captures time dependence of
data thus reducing the variability of the
estimates
Informative priors serve as soft constraints
to overcome the under-determinacy of the
problem
Future Work
More tests on bigger networks
from 2-star (4-D) to 4-star (16-D)
Fit non-parametric seasonal components
for the non-observable OD flows
BACK - UP
Network Engineering
State-of-the-Art: guess and tweak
Guess based on experience & intuition
Manually tweak things, and hope the best
Disadvantages
Manual process: time consuming, error prone
Not very reliable: intuition may be wrong,
unexpected side effects
Suboptimal performance: wastes resource/time
Need to repeat the exercise when traffic pattern
changes
A More Scientific Approach?
Feldmann et al. 2000
Shaikh et al. 2002
Tomography
Fortz et al. 2002
A: "Well, we don't know the topology, we don't know the
traffic matrix, the routers don't automatically adapt the
routes to the traffic, and we don't know how to optimize
the routing configuration. But, other than that, we're all
set!" [Rexford2000, Kurose2003]
Contributions
Realistic Models: Gamma and log-Normal
P( OD Flows(t) | (t) )
Explicit Time Dependence:
E( OD Flows(t) | y(t) … y(1) )
Contributions
Informative priors in a Bayesian Dynamical
System for an under-constrained problem
Drive our inferences to the correct solution
Get high quality particles
Easy solution for Sparse Traffic
Exploring the OD space
Gibbs sampler with Metropolis steps is able to
explore P(Xt| Yt)
We prove irreducibility
of the chains
P(Xt|Yt) > 0
[ Gamma, log-Normal ]
P(Xt|Yt) = 0
P(Xt|Yt) > 0
Non-Deterministic Dynamics
Introduce explicit non-deterministic
dynamics (F) on the average OD flows:
’(t+1) = F’[nxn] · ’(t)
Diagonal matrix F’[nxn] : F’[i,i] ~ log-Normal
leads to:
’(t+1) = F’·’(t) e(t+1) = eF·e(t) (t+1) = F+(t)
Better OD Flows in 4 Steps
1
4
2
3
Immanuel Kant + o(1)
In making inferences on non-observable
quantities we find the model we look for!
Assume a model that reasonably
approximates real OD flows, and of
course it does not hurt to have a prior
opinion about it …
Learning OD Flows
Typical solutions are based on:
Generalized Least Squares
Maximum Likelihood
Bayesian methods
Entropy
These methods generate one set of OD flows X from multiple
observations {Y1,..,YT}. In general:
max
X
s.t.
p·D1[X, Xobs] + q·D2[{Y}, {Yobs}]
Y = A X,
X 0,
Random
p,q [0,1] fixed
Intrinsic Dimensionality
The routing matrix A has m rows < n columns, and its
m rows are linearly independent
The space Rn+ where the OD flows live, can be
decomposed into a sub-space R(n-m)+ with an open
interior, and a degenerate sub-space Rm+
It is possible to rearrange A=[A1,A2], and X=[X1,X2]
accordingly, so that given X2 R(n-m)+
-1
X1 = A1·(Y - A2X2) Rm+
Doubly Stochastic BDS
© Copyright 2026 Paperzz