Medagliani_2016-04-05_FlowSplitting

Security Level: Confidential
Solving bucket-based large
flow allocation problems
2016-04-05
www.huawei.com
Paolo Medagliani
[email protected]
HUAWEI TECHNOLOGIES Co., Ltd.
HUAWEI Confidential
Outline

Introduction to the problem

Two algorithms


Linear programming-based

Greedy
Results
HUAWEI TECHNOLOGIES CO., LTD.
2
French Mathematical and Algorithmic Sciences Lab
•
•
•
•
•
•
Located in Boulogne Billancourt
Started officially October 2014
57 researchers at PhD level + 11 PhDs
Merouane Debbah (scientific director)
Mid and Long Term research
2 departments:
Communication Science
Networking Science
• Focus: 4.5G, 5G, SDN, Big Data for telecom
• An active lab in a HUGE company (170k employees
worldwide, +45% growth this year in Western Europe)
HUAWEI TECHNOLOGIES CO., LTD.
3
Our Vision, Projects and Expertise
Vision
Solve global and online optimization problems
into next-generation network controllers
Topics
Routing and Admission control
Rate control, network monitoring
Network virtualization (embedding,
Projects
SDN: Network optimization, Path
computation, Monitoring,
Learning
placement…)
Tools and Skills
Optimization theory (combinatorial, stochastic), Online and approximation
algorithms , Game and auction theory, Control theory, Dynamic
Programming, Online algorithms, Machine learning
HUAWEI TECHNOLOGIES CO., LTD.
4
Why flow splitting?



Flow splitting

Helps to balance traffic or to accept more traffic

Increase scalability when traffic needs to be processed by middleboxes
Use case

Cloud or carrier provider hosting up to thousands of flow aggregates

Multi-path routing requires a switch to split aggregated flows
Main problem

Find a feasible solution that maximizes the throughput
and minimize cost.

The problem is not linear, NP-hard and cannot
be approximated to within constant factors unless P=NP.
HUAWEI TECHNOLOGIES CO., LTD.
5
MPLS-TE use case
f1= 0.5
Flow f
1
3
S
f2= 0.5

f1
2
f2
f1
f2
Split is considered only at the source S

Simpler to manage

Helps to handle Jitter in the flow aggregate

A flow consists in a set of sub-paths from S to D

Intermediate nodes treat f1 and f2 as two different flows

Split could be uneven and unequal cost (vs. ECMP)
HUAWEI TECHNOLOGIES CO., LTD.
6
D
Flow f
Hash-based splitting
Extend ECMP by repeating entries
Typical size of TCAM memory is 3k entries[1]
[1] K.
Kannan, S. Banerjee, “Compact TCAM: Flow entry compaction in TCAM for power aware SDN”,
Distributed Computing and Networking, 2013
HUAWEI TECHNOLOGIES CO., LTD.
7
Hash-based splitting

The larger the number of buckets the better the flow distribution
accurately models a fractional ideal

The distribution of flow volume amongst the paths is constrained
by the use of a limited number of TCAM entries
HUAWEI TECHNOLOGIES CO., LTD.
8
Page 8
Flow splitting problem

Problem: Find a feasible GLOBAL bucket configuration

Input parameters:


Network topology

Current network state (residual capacity on each link)

Demands to be allocated
SDN
Controller
Constraints

Max number of buckets
at each node (TCAM size)

Max number of paths for each demand
t2
s1
t1
• Centralized control and
management

s2
Control: allocation of paths according to bucket structure

Objective: max bandwidth acceptance and cost minimization
HUAWEI TECHNOLOGIES CO., LTD.
9
MINLP formulation
Large constant
ratios
nb buckets
memory
MCF
x says if an edge is used (binary)
y tells how much bandwidth is allocated to an edge
z gives the number of buckets used (binary)
HUAWEI TECHNOLOGIES CO., LTD.
10
Results on execution time

ILP with CPLEX after linearization

Network with 6 nodes and 8 bidirectional links


K=1 commodity

Without flow splitting ->

With flow splitting ->
0’1’’
0’3’’
I want speed and
optimality…
but HOW??
K=2 commodities

Without flow splitting ->

With flow splitting ->
1’40’’
24’
NO SCALABILITY
HUAWEI TECHNOLOGIES CO., LTD.
11
Outline

Introduction to the problem

Two algorithms


Linear programming-based

Greedy
Results
HUAWEI TECHNOLOGIES CO., LTD.
12
How to solve the flow splitting problem?

2 complementary approaches


A Linear programming-based heuristic

Very close to optimality (based on Column Generation)

Requires a succesive phase of rounding

Can be accelerated by parallelizing some steps of the algorithm
A greedy heuristic

Very fast even for massive number of demands

Based on sequential execution of shortest path algorithm

Can embed the computation of reliable pair of paths respecting given weight constraints
HUAWEI TECHNOLOGIES CO., LTD.
13
LP-based algorithm

Scale edge capacities by (1 - α)

Artificially reduce link capacity to reserve space for
Scale
rounding. Different values of α have to be used.

LP
Solve the relaxed multicommodity flow (i.e., using
column generation)

LP solver
Remove path number and bucket constraints, relax
integer variables and solve.
Allocate a proportional fair amount of bucket budget
to each demand at each source node


Randomly round up the fractional solution

Reduce the number of paths if above N

Find a feasible configuration which minimizes the error
Rounding
Iterate
Iterate if demands can still be allocated
HUAWEI TECHNOLOGIES CO., LTD.
Oracle
Rounding

14
CG-based methauristic (1/5)
Original network
Residual
capacity
d1
HUAWEI TECHNOLOGIES CO., LTD.
demands
15
CG-based methauristic (2/5)
Scaled network by a factor α
Given c(e) the capacity of a link, c’(e)=c(e)*(1- α)
Solve CG on this network
Scaling network capacity
allows to find feasible paths during
rounding session
d1
HUAWEI TECHNOLOGIES CO., LTD.
demands
16
CG-based methauristic (3/5)
In red, the link occupation after the CG on scaled network
d1
p1 p2
Solution for d1
as provided by
CG
p1=0.43
p2=0.57
Solution may not be
feasible in terms of
buckets => ROUNDING
Randomly move bw
from one path to
another
d1
d1
d1
d1
HUAWEI TECHNOLOGIES CO., LTD.
demands
17
CG-based methauristic (4/5)
ROUNDING carried out on original network
Randomly assign buckets to paths computed by LP
Priority to paths with higher bw to be allocated
d1
Residual
capacity
p1 p2
Solution before
rounding for d1
and after
rounding
p1=0.4
p2=0.6
d4
Allocated
demands
No knowledge of the
network => test of all
bucket profiles
The feasible configuration with the smallest
error compared to LP is applied to the
network
Residual
demands
HUAWEI TECHNOLOGIES CO., LTD.
New attempt on
residual network
18
Greedy algorithm

Allocate a proportional fair bucket budget to each demand at each source
node

Divide each flow into chunks knowing the bucket number

Allocate as many chunks as possible on the shortest path between source
Oracle
and destination

Once the path is saturated, update network utilization and compute a new
shortest path to allocate residual bandwidth

Repeat previous steps until either the whole demand is allocated or the
Greedy
demand is rejected

Repeat path allocation different chunk sizes in order to find the best bucket
configuration in term of routing cost

Best solution
Iterate if demands can still be allocated
Iterate
HUAWEI TECHNOLOGIES CO., LTD.
19
Packing buckets into a path
32 buckets
100 units flow
}
25
Chunk size = 100/32 = 3.125
15
60
Bottleneck is 15…
Q) How many chunks can we fit?
A)15/chunk size=15/(100/32)= 4.8, so
we can fit 4 chunks of flow
----> assign 4 buckets
HUAWEI TECHNOLOGIES CO., LTD.
20
Packing buckets into a path
32 buckets
100 units flow
}
25-(4*chunk size)=12.5
Chunk size = 100/32 = 3.125
15-(4*chunk size)=2.5
60-(4*chunk size)=57.5
Chunk size=3.125 < 2.5…we
can’t fit any more chunks in this
link, may as well delete it
We need to find other paths to service the remaining
32 – 4 = 28 chunks of flow, i.e., assign the remaining 28 buckets
HUAWEI TECHNOLOGIES CO., LTD.
21
Outline

Introduction to the problem

Two algorithms


Linear programming-based

Greedy
Results
HUAWEI TECHNOLOGIES CO., LTD.
22
Simulation setup



Fat Tree topology with 125 nodes

Average node degree of 4

Link capacity in [100M : 1G]

Exponential distribution of edge costs

TCAM Size in {20; 50; 100; 150; 200}
250 demands

Random source destination pairs

Max size in [1G]

Zipf distribution (exponent 0.8)
Matlab implementation
HUAWEI TECHNOLOGIES CO., LTD.
23
Accepted thoughput and running time
HUAWEI TECHNOLOGIES CO., LTD.
24
CDF of bucket distribution
TCAM size: 20 buckets
HUAWEI TECHNOLOGIES CO., LTD.
TCAM size: 200 buckets
25
Performance comparison – Scalability
•C++ implementation
•500 nodes
•4900 links
•5000 to 13000 demands
•10% of elephant flows
Due to bottlenecks, performance
loss compared to unconstrained LP
Despite worse bw allocation performance,
greedy approaches are really fast at
computing a solution for large demand sets
HUAWEI TECHNOLOGIES CO., LTD.
26
Concluding remarks
• MINLP NP-hard even to approximate within any constant factor
• Flow splitting is a problem that is very hard (see impossible)
to be optimally solved.
• Two approaches to solve the flow splitting with buckets
problem

LP-based rounding approach which is close to optimality but may be slow for
large network instances

Greedy approach which is very fast but inaccurate for large demand sets
• There is still room for improvement through algorithm faster
LP resolution and parallel computing implementation
• Online version of the problem is interesting
HUAWEI TECHNOLOGIES CO., LTD.
27