Slides - University of Alberta

Pricing Cloud Bandwidth Reservations
under Demand Uncertainty
Di Niu, Chen Feng, Baochun Li
Department of Electrical and Computer Engineering
University of Toronto
1
Roadmap
Part 1 A cloud bandwidth reservation
model
Part 2 Price such reservations
Large-scale distributed optimization
Part 3 Trace-driven simulations
2
Cloud Tenants
WWW
Problem: No bandwidth guarantee
Not good for Video-on-Demand, transaction
processing web applications, etc.
3
Amazon Cluster Compute
Bandwidth
10 Gbps Dedicated Network
Demand
0
1
2
Days
Over-provision
4
Good News:
Bandwidth reservations are becoming
feasible between a VM and the Internet
H. Ballani, et al.
Towards Predictable Datacenter Networks
ACM SIGCOMM ‘11
C. Guo, et al.
SecondNet: a Data Center Network Virtualization
Architecture with Bandwidth Guarantees
ACM CoNEXT ‘10
5
Dynamic Bandwidth Reservation
Bandwidth
reduces cost due to better utilization
Reservation
Demand
0
1
2
Days
Difficulty: tenants don’t really know their demand!
6
A New Bandwidth Reservation Service
A tenant specifies a percentage of its bandwidth
demand to be served with guaranteed performance;
The remaining demand will be served with best
effort
Workload
history
(e.g., 95%)
QoS
of
the
tenant
Level Guaranteed
Portion
Cloud
Demand
Tenant
Provider
Prediction
repeated periodically
Bandwidth Reservation
7
Tenant Demand Model
Each tenant i has a random demand Di
Assume Di is Gaussian, with
mean μi = E[Di]
variance
2
σi = var[Di]
covariance matrix Σ = [σij]
Service Level Agreement: Outage w.p.
8
Roadmap
Part 1 A cloud bandwidth reservation
model
Part 2 Price such reservations
Large-scale distributed optimization
Part 3 Trace-driven simulations
9
Objectives
Objective 1: Pricing the reservations
A reservation fee on top of the usage
fee
Objective 2: Resource Allocation
Price affects demand, which affects
price in turn
Social Welfare Maximization
10
Tenant Utility (e.g., Netflix)
Tenant i can specify a guaranteed portion wi
Tenant i’s expected utility (revenue)
Concave, twice differentiable, increasing
Utility depends not only on demand, but also
on the guaranteed portion!
11
Bandwidth Reservation
Given submitted guaranteed portions
the cloud will guarantee the demands
It needs to reserve a total bandwidth
capacity
Non-multiplexing:
Multiplexing:
Service cost
e.g.
12
Cloud Objective:
Social Welfare Maximization
Price
Surplus of
tenant i
Social
Welfare
Profit of the
Cloud
Provider
Impossible: the cloud does not know Ui
13
Pricing Function
Pricing function
Price guaranteed portion,
not absolute bandwidth!
Example: Linear pricing
Under Pi(⋅), tenant i will choose
Surplus
(Profit)
14
Pricing as a Distributed Solution
Determine pricing policy
to
Social Welfare
where
Surplus
Challenge:
Cost not decomposable for
multiplexing
15
A Simple Case: NonMultiplexing
Determine pricing policy
to
where
Since
, for Gaussian
Mean
Std
The General Case:
Lagrange Dual Decomposition
M. Chiang, S. Low, A. Calderbank, J. Doyle.
Layering as optimization decomposition: A
mathematical theory of network architectures. Proc. of
IEEE 2007
Original problem
Lagrange dual
Dual problem
17
Lagrange dual
Dual problem
Lagrange multiplier ki as price: Pi (wi) := ki wi
decompose
Subgradient Algorithm:
For dual minimization, update price:
a subgradient of
18
Weakness of the Subgradient Method
Social Welfare (SW)
4
Update
to increase
Cloud Provider
3
Guaranteed Portion
Price
1
Tenant 1 . . . Tenant i . . . Tenant N
2
Surplus
Step size is a issue! Convergence is slow.
19
Our Algorithm: Equation
Updates
KKT Conditions of
2
Set
Cloud Provider
1
3
. . . Tenant i . . .
4
Solve
Linear pricing Pi (wi) = ki wi suffices!
20
Theorem 1 (Convergence)
Equation updates converge if for all i
for all
between
and
21
Convergence: A Single Tenant (1-D)
Subgradient method
Equation Updates
Not converging
22
The Case of Multiplexing
Covariance matrix:
symmetric, positive
semi-definite
is a cone centered at 0
if
is not zero and
is small
Satisfies Theorem 1, algorithm converges.
23
Roadmap
Part 1 A cloud bandwidth reservation
model
Part 2 Price such reservations
Large-scale distributed optimization
Part 3 Trace-driven simulations
24
Data Mining: VoD Demand
Traces
200+ GB traces (binary) from UUSee Inc.
reports from online users every 10
minutes
Aggregate into video channels
25
Bandwidth (Mbps)
Predict Expected Demand via Seasonal ARIMA
Time periods (1 period = 10 minutes)
26
Mbps
Predict Demand Variation via GARCH
Time periods (1 period = 10 minutes)
27
Prediction Results
Each tenant i has a random demand Di
in each “10 minutes”
Di is Gaussian, with
mean μi = E[Di]
variance
2
σi = var[Di]
covariance matrix Σ = [σij]
28
Dimension Reduction via
PCA
A channel’s demand =
weighted sum of factors
Find factors using Principal Component
Analysis (PCA)
Predict factors first, then each channel
29
Bandwidth (Mbps)
3 Biggest Channels of 452 Channels
Time periods (1 period = 10 minutes)
30
Mbps
The First 3 Principal Components
Time periods (1 period = 10 minutes)
31
Data Variance Explained
98%
8 components
Complexity Reduction:
452 channels
8 components
Number of principal components
32
Pricing: Parameter Settings
Usage of tenant i:
w.h.p.
Utility of tenant i (conservative estimate)
Reputation
loss
for
Linear revenue
demand not guaranteed
33
CDF
Mean = 6 rounds
Mean = 158 rounds
Convergence Iteration of the Last Tenant
100 tenants (channels), 81 time periods (81 x 10
Minutes)
34
Related Work
Primal/Dual Decomposition [Chiang et al. 07]
Contraction Mapping x := T(x)
D. P. Bertsekas, J. Tsitsiklis, "Parallel and
distributed computation: numerical methods"
Game Theory [Kelly 97]
Each user submits a price (bid), expects a payoff
Equilibrium may or may not be social optimal
Time Series Prediction
HMM [Silva 12], PCA [Gürsun 11], ARIMA [Niu
11]
35
Conclusions
A cloud bandwidth reservation model
based on guaranteed portions
Pricing for social welfare maximization
Future work:
new decomposition and iterative
methods for very large-scale distributed
optimization
more general convergence conditions
36
Thank you
Di Niu
Department of Electrical and Computer Engineering
University of Toronto
http://iqua.ece.toronto.edu/~dniu
37
38
RMSE (Mbps) in Log Scale
Root mean squared errors (RMSEs) over 1.25
days
Channel Index
39
Optimal Pricing
when each tenant requires wi ≡ 1
Without multiplexing,
With multiplexing,
Expected
Demand
Correlation to the
market, in [-1, 1]
Demand
Standard Deviation
40
Histogram of Price Discounts due to Multiplexing
Counts
Majority
mean discount 44%
total cost saving 35%
Risk
neutralizers
Discounts of All Tenants in All Test Periods
41
Aggregate bandwidth (Mbps)
Video Channel: F190E
Time periods (one period = 10 minutes)
42