(slides)

QoE Driven Server Selection
for VoD in the Cloud
Chen Wang1,2, Hyong Kim1, Ricardo Morla2
1Department
of ECE, Carnegie Mellon University
2Faculdade de Engenharia de Universidade do Porto
IEEE CLOUD 2015, New York, USA
1
Challenges
• Cloud for large-scale VoD: Elasticity, Scalability, Flexibility
• Performance impact due to VM interference
– The performance of video server in a VM varies
– The user experience on the video server varies
64-bit OS
64-bit OS
64-bit OS
Virtualization Layer
Hardware(CPU, disk, memory, network)
2
Problem Statement
Which video server in the Cloud can
provide the best Quality-of-experience
(QoE) for a user request?
3
Our Objectives
• Select a server providing the Best QoE
– What is the criteria to select server
• Existing System: the lowest network latency/server load
• The Best Server Performance Metric ≠ The best user QoE
– When to select server
• Existing system: before the start of streaming
• The QoE at the start of streaming ≠ The QoE in 10 min
– Who selects server
• Existing system: local DNS server
• Client himself knows better.
• Neighboring clients might know better.
• Scalability
– millions of users, thousands of servers.
4
Our Proposed System
• Best QoE
– What: QoE gives the best perception of server performance
• QoE based Server Monitoring & Server Selection
– When: before the downloading of each video chunk
• Adaptive Server Selection per chunk
– Who: clients and their neighbors.
• Cooperation among nearby clients on QoE based
server monitoring
• Scalability --- Agent based System
– Agents perform distributed control.
– Serve user requests locally.
5
System Design
Cache Agent
•
•
Discover K candidate servers
K servers to client
S3
S2
Production Cloud Environment
S1
S5
C1
C2
C5
Cooperation
Client Agent
•
•
•
C6
S4
Monitor client’s QoE on Candidate Servers
Adaptive server selection
Cooperative clients share QoE of Servers
C4
C3
6
System Operation
★ Videos

★
S2

S1
S3
1. Location aware overlay of cache
agents
2.Multi-Candidate Content Discovery, CST

S4
3.Connect to the local cache agent.
4.K candidate servers for a video request.
★
5.QoE driven Adaptive server Selection
S5
6.Cooperation among client agents.
CST(S5)


★
Srv1
Srv2
S5
S3
S3
S4
S5
S2
7
Multi-Candidate Content
Discovery (MCCD)

★ Videos
★
S2

CST(S2)
S3
Cand1

S1 S2

S3
★
S2
Cand2
Cand1
Cand2

S3
S2

S3
★
S24
S
S5

S3
S5
CST(S3)
CST(S5)
Cand1
Cand2

S5
S3

S3
★
S5
S2
★
S5
8
QoE Model
• Streaming Scheme: DASH
• Factors impacting QoE per video Chunk
– Bitrate of chunk:
– Freezing time: t
r
• Existing QoE Model
– Logarithm Law:
– Logistic Model:
Qfreezing (t )  
c1
5 
c3

c


 1  2 

 t 

5

a2 r
Qvideo _ quality (r) = a1 ln
rmax
t 0
t 0
a1,a2 ,c1 ∼c3
are positive fitted coefficients.
9
Our Chunk based QoE Model
• Chunk based QoE Model
Freezing
Decreasing Bitrate
10
QoE driven Server Selection
• What: Criterion of Server Selection
Candidate Server 1
Candidate Server 2
Low Latency
High Interference
Server
Load
Network
Latency
Others
QoE
Can1
Latest
QoE
Can2
11
Adaptive Server Selection
• When: Adaptive Server Selection per Chunk
– Dynamic Interference
Low Latency
Candidate Server 1
Candidate Server 2
Dynamic Interference
Can1
Can2
Chunk 1
Chunk 2
Chunk 3
12
Cooperative Server Selection
• Who: Neighbors know better
Low Latency
Candidate Server 1
Can1
Latest
QoE
Candidate Server 2
Can2
Can1
Can2
Latest
QoE
13
Comparison Methods
• Client streaming from 2 candidates
– DASH: Streaming from the closest server
• DNS based Server Selection + DASH streaming
– QAS-DASH: QoE + Adaptive + DASH
– CQAS-DASH: QoE + Adaptive + Cooperative +
DASH
14
Google Cloud Experiment
Cache Agent
Client Agent
Client Agent attached to Cache Agent
Location Aware Cache Agent Overlay
15
Google Cloud Experiment
< 3.4
80%
QoE
DASH
10%
3.5
QASDASH
<1%
3.62
CQASDASH
0%
3.68
Session QoE: The average chunk QoE in a video session.
16
Simulation
• Simulation in Simgrid
Video Servers
Clients
17
Simulation Results
<3
90%
QoE
DASH
> 40%
2.9216
QASDASH
0%
3.1822
CQASDASH
0%
3.5004
CQAS Improves
90% QoE
>20%
18
Conclusion
• QoE + Adaptability
– QoE: a good indicator of server performance.
– Adaptability: improve user experience in Cloud environment
– Closest DASH  QAS-DASH:
• Google Cloud: ~3.5  >3.6 (~80th percentile session QoE)
• Simulation: 2.9216  3.1822 (8.92% in 90th percentile
session QoE)
• Cooperation
– Cooperation effectively help server selection in clients
– CQAS-DASH (QoE + Adaptability + Cooperation)
• Google Cloud: Doubled bitrate for 80% video sessions
• Simulation: >20% in 90th percentile session QoE
19
Netflix Titles
• http://netflixcanadavsusa.blogspot.mx/
– Netflix Canada: 4499 movies/shows
– Netflix USA: 8791 movies/shows
20
Capacity Limit
• Google Cloud
– We throttle the maximum bandwidth of each
server to 4 Mbps to emulate the server
overloading that would happen in real systems.
• Simulation
– Link to server: 50Mbps
– Backbone link: 250Mbps
21