P2P Streaming Analysis

A simple model for analyzing P2P
streaming protocols.
Seminar on advanced Internet applications and systems
Amit Farkash.
1
The problem:
The engineering of streaming video from a server to a single
client is well studied and understood.
This, however, is not scalable to serve a large number of
clients simultaneously.
P2P streaming tries to achieve scalability (like P2P file
distribution) while at the same time meet real time
playback requirements.
It is a challenging problem still not well understood.
2
The real world:
While the problem may still not be well understood in
practice P2P streaming seems to be working just fine.
Even in Israel where upload speed is considerably low it
is possible to receive a "watchable" stream.
3
The real world:
Current popular P2P streaming softwares are:
Usually streaming sporting events (illegally due to
copyright violation).
4
P2P streaming vs. P2P file sharing:
 P2P streaming is less demanding – no need for entire file.
1) Peers join the video session from the point of their arrival
times.
2) Missing “just a few” frames is tolerable.
 P2P streaming is more demanding - real time
requirements!
5
Performance metrics:
We will compare downloading strategies based on two
performance metrics:
 Continuity – probability of continues playback.
 Startup latency – expected time to start playback.
6
Basic model:
 M – Number of peers in the network.
 A single server streaming chunks of video in playback
order to M peers.
 Each chunk has a sequence number starting from 1.
 Time is slotted. Server selects a peer randomly in time
t and sends it the chunk.
7
Basic model:
 Each peer maintains a buffer B that caches up to n
chunks.
 B(n) is reserved for the chunk to be played back
immediately while B(1) is used to store the newest
chunk, that is distributed in the current time slot.
 At time t chunk t-n+1 is removed from the buffer and
played back by the peer while all other chunks are
shifted by 1.
8
Basic model:
Goal: Ensure B(n) is filled as many time slots as possible.
9
Basic model:
Pk(i)[t] – probability that B(i) of peer k is filled (with
the correct) chunk at time t.
Assumption: this probability reaches a steady state for
large t, namely Pk(i)[t] = Pk(i).
We call Pk(i) the buffer occupancy probability of peer
k.
10
Basic model:
Simple case - only server is distributing chunks:
 Pk(1) = P(1) = 1/M.
 Pk(i+1) = P(i+1) = P(i).
∀ k.
i=1,…,n-1. ∀ k.
Obviously very poor playback for any M>1.
11
Improvement:
 Each peer selects another peer in each time slot to try
and download a chunk not already in its buffer.
 If the selected peer has no useful chunk, no chunk is
downloaded in that time slot.
 Assume all peers use the same selection strategy,
therefore have the same distribution P(i).
 Assume peers are selected randomly.
12
Improvement:
Since peers are selected randomly the probability for a
peer, A ,to be selected by K≥0 peers is:
A(k) =
If M=100, A(3) is only about 1.8 % so let's assume A's
peer's upload capacity is large enough to serve all
requests in each time slot.
13
Chunk selection policy:
First we define the following events:
 WANT(K,i) = B(i) of peer K is unfilled.
 HAVE(H,i) = B(i) of peer H is filled.
 SELECT(H,K,i) = Using a chunk selection strategy, peer K
cannot find a more preferred chunk than that of B(i) that
satisfied WANT and HAVE.
14
Chunk selection policy:
For i=1,…,n-1 we get:
P(i+1) =
P(i)+Pr[WANT(K,i)∩Pr[HAVE(H,i)]∩Pr[SELECT(H,K,i)]
≈ P(i) + [1-Pk(i)] Ph(i) SELECT(i)
= P(i) + [1-P(i)] P(i) SELECT(i)
Note that P(i) is an increasing function, hence
collaborating improves playback performance.
15
Chunk selection policy:
We used the following assumptions to simplify our equation:
 Assume all peers are independent so P(i) is the same for each K,
therefore: WANT(K,i) = 1-P(i).
 Assume large enough number of peers so that knowing the state
of one does not significantly affect the probability of the state of
another peer, therefore:
Pr[HAVE(H,i)|WANT(K,i)] ≈ Pr[HAVE(H,i)]=P(i).
 Assume chunks are independently distributed in the network, so
the probability distribution for position i is not strongly affected
by the knowledge of the state of other positions, therefore:
Pr[SELECT(H,K,i)|WANT(K,i) ∩ HAVE(H,i)] ≈
Pr[SELECT(H,K,i)] = SELECT(i).
16
Chunk selection strategies:
 Rarest first: Select a chunk, which has the fewest
number of copies in the system.
Well known strategy. Gives good scalability
(Continuity).
 Greedy: Fill empty buffer locations closer to the
playback time first.
Low startup latency but fares poorly in continuity.
17
Greedy:
Peer K will want to select the chunk closet to the
playback deadline, which it doesn't already have.
If B(n) is empty, it will select it. If B(n) is filled it will
select B(n-1) and so on…
This means peer K will select to download chunk to
B(i) if B(j) are filled for every n>j>i and it was not
distributed a chunk by the server.
18
Greedy:
Pr[chunk isn't downloaded to B(j)] =
¬Pr[WANT(K,j)HAVE(H,j)] =
Pr[K has j] + Pr[K doesn't have j] * Pr[H doesn't have j] =
Pk(j) + (1-Pk(j)) (1-Ph(j)) = P(j) + (1-P(j))²
Therefore: SELECTG(i)
Proposition: SELECTG(i) = 1 - (P(n) - P(i+1)) - P(1)
For Proof see paper, proposition 1.
19
Rarest first:
For i=1,…,n-1: P(i+1) ≥ P(i), so the expected number of
copies of chunk in B(i+1) is equal or greater than the
expected number of copies of chunk in B(i).
This means peer K will want to select to download chunk
to B(1) unless B(1) is already filled, in this case K will
select B(2) and so on…
20
Rarest first:
Peer K will select to download chunk to B(i) if B(j) are
filled for every 1≤j<i and it was not distributed a chunk
by the server:
Therefore: SELECTRF(i)
Proposition: SELECTRF(i) = 1 – P(i)
For Proof see paper, proposition 2
21
Results:
We get the following difference equations:
 Rarest First:
P(i+1) ≈ P(i) + P(i) [1-P(i)]²
 Greedy:
P(i+1) ≈ P(i) + P(i) [1-P(i)] [1-P(1)-P(n)+P(i+1)]
22
Evaluating performances:
We already have Continuity: P(n).
Another performance metric is start up latency, the
time a peer should wait before starting playback.
Assuming all other peers have already reached steady
states, we argue that a newly arriving peer should wait
until its buffer has reached steady state as well.
23
Start up latency
Assumption: when a peer K starts with an empty buffer
every other peer H it contacts is likely to have a chunk
for it to download.
Therefore after
time slots, K is expected to
have acquired the same number of chunks as H, which is
also
, thus we have: Start up latency =
.
24
Start up latency
Is there a problem here with the Greedy strategy ?
Assume start up latency for the Greedy strategy is 4 time
slots and that on the second time slot, it successfully
downloads a chunk to position B(n-1).
Since we shift positions each time slot by 1, by the time
the playback starts this chunk becomes obsolete.
25
Start up latency
Suggested solution:
Assume the start up latency for the Greedy strategy is
Sg, then Greedy has to download chunks to buffer
positions less than n - Sg in its first attempt. Positions
less than n - (Sg + 1) in its second attempt and so
on…
until playback starts.
26
Models from comparison:
 Discrete model – The solution for the buffer state
distribution P(i) is derived numerically.
For the Greedy Strategy we first give P(n) a fixed value,
substitutes n steps inversely from P(n) to P(1) and
compare P(1) with 1/M.
If P(1) ≈ 1/M then we get the solution, else P(n) is
adjusted accordingly and the inverse substitution
process is repeated.
27
Models from comparison:
 Continuous model – Generally solved numerically
using MatLab.
 Simulation model – Based on the discrete model.
o One server and M peers.
o In each time slot the server distributes one chunk to a
random peer.
o Each peer selects one random peer to download one
chunk from and can upload up to two peers.
28
Rarest first vs. Greedy:
M=1,000, n=40:
29
Mixed strategy:
Trying to achieve Rarest first’s continuity (or even
better) with lower start up latency.
Always try to collaborate with others in order to improve
performances, but if you can’t find a “rare” chunk to
download and if positions closer to the playback are still
unfilled, why not try to fill them up ?
30
Mixed strategy:
 Partition the buffer at point m.
 First, try to download a chunk using Rarest first to
positions 1…m.
 If no chunk can be downloaded using Rarest first, use
Greedy for positions m+1…n.
31
Mixed strategy:
Difference equations:
 For B(1) to B(m) same as Rarest first:
P(1) = 1/M.
P(i+1) ≈ P(i) + P(i) [1-P(i)]²
 For B(m+1) to B(n) same as Greedy but substitute B(m)
for B(1):
P(i+1) ≈ P(i) + P(i) [1-P(i)] [ 1-P(m)-P(n)+P(i+1)]
32
Rarest first vs. Greedy vs. Mixed:
We keep buffer size at n = 40 and we set m = 10.
This means for the mixed strategy that buffer positions 1
to 10 are running Rarest first while buffer positions 11 to
40 are running Greedy.
We’ll present results from the discrete model.
33
Rarest first vs. Greedy vs. Mixed:
34
Rarest first vs. Greedy vs. Mixed:
35
Optimizing the Mixed Strategy:
Fix n to 40 and vary m:
36
Optimizing the Mixed Strategy:
Adapting to peer population:
One way is to observe the value of P(m).
 Set a target value, say Pm = 0.3
 When a peer finds the average of P(m) to be less than
Pm the peer increases m, else the peer decreases m.
37
Optimizing the Mixed Strategy:
Simulation:
 Initial value of m is 10.
 Calculate average of P(m) for 20 time slots and then
set new value of m.
 Initially M = 100 and every 100 time slots add another
100 peers (with empty buffers).
38
Optimizing the Mixed Strategy:
39
Summary:
 We first saw 2 different selection strategies:
• Rarest first:
Good continuity but higher start up latency.
• Greedy:
Low start up latency but poor continuity.
 We then introduced a mixed strategy.
• We showed it achieves better continuity than both of the
above and lower start up latency than Rarest first.
• We show how to optimize the Mixed strategy.
40
Summary:
Application to real world protocols:
 The Mixed strategy can be viewed as an optimization
of the CoolStreaming protocol. The presented model
does not try to capture all aspects of the
implementation of CoolStreaming, however, the
chunk selection strategy can be incorporated.
 The Mixed strategy is also compatible with BiTos and
can be viewed as an alternative for it.
41
Think…
In this version of mixed we choose to use Rarest first
initially and use Greedy only if no chunk is found using
the Rarest first strategy for positions 1 to m.
What about the other way around ?
Generally collaborate with others in order to improve
performances, however, if for some reason, positions
closer to the playback remain unfilled try to fill them up
first.
42
Paper:
A Simple Model for Analyzing P2P Streaming
Protocols.
 YiPeng Zhou, Dah Ming Chiu
Information Engineering Department
The Chinese University of Hong Kong
 John C.S. Lui
Department of Computer Science & Engineering
The Chinese University of Hong Kong
43