ppt - CSE Labs User Home Pages

Congestion Control and AQM
• Congestion Control, Efficiency and Fairness
• Analysis of TCP Congestion Control
– A simple TCP throughput formula
• RED and Active Queue Management
– How RED works
– Other AQM mechanisms
• TCP and ECN
Readings: please do required!
And optional readings if interested
CSci5221:
congestion control and AQM
1
Why Congestion Control
• Inefficiency and Congestion Collapse
– “self-interest” vs. “social welfare”
•
Inefficiency: a simple “artificial” example
source 1 rate
s1
l1 =100kb/s
C4 =100kb/s
C1 =100kb/s
C3 =110kb/s
x
source 2 rate s2
l2 =1000kb/s
CSci5221:
C2 =1000kb/s
source 1 throughput
m1 =10kb/s !
y
C5 =10kb/s
Assumption: when total offered traffic
exceeds link capacity, all sources see their
traffic reduced in proportion of their
offered traffic (e.g., when FIFO is used)
congestion control and AQM
d1
d2
source 2 throughput
m2 =10kb/s
2
Why Congestion Control (cont’d)
•
Congestion Collapse: another “artificial” example
 Each
source i traverses two links:
link i and link i+1
 source i input rate li
 actual rate on link i:
CSci5221:
congestion control and AQM
source i
link i
node i+1
node i
link i+1
link i-1
3
Fairness
• Consider a simple scenario:
–
–
–
–
N users want to transit data over a link of bandwidth C
Each user i wants ri bandwidth
If å ri <C, no problem!
But if å ri >C, what should we do?
• Suppose all users are of equal “importance”:
– To be fair, allocate the same share, C/N, to each user
– Ok, if ri > C/N; what if there exists ri, ri < C/N
• i.e., some users want less than that their “fair share”
• how do we allocate the “residue” bandwidth of these users?
– the “Fair Queuing” algorithm: WFQ where wi =1 for all i’s
• If not all users are equal: importance denoted by wi
–
weighted fair queueing
CSci5221:
congestion control and AQM
4
Max-Min Fairness
• Network scenario: A simple line network example
link i bw Ci
user 0
•
user i
First, maximize thruput at each router/link (or total network
thruput) may lead to unfair bw allocation
• How to allocate bw fairly?
– let x ij be a “feasible” bw share of user i at link j
– Bw allocation to user i: xi = minj xij
– Ideally, we want to max xi =max minj xij , for all users
• Max-min fairness:
– Let {xj} be a bw allocation vector (bav), it is max-min fair if
• for any other bav y, if yi > xi, then there exists j, xj <= xi
and yj < xj
– Unfortunately, such max-min fair bav may not always exist!
CSci5221:
congestion control and AQM
5
Fairness (cont’d)
• (Abstract) Network model: S sources and L links (cl)
– Al,s (routing matrix): fraction of traffic of source s on link l
– feasible (rate) allocation:
– formal def of “bottleneck” link (with respect to source s)
• Some Facts (Theorems)
–
A feasible rate allocation is max-min fair if and only if every source
has a bottleneck link
– Under network model and assuming routing matrix fixed, there
exists a unique max-min fair allocation !
Fair Queueing implements max-min fairness
• Other definitions of fairness (optional material!)
– proportional fairness: there exists a unique prop. fair allocation!
– utility approach to fairness:
–
• “utility-fair”:
CSci5221:
congestion control and AQM
6
TCP Congestion Control Behavior
• congestion control:
TCP runs at end-hosts
– decrease sending rate when
loss detected, increase when
no loss
• routers
– discard, (or mark packets if
ECN is available),
when congestion occurs
• interaction between end
systems (TCP) and routers?
– want to understand
(quantify) this interaction
CSci5221:
congestion control and AQM
congested router drops packets
7
Generic TCP CC Behavior:
Additive Increase
Congestion Window Adaption Algorithm (window W )
– up to W packets in network
– return of ACK allows sender to send another packet
– cumulative ACKS
•increase window by one per RTT
W:= W +1/W per ACK
=> W :=W +1 per RTT
•seeks available network bandwidth
•Ignoring the “slow start” phase during which window
increased by one per ACK
W := W +1 per ACK
=> W :=2W per RTT
CSci5221:
congestion control and AQM
8
receiver
sender
CSci5221:
W
congestion control and AQM
9
Generic TCP CC Behavior:
Multiplicative Decrease
Congestion Window Adaption Algorithm (window W)
• increase window by one per RTT
W :=W +1/W per ACK
• Packet loss: indication of congestion
• Upon receiving triple duplicate ACKs,
decrease window by half, W :=W/2
CSci5221:
congestion control and AQM
10
receiver
TD
sender
CSci5221:
congestion control and AQM
11
Generic TCP CC Behavior:
After Time-Out (TO)
Congestion Window Adaption Algorithm (window W)
• increase window by one per RTT
W :=W +1/W per ACK
• halve window on detection of loss (triple duplicate
ACKs): W := W/2
• timeouts due to lack of ACKs:
window reduced to one, W :=1
CSci5221:
congestion control and AQM
12
receiver
sender
TO
CSci5221:
congestion control and AQM
13
Generic TCP Behavior: Summary
Congestion Window Adaption Algorithm (window W)
• increase window by one per RTT (or one over window
per ACK, W :=W +1/W)
• halve window on detection of loss (triple duplicate
ACKs), W :=W/2
• timeouts due to lack of ACKs, W:= 1
• successive timeout intervals grow exponentially long
up to six times
CSci5221:
congestion control and AQM
14
Understanding TCP Behavior
• can simulate (ns-2)
+ faithful to operation of TCP
- expensive, time consuming
• deterministic approximations
+ quick
- ignore some TCP details, steady state
• fluid models
+ transient behavior
- ignore some TCP details
CSci5221:
congestion control and AQM
15
TCP Throughput/Loss Relationship
loss occurs
W
TCP
window
size
W/2
Idealized model:
• W is maximum supportable
window size (then loss occurs)
• TCP window starts at W/2 grows
to W, then halves, then grows to
W, then halves…
• one window worth of packets
each RTT
• to find: throughput as function
of loss, RTT
time (rtt)
CSci5221:
congestion control and AQM
16
TCP Throughput/Loss Relationship
# packets sent per “period” =
W
TCP
window
size
W/2
period
time (rtt)
CSci5221:
congestion control and AQM
17
TCP Throughput/Loss Relationship
# packets sent per “period” =
W /2
W W
W

   1  ...  W =  (  n)
2 2

n =0 2
W
TCP
window
size
W
 W W /2
=   1   n
2
 2 n =0
W/2
W
 W W / 2(W / 2  1)
=   1 
2
2
 2
3
3
= W2  W
8
4
3
 W2
8
period
time (rtt)
CSci5221:
congestion control and AQM
18
TCP Throughput/Loss Relationship
3 2
# packets sent per “period”  W
8
1 packet lost per “period” implies:
W
8
8
or:
W
=
3W 2
3 ploss
3 packets
B = avg._thrup ut = W
4
rtt
1.22 packets
B = avg._thrup ut =
ploss rtt
ploss 
TCP
window
size
W/2
period
time (rtt)
CSci5221:
B throughput formula can be extended
to model timeouts and slow start [PFTK’98]
(see slide 59 for details)
congestion control and AQM
19
Drawbacks of FIFO with Tail-drop
• Sometimes too late a signal to end system about
network congestion
– in particular, when RTT is large
• Buffer lock out by misbehaving flows
• Synchronizing effect for multiple TCP flows
• Burst or multiple consecutive packet drops
– Bad for TCP fast recovery
CSci5221:
congestion control and AQM
20
FIFO Router with Two TCP Sessions
CSci5221:
congestion control and AQM
21
Active Queue Management
• Dropping/marking packets
depends on average queue
length -> p = p(x)
• Advantages:
• Examples:
– RED
– REM
– …
– …
Marking probability p
– signal end systems earlier
– absorb burst better
– avoids synchronization
1
pmax
0
tmin
tmax
2tmax
average queue length x
CSci5221:
congestion control and AQM
22
RED: Parameters
• min_th – minimum threshold
• max_th – maximum threshold
• avg_len – average queue length
– avg_len = (1-w)*avg_len + w*sample_len
Discard Probability
1
0
min_th max_th queue_len
CSci5221:
congestion control and AQM
Average
Queue Length
23
RED: Packet Dropping
• If (avg_len < min_th)  enqueue packet
• If (avg_len > max_th)  drop packet
• If (avg_len >= min_th and avg_len < max_th) 
enqueue packet with probability P
Discard Probability (P)
1
0
min_th max_th queue_len Average
Queue Length
CSci5221:
congestion control and AQM
24
RED: Packet Dropping (cont’d)
• P = max_P*(avg_len – min_th)/(max_th – min_th)
• Improvements to spread the drops
P’ = P/(1 – count*P), where
• count – how many packets were consecutively enqueued
since last drop
Discard Probability
max_P
1
P
0
min_th
CSci5221:
max_th queue_len Average
Queue Length
avg_len
congestion control and AQM
25
RED Router with Two TCP Sessions
CSci5221:
congestion control and AQM
26
Issues with RED
• Parameter sensitivity
– how to set minth, maxth, and maxp
– Goal: maintain avg. queue size below midpoint
between min_{th} and max_{th}
• maxth needs to be significantly smaller than max.
queue size to absorb transient peaks
• maxp determines drop rate
– In reality, hard to set these parameters
• RED uses avg. queue length, may introduce
large feedback delay, lead to instability
CSci5221:
congestion control and AQM
27
Other AQM Mechanisms
•
•
•
•
•
•
Adaptive RED (ARED)
BLUE
Virtual Queue
Random Early Discard (REM)
Proportional Integral Controller
Adaptive Virtual Queue
– Improved AQMs are designed based on control
theory to provide better faster response to
congestion and more stable systems
CSci5221:
congestion control and AQM
28
Explicit Congestion Notification (ECN)
• Standard TCP:
– Losses needed to detect congestion
– Wasteful and unnecessary
• ECN (RFC 3168):
– Routers mark packets instead of dropping them
– Receiver returns marks to sender in ACK packets
– Sender adjusts its window accordingly
• Need to make “changes” to both IP and TCP
headers
– two bits (ECT and CE bits) in the IP header (“TOS” field)
– two bits (CWR and ECE) bits in TCP header (two unused
“reserved” bits in the standard TCP header)
• assume cooperation of sender/receiver and routers
CSci5221:
congestion control and AQM
29
TCP with ECN: How It Works
• Use of ECT and CE bits (in “TOS” field) in the IP header
ECT CE
0
0
0
1
1
0
1
1
•
non-ECT capable (set by sender)
ECT(1), endpoints ECT-capable (set by sender)
ECT(0), endpoints ECT-capable (set by sender)
CE, set by routers indicating congestion!
Use of in CWR and ECE in the TCP header
– Receiver: if CE, set ECE (explicit congestion echo) bit in ACK
• What about receiver cheating?! See RFC 3540
– Sender: sets CWR in next data packet to ack receipt of ECE!
• why does sender needs to indicate receipt of ECE-set ACK?
• During TCP set-up, sender and receiver negotiate ECT
– If using ECT, sender sets both CWR and ECE in the SYN packet
– receiver sets ECE (but not CWR), if also ECT-capable
CSci5221:
congestion control and AQM
30
Generalization of TCP AIMD
• AIMD in Standard TCP:
– If no packet lost during the round,
CongWin: = CongWin +1;
– If packet losses encountered during the round,
CongWin: = CongWin/2
• Generalization of AIMD
– If no packet lost during the round,
CongWin: = CongWin + \alpha;
– If packet losses encountered during the round,
CongWin: = CongWin/\beta
where \alpha (>=1) and \beta (>1)
CSci5221:
congestion control and AQM
31
Other Issues/Extension of TCP
• When most flows are “short” (i.e., consist of 1 or
few packets), TCP slow start cannot ramp up fast
enough to the available bw
– persistent TCP connections
– TCP splitting via “TCP proxies”, etc.
• TCP and congestion issues in data centers
– e.g., the “TCP incast” problem which may cause
throughput collapse!
• Multi-path TCP (mTCP): esp. useful in data centers
• TCP under large delay-bandwidth product
– e.g., XCP: eXplicit congestion Control Protocol (a
“research” protocol – optional material follows!)
CSci5221:
congestion control and AQM
32