Performance of short

Performance of short-lived TCP
flows
Per Hurtig
Karlstad University
Motivation
• TCP seeks to maximize the throughput, which is
ideal for bulk transfers
• Short-lived, or less data intense, flows do
however have other requirements
– mainly low latency
• In this presentation we look at two mechanisms
that do not work perfectly with low latency req’s
– TCP loss recovery
– TCP metric caching
Loss recovery
• TCP has two mechanisms to detect and
recovery from loss
– Fast retransmit (FR)
– Retransmission by Timeout (RTO)
• Fast retransmit is preferred
– Detects loss faster
– Does not slow down transfer so much
• The RTO mechanism is a last resort
Loss recovery
Loss recovery
• In some situations, it’s not possible to use FR
– When data does not arrive at all (severe cong.)
– or when no new data is available (short-lived)
– ...
• Short-lived flows might be forced to use RTO
although congestion isn’t severe
– Unfortunately, the RTO management alg. delays
the timeout process even further
RTO management
The following is the RECOMMENDED algorithm for managing the
retransmission timer:
(5.1) Every time a packet containing data is sent (including a retransmission),
if the timer is not running, start it running so that it will expire after RTO
seconds (for the current value of RTO).
(5.2) When all outstanding data has been acknowledged, turn off the
retransmission timer.
(5.3) When an ACK is received that acknowledges new data, restart the
retransmission timer so that it will expire after RTO seconds (for the
current value of RTO).
Problem
Proposed solution
• The simple solution is to remove the offset
– ”When the RTO is restarted, remove the time
elapsed since the earliest outstanding segment
was sent”
• The problem with this approach is when to
apply it
– For every restart of the RTO?
– Only when we can’t use FR?
Conditions
• Basically, it’s both unnecessary and risky to
always do it
– Unnecessary RTOs (instead of FRs)
– ...
• We only do it if
– The number of outstanding segments is less than
four
– and there is no unsent data ready for transmission
or the receiver’s advertised window blocks us
Results
Results
Conclusions
• The RTO often takes RTO+RTT to fire
• This prolongs the loss process for data-limited
flows (which often have latency req’s)
• By taking the last outstanding segment in
consideration, it’s possible to retransmit after
RTO
• We only do this when FR can’t be used
TCP metric caching
• TCP uses a number of variables to describe the
state of a connection
– Round-trip time(s)
– Congestion window
– Slow-start threshold
– ...
• These variables are typically used for normal
TCP operations (loss recovery, congestion
control)
TCP metrics
• Some operating systems also share, and/or
save, such state information to optimize other
connections
– e.g. ”congestion manager”
– state caching
• Linux, for instance, uses state caching to
optimize new connections to the same
destination
– To quickly converge to the ”correct” state
ssthresh example
• Whenever a connection is closed the ssthresh
is cached
• The cached value is typically set to
– max(cwnd/2, ssthresh)
– assumes a correct cwnd/ssthresh based on
congestion losses
• For short flows, this is inappropriate
– No time to grow cwnd, random loss, ...
ssthresh example
• To illustrate the problem, we transmitted 1000
consecutive short flows
– Experiments were conducted using emulation
– 10Mbps bw, 50ms delay, 0-5% random packet loss
• In the coming two slides, we will see an
example of what happens with the cache
turned on/off
Caching off
Caching on
What happens!?
• After a loss, the ssthresh becomes very small
– Congestion window is never large for these flows
• New flows, to the same destination, will start
in congestion avoidance
• The flows are too short to leave this state (not
enough time to grow cwnd)
Difference (on/off)
Conclusions/Future work
• The caching does not work well for short-lived
flows
• Long-lived flows could also suffer
– A ”late” random loss
• We should find out when and what to cache
• We should also look at other operating
systems
– Do they do something similar?
– How do they do it?