Internet Stream Size Distributions Nevil Brownlee kc claffy CAIDA, SDSC, UC San Diego and The University of Auckland, New Zealand CAIDA, SDSC UC San Diego [email protected] [email protected] ABSTRACT We present and discuss stream size and lifetime distributions for web and non-web TCP traffic on a campus OC12 link at UC San Diego. The distributions are stable over long periods, and show that on this link only 3% of the streams last longer than one minute, and that only about 0.5% of them are bigger than 100 kBytes. Although there are large streams (elephants) on this link, the bulk of its traffic is composed of many small streams (mice). 1. INTRODUCTION Many studies of network stream size distributions have been published. Downey [1] presented a model that explains the distribution of file sizes found both in computer systems and in the World Wide Web. Zhang, et al [2] observe that “Internet traffic is now dominated by mice, i.e. small objects 10-20 kB in size; the average web document is only around 30 kB,” but in contrast report that “the majority of the packets and bytes belong to elephants.” In this paper we present more detailed measurements of stream size and lifetime distributions, allowing us to comment on the stability of the distributions from minute to minute over periods of an hour or more. 2. MEASURING DISTRIBUTIONS We observe streams in real time using the methodology described by Brownlee & Murray [3]. We use a NeTraMet meter, modified to observe streams within a flow, to collect data on stream size and lifetime distributions. For this study we use a ruleset (meter configuration file) that produces separate flows for various kinds of traffic, in particular web, non-web TCP and UDP. Each stream within a flow is monitored by the NeTraMet meter. When a stream times out, the meter knows its size in packets, size in bytes, and (active) lifetime in microseconds. Each stream’s data is used to add a point to its flow’s size and lifetime distributions. Support for this work is provided by DARPA NGI Contract N66001-98-2-8922, NSF Award NCR-9711092 ‘CAIDA: Cooperative Association for Internet Data Analysis,’ and The University of Auckland. OC12, Mon 11 Mar 02, FlowTime for whole torrent % streams 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 stream lifetime (minutes) 25 30 0935 0940 0945 0950 0955 1000 1005 time (HHMM) 1010 Figure 1: Stream lifetime plots (% of total) for all traffic, covering 40 minutes from when metering began NeTraMet’s dynamic timeout algorithm is similar to the one described in [4]. It uses a minimum inactive time of two seconds, after which a flow is timed out if it remains inactive for a period equal to its average packet interarrival time multiplied by a factor of 10. We also tested the meter using a factor of 100, and compared the percentage of streams timing out during each minute after a twohour run. With the 100 factor, 96.5% of streams lasted one minute or less, compared with 97.9% when using the factor of 10, a difference of 1.4%. For longer streams the effect was smaller, 0.6% for 2-minute streams and 0.2% for 5-minute streams. Overall the slight improvement in measuring stream lifetimes is arguably not worth the additional processing and memory overhead in the meter. We read flow data (including all the distributions) from our meters at one-minute intervals; the ‘times of day’ when distributions were read from the meter appear on the figures below as 24-hour times. 3. METER LOCATIONS We used NeTraMet to make measurements on high-speed network links at three experimental sites. At our OC3 and OC48 sites more than a third of the traffic by byte is web streams. The main difference between these two sites is that the OC48 link carries about 800 Mb/s whereas the OC3 link carries only about 8 Mb/s. However our OC12 link is markedly different: its 120 Mb/s load is dominated by non-web TCP streams. In this paper we concentrate on our OC12 link at UC San Diego; we will address the differences in traffic 5. STREAM SIZES IN BYTES OC12, Mon 11 Mar 02, FromFlowOctets, web streams % streams 5 We now explore the stream size distributions for different kinds of traffic. Figures 2 and 3 show per-minute distributions for a typical hour. Each minute’s distribution shows the percentage of streams of various sizes, using a log scale from 40 bytes (0.04 kB) to 400 kilobytes. We observe that most streams on this link are counted in the lowest bin; our z-axis scale runs from 0 to only 5% of streams, in order to reveal detail for streams larger than 40 bytes. 4 3 1000 2 1010 1020 1 1030 1040 0 1050 0.1 1 10 100 time (HHMM) 1100 stream size (kBytes) Figure 2: Stream size plots (% of total) for web traffic We have examined cumulative distributions to determine what proportion of streams lie in different size ranges. For web streams: 87% are under 1 kB in size; 8% are between 1 and 10 kB; and 4.8% are between 10 and 100 kB. For non-web streams these figures are 89%, 7% and 1.5%, respectively, i.e. non-web TCP traffic has slightly more small streams, but significantly fewer streams between 10 and 100 kB. Figures 2 and 3 show this effect clearly; both figures show a steep fall toward 1 kB, but whereas web traffic drops from 2% to 0.2% as stream size increases above 10 kB, the non-web traffic streams fall from around 0.8% at 1 kB down to near zero for sizes above 10 kB. Another striking feature of these plots is that they change little over time; their basic shapes remain much the same over the whole hour shown on the figures. OC12, Mon 11 Mar 02, FromFlowOctets, non-web streams 6. CONCLUSION % streams 5 4 3 1000 2 1010 1020 1 1030 1040 0 1050 0.1 1 10 100 time (HHMM) 1100 stream size (kBytes) Figure 3: Stream size plots (% of total) for non-web TCP traffic among the OC3, OC12 and OC48 links in future work. 4. We have measured stream lifetime and byte size distributions at one-minute intervals for traffic on our OC12 link. On this link 87% of the streams are smaller than 1 kB, and only about 0.5% are bigger than 100 kB. This suggests that although there are large streams (elephants) on this link, the bulk of its traffic is composed of small streams (mice). STREAM LIFETIMES We begin by examining the stream lifetime distribution for the total traffic, i.e. the torrent, on the OC12 link. Figure 1 shows distributions of stream lifetime for 40 consecutive minutes from 0932, the time when we began metering. The stream lifetime distribution for each minute shows the percentage of streams with lifetimes of 1 to 30 minutes. During the first 30 minutes the maximum observed lifetime increasd, reaching the 30th bin after half an hour, i.e. at 1032. After that time the rightmost edge of the plot shows the overflow bin; a few streams continue to time out after 30 minutes. After the initial 30 minutes, the distributions continue similarly. Each minute, a few streams time out with lifetimes between 10 and 30 minutes. The most surprising feature of figure 1 is that nearly 97% of all streams on this link last only one minute or less, and that fewer than 1% last more than five minutes. On this campus OC12 link, 97% of streams last one minute or less, and only about 1% of them live longer than 5 minutes. We believe that a meter-reading interval of one minute (as used in this study) yields valid and useful data about short-term behaviour of stream distributions. Where more detail of longer-running streams is required, five-minute readings should prove effective. Finally, note that the data for our distributions is collected in real time. Our NeTraMet meter performs data reduction and produces flow data files directly; our figures are generated by simple perl scripts using that flow data. This capability is well-suited to ongoing monitoring applications where there is no desire to generate, store or process large amounts of packet trace data. 7. REFERENCES [1] A. B. Downey, The structural cause of file size distributions, MASCOTS Symposium, 2001, available at http://rocky.wellesley.edu/downey/filesize/ [2] Y. Zhang and L. Qiu, Understanding the End-to-End Performance Impact of RED in a Heterogeneous Environment, Cornell CS Technical Report 2000-1802, July 2000, available at http://www.aciri.org/floyd/red.html [3] N. Brownlee and M. Murray, Streams, Flows and Torrents, PAM2001 April 2001, available at http://www.caida.org/ outreach/papers/2001/StreamsFlowsTorrents/ [4] Bo Ryu, David Cheney and Hans-Werner Braun, Internet Flow Characterization - Adaptive Timeout and Statistical Modeling, PAM2001, April 2001
© Copyright 2026 Paperzz