The Cyber House Rule

Computer Science
Generating Streaming Access Workload
for Performance Evaluation
Shudong Jin
3nd Year Ph.D. Student
(Advisor: Azer Bestavros)
Project Overview
Computer Science

This project aims to develop a Generator of Internet
Streaming Media Object access workloads (GISMO)

Why develop GISMO?
Streaming access of emerging Internet streaming application
(e.g., video/audio on Web) has unique characteristics:
-
High bandwidth requirement
Long duration (seconds to hours)
Variable bit-rate (VBR) burstiness
Timeliness and user-perceived quality are important
There is no streaming access workload generator
- Workload generation is important for performance evaluation of
Internet streaming content delivery techniques
GISMO: Characteristics
Computer Science
Component
Model
Popularity
Zipf-like
Temporal Correlation
Truncated Pareto
Seasonal Patterns
User-defined
Object Size
Power Law
Partial Access
Truncated Pareto
VBR Long-Range Dependence
Self-similarity
VBR Marginal Distribution - Body
Lognormal
VBR Marginal Distribution - Tail
Pareto
GISMO: Modeling
Computer Science

Modeling Request Arrival Process
Popularity distribution
- Zipf-like distribution models the skewed request frequency of the
streaming media objects. P ~ r-, 0<<1, where P is the access
frequency, r is the rank of an object.
Temporal Correlation of Requests
- Requests to the objects tend to arrive non-randomly. Pareto
distribution models the correlated inter-arrival time.
Seasonal Patterns
- Aggregated request arrival rate can exhibit seasonal patterns
(hourly, daily, weekly etc). GISMO users can define such diurnal
patterns.
GISMO: Modeling
Computer Science

Modeling Individual Requests
Object Size Distribution
- Streaming media objects have a wide range of length. We use
a power law to model it.
Partial Access Patterns
- User interactions involves in streaming access. We use
Pareto distribution to model the stop time.
Variable Bit-Rate
- The bit-rate of streaming media objects has high variability.
We use Pareto distribution to model the tail of VBR marginal
distribution, and Lognormal distribution for the body.
GISMO: Modeling
Computer Science

VBR self-similarity
 The bit-rate of streaming media objects (e.g., audio/video)
exhibits long-range dependence.
 The auto-correlation function decay slowly
 Burstiness persists for long period, and implies the
ineffectiveness of buffering

Generating self-similar process FGN
 We use a random middle-point displacement algorithm

Transforming VBR marginal distribution
 Gaussian  hybrid Lognormal/Pareto distribution
GISMO: Functionality
Computer Science

GISMO generates
A set of bogus streaming media objects, installed in the
servers which mimic real servers
Requests to these objects, initiated by the clients which
mimic real users

GISMO can be used for many purposes
Evaluating the performance of streaming media servers, e.g.,
scheduling and I/O
Evaluating network protocols for streaming data transmission
Evaluating streaming data replication techniques (caching,
pre-fetching, multicast merging, etc)
GISMO: Architecture
Computer Science
Requests
WWW
Browser
TCP
Media
Player
WWW
Browser
Requests
Streaming
Server
Objects
Network
RTSP
Media
Player
Media
Player
Requests
UDP
WWW
Browser
Web
Server
GISMO: Use Case
Computer Science

We have conducted a case performance study
Using GISMO to generate workloads
Evaluating proxy caching and server stream
merging techniques
Showing that how the workload characteristics
impact their effectiveness
GISMO: Use Case
Computer Science
How does popularity impact the effectiveness of
proxy caching (left) and server merging (right)
Future Directions
Computer Science
More client interactions in request streams, e.g.,
VCR functionality
More correlations in streaming media objects, e.g.,
Group-of-Picture GoP correlation
Using GISMO in evaluating streaming content
delivery techniques
Using GISMO in evaluating network protocols for
streaming data transmission
Related Publications
Computer Science

Shudong Jin and Azer Bestavros. Generating Streaming Access
Workloads for Performance Evaluation and A Case Study. BU CS
Technical Report, April 2001.

Shudong Jin and Azer Bestavros. Temporal Locality in Web
Request Streams: Sources, Characteristics, and Caching
Implications. Short paper appeared in ACM SIGMETRICS’2000;
full paper appeared in MASCOTS’2000.

Paul Barford and Mark Crovella. Generating Representative Web
Workloads for Network and Server Performance Evaluation. ACM
SIGMETRICS’1998.