The Internet Real-Time Laboratory

DotSlash – or how to deal
with 15 minutes of fame
Weibin Zhao
Henning Schulzrinne
Columbia University
CATT/WICAT Annual Research Review
November 14, 2003
Web Hotspots

A sudden, dramatic surge of request rate




Short-term overload


Hard to predict – e.g., external reference to web
site
Build up quickly
Large magnitude
Long delay or no service
One-time event



“15 minutes of fame”
Small web sites, e.g., on DSL or cable modem
Thus, traditional solutions (Akamai, etc.) not
economical
11/14/2003
DotSlash
2
Existing Mechanisms

Capacity planning
 high-end servers, LAN clusters, WAN mirrors
 Fixed capacity, cost

CDNs



Caching


Advance set-up, cost
Only effective for large servers
Not fully controlled by the server
Admission control

No service for some clients
11/14/2003
DotSlash
3
DotSlash Approach

Enable a site to utilize spare capacity at other
sites via dynamic collaboration  Rescue




Extending capacity dynamically as needed
Usage models: mainly for mutual aid, could be
commercial
Focusing on small sites
Advantages


Client transparent: no browser changes
Low cost: based on existing systems, no additional
hardware cost
11/14/2003
DotSlash
4
Rescue Example-1:
www.origin.com
1.2.3.4
HTTP redirect &
Cache miss
www.rescue.com
5.6.7.8
(8) Reverse proxy
origin.com
Dynamic
DNS
origin.com
DNS
(3)
(9)
(4) HTTP redirect
alias1.rescue.com
Dynamic
DNS
rescue.com
(7)
cache
(10)
rescue.com
DNS
DNS RR
(5) alias1.rescue.com
(2) 1.2.3.4
Client1
(6) 5.6.7.8
(1) www.origin.com
11/14/2003
DotSlash
5
Rescue Example-2:
www.origin.com
1.2.3.4
DNS RR &
Cache hit
www.rescue.com
5.6.7.8
rescue.com
origin.com
Dynamic
DNS
Dynamic
DNS
origin.com
DNS
rescue.com
DNS
DNS RR
(2) 5.6.7.8
Client2
(1) www.origin.com
11/14/2003
cache
(3)
(4)
DotSlash
6
DotSlash Functions


Workload monitoring
Rescue server discovery


Request distribution and redirect


DNS RR + HTTP redirect
Dynamic replication of content


Enhanced SLP: mSLP + Select/Sort extensions
Reverse caching proxy
Rescue relationship management

Allocate/release rescue servers
11/14/2003
DotSlash
7
Basic Architecture

Distributed system + dynamic virtual hosting



Components

Client
Distributed: multiple servers for one site
Virtual hosting: one server for multiple sites
HTTP
Apache module (Mod_dots), DotSlash daemon (Dotsd),
dynamic DNS (BIND), service discovery (enhanced SLP)
Mod_dots
SHM
Dotsd
Apache
Other
Dotsd
SLP
DNS
BIND
11/14/2003
TCP
DotSlash
mSLP
8
Server Status
Origin server
SOS
Ask help from others
Allocate rescue servers
Regular
server
Normal
Otherwise
As a rescue server
Rescue server
Not as a rescue server
Rescue
Provide help to others
11/14/2003
Release rescue servers
DotSlash
9
Workload Monitoring

Bottlenecks




Overload
Network bandwidth
CPU
Memory
Metrics


Net
CPU
Mem
Output data rate (HTTP)


OR Gate
KBytes/second
CPU load average
Num concurrent conns

11/14/2003
MaxClients
DotSlash
AND Gate
Normal
10
Origin Server


Workload monitoring
HTTP redirect



Use rescue server aliases
Don’t redirect requests from rescue servers
Redirect Policies



WRR based on rescue server capacity (load)
URL Hashing (URL locality) for large files
DNS round-robin

Add/remove rescue server IP addresses
11/14/2003
DotSlash
11
Rescue Server


Workload monitoring
Reverse caching proxy


Memory caching: concurrency control among
threads
Check Host: header:




A regular request
A reverse proxy request to origin server (active rescue)
An HTTP redirect to origin server (expired rescue)
Dynamic DNS update

Register new aliases
11/14/2003
DotSlash
12
Rescue Server Management

Allocation




Initial allocation: based on origin server load
Additional allocation: based on rescue server load
Policy: double current capacity
Release




When load is consistently low
Release one rescue server at a time
Partial release: the min-capacity rescue server
Final release: the last rescue server
11/14/2003
DotSlash
13
Other Components

Testing tool


Dynamic DNS


Use httperf from HP Labs, and extend it to handle
HTTP redirects automatically
Use BIND 9.2.2, set up a testing domain:
dotslash.cs.columbia.edu, and only allow updates
from specific hosts
Service discovery

Use SLP with mesh-enhancement (mSLP), and
Select, Sort and Attribute List extensions
11/14/2003
DotSlash
14
Testing Environment

Apache




2.0.47
With proxy and caching modules
Enable: caching and mod_dots
OS


Redhat 9, Linux kernel 2.4.20-20.9, hosts
from Clic Lab
Redhat 7.3, Linux kernel 2.4.196_planetlab, nodes from PlanetLab
11/14/2003
DotSlash
15
Conclusion

Preliminary version implemented



Basic measurement and control flow
Working on service location
On-going work


Performance measurement
Apply to dynamic contents

11/14/2003
LAMP: Linux+ Apache + MySQL + PHP
DotSlash
16