Wei-Hsin Lee June 2008

Shared-Dictionary Compression over
HTTP (SDCH)
Wei-Hsin Lee
June 2008
Why do we care?
• Speeding up Google and the Web
– The faster the Web is, the more useful it is.
– The faster Google web search is, the more searches
people do.
– Lots of users still suffer from slow networks. For
example, in developing countries.
Reduce transmission time
• Reducing payload size is the key.
• Gzip works well as the compression for each
individual response.
• What about common data shared by a group of
pages (inter-response redundancy) or pages
that change a little bit frequently?
• Only transmit the data that is common to each
response once.
• Thereafter, send only the parts of the response
that differ.
Why not RFC 3229?
• RFC3229 “Delta Compression in HTTP”
– Good for saving bandwidth
• But
– Too many states for server to track
• The possible states of www.google.com/search is bigger than
all possible search results.
– Only applicable to the same URL
• Discourages aggressive caching.
– No benefit for similar pages that don’t share an URL.
Shared-Dictionary Compression over HTTP
(SDCH)
• An addition to HTTP
• Small set of states (dictionaries) shared between
client and server.
• Dictionaries are scoped by domain name and
path. Just like cookies. It allows dictionaries to
apply to multiple URLs.
SDCH protocol details
• SDCH defines
– How client informs server of its capability and state.
– How the server should respond to client when the
client is SDCH capable.
– How dictionaries get loaded into client.
• Implement VCDIFF (RFC 3284) differential
compression format with enhancements
– Interleave instructions with data so that each network
packet can be decoded as it arrives. (chunked
encoding)
– Checksum to ensure data integrity
Example 1
Example 2
Other details
• Complement to Gzip or Deflate.
– Should be used before applying Gzip
• Lab result
– About 40 percent data reduction better than Gzip
alone on Google search.
– See faster Google search results. Especially under
low bandwidth and high latency condition.
• Working on the best way to get this out to users.
Your help counts!
• Please join the group
– http://groups.google.com/group/SDCH
– Protocol spec, and the encoder/decoder code will be
there soon.
• Getting your hands dirty is even better!
– Make your web site use SDCH.
– Make Squid or Apache web servers SDCH capable.
Don’t forget to join the group.
http://groups.google.com/group/SDCH