ch7 - Central Washington University

Application Layer
Chapter 7
•
•
•
•
•
DNS – Domain Name System
Electronic Mail
The Web
Streaming Audio and Video
Content Delivery
Revised: August 2011
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
The Application Layer
Uses transport services to build
distributed applications
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Application
Transport
Network
Link
Physical
DNS – Domain Name System
DNS is a world-wide distributed directory service. DNS delegates the
responsibility of assigning domain names and mapping those names
to internet resources by designating authoritative name servers for
each domain (zone). Net admins may delegate authority over subdomains within their allocated name space to other name servers.
This provides a distributed and fault tolerant service that was designed
to avoid a single large central database.
Internet has two primary name-spaces:
1. domain name hierarchy (DNS addresses)
2. Internet Protocol Address spaces (IPv4 and IPv6)
The DNS resolves high-level human readable
names for computers to low-level IP addresses
• DNS name space »
• Domain Resource records »
• Name servers »
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
The DNS Name Space (1)
DNS namespace is hierarchical from the root down
•
•
•
Different parts delegated to different organizations
Can register under multiple top-level domains
Absolute domain names always end in a period, relative domain names do not
−
Relative names have to be interpreted in a context to uniquely determine their meaning
250+ Top
Level
Domains:
1. Generic
2. Countries
Generic Second Level The computer robot.cs.washington.edu
Question: What is a “domain”?
Domains run by Registrars
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
appointed by ICANN
The DNS Name Space (2)
The Generic top-level
domains are controlled
by ICANN who appoints
registrars to run them
ICANN = Internet Corporation for
Assigned Names and Numbers
This one was controversial
To create a new domain, permission is required of the domain in which it will be included.
Naming follows organizational boundaries, not physical networks.
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Domain Resource Records (1)
Every domain (all levels) has its own DNS database that is comprised of Resource
Records.
Each Resource Record is a 5 tuple: Domain_Name, TTL, Class, Type, Value (page 616)
The key resource records in the namespace are IP addresses
(A/AAAA) and name servers (NS), but there are others too (e.g.,
MX)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Domain Resource Records (2)
Name server
IP addresses
of computers
CNAME and PTR
are aliases; CNAME
allows the same IP
address to have multiple
names; PTR enables
reverse lookups
Mail gateways
A portion of a possible DNS database for cs.vu.nl.
Question: What is cs.vu.nl’s web server’s name and IP address?
Question: What is a reverse lookup?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Name Servers (1)
DNS name space divided into overlapping zones. Admins determine zone
boundaries. Each zone has one or more name servers.
Name servers contain data for portions of the name
space called zones (circled).
One zone
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Name Servers (2)
Finding the IP address for a given hostname is called
resolution and is done with the DNS protocol.
Resolution:
• Computer requests local name server to resolve
• Local name server asks the root name server
• Root returns the name server for a lower zone
• Continue down zones until a name server can answer
DNS protocol:
• Runs on UDP port 53, retransmits lost messages
• Caches name server answers for better performance
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Name Servers (3)
Example of a computer looking up the IP for a name
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Electronic Mail
•
•
•
•
•
Architecture and services »
The user agent »
Message formats »
Message transfer »
Final delivery »
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architecture and Services (1)
The key components and steps (numbered) to send email
−
User Agent = Email Reader; a program that supports composing, receiving, and
replying to messages.
− Uses Simple Mail Transfer Protocol (SMTP) over TCP (port 25)
1. Mail submission is method by which user agents send messages into the email
system for delivery
2. Message transfer between MTAs
3. Final delivery to receiving User Agent
Architecture of the email system
Question: Where is the user’s mailbox located?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architecture and Services (2)
Envelope
Question: Who
knows why this is
funny for Tanenbaum?
Message
(= header
and body)
Paper mail
Electronic mail
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
The User Agent
What users see – interface elements of a typical user agent
− Email address format: user@dns-address
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Message Formats (1)
Header fields related to message transport; headers are
readable ASCII text
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Message Formats (2)
Other header fields useful for user agents
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Message Formats (3)
Originally only ASCII text could be sent by email. Multipurpose Internet
Mail Extensions (MIME) was developed to send messages with richer
content (non-Latin alphabets, Chinese, Japanese) as well as audio,
images, or binary documents or programs.
•
MIME adds structures to the message body and also defines encoding
rules for the transfer of non-ASCII messages.
MIME header fields used to describe what content is in
the body of the message
Message Headers added by MIME
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Message Formats (4)
Common MIME content types and subtypes
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Message Formats (5)
Putting it all together:
a multipart message
containing HTML and
audio alternatives.
One part
(HTML)
Another
(audio)
Question: What does this email message do?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Message Transfer (1)
Messages are transferred with SMTP (Simple Mail Transfer
Protocol)
•
•
Readable (4 byte long ASCII) text commands
Submission from user agent to MTA on port 587
− If message cannot be delivered, an error report containing the first part of
the undeliverable msg is returned to the sender
•
One MTA to the next MTA on port 25 usually using TCP
− Recipient’s mailbox is within the destination MTA
− The destination MTA is the @dns-address part of the user’s email address
− user@dns-address is the user’s mailbox within the destination MTA
•
Other protocols for final delivery (IMAP, POP3, webmail)
− IMAP/POP3/webmail get email messages from the mailbox on the user’s
mail server to the user’s email user agent on the user’s machine.
− Webmail systems use web protocols: E.g., Gmail, Hotmail, Yahoo!mail
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Message Transfer (2)
Sending a message:
• From Alice to Bob
• SMTP commands
are marked [pink]
C: = from client (sender)
S: == from server (receiver)
HELO – Hello message
RCPT – ID email recipient(s)
EHLO – Hello msg for clients
wanting to use an
extension (next slide)
. . . (rest of message) . . .
Question: Does the
Sender (client) or the
Receiver (server)
start the dialog?
If the server is first, how
does it know when to
talk?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Message Transfer (3)
Common SMTP extensions (not in simple example)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Final Delivery (1)
User agent uses protocol
like IMAP for final delivery
• Has commands to
manipulate folders /
messages [right]
Alternatively, a Web
interface (with proprietary
protocol) might be used
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
The World Wide Web (WWW)
Standards body for the web is the W3C (World Wide Web Consortium;
see http://www.w3.org)
Primary web protocol is HTTP, which is a simple text-based protocol
that runs over TCP in the general case.
•
•
•
•
•
•
Architectural overview »
Static Web pages »
Dynamic pages and Web applications »
HTTP – HyperText Transfer Protocol »
The mobile Web »
Web search »
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architectural Overview (1)
HTTP transfers pages from servers to browsers
Question: Where is the User in this picture?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architectural Overview (2)
Pages are named with URLs (Uniform Resource Locators)
• Example: http://www.phdcomics.com/comics.php
Protocol
Server
Page on server
Our
focus
Common URL protocols
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architectural Overview (3)
Steps a client (browser) takes to follow a hyperlink:
−
−
−
−
−
−
Determine the protocol (HTTP)
Ask DNS for the IP address of server
Make a TCP connection to server (port 80 or 443)
Send request for the page; server sends it back
Fetch other URLs as needed to display the page
Close idle TCP connections
Steps a server takes to serve pages:
−
−
−
−
−
Accept a TCP connection from client
Get page request and map it to a resource (e.g., file name)
Get the resource (e.g., file from disk)
Send contents of the resource to the client.
Release idle TCP connections
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architectural Overview (4) -- Clients
When a server returns a page, it also returns some additional info about
that page. This info includes the MIME type of the page that instructs the
client’s browser how to process that info. If the MIME type is not one of the
built-in ones, then the browser consults its table of MIME types which
associates MIME types with Third-Party-provided viewers. Two types of
viewers: 1) plug in or 2) helper app.
Content type is identified by MIME types
• Browser takes the appropriate action to display
• Plug-ins / helper apps extend browser for new types
Installed as
an extension
to the browser
(same process as
the browser)
Complete
program
running as
a separate
process
Question: Where did we see MIME before?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architectural Overview (5) -- Servers
Web servers are optimized to be able to quickly service numerous requests
To scale performance, Web servers can use:
• Caching to optimize disk retrievals, multiple threads for
parallel processing, and a front end to orchestrate
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architectural Overview (6)
Server steps, revisited:
1. Resolve name of Web page requested
− Incoming request may not contain the actual name of file or
program as a literal string
2. Perform access control on the Web page
− Authenticate client to determine if (s)he can access material
3. Check the cache
− Dynamic pages cannot be cached; only current cached content
returned (if so, skip step 4)
4. Fetch requested page from disk or run program
5. Determine the rest of the response that accompany the
contents of the page
6. Return the response to the client
7. Make an entry in the server log
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Architectural Overview (7) -- cookies
Even though HTTP uses TCP, it does NOT have any concept of a login session
so each HTTP request has no memory nor context. Cookies provide a memory
or context. Examples: pay-for-view web sites; e-commerce; customized web
portals like Yahoo! Cookies can be abused to learn a user’s browsing habits.
Cookies support stateful client/server interactions
− Cookies are a small (4KB or less) named string that the server
associates with a browser
− Browsers store the server-provided cookie for a time period within their
browser directory on the client’s disk. Cookies persist across browser
invocations to that server -- unless the user disables cookies.
•
•
•
Server sends cookies (state) with page response
Client stores cookies across page fetches
Client sends cookies back to server with requests
Example of a browser directory within a client’s disk
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Static Web Pages (1)
Static Web pages are simply files
• Have the same contents for each viewing
Can be visually rich and interactive nonetheless:
• HTML that mixes text and images
• Forms that gather user input
• Style sheets that tailor presentation
• Vector graphics, videos, and more (over) . . .
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Static Web Pages (2)
Progression of features through HTML 5.0
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Dynamic Pages & Web Applications (1)
Applications can now run inside the browser or the server with the web providing the
user interface. Users do not need to install separate apps and user data can be
accessed from different computers; this is a form of cloud computing
Dynamic pages are generated by programs running at the
server (with a database) and/or the client
•
E.g., PHP, JSP, and CGI at server; JavaScript at client
−
−
•
Microsoft’s Active Server Pages .NET is a proprietary version of PHP and JSP
CGI = Common Gateway Interface; JSP = JavaServer Pages
Pages vary each time like using an application
Question: What does “a form of cloud computing” mean?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Dynamic Pages & Web Applications (2)
Web page that gets
form input and calls
a server program
PHP server program
that creates a
custom Web page
PHP calls
Resulting Web page
(for inputs “Barbara”
and “32”)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Dynamic Pages & Web Applications (3)
Client-side processing occurs inside the user’s browser. Technologies: 1) JavaScript,
2) VBScript, 3) Java applets, and 4) Microsoft ActiveX controls (see pages 677-678)
JavaScript program
produces result
page in the browser
First page with form,
gets input and calls
program above
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Dynamic Pages & Web Applications (4)
Client-side and Server-side scripting look the similar in that they both
embed code in HTML files, they are processed totally differently.
The difference between server and client programs
Server-side scripting with PHP
Client-side scripting with JavaScript
After user clicks on submit button, the
When the user clicks on the submit
browser collects info into a long string
button the browser interprets a JavaScript
and sends it to server as a request for PHP function contained on the page. All of the
Page. The server loads the PHP file and
work is done locally within the browser.
executes the PHP script to produce a new
Question: when is it preferable to use
HTML page that is sent to the browser to
server-side and when preferable to use
display.
client-side scripting?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Dynamic Pages & Web Applications (5)
AJAX (Asynchronous Javascript and XML) is a set of technologies that
work together that work together to enable Web applications that are as
responsive and powerful as traditional desktop applications.
Web applications use a set of technologies that work
together, e.g. AJAX:
•
•
HTML: present information as pages.
DOM: change parts of pages while they are viewed.
− DOM (Document Object Model) – model of HTML page that is
accessible to programs (e.g., an API to change parts of a page)
•
XML: let programs exchange data with the server.
− XML (eXtensible Markup Language) is a language for specifying
structural content
− W3C created XML to allow web content to be structured for
automated processing
•
Asynchronous way to send and retrieve XML data.
− SOAP (Simple Object access Protocol) – language/systemindependent way to do RPC between programs
•
JavaScript as a language to bind all this together.
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Dynamic Pages & Web Applications (6)
The DOM (Document Object Model) tree represents
Web pages as a structure that programs can alter
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Dynamic Pages & Web Applications (7)
XML captures document structure
HTML is concerned with presentation.
Example of XML:
Note: XHTML is
HTML defined
in terms of XML
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Dynamic Pages & Web Applications (8)
Web application developers have a library of technologies
available to them to create web content:
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
HTTP (1)
HTTP is the primary protocol of the web.
HTTP (HyperText Transfer Protocol) is a requestresponse protocol that runs on top of TCP
•
•
•
•
•
•
Fetches pages from server to client
Server usually runs on port 80
Headers are given in readable ASCII
Content is described with MIME types
Protocol has support for pipelining requests
Protocol has support for caching
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
HTTP (2)
HTTP 1.0 – connection was set up and released for every response
HTTP 1.1 – Enables connection reuse
HTTP uses persistent connections to improve performance
One connection for
each request
HTTP 1.0
Sequential requests
on one connection
HTTP 1.1
Pipelined requests
on one connection
HTTP 1.1
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
HTTP (3)
HTTP has several request methods.
Requests server to Fetch
a MIME-encoded page
Used to send
input data
(forms) to a
server program
Both GET and POST are
used for SOAP web services
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
HTTP (4)
Response codes tell the client how the request fared:
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
HTTP (5)
A request line (e.g., GET, POST) may be optionally followed by additional
lines with more information, called request headers. Responses also
optionally may have response headers.
Many headers carry key information:
Function
Example Headers
Browser capabilities
(client  server)
User-Agent, Accept, Accept-Charset, AcceptEncoding, Accept-Language
Caching related
(mixed directions)
If-Modified-Since, If-None-Match, Date, LastModified, Expires, Cache-Control, ETag
Browser context
(client  server)
Cookie, Referer, Authorization, Host
Content delivery
(server  client)
Content-Encoding, Content-Length, Content-Type,
Content-Language, Content-Range, Set-Cookie
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
HTTP (6)
HTTP has built-in support to help clients identify when they can safely
reuse pages
HTTP caching checks to see if the browser has a known
fresh copy, and if not if the server has updated the page
− HTTP uses two strategies: 1) cache page validation (step 2),
2) conditional GET – asks server if the cached copy is still valid
•
Uses a collection of headers for the checks
•
Can include further levels of caching (e.g., proxy caching)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
The Mobile Web
Enhancement/extensions made to support web access for small handhelds
and other mobile devices.
Mobiles (phones, tablets) are challenging as clients:
−
−
−
−
−
Relatively small screens
Limited input capabilities, lengthy input.
Network bandwidth is limited
Connectivity may be intermittent.
Computing power is limited
Strategies to handle them:
• Content: servers provide mobile-friendly versions;
transcoding can also be used
• Protocols: no real need for specialized protocols;
HTTP with header compression sufficient
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Web Search
Search has proved hugely popular, in tandem with
advertising that has proved hugely profitable
• A simple interface for users to navigate the Web
Search engine requires:
• Content from all sites, accessed by crawling. Follow
links to new pages, but beware programs.
− Each web search engine obtains its database of pages by doing
web crawling – a systematic traversal of all pages and links
− Deep web: Web content that search engines cannot catalog (e.g.,
dynamic content), which is therefore “hidden”
•
Indexing, which benefits from known and discovered
structure (such as XML) to increase relevance
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Audio and Video
Real time and streamed multimedia traffic.
• Real time multimedia is usually carried using RTP
• Streamed multimedia may be carried by RTP or HTTP
• Real Time Streaming Protocol – proprietary (RealNetworks) variant of RTCP
• RTCP (real time control protocol) controls remote multimedia playout
Audio and video have become key types of traffic, e.g., voice
over IP, and video streaming.
• Digital audio »
• Digital video »
• Streaming stored media »
• Streaming live media »
• Real-time conferencing »
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Digital Audio (1) – review
ADC (Analog-to-Digital Converter using Nyquist’s Sampling Theorem)
produces digital audio from a microphone
• Telephone: 8000 8-bit samples/second (64 Kbps);
computer audio is usually better quality (e.g., 16 bit)
ADC
Continuous audio
(sine wave)
Digital audio
(sampled, 4-bit quantized)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Digital Audio (2) -- review
Digital audio is typically compressed before it is sent
• Lossy encoders (like AAC) exploit human perception
• Large compression ratios (can be >10X)
Sensitivity of the ear
varies with frequency
A loud tone can mask
nearby tones
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Digital Video (1)
When the human eyes sees something, the image is retained for some
milliseconds before decaying. If a sequence of images is shown at 50
images/sec, the eye does not notice that it is looking at discrete images. All
video systems exploit this principle to produce moving pictures.
Video is digitized as pixels (sampled, quantized)
•
TV quality: 640x480 pixels, 24-bit per pixel color, 30 frames/sec
−
•
Leveraging interlacing – odd number scan lines and even number
cumulatively sent sequentially at 60 frames/sec to fool the eye
HDTV: 1280X720; Blu-ray: 1080X720
Video is sent compressed due to its large bandwidth
•
Lossy compression exploits human perception
− E.g., JPEG for still images, MPEG, H.264 for video
−
•
Large compression ratios (often 50X for video)
Video is normally > 1 Mbps, versus >10 kbps for speech and
>100 kbps for music
Question: Why is music’s bandwidth higher than speech’s?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Digital Video (2)
JPEG = Joint Photographic Experts Group
JPEG lossy compression sequence for one image:
Step 1
Step 2
Step 3
Step 5
Subsequent slides show each step in the JPEG compression sequence
Question: What does “lossy” mean?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Digital Video (3)
Step 1: Pixels are mapped to luminance/chrominance (matrix:
YCbCr) color space and chrominance is sub-sampled
•
•
•
•
Y = 16 + .26R + .5G + .09B (luminance);
Cb = 128 + .15R - .29G - .44B (chrominance)
Cr = 128 + .44R - .37G + .07B (chrominance)
The eye is less sensitive to chrominance
Input 24-bit RGB pixels

8-bit luminance
pixels
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
8-bit chrominances
for every 4 pixels
Digital Video (4)
Step 2: Each component block is transformed to spatial
frequencies with DCT (Discrete Cosine Transformation)
• Captures the key image features
One component block
of the Y matrix
Transformed block
(i.e., the DTC coefficients)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Digital Video (5)
Step 3: DCT coefficients are quantized by dividing by
thresholds; reduces bits in higher spatial frequencies
• Top left element is differenced over blocks (Step 4)
Input
/
Thresholds
=
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Output
Digital Video (6)
Step 5: The block is run-length encoded in a zig-zag
order. Then it is Huffman coded before sending (Step 6)
Order in which the block coefficients are sent
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Digital Video (7)
MPEG = Motion Picture Experts Group; following describes MPEG-1
MPEG compresses over a sequence of frames, further
using motion tracking to remove temporal redundancy
− I (Intra-coded) frames are self-contained
− P (Predictive) frames use block motion predictions
− B (Bidirectional) frames may base prediction on future frame
Three consecutive frames with stationary and moving components
Who has viewed I-frames in Netflix or other video playout during fast forwards?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Stored Media (1)
A simple method to stream stored media, e.g., for video
on demand, is to fetch the video as a file download
• But has large startup delay, except for short files
Video-on-Demand Playout can start after the file has downloaded
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Stored Media (2)
Effective streaming starts the playout during transport
• Using RTSP (Real-Time Streaming Protocol) or the RTCP
(RTP’s Real-Time Control Protocol)
Question: What protocol is being used in step 5?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Stored Media (3)
Key problem: how to handle transmission errors
Strategy
Advantage (PRO)
Disadvantage (CON)
Use reliable
transport (TCP)
Repairs all errors
Increases jitter significantly
Add FEC
Repairs most errors
(e.g., parity packet)
Increases overhead, decoding
complexity and jitter
Interleave media
Slightly increases decoding
complexity and jitter
Masks most errors
Interleave media: mixing up the order of the media before transmission and
unmixing it on reception. In this way if a packet (or a burst of packets) is lost,
the loss will be spread out over time by the unmixing.
Question: Who remembers what FEC is?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Stored Media (4)
Parity packet can repair one lost packet in a group of N
• Decoding is delayed for N packets
An approach for FEC using multimedia
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Stored Media (5)
Interleaving spreads nearby media samples over
different transmissions to reduce the impact of loss
Packet
stream
Media
samples
Loss reduces temporal resolution; doesn’t leave a gap
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Stored Media (6)
Key problem: media may not arrive in time for playout
due to variable bandwidth and loss/retransmissions
• Client buffers media to absorb jitter; we still need to
pick an achievable media rate
Safety margin,
to avoid a stall
Can pause server (or go
ahead and save to disk)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Stored Media (7)
RTSP commands
• Sent from player to server to adjust streaming
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Live Media (1)
Streaming live media is similar to the stored case plus:
• Can’t stream faster than “live rate” to get ahead
− Usually need larger buffer to absorb jitter
•
Often have many users viewing at the same time
− UDP with multicast greatly improves efficiency. When not
available, many TCP connections are used instead.
− For very many users, content distribution is used [later
section]
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Live Media (2)
With multicast streaming media, parity is effective
• Clients can each lose a different packet and recover
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Streaming Live Media (2)
Comparatively easy today for anyone to set up an Internet radio station.
In this example, recorded content is either stored on a media server as
podcasts for subsequent streaming or sent to for live streaming.
Production side of a student radio station.
As before
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Real-Time Conferencing (1)
Real-time conferencing has two or more connected live
media streams, e.g., voice over IP, Skype video call
Key issue over live streaming is low (interactive) latency
• Want to reduce delay to near propagation
• Benefits from network support, e.g., QoS
• Or, benefits from ample bandwidth (no congestion)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Real-Time Conferencing (2)
H.323 was issued in 1996 for real time conferencing. It was a popular
approach until it became supplanted by the IETF’s SIP, which was
VASTLY simpler and cheaper to use and produced equivalent results.
H.323 architecture for Internet telephony supports calls
between Internet computers and PSTN phones.
VoIP
call
Internet/PSTN
Internet
Gatekeeper controls
calls for LAN hosts
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Real-Time Conferencing (3)
H.323 protocol stack:
• Call is digital audio/video over RTP/UDP/IP
• Call setup is handled by other protocols (Q.931 etc.)
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Real-Time Conferencing (4)
Logical channels that make up an H.323 call
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Real-Time Conferencing (5)
SIP (Session Initiation Protocol) is an alternative to
H.323 to set up real-time calls
• Simple, text-based protocol with URLs for addresses
• Data is carried with RTP / RTCP as before
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Real-Time Conferencing (6)
Setting up a call with the SIP three-way handshake
• Proxy server lets a remote callee be connected
• Call data flows directly between caller/callee
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Real-Time Conferencing (7)
YES
Comparison of H.323 and SIP.
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Corrected
by
Instructor
Content Delivery
Majority of Internet bandwidth is currently used to deliver multimedia
content. This resulted in greatly enhanced bandwidth deployments
and Content Delivery Networks (CDN). CDN sets up a distributed
collection of machines at locations inside the Internet and uses them
to serve content to clients. This is used by big players. Alternative is a
peer-to-peer (P2P) network architect, used by little players. This
section also talks about other content delivery types for other data
types.
Delivery of content, especially Web and video, to
users is a major component of Internet traffic
• Content and Internet traffic »
• Server farms and Web proxies »
• Content delivery networks »
• Peer-to-peer networks »
Question: What does “distributed collection of machines” mean?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Content and Internet Traffic
Internet traffic:
1. Shifts seismically (emailFTPWebP2Pvideo)
− Internet traffic changes quickly, not only in the details but in the
overall makeup
2. Has many small/unpopular and few large/popular
flows – mice and elephants
− Important: note description of “self-similar” traffic on page 737
Zipf popularity distribution, 1/k
Shows up as a line on log-log plot
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Server Farms and Web Proxies (1)
No matter how much bandwidth one machine has, it can only serve so
many web requests before the load is too great. Server farm is a single
logical machine comprised of many physical machines.
Server farms enable large-scale Web servers:
• Front-end load-balances requests over servers
• Servers access the same backend database
Question: What is the difference between logical and physical?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Server Farms and Web Proxies (2)
Caching improves performance by shortening the response time and reducing
the network load. Web proxy shares the cache among multiple users by
actually fetching web requests on behalf of its users. This also enables an
organization to control the content that is entering its AS.
Proxy caches help organizations to scale the Web
• Caches server content over clients for performance
• Also implements organization policies (e.g., access)
Question: Where is a web proxy usually physically located?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
CDNs – Content Delivery Networks (1)
Server farms and web proxies help to build large sites and to improve web
performance, but they are not sufficient for truly popular web sites that
must serve content on a global scale. This is the role of a CDN.
CDNs scale Web servers by having clients get content
from a nearby CDN node (cache) Distributes copy of content to other
CDN nodes and is a Name Server to
resolve DNS queries
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Content Delivery Networks (2)
Directing clients to nearby CDN nodes with DNS:
• Client query returns local CDN node as response
• Local CDN node caches content for nearby clients
and reduces load on the origin server
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Content Delivery Networks (3)
Akamai pioneered using DNS for content distribution and became the first
major CDN and industry leader (page 745-6) Question: Where have we seen
Akamai before?
Origin server rewrites pages to serve content via CDN
Traditional Web page on server
Page that distributes content via CDN
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Peer-to-Peer Networks (1)
P2P is a file-sharing network that many computers use to pool their
resources to jointly form a CDN together. The computers may simply be
home computers. There is often no dedicated infrastructure and everyone
participates in the task of distributing content, with no central point of
control.
P2P (Peer-to-Peer) is an alternative CDN architecture
with no dedicated infrastructure (i.e., servers)
• Clients serve content to each other as peers
Challenges when servers are removed:
1. How do peers find each other?
2. How do peers support rapid content downloads?
3. How do peers encourage each other to upload?
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Peer-to-Peer Networks (2)
Two common types of P2P technology:
1. BitTorrent protocol – see pages 750 – 752 – standard: www.bittorent.org
2. Distributed Hash Table (DHT)– see pages 753 - 756 – academic
1) BitTorrent lets peers download torrents, which are a content description
− Peers find each other via Tracker in torrent file
» Tracker is a server with the list of all other peers uploading or downloading content
− Peers swap chunks (parts of content) with partners, preferring those who
send most quickly [2]
− Many peers speed download; preference helps uploads [3]
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Peer-to-Peer Networks (3)
Distributed Hash Tables (DHTs) are a fully distributed
index that scales to very many clients/entries
•
Basic functionality of an index is to map a key to a value, much
like a hash table
−
Chord uses SHA-1 for its hash, just like in BitTorrent (and cryptography). It
takes a variable length string and produces a random 160-bit number. It is
used to:
1.
2.
Convert any IP address to a 160-bit number called a node identifier.
Convert a content name to a key
•
Need to follow O(log N) path for N entries
•
Each node maintains a finger table that has m entries, each pointing to a
different actual node. Each entry has two fields: start and IP address of
successor.
•
Lookup process (next slide): requesting node sends a packet to its
successor containing its IP address and the key it is looking for. The
packet is propagated around the ring until it locates the successor to the
node identifier being sought. That node checks to see if it has any info
matching the key and, if so, returns it directly to the requesting node,
whose IP address it has
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
Peer-to-Peer Networks (3)
A Chord ring of 32 identifiers. Finger tables [at right, and as
arcs] are used to navigate the ring.
• Example: path to look up 16 from 1 is 1  12  15
Identifier
values are
stored at
predecessor
Example: 32 node ID arranged in
circle. Shaded ones are actual machines,
Arcs show fingers and labels on arcs are
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011
table indices.
End
Chapter 7
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011