Chapter 7: Distributed Systems: Warehouse

Chapter 7:
Distributed Systems:
Warehouse-Scale Computing
Fall 2013
Jussi Kangasharju
Chapter Outline
n  Warehouse-scale computing overview
n  Workloads and software infrastructure
n  Failures and repairs
n  Note: Term “Warehouse-scale computing” originates
from Google à Examples typically of Google’s services
n  Trend towards WSC is more general
n  This chapter based on book Barroso, Hölzle: “The
Datacenter as a Computer” (see course website)
Kangasharju: Distributed Systems
2
What is Warehouse-Scale Computing (WSC)?
n  Essentially: Modern Internet services
n  Massive scale of…
n  Software
n  Data
infrastructure
repositories
n  Hardware
platform
n  Program is a service
n  Consists of tens of interacting programs
n  Different
teams, organizations, etc.
Kangasharju: Distributed Systems
3
WSC vs. Data Centers
n  Both look very similar to the outside
of computers in one building”
n  “Lots
n  Key difference:
n  Data centers host services for multiple providers
n  Little
commonality between hardware and software
n  Third-party
software solutions
n  WSC run by a single organization
n  Homogeneous
n  In-house
hardware and software and management
middleware
Kangasharju: Distributed Systems
4
Cost Efficiency
n  Cost efficiency extremely important
n  Growth driven by 3 main factors:
n  Popularity
n  Size
increases load
of problem increases (e.g., indexing of Web)
n  Highly
competitive market
n  Need bigger and bigger systems à Cost efficiency!
Kangasharju: Distributed Systems
5
Future of Distributed Computing?
n  WSC is not just a collection of servers
n  New
n  Too
and rapidly evolving workloads
big to simulate à New design techniques
n  Fault
behavior
n  Energy
n  New
efficiency
programming paradigms
n  Design spectrum:
n  One
computer à Multiple computers à Data center
n  WSC
= Multiple data centers operating together
n  Modern
CDN: “Server” = WSC data center
Kangasharju: Distributed Systems
6
Architectural Overview
n  Storage
n  Networking
n  Storage hierarchy
n  Latency, bandwidth, capacity
n  Power usage
n  Handling failures
Kangasharju: Distributed Systems
7
General architecture
n  Servers, e.g., 1-U servers
n  Racks
n  Interconnected racks
Kangasharju: Distributed Systems
8
Storage
n  Tradeoff: NAS vs. local disks as distributed filesystem?
n  NAS:
n  Easier
to deploy, puts responsibility on vendor
n  Collection of disks:
n  Must
implement own filesystem abstraction (e.g., GFS)
n  Lower
hardware costs (desktop vs. enterprise disks)
n  Reliability
n  More
issues and replication?
network traffic due to writes
Kangasharju: Distributed Systems
9
Network
n  48-port 1 Gbps Ethernet switches are “cheap”
n  Good bandwidth within one rack
n  Problem: Cluster-level bandwidth?
n  Bigger
and faster switches prohibitively expensive?
n  Hierarchical network organization:
n  Good
n  Less
bandwidth within rack
bandwidth within cluster
n  Programmer
must keep this in mind! (transparency?)
Kangasharju: Distributed Systems
10
Storage Hierarchy
n  Server:
n  N
processors, X cores/CPU, local cache, DRAM, disks
n  Fast,
but limited capacity
n  Rack:
n  Individual
n  A
servers, combined view
bit slower, but more capacity
n  Cluster:
n  View
over all racks
n  Slower,
but more capacity
n  Tradeoff: Bandwidth, latency, capacity
Kangasharju: Distributed Systems
11
Power Usage
n  No single culprit on server level
n  CPU
33%
n  DRAM
n  Disk
30%
10%
n  Network
n  Other
5%
22%
n  Further optimization targets on cluster/WSC level
n  Cooling
of data center
Kangasharju: Distributed Systems
12
Handling Failures
n  At this scale, things will break often
n  Application must handle them
n  More details later
Kangasharju: Distributed Systems
13
Workloads and Software Infrastructure
n  Different levels of abstraction
n  Platform-level software
n  Firmware,
kernel, individual OS
n  Cluster-level infrastructure software
n  Distributed
n  “OS
software for managing resources and services
for a datacenter”
n  Distributed
FS, RPC, MapReduce, …
n  Application-level software
n  Actual
application, e.g., Gmail, Google Maps
Kangasharju: Distributed Systems
14
Datacenter vs. Desktop
n  Differences in developing software
n  Datacenter:
n  Parallelism
n  Workload
(both data and requests)
changes
n  Homogeneous
n  Hiding
platform
failures
Kangasharju: Distributed Systems
15
Basic Techniques
Technique
Reliability
Availability
Replication
Yes
Yes
Partitioning
Yes
Yes
Load balancing
Yes
Timers
Yes
Integrity checks
Yes
App.-specific
Compression
Yes
Eventual consistency Yes
Kangasharju: Distributed Systems
Yes
16
Cluster-Level Infrastructure Software
n  Resource management
n  Mapping
of tasks to resources
n  Hardware abstraction and basic services
n  Distributed
storage, message passing, …
n  Deployment and maintenance
n  Software
distribution, configuration, …
n  Programming frameworks
n  Hide
some of the above from programmer
n  Examples:
MapReduce, BigTable, Dynamo
Kangasharju: Distributed Systems
17
MapReduce
n  Google’s framework for processing large data sets on
clusters
n  Name from map and reduce (functional programming)
n  Not
really much in common with real “map” and “reduce”
n  One master, multiple (levels) of slaves
n  Map:
n  Master
partitions input, distributed to slaves
n  Slaves
may do the same
n  Reduce:
n  Slave
sends its result to its master
n  Eventually
root-master will get result
Kangasharju: Distributed Systems
18
Application-Level Software
n  What is the application?
n  First
was search, then many other have appeared
n  Datacenter must support general-purpose computing
n  Too
expensive to tailor datacenters for applications
n  Changing
workloads à Faster to adapt software
n  Two application examples:
n  Search
n  Similar
scientific articles (see book for description)
Kangasharju: Distributed Systems
19
Search
n  Inverted index
n  Set
of documents matching a keyword
n  Size of index similar to original data
n  Consider query “new york restaurant”
n  Must
search each of three terms
n  Find
documents matching every term
n  Sorting
(PageRank + other criteria) à Result
n  Latency must be low (user waiting)
n  Throughput must be high (many users)
n  Read-only index à Easily parallelizable
Kangasharju: Distributed Systems
20
Monitoring Infrastructure
n  Service-level dashboards
n  Real-time
n  Can
monitoring of few key indicators (latency, t-put)
extend to some more indicators
n  Performance debugging tools
n  Dashboards
n  No
only show problem, but no answer to “why”
need for real-time (compare CPU profilers)
n  Blackbox
monitoring vs. instrumentation approach
n  Platform-level monitoring
n  Everything
n  Need
above is needed, but not sufficient
a higher-level view (see book for details)
Kangasharju: Distributed Systems
21
Buy vs. Build?
n  Buy:
n  Typical
solution
n  Build:
n  Google’s
n  Original
n  More
(and others’) approach
reason: No third-party solutions available
software development and maintenance work
n  Improved
n  In-house
flexibility
software can take “shortcuts”
-  Not implement every feature
Kangasharju: Distributed Systems
22
Failures
n  Traditional software not good with failures
n  Result: Make hardware more reliable
n  WSC is different because of scale
n  Imaginary
n  WSC
30 year MTBF = 10,000 days MTBF
with 10,000 servers = 1 failure per day
n  Software must handle failures
n  Application
or middleware
n  Middleware
makes applications simpler
Kangasharju: Distributed Systems
23
Positive Side Effect
n  Failures are a fact of life
n  Can buy cheaper hardware
n  Upgrades are simpler
n  Upgrade,
n  Same
kill, reboot
for hardware upgrades
n  “Failure is an option” J
n  Can
allow servers to fail, makes life simpler
Kangasharju: Distributed Systems
24
Caveats
n  Cannot ignore reliability completely
n  Hardware must be able to detect errors and failures
n  No
need to recover, but can include
n  Not detecting hardware errors is risky
n  See
book for example
n  Every
piece of software would need to handle everything
Kangasharju: Distributed Systems
25
Categorizing Faults
n  Corrupted
n  Data
n  Can
lost or corrupted
data be regenerated or not?
n  Unreachable
n  Service
n  User
unreachable by users
network reliability?
n  Degraded
n  Service
n  What
available, but degraded
can be still done?
n  Masked
n  Fault
occurs, but is masked
Kangasharju: Distributed Systems
26
Sources of Faults
n  Hardware not the common culprit (~10%)
n  Software and configurations are bigger problems
n  Exact
numbers depend on study
n  Hardware problem = single computer
n  Software/configuration problem = many computers
simultaneously
Kangasharju: Distributed Systems
27
Causes of Crashes
n  Anecdotal evidence points to software
n  Hardware: Memory or disk
n  DRAM errors happen, but can be helped with ECC
n  Some
errors still persist
n  Real crash rate higher than studies predict
n  Again
points to software
n  Predicting problems in WSC not useful
n  Need
to handle failures anyway
n  Could
be useful in other systems
Kangasharju: Distributed Systems
28
Repairs
n  When something breaks, it must be repaired
n  Two important characteristics of WSC
n  No need to repair immediately
n  Optimize
time of repair technician
n  Collect lot of health data from large number of servers
n  Use
machine learning to optimize actions
Kangasharju: Distributed Systems
29
Summary: Key Challenges
n  Rapidly changing workloads
n  Building balanced systems from imbalanced components
n  Energy use
n  Amdahl’s Law
Kangasharju: Distributed Systems
30
Chapter Summary
n  Warehouse-scale computing overview
n  Workloads and software infrastructure
n  Failures and repairs
Kangasharju: Distributed Systems
31