Distributed Systems (5DV147) Replication What is Replication

Distributed Systems (5DV147)
Replication
Fall 2015
1
What is Replication?
 Make multiple copies of a data object and ensure that all
copies are identical
 Two Types of access; reads, and writes (updates)
 Reasons, have a backup plan:
– Handle more work (e.g. web-servers)
– Keep data safe (fault tolerance)
– Reduce latencies (CDN's and Caching)
– Keep data available
Replication
3
Replication requirements
Motivation
 Consistency
 How do we ensure correctness?
Obtain identical results from different copies (is that
Client
Physical
Object
Logical
object
Physical
Object
Problems that you may find
Motivation
Concurrent access, rather than exclusive
Operations are interleaved
 Clients must be unaware of replication
true?)
4
Multiple clients access replicas
 Transparency (illusion of a single copy)
–
Motivation
Replica placement
Placing servers
Placing content
Not always identical:
 Some have
received updates
Overhead required to keep replicas up to date
5
Global synchronization (Atomic operations)
6
Types of ordering adapted to replication
Some definitions
FIFO– if a client issues r and then r’, any correct Replica
Manager that handles r’ handles r before it
Sequential consistency property
Causal– if the issuing of r happened-before issuing r’, then
any correct Replica Manager that handles r’ handles r
before it
Linearizability property
Correctness
 Order of operations is consistent with the program order in which each
individual process executed them
 Order of operations is consistent with the real times at which the operations
occurred during execution
“Basic” correctness property
–
Total – if a correct Replica Manager handles r before r’,
then any correct RM that handles r’ handles r before it
An interleaved sequence of operations must meet the specification of a
single correct copy of the object(s), i.e., clients can not make a difference
between replicated systems and single copy ones.
7
8
Correctness
Example of interleaved operations for 2 clients:
C1: A, B, C
C2: d, e, f
Real Order during execution: A, B, d, C, e, f
Models of replication
An interleaving with sequential consistency:
A, B, d, e, f, C
Interleaving with linearizability:
A, B, d, C, e, f
9
10
Models of replication
Passive (primary-backup) replication
 One primary replica manager, many backup replicas
 If primary fails, backups can take its place (election!)
Primary
 Implements linearizability if:
 A failing primary is replaced C
FE
RM
by a unique backup
 Backups agree on which
operations were performed C
FE
RM
before primary crashed
 View-synchronous group
communication!
Passive replication
11
Figure adapted from Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 © Pearson Education 2012 – based on Figure 18.3
RM
Backup
Backup
12
Models of replication
Steps of passive replication
Primary
1. Request
– Front end issues request with unique ID
1
C
4
FE
5
2. Coordination
– Primary checks if request has been
carried out, if so, returns cached
response
C
3. Execution
1
RM
RM
3
2
FE
Backup
Active replication
RM
Backup
– Perform operation, cache results
4. Agreement
– Primary sends updated state to
backups, backups reply with Ack.
5. Response
– Primary sends result to front end,
which forwards to the client
What happens if the primary RM crashes?
 Before agreement
 After agreement
13
14
Models of Replication
Active replication
Steps of active replication
 Front end adds unique identifier to
request, multicasts to RMs
2. Coordination
RM
FE
C
Failure transparent
FE
2
1
5
2
RM
2
FE
1
C
3
RM
3
4. Agreement
 Not needed
5. Response
RM
 All RMs respond to front end, front end
interprets response and forwards response
to client
15
Models of replication
16
Models of replication
Comparing active and passive replication
Both handle crash failures (but differently)
Only active can handle arbitrary failures
Passive may suffer from large overheads
Optimizations?
 Simple
–
C
 Each RM executes request
Advantages of Active replication
Same code everywhere
5
 Totally ordered request delivery to RMs
FE
3
5
3. Execution
RM
Figure adapted from Instructor’s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 © Pearson Education 2012 – based on Figure 18.4
–
RM
1. Request
 RMs play equivalent roles
 All replica managers carry out all operations
 Front ends multicast one request
at a time (FIFO)
 Requests are totally ordered
C
 Implements sequential
consistency
 Tolerate Byzantine failures
Replication: models
Send “reads” to backups in passive
Lose linearizability property!
Send “reads” to specific RM in active
Lose fault tolerance
Exploit commutativity of requests to avoid ordering requests in
active
17
18
Semi Active Replication
Models of replication
 Intermediate soluyion between Active and Passive replication
 Main difference with active replication
– each time replicas have to make a non-deterministic
decision, a process, called the leader , makes the choice
and sends it to the followers
Models of replication
Comparing active and passive replication
Both handle crash failures (but differently)
Only active can handle arbitrary failures
Passive may suffer from large overheads
Optimizations?
Send “reads” to backups in passive
Lose linearizability property!
Send “reads” to specific RM in active
Lose fault tolerance
Exploit commutativity of requests to avoid ordering requests in
active
19
20
Problem
Models of replication
 How do you make replicas
– In P2P systems
– Cloud Systems
– RAID 5 and RAID 6
Option 1:
– Make replicas and copy the data :)
Option 2:
– Use coding theory to come-up with something intelligent
● Network coding
● Erasure coding
Replication vs coding
21
What in erasure coding?
Models of replication
Divide that file into m pieces
–
Run an erasure coding algorithm on the pieces
to produce m+n pieces
–
You will be able to reconstruct the file if you
have any m pieces
Models of replication
Example: Replication vs Erasure coding (1)
 One large file, let us say, of size 1 TB.
 One large distributed system with 10000 servers
If you replicate the file on 3 machines in your
network, you require 3 TB to host the file and its
replicas
 Suppose you have a large file that you want to
replicate
–
22
23
–
To have higher redundancy, you need more space
–
If the three machines fail, file lost
24
Models of replication
Some probability
–
Models of replication
For Replication
Let Ɛ be the maximum probability of
unavailability tolerated for an object o
●
a is the average node availability
–
Ɛ = P(object o is unavailable)
= P( all k replicas of o are unavailable)
= P (one replica is unavailable)k
= (1 - a)k
–
Taking the log of both sides:
k= log Ɛ / log(1-a)
25
26
Models of replication
Models of replication
Example: Replication vs Erasure coding (2)
 Take the same file
–
But chop it in 10 parts (m) of equal size, i.e., 100 GB
–
Set n in the erasure coding algorithm to 5
–
Run the algorithm to produce m+n pieces, all of size 100 GB
–
Distribute on 15 machines out of the 1000 machines, i.e.,
total disk size used, 1.5 TB
–
Now, if up to 5 of the 15 machines fail, you will still be able
to reproduce the file
And that is black magic
(Using Galois Fields and XoRs)
27
28
Models of replication
You're just too good to be true
Points against coding
 Complexity added to the system
– More complex systems, more bugs, harder testing, longer
implementation times
Sounds like we just solved the problem of data
replication
–
But have we?
–
Can you think of why are people still using
good ol' normal replication?
Models of replication
 Download/read latency
– Now you need to get your data from m machines with
variable latency
 What if you just want to read the first 100 lines in a text file?
– Easy with replication
– Not easy with coding
29
30
Required Readings

Summary
Erasure Coding vs. Replication: A Quantitative Comparison
https://docs.switzernet.com/people/emin-gabrielyan/060112-capillaryreferences/ref/Weatherspoon02.pdf
Optional Readings

Summary
Extra reading (some bonus questions will be based on this paper)
–
“A Tutorial on Reed–Solomon Coding for Fault-Tolerance in
RAID-like Systems” by James S. Plank

Understanding Replication in Databases and Distributed Systems (Until page 11, the rest is highly
recommended to read, but optional)
http://infoscience.epfl.ch/record/52326/files/IC_TECH_REPORT_199935.pdf
31
Next Lecture
Consistency
33
–
Available: http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.pdf
–
Note, you probably know everything you need as background to understand this. It will take some
of you outside their comfort zone (Mathematics, yucky!), but it is worth your effort!
–
I will be happy to help anyone after the 2nd of October on this :)
32