Dr Markus Hagenbuchner [email protected] CSCI319 Distributed Systems Chapter 7 – Consistency & Replication CSCI319 Chapter 7 Page: 1 Consistency And Replication Lecture notes based on the textbook by Tannenbaum Study objectives: 1. Understand the role of replication in DS systems. 2. Explain replication strategies. 3. Understand and explain the important consistency models, and algorithms that realize a given consistency model. 4. Explain the difference between data centric consistency and client centric consistency. 5. Understand the role of conit, and explain how conits are computed. CSCI319 Chapter 7 Page: 2 Content • Consistency models! 1. Data centric 2. Client centric • • Consistency measure Replication strategies – – Protocols Placement CSCI319 Chapter 7 Page: 3 Reasons for Replication • • Replication to increase the reliability of a system. Replication for performance Scaling in numbers Scaling in geographical area Contrariety Consistency enforcement compromises performance Cost of increased bandwidth for maintaining replication CSCI319 Chapter 7 Page: 4 Consistency With replication comes the question about consistency: • How are changes on one replica “experienced” by other replicas? • How are changes “seen” by clients that access different replicas? Consistency models define the behavior of a system of replicas. If a system is in an inconsistent state then how can we quantify (measure) the severity of the inconsistency? • Consistency Units (conits) CSCI319 Chapter 7 Page: 5 Measuring Inconsistency (1) A (distributed) system is consistent when it adheres to a given consistency model. Consistency model: Processes agree to obey a given rule, the store promises to work correctly under such rule. –This effectively places restrictions on how read or write operations can be executed. CSCI319 Chapter 7 Page: 6 Measuring Inconsistency (2) With replication in a distributed system it may not be possible to be consistent at all times. Measuring inconsistency in a distributed system: – Why useful? – How can this be done? One solution: conits CSCI319 Chapter 7 Page: 7 Measuring Inconsistency (3) An example: Lets assume that we have two storage containers x and y, and that these are replicated on two disks. If one machine changes the value of x (or y) on one of the replicas, then this may not be immediately replicated on the other replica. Hence, the replicas can be in an inconsistent state. A conit can measure the degree of an inconsistency with respect to a set of storage containers. This is illustrated on an example as follows: CSCI319 Chapter 7 Page: 8 Measuring Inconsistency (4) An example of keeping track of consistency deviations: Replica A Replica B Conit Conit x=6; y=12 Operation x=2; y=6 Result Operation Result <5,B> x:=x+2 [x=2] <5,B> x:=x+2 [x=2] <7,B> y:=x+1 [y=3] <7,B> y:=x+1 [y=3] <8,A> x:=x*2 [x=4] <10,B> y:=y*x [y=6] <10,B> y:=y*x [y=12] <14,A> x:=x+2 [x=6] Vector clock A = (15,10) Order deviation =3 Numerical deviation = (0,3) CSCI319 Vector clock B = (0,11) Order deviation =2 Numerical deviation = (2,12) Chapter 7 Page: 9 Measuring Inconsistency (5) Shown on the previous slide is: • Two replicas. • A number of operations with respect to the conit are scheduled for execution. For example, the operation <14,A> indicates that at logical time 14 an operation was issued at replica A, and that the associated operation is x := x+2 • The operation in the gray box indicates a committed operation. Committed operations have been executed locally and cannot be reversed. Thus, the two replicas are in an obvious inconsistent state. The question we wish to answer is: How can we quantify the degree of inconsistency in this example. CSCI319 Chapter 7 Page: 10 of 55 Interactive slide What is a conit? • The Consistency Unit specifies a unit over which consistency is to be measured. A conit watches over a set of related data items. How is the order deviation computed? • The number of tentative operations at a given replica which have not yet been committed. How is the numerical deviation computed? • A vector counting the number of operations at other replicas not seen at a given replica, and the maximum difference (in value) between committed operations at a given replica and the result of operations at other replicas. CSCI319 Chapter 7 Page: 11 of 55 Interactive slide What is a conit? • The Consistency Unit specifies a unit over which consistency is to be measured. A conit watches over a set of related data items. How is the order deviation computed? • The number of tentative (scheduled) operations at a given replica which have not yet been committed. How is the numerical deviation computed? • A vector counting the number of operations at other replicas not seen at a given replica, and the maximum difference (in value) between committed operations at a given replica and the result of operations at other replicas. CSCI319 Chapter 7 Page: 12 of 55 Interactive slide How is the order deviation computed in the example of slide 10? • The number of tentative (not committed yet) operations at replica A is 3, the number of tentative operations at replica B is 2. How is the numerical deviation computed in the example? • For Replica A: The number of operations in the system that the replica has not yet seen is 1 (first part of the answer). And the result of committed operations are x=2, y=0 whereas the result of tentative operations at Replica B is x=2, y=5. The difference in x is 2, the difference of y is 5. The maximum of the two differences is 5. Both answers together give the numerical deviation for A=(1,5). • For Replica B: The number of operations in the system that the replica has not yet seen is 3 (first part of the answer). And the result of committed operations is x=0, y=0 (no committed operations at B) whereas the result of tentative operations at Replica A is x=6, y=3. The difference in x is 6, the difference of y is 3. The maximum of the two differences is 6. Both answers together give the numerical deviation for B=(3,6). CSCI319 Chapter 7 Page: 13 of 55 Interactive slide How is the order deviation computed in the example of slide 10? • The number of tentative (not committed yet) operations at replica A is 3, the number of tentative operations at replica B is 2. How is the numerical deviation computed in the example? • For Replica A: The number of operations in the system that the replica has not yet seen is 0 (first part of the answer). And the result of committed operations are x=2; y=3 whereas the result of tentative operations at Replica B is x=2; y=6. The difference in value of x is 0, the difference of y is 3. The maximum of the two differences is 3. Both answers together give the numerical deviation for A=(0,3). • For Replica B: The number of operations in the system that the replica has not yet seen is 2 (first part of the answer). And the result of committed operations is x=0; y=0 (no committed operations at B) whereas the result of tentative operations at Replica A is x=6, y=12. The difference in value of x is 6, the difference of y is 12. The maximum of the two differences is 12. Both answers together give the numerical deviation for B=(2,12). CSCI319 Chapter 7 Page: 14 of 55 Consistency Models 1. Data centric consistency models Concerns read and write on shared data (e.g. shared memory, shared database, distributed file system, etc.) 2. Client centric consistency models Concerns consistency experienced by any one client when accessing a distributed data store. CSCI319 Chapter 7 Page: 15 of 55 Data-centric Consistency Models The general organization of a logical data store, physically distributed and/or replicated across multiple processes. CSCI319 Chapter 7 Page: 16 of 55 Data centric consistency models We will address two data centric consistency models: 1. Sequential consistency 2. Causal consistency There is a third one called “Grouping operations”. This is a technique with results in consistency between elements in a group. Hence, this is considered a consistency model as well. CSCI319 Chapter 7 Page: 17 of 55 Data centric consistency models Notation used explained on an example: Behavior of two processes operating on the same data item. The horizontal axis is time. • Pi refers to the i-th process • W(x)a refers to a value ‘a’ written to a data item x • R(x)a refers to value ‘a’ read from data item x. Note: P1 and P2 may write to a different replica (as was shown on slide 6). CSCI319 Chapter 7 Page: 18 of 55 Sequential Consistency (1) Definition: A data store is sequentially consistent when: The result of any execution is the same as if the (read and write) operations by all processes on the data store … • were executed in some sequential order and … • the operations of each individual process appear in this sequence in the order specified by its program. Note that the term “appear” means how a process “sees” or “experiences” the result of a write operation. This refers to the read operation of a process. CSCI319 Chapter 7 Page: 19 of 55 Sequential Consistency (2) Three examples: (a) and (c) are a sequentially consistent data store. (b) a data store that is not sequentially consistent. CSCI319 Chapter 7 Page: 20 of 55 Sequential Consistency (2) An example which may adopt the sequential consistency model is: Data replication among true replicas. A true replica may contain any information as long as all replica contain the exact same information. Any event that is encountered at any replica must be replicated in exactly the same order to all other replica. CSCI319 Chapter 7 Page: 21 of 55 Sequential Consistency (3) A more thorough view into the effects of sequential consistency. Example: Three concurrently-executing processes operating in a distributed memory space. The variables involved are assumed to have been initialized with 0. Process P1 x=1; print(y, z); Process P2 y = 1; print(x, z); Process P3 z = 1; print(x, y); Note, there are 90 valid execution sequences in this example, 64 of them are allowed under the sequential consistency model. Lets have a look at four of them: CSCI319 Chapter 7 Page: 22 of 55 Sequential Consistency (4) Four of the possible 90 execution sequences for the processes of the previous slide. The vertical axis is time: x=1 print(y, z) y=1 print(x, z) z=1 print(x, y) x=1 y=1 print(x, z) print(y, z) z=1 print(x, y) y=1 z=1 print(x, y) print(x, z) x=1 print(y, z) y=1 x=1 z=1 print(x, z) print(y, z) print(x, y) Output: 001011 Output: 101011 Output: 010111 Output: 111111 (a) (b) (c) (d) Q: Which of these four execution sequences do not violate the sequential consistency model? CSCI319 Chapter 7 Page: 23 of 55 Interactive slide Which of the previous four execution sequences do not violate the sequential consistency model? Answer: all four of them comply to the sequential consistency model. Example of an assessment question: Consider the following situation: Process P1 x=1; print(y, z); Process P2 y = 1; print(x, z); Process P3 z = 1; print(x, y); Task: Give an execution sequence, and output which would violate the sequential consistency model. CSCI319 Chapter 7 Page: 24 of 55 Causal Consistency (1) Definition: Causal consistency is another data centric consistency model: For a data store to be considered causally consistent, it is necessary that the store obeys the following condition: Writes that are potentially causally related … – must be seen by all processes, and – must be seen in the same order. Concurrent writes … – may be seen in a different order – on different machines. CSCI319 Chapter 7 Page: 25 of 55 Causal Consistency (2) Example: This sequence is allowed with a causally-consistent store (but would violate the sequentially consistency model). CSCI319 Chapter 7 Page: 26 of 55 Causal Consistency (3) Example (a) A violation of a causally-consistent store. CSCI319 Chapter 7 Page: 27 of 55 Causal Consistency (4) Example (b) A correct sequence of events in a causallyconsistent store. The causal consistency model is particularly useful for shared distributed databases CSCI319 Chapter 7 Page: 28 of 55 Grouping Operations (1) Grouping operations: • are a more commonly applied synchronization technique where the aim is to keep operations between processes in a group synchronized. • Support synchronization variables – • Allow non-exclusive access to a resource – • Which define a synchronization point But does not guarantee that resource has been synchronized Allows implementation of an entry consistency model: CSCI319 Chapter 7 Page: 29 of 55 Grouping Operations (2) Necessary criteria for correct entry consistency synchronization • An “acquire access” of a synchronization variable is not allowed to be performed until all updates to a guarded shared data have been performed with respect to the process which acquired the access. • Before an exclusive access to a synchronization variable is allowed to be performed by a process, no other process may hold the synchronization variable. • After the exclusive mode access to a synchronization variable has been performed, any other process’ next nonexclusive mode access to that synchronization variable may not be performed until it has performed synchronization with respect to that variable’s owner. CSCI319 Chapter 7 Page: 30 of 55 Grouping Operations (3) Example: A valid event sequence for entry consistency. Acq(Lx) refers to the “acquire access” synchronization operation on variable x. CSCI319 Chapter 7 Page: 31 of 55 2.Client-Centric Consistency Models • Data centric consistency models provide a system wide consistency model on a shared data structure. • In contrast, client centric consistency is consistency from a single clients’ point of view. • Realizes eventual consistency. • Common client centric models: 1. Monotonic Reads 2. Monotonic Writes 3. Read your writes 4. Write follows reads CSCI319 Chapter 7 Page: 32 of 55 Eventual Consistency Client centric consistency: An illustration of the principle of a mobile user accessing different replicas of a distributed database. CSCI319 Chapter 7 Page: 33 of 55 Monotonic Reads (1) Definition: A data store is said to provide monotonic-read consistency if the following condition holds: If a process reads the value of a data item x then any successive read operation on x by that process: – will always return that same value – or a more recent value. CSCI319 Chapter 7 Page: 34 of 55 Monotonic Reads (2) The read operations performed by a single process P at two different local copies of the same data store. Example: (a) A monotonic-read consistent data store. Here, the notation is: xi is the version of item x at location i, WS(xi) is the result of a write to xi at a local rfeplica, and WS(xi,xj) is a subsequent writing to x based on the result of xi at location j (in this sequence). WS can be interpreted as a “Write executed by the System” CSCI319 Chapter 7 Page: 35 of 55 Monotonic Reads (3) The read operations performed by a single process P at two different local copies of the same data store. Example (b): A data store that does not provide monotonic reads. This violates monotonic read consistency model since the WS(x2) does not guarantee that all changes due to WS(x1) have been performed on L2. CSCI319 Chapter 7 Page: 36 of 55 Monotonic Reads (3) Example: Email system A client reading Emails by accessing a locally available replica can expect to see the same Emails when accessing another replica at a later time. It may be that new Email may arrive (and hence, may be added to the users Email database) in-between two reads. In this case, the client can expect to see all the old Emails as well as the new Emails. In other words, the Email client will never get to see an older version of the Email database when accessing Email at different replicas in the system. Such behavior is guaranteed by the monotonic read consistency model. CSCI319 Chapter 7 Page: 37 of 55 Monotonic Writes (1) Definition: In a monotonic-write consistent store, the following condition holds: A write operation by a process on a data item x … – is completed before any successive write operation on x – and by the same process. CSCI319 Chapter 7 Page: 38 of 55 Monotonic Writes (2) The write operations performed by a single process P at two different local copies of the same data store. Example (a): A monotonic-write consistent data store. CSCI319 Chapter 7 Page: 39 Monotonic Writes (3) The write operations performed by a single process P at two different local copies of the same data store. Example (b): A data store that does not provide monotonic-write consistency. The monotonic writes consistency model is particularly useful for distributed database systems. CSCI319 Chapter 7 Page: 40 Read Your Writes (1) Definition: A data store is said to provide read-yourwrites consistency, if the following condition holds: The effect of a write operation by a process on data item x … – will always be seen by a successive read operation on x – by the same process. This is also known as the UNIX semantics. CSCI319 Chapter 7 Page: 41 Read Your Writes (2) Example (a): A data store that provides read-your-writes consistency. CSCI319 Chapter 7 Page: 42 Read Your Writes (3) Example (b): A data store that does not provide read-yourwrites consistency. CSCI319 Chapter 7 Page: 43 Writes Follow Reads (1) Definition: A data store is said to provide writes-follow-reads consistency, if the following holds: A write operation by a process … – on a data item x following a previous read operation on x by the same process is guaranteed to take place on the same or a more recent value of x that was read. CSCI319 Chapter 7 Page: 44 Writes Follow Reads (2) Example (a): A writes-follow-reads consistent data store. CSCI319 Chapter 7 Page: 45 Writes Follow Reads (3) Example (b): A data store that does not provide writesfollow-reads consistency. Writes follows reads consistency is useful, for example, in distributed user forums or newsgroups. CSCI319 Chapter 7 Page: 46 Realizing Data-centric Consistency Models There are many ways by which each of the consistency models can be realized. One example: Most strategies for realizing sequential consistency make use of primary based protocols such as – Remote-write protocols – Local write protocols These are also called primary backup protocols. The reason for this nomenclature will become clear in the following: CSCI319 Chapter 7 Page: 47 Remote-Write Protocols The principle of a primary-backup protocol. CSCI319 Chapter 7 Page: 48 Local-Write Protocols Primary-backup protocol in which the primary migrates to the process wanting to perform an update. CSCI319 Chapter 7 Page: 49 Realizing Client-centric Consistency Models Again, there are many ways by which each of the consistency models can be realized. An example: Monotonic-read consistency – Each write operation is assigned a unique identifier wid. – Each server has a globally unique identifier sid. – Propagation includes the passing of sid and wid – We can now determine whether a write has taken place at a local copy before subsequent reads from the data item by the same client is performed. CSCI319 Chapter 7 Page: 50 Design issues • • • Where to place replicas (server, data)? What to propagate? Replication strategies – – – – Server or client initiated? Pull or Push protocol? Remote or local write protocol? How to realize a consistency model? CSCI319 Chapter 7 Page: 51 Replica-Server Placement (1) Strategies of replica-server placement can be non-trivial if a large number of replicas are to be managed. A solution would be to segment replicas by using a regular grid as in the following example. But then choosing a proper cell size for server placement can be an issue: CSCI319 Chapter 7 Page: 52 of 55 Replica-Server Placement (2) More advanced strategies of replica-server placement include clustering strategies such as LVQ, k-means, average distance analysis, etc. These are famous and scalable machine learning methods. However, we will not go into detail of the underlying algorithms here. Data Mining and Machine Learning Subjects (postgraduate subjects) cover these. CSCI319 Chapter 7 Page: 53 of 55 State versus Operations Possibilities for what is to be propagated: 1. Propagate only a notification of an update. 2. Transfer data from one copy to another. 3. Propagate the update operation to other copies. CSCI319 Chapter 7 Page: 54 of 55 Pull versus Push Protocols A replica can be maintained either by the server or by the client. Depending of this, we compare between push-based (if replica is maintained by a server) and pull-based (if maintained by client) protocols in the case of multiple-client, single-server systems. Push and pull protocols can be combined to gain certain advantages (e.g. clients acquire a lease. The server pushes updates until lease expires). CSCI319 Chapter 7 Page: 55 of 55 Consistency Protocols A consistency protocol: • Describes an implementation of a specific consistency model. I.e, if a certain level of out-of-datedness may be acceptable in certain situations. • The protocol creates a bound on Numerical deviation Staleness deviation Order deviation Conit is a way to compute these deviations. CSCI319 Chapter 7 Page: 56 of 55 Continuous Consistency Binding numerical deviation: – Allow a numerical upper bound for deviation of number of transactions. Binding staleness deviation: – Achieved through vector clocks and clock synchronization. Binding order deviation: – I.e., refuse writes until sufficient number of tentative writes are committed. CSCI319 Chapter 7 Page: 57 of 55 Summary • Consistency models – – • • Client centric Data centric (In-)consistency measures Design issues – – – Replication strategies Placement Protocols CSCI319 Chapter 7 Page: 58 of 55
© Copyright 2026 Paperzz