CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch CPSC 668 Set 16: Distributed Shared Memory 1 Distributed Shared Memory • A model for inter-process communication • Provides illusion of shared variables on top of message passing • Shared memory is often considered a more convenient programming platform than message passing • Formally, give a simulation of the shared memory model on top of the message passing model • We'll consider the special case of – no failures – only read/write variables to be simulated CPSC 668 Set 16: Distributed Shared Memory 2 The Simulation users of read/write shared memory read/write return/ack … alg0 send read/write recv Shared Memory return/ack algn-1 send recv Message Passing System CPSC 668 Set 16: Distributed Shared Memory 3 Shared Memory Issues • A process invokes a shared memory operation (read or write) at some time • The simulation algorithm running on the same node executes some code, possibly involving exchanges of messages • Eventually the simulation algorithm informs the process of the result of the shared memory operation. • So shared memory operations are not instantaneous! – Operations (invoked by different processes) can overlap • What values should be returned by operations that overlap other operations? – defined by a memory consistency condition CPSC 668 Set 16: Distributed Shared Memory 4 Sequential Specifications • Each shared object has a sequential specification: specifies behavior of object in the absence of concurrency. • Object supports operations – invocations – matching responses • Set of sequences of operations that are legal CPSC 668 Set 16: Distributed Shared Memory 5 Sequential Spec for R/W Registers • Each operation has two parts, invocation and response • Read operation has invocation readi(X) and response returni(X,v) • Write operation has invocation writei(X,v) and response acki(X) • A sequence of operations is legal iff each read returns the value of the latest preceding write. • Ex: write0(X,3) ack0(X) read1(X) return1(X,3) CPSC 668 Set 16: Distributed Shared Memory 6 Memory Consistency Conditions • Consistency conditions tie together the sequential specification with what happens in the presence of concurrency. • We will study two well-known conditions: – linearizability – sequential consistency • We will only consider read/write registers, in the absence of failures. CPSC 668 Set 16: Distributed Shared Memory 7 Definition of Linearizability • Suppose is a sequence of invocations and responses. – an invocation is not necessarily immediately followed by its matching response • is linearizable if there exists a permutation of all the operations in (now each invocation is immediately followed by its matching response) s.t. – |X is legal (satisfies sequential spec) for all X, and – if response of operation O1 occurs in before invocation of operation O2, then O1 occurs in before O2 ( respects real-time order of nonoverlapping operations in ). CPSC 668 Set 16: Distributed Shared Memory 8 Linearizability Examples Suppose there are two shared variables, X and Y, both initially 0 write(X,1) ack(X) read(Y) return(Y,1) p0 1 3 write(Y,1) ack(Y) 0 return(X,1) read(X) p1 2 4 Is this sequence linearizable? Yes - green triangles. What if p1's read returns 0? CPSC 668 No - see arrow. Set 16: Distributed Shared Memory 9 Definition of Sequential Consistency • Suppose is a sequence of invocations and responses. • is sequentially consistent if there exists a permutation of all the operations in s.t. – |X is legal (satisfies sequential spec) for all X, and – if response of operation O1 occurs in before invocation of operation O2 at the same process, then O1 occurs in before O2 ( respects real-time order of operations by the same process in ). CPSC 668 Set 16: Distributed Shared Memory 10 Sequential Consistency Examples Suppose there are two shared variables, X and Y, both initially 0 write(X,1) 3 ack(X) p0 write(Y,1) 1 read(Y) ack(Y) 4 0 return(Y,1) read(X) 2 return(X,0) p1 Is this sequence sequentially consistent? Yes - green numbers. What if p0's read returns 0? CPSC 668 No - see arrows. Set 16: Distributed Shared Memory 11 Specification of Linearizable Shared Memory Comm. System • Inputs are invocations on the shared objects • Outputs are responses from the shared objects • A sequence is in the allowable set iff – Correct Interaction: each proc. alternates invocations and matching responses – Liveness: each invocation has a matching response – Linearizability: is linearizable CPSC 668 Set 16: Distributed Shared Memory 12 Specification of Sequentially Consistent Shared Memory • Inputs are invocations on the shared objects • Outputs are responses from the shared objects • A sequence is in the allowable set iff – Correct Interaction: each proc. alternates invocations and matching responses – Liveness: each invocation has a matching response – Sequential Consistency: is sequentially consistent CPSC 668 Set 16: Distributed Shared Memory 13 Algorithm to Implement Linearizable Shared Memory • Uses totally ordered broadcast as the underlying communication system. • Each proc keeps a replica for each shared variable • When read request arrives: – send bcast msg containing request – when own bcast msg arrives, return value in local replica • When write request arrives: – send bcast msg containing request – upon receipt, each proc updates its replica's value – when own bcast msg arrives, respond with ack CPSC 668 Set 16: Distributed Shared Memory 14 The Simulation users of read/write shared memory read/write alg0 to-bc-send return/ack read/write … return/ack algn-1 to-bc-recv to-bc-send Shared Memory to-bc-recv Totally Ordered Broadcast CPSC 668 Set 16: Distributed Shared Memory 15 Correctness of Linearizability Algorithm • Consider any admissible execution of the algorithm – underlying totally ordered broadcast behaves properly – users interact properly • Show that , the restriction of to the events of the top interface, satisfies Liveness, and Linearizability. CPSC 668 Set 16: Distributed Shared Memory 16 Correctness of Linearizability Algorithm • Liveness (every invocation has a response): By Liveness property of the underlying totally ordered broadcast. • Linearizability: Define the permutation of the operations to be the order in which the corresponding broadcasts are received. – is legal: because all the operations are consistently ordered by the TO bcast. – respects real-time order of operations: if O1 finishes before O2 begins, O1's bcast is ordered before O2's bcast. CPSC 668 Set 16: Distributed Shared Memory 17 Why is Read Bcast Needed? • The bcast done for a read causes no changes to any replicas, just delays the response to the read. • Why is it needed? • Let's see what happens if we remove it. CPSC 668 Set 16: Distributed Shared Memory 18 Why Read Bcast is Needed read return(1) p0 write(1) p1 to-bc-send p2 read return(0) CPSC 668 Set 16: Distributed Shared Memory 19 Algorithm for Sequential Consistency • The linearizability algorithm, without doing a bcast for reads: • Uses totally ordered broadcast as the underlying communication system. • Each proc keeps a replica for each shared variable • When read request arrives: – immediately return the value stored in the local replica • When write request arrives: – send bcast msg containing request – upon receipt, each proc updates its replica's value – when own bcast msg arrives, respond with ack CPSC 668 Set 16: Distributed Shared Memory 20 Correctness of SC Algorithm Lemma (9.3): The local copies at each proc. take on all the values appearing in write operations, in the same order, which preserves the order of non-overlapping writes - implies per-process order of writes Lemma (9.4): If pi writes Y and later reads X, then pi's update of its local copy of Y (on behalf of that write) precedes its read of its local copy of X (on behalf of that read). CPSC 668 Set 16: Distributed Shared Memory 21 Correctness of the SC Algorithm (Theorem 9.5) Why does SC hold? • Given any admissible execution , must come up with a permutation of the shared memory operations that is – legal and – respects per-proc. ordering of operations CPSC 668 Set 16: Distributed Shared Memory 22 The Permutation • • Insert all writes into in their to-bcast order. Consider each read R in in the order of invocation: – suppose R is a read by pi of X – place R in immediately after the later of 1. the operation by pi that immediately precedes R in , and 2. the write that R "read from" (caused the latest update of pi's local copy of X preceding the response for R) CPSC 668 Set 16: Distributed Shared Memory 23 Permutation Example 4 read return(2) p0 write(2) p1 3 ack to-bc-send to-bc-send p2 write(1) 1 ack read return(1) 2 permutation is given by green numbers CPSC 668 Set 16: Distributed Shared Memory 24 Permutation Respects Per Proc. Ordering For a specific proc: • Relative ordering of two writes is preserved by Lemma 9.3 • Relative ordering of two reads is preserved by the construction of • If write W precedes read R in exec. , then W precedes R in by construction • Suppose read R precedes write W in . Show same is true in . CPSC 668 Set 16: Distributed Shared Memory 25 Permutation Respects Ordering • Suppose in contradiction R and W are swapped in : – There is a read R' by pi that equals or precedes R in – There is a write W' that equals W or follows W in the tobcast order – And R' "reads from" W'. R' |pi : : R W …W … W' … R' … R … • But: – R' finishes before W starts in and – updates are done to local replicas in to-bcast order (Lemma 9.3) so update for W' does not precede update for W – so R' cannot read from W'. CPSC 668 Set 16: Distributed Shared Memory 26 Permutation is Legal • Consider some read R of X by pi and some write W s.t. R reads from W in . • Suppose in contradiction, some other write W' to X falls between W and R in : : …W … W' … R … • Why does R follow W' in ? CPSC 668 Set 16: Distributed Shared Memory 27 Permutation is Legal Case 1: W' is also by pi. Then R follows W' in because R follows W' in . • Update for W at pi precedes update for W' at pi in (Lemma 9.3). • Thus R does not read from W, contradiction. CPSC 668 Set 16: Distributed Shared Memory 28 Permutation is Legal Case 2: W' is not by pi. Then R follows W' in due to some operation O, also by pi , s.t. – O precedes R in , and – O is placed between W' and R in : …W … W' … O … R … Consider the earliest such O. Case 2.1: O is a write (not necessarily to X). • update for W' at pi precedes update for O at pi in (Lemma 9.3) • update for O at pi precedes pi's local read for R in (Lemma 9.4) • So R does not read from W, contradiction. CPSC 668 Set 16: Distributed Shared Memory 29 Permutation is Legal : …W … W' … O … R … Case 2.2: O is a read. • By construction of , O must read X and in fact read from W' (otherwise O would not be after W') • Update for W at pi precedes update for W' at pi in (Lemma 9.3). • Update for W' at pi precedes local read for O at pi in (otherwise O would not read from W'). • Thus R cannot read from W, contradiction. CPSC 668 Set 16: Distributed Shared Memory 30 Performance of SC Algorithm • Read operations are implemented "locally", without requiring any inter-process communication. • Thus reads can be viewed as "fast": time between invocation and response is only that needed for some local computation. • Time for writes is time for delivery of one totally ordered broadcast (depends on how to-bcast is implemented). CPSC 668 Set 16: Distributed Shared Memory 31 Alternative SC Algorithm • It is possible to have an algorithm that implements sequentially consistent shared memory on top of totally ordered broadcast that has reverse performance: – writes are local/fast (even though bcasts are sent, don't wait for them to be received) – reads can require waiting for some bcasts to be received • Like the previous SC algorithm, this one does not implement linearizable shared memory. CPSC 668 Set 16: Distributed Shared Memory 32 Time Complexity for DSM Algorithms • One complexity measure of interest for DSM algorithms is how long it takes for operations to complete. • The linearizability algorithm required D time for both reads and writes, where D is the maximum time for a totally-ordered broadcast message to be received. • The sequential consistency algorithm required D time for writes and C time for reads, where C is the time for doing some local computation. • Can we do better? To answer this question, we need some kind of timing model. CPSC 668 Set 16: Distributed Shared Memory 33 Timing Model • Assume the underlying communication system is the point-to-point message passing system (not totally ordered broadcast). • Assume that every message has delay in the range [d-u,d]. • Claim: Totally ordered broadcast can be implemented in this model so that D, the maximum time for delivery, is O(d). CPSC 668 Set 16: Distributed Shared Memory 34 Time and Clocks in Layered Model • Timed execution: associate an occurrence time with each node input event. • Times of other events are "inherited" from time of triggering node input – recall assumption that local processing time is negligible. • Model hardware clocks as before: run at same rate as real time, but not synchronized • Notions of view, timed view, shifting are same: – Shifting Lemma still holds (relates h/w clocks and msg delays between original and shifted execs) CPSC 668 Set 16: Distributed Shared Memory 35 Lower Bound for SC Let Tread = worst-case time for a read to complete Let Twrite = worst-case time for a write to complete Theorem (9.7): In any simulation of sequentially consistent shared memory on top of point-to-point message passing, Tread + Twrite d. CPSC 668 Set 16: Distributed Shared Memory 36 SC Lower Bound Proof • Consider any SC simulation with Tread + Twrite < d. • Let X and Y be two shared variables, both initially 0. • Let 0 be admissible execution whose top layer behavior is write0(X,1) ack0(X) read0(Y) return0(Y,0) – write begins at time 0, read ends before time d – every msg has delay d • Why does 0 exist? – The alg. must respond correctly to any sequence of invocations. – Suppose user at p0 wants to do a write, immediately followed by a read. – By SC, read must return 0. – By assumption, total elapsed time is less than d. CPSC 668 Set 16: Distributed Shared Memory 37 SC Lower Bound Proof • Similarly, let 1 be admissible execution whose top layer behavior is write1(Y,1) ack1(Y) read1(X) return1(X,0) – write begins at time 0, read ends before time d – every msg has delay d • 1 exists for similar reason. • Now merge p0's timed view in 0 with p1's timed view in 1 to create admissible execution '. • But ' is not SC, contradiction! CPSC 668 Set 16: Distributed Shared Memory 38 SC Lower Bound Proof time 0 p0 0 write(X,1) read(Y,0) write(Y,1) read(X,0) write(X,1) read(Y,0) d p1 1 p0 p1 ' p0 p1 CPSC 668 write(Y,1) read(X,0) Set 16: Distributed Shared Memory 39 Linearizability Write Lower Bound Theorem (9.8): In any simulation of linearizable shared memory on top of point-to-point message passing, Twrite ≥ u/2. Proof: Consider any linearizable simulation with Twrite < u/2. • Let be an admissible exec. whose top layer behavior is: p1 writes 1 to X, p2 writes 2 to X, p0 reads 2 from X • Shift to create admissible exec. in which p1 and p2's writes are swapped, causing p0's read to violate linearizability. CPSC 668 Set 16: Distributed Shared Memory 40 Linearizability Write Lower Bound time: 0 u u/2 read 2 p0 : write 1 p1 write 2 p2 p0 delay pattern d - u/2 d - u/2 d - u/2 p1 d - u/2 d d-u p2 CPSC 668 Set 16: Distributed Shared Memory 41 Linearizability Write Lower Bound time: 0 u u/2 read 2 p0 write 1 shift p1 by u/2 p1 shift p2 by -u/2 write 2 p2 p0 delay pattern d d-u d p1 d-u d- u d p2 CPSC 668 Set 16: Distributed Shared Memory 42 Linearizability Read Lower Bound • Approach is similar to the write lower bound. • Assume in contradiction there is an algorithm with Tread < u/4. • Identify a particular execution: – fix a pattern of read and write invocations, occurring at particular times – fix the pattern of message delays • Shift this execution to get one that is – still admissible – but not linearizable CPSC 668 Set 16: Distributed Shared Memory 43 Linearizability Read Lower Bound Original execution: • p1 reads X and gets 0 (old value). • Then p0 starts writing 1 to X. • When write is done, p0 reads X and gets 1 (new value). • Also, during the write, p0 and p1 alternate reading X. • At some point, the reads stop getting the old value (0) and start getting the new value (1) CPSC 668 Set 16: Distributed Shared Memory 44 Linearizability Read Lower Bound • Set all delays in this execution to be d - u/2. • Now shift p2 earlier by u/2. • Verify that result is still admissible (every delay either stays the same or becomes d or d - u). • But in shifted execution, sequence of values read is 0, 0, …, 0, 1, 0, 1, 1, …, 1 CPSC 668 Set 16: Distributed Shared Memory 45 Linearizability Read Lower Bound u/2 read 1 read 0 read 1 read 1 p2 read 0 read 0 read 1 read 1 p1 write 1 p0 p2 read 0 read 0 read 1 read 1 read 0 read 1 read 1 read 1 p1 p0 CPSC 668 write 1 Set 16: Distributed Shared Memory 46
© Copyright 2026 Paperzz