Chapter 3 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press , 2000 Shlomi Dolev, All Rights Reserved © Chapter 3 - Motivating Self-Stabilization 3-1 Chapter 3: Motivating Self-Stabilization Converging to a desired behavior from any initial state enables the algorithm to converge from an arbitrary state caused by faults Why should one have interest in self-stabilizing algorithms? Its applicability to distributed systems Recovering from faults of a space shuttle. Faults may cause malfunction for a while. Using a selfstabilizing algorithm for its control will cause an automatically recovery, and enables the shuttle continue in its task Chapter 3 - Motivating Self-Stabilization 3-2 What is a Self-Stabilizing Algorithm ? This question will be answered using the “Stabilizing Orchestra” example The Problem: The conductor is unable to participate – harmony is achieved by players listening to their neighbor players Windy evening – the wind can turn some pages in the score, and the players may not notice the change Chapter 3 - Motivating Self-Stabilization 3-3 The “Stabilizing Orchestra” Example Our Goal: To guarantee that harmony is achieved at some point following the last undesired page turn Imagine that the drummer notices a different page of the violin next to him … (solutions and their problems): 1. The drummer turns to its neighbors new page – what if the violin player noticed the difference as well ? 2. Both the drummer and violin player start from the beginning - what if the player next to the violin player notices the change only after sync between the other 2 ? Chapter 3 - Motivating Self-Stabilization 3-4 The “Stabilizing Orchestra” Example – the Self-Stabilizing Solution Every player will join the neighboring player who is playing the earliest page (including himself) Note that the score has a bounded length. What happens if a player goes to the first page of the score before harmony is achieved? This case is discussed in details in chapter 6. In every long enough period in which the wind does not turn a page, the orchestra resumes playing in synchrony Chapter 3 - Motivating Self-Stabilization 3-5 Chapter 3: roadmap 3.1 Initialization of a Data-Link Algorithm in the Presence of Faults 3.2 Arbitrary Configuration Because of Crashes 3.3 Frequently Asked Questions Chapter 3 - Motivating Self-Stabilization 3-6 The Data Link Algorithm The task of delivering a message is sophisticated, and may cause message corruption or even loss The layers involved: The sender sends sequences of Network Layer Network Layer bits to the receiver Data link Layer Data link Layer Head Frame Head Tail Packet Frame Physical Layer Packet Tail Packet Head Tail Frame Physical Layer Chapter 3 - Motivating Self-Stabilization 3-7 The alternating-bit algorithm Is used to cope with possibility of frame corruption or loss Sender Receiver 01 initialization 01 initialization 02 begin 02 begin 03 i := 1 03 j := 1 04 bits := 0 04 bitr := 1 05 send(bits,imi) (*imi is fetched*) 05 end (*end initialization*) 06 end (*end initialization*) Every message from the sender is frame repeatedly sent in 06 upon arrival 07 upon aatimeout 07 begin frame to the receiver until acknowledges arrives 08 send(bits,imi) 08 receive(FrameBit , msg) 09 upon frame arrival 09 if FrameBit bitr then 10 begin 10 begin 11 receive(FrameBit) 11 bitr := FrameBit 12 if FrameBit = bits then acknowledgement 12 j := j + 1 13 begin 13 omj := msg 14 bits := (bits + 1) mod 2 14 end 15 i := i + 1 15 send(bitr) Send acknowledgement 16 end 16 end 17 send(bits,imi) (*imi is fetched*) 18 end Chapter 3 - Motivating Self-Stabilization 3-8 The alternating-bit algorithm – run sample S received ack. R received m21 … R received m1 again Upon aa timeout Upon timeout … S R received m1 again .,1> . . 2.12.,1> .,0> ,1> <m1.22.,0> .. ...<m .<m .. 1.. ,0> <m ,1> <m ,1> . .. .21.<m .,1> <m ... 2.,0> R bitRR==1010 bits = 10 <1> <0>bit <0> . . ....<0> .......... <0> <0> <0>. <1> Once the sender receives an acknowledgment <1>, no frame with sequence number 0 exists in the system Chapter 3 - Motivating Self-Stabilization 3-9 There Is No Data-link Algorithm that can Tolerate Crashes It is usually assumed that a crash causes the sender/receiver to reach an initial state No initialization procedure exists such that we can guarantee that every message fetched by the sender, following the last crash, will arrive at its destination The next Execution will demonstrate this point. Denote: CrashR – receiver crash CrashS – sender crash CrashX causes X to perform an initialization procedure Chapter 3 - Motivating Self-Stabilization 3-10 The Pumping Technique Reference Execution (RE) = CrashS, CrashR, sendS(fs1), receiveR(fs1), sendR(fr1), receiveS(fr1), sendS(fs2), … , receiveS(frk) The idea : repeatedly crash the sender and the receiver and to replay parts of the RE in order to construct a new execution E’ these frames are information S If sends fs1kreceives fr1 , lost, sendsnofs2 receives fr2, … , RWe receives fmessage , sends ffr1 , receives fs2 , sends fr2k), … , let S send f and receive f (i from 1 to s1 Suppose Crash and Crash occurred about the exists in the system R receives and sends f crashes si ri receives f S sends f S R s1 r1 S sends fs1 S receives fs1 sends fs2 f crashes r(k-1) sk sends r1 and R receives fs1, sends f , receives f and R crashes r1 s2 r2 receives fsk and sends frk Now S and R crash f ... f f fsk s2fs1 ... ffs1s2 ffs1s1 sk s2 s1 Crash Crash Crash Crash S S RR Continue with the same technique . . m2 m1 m2 S m21 ..... R fr1 ff f f ......f ff f r2 r1 r1 r2r2 r(k-1) rkr1 r1 Chapter 3 - Motivating Self-Stabilization 3-11 Conclusion ! It is possible to show that there is no guarantee that the kth message will be received We want to require that eventually every message fetched by the sender reaches the receiver, thus requiring a Self-Stabilizing Data-Link Algorithm Chapter 3 - Motivating Self-Stabilization 3-12 Chapter 3: roadmap 3.1 Initialization of a Data-Link Algorithm in the Presence of Faults 3.2 Arbitrary Configuration Because of Crashes 3.3 Frequently Asked Questions Chapter 3 - Motivating Self-Stabilization 3-13 Arbitrary configuration because of crashes A combination of crashes and frame losses can bring a system to any arbitrary states of processors and an arbitrary configuration Chapter 3 - Motivating Self-Stabilization 3-14 Any Configuration Can be Reached by a Sequence of Crashes The pumping technique is used to reach any arbitrary configuration starting with the reference execution Reference Execution (RE) = CrashS, CrashR, sendS(fs1), receiveR(fs1), sendR(fr1), receiveS(fr1), sendS(fs2), … , receiveS(frk) The technique is used to accumulate a long sequence of frames Chapter 3 - Motivating Self-Stabilization 3-15 Reaching an Arbitrary Configuration Our first goal – creating an execution in which RE appears i times in a row (RE)i Denote : FrE (FsE) – the sequence of frames sent by the receiver (sender) in RE FirE (FisE) = the sequence Fr(s)E Fr(s)E … Fr(s)E (i times) First we the Pumping Technique to S RS receives sends fthe receives receives , fsends S sends received fi,use , freceives first fr1,ffsends F ,…be crashed For any finite can extended to s1,technique s1 sends r1r1 s1the rE,… s2, receives S,S sends fs1 S crashes S sends receives , receives f and fsends ,irksends fr1f fs2 receive RE ff,s1sk and receives sends fr1 crashes s1 fRr2 and , …freceived sends f the second reach a configuration in which Ffrk R sk sk, receives rk, in qs,r sE appears ..... i, … F fs2sE f fs1 f , fs2fthe , fs1, same fs1, FsE F F s1 s1 s2 s2 s1 sk sE Crash Continue sE with technique Crash S SS R R R F fr1rE fr1r1rE fr2r2 ... frkfrkr1 FfrE rE F r1 Chapter 3 - Motivating Self-Stabilization 3-16 Reaching an Arbitrary Configuration Our second goal – achieving ca (an arbitrary configuration) Denote k1 (k2)- the number of frames in qs,r (qr,s) in ca i = k1+k2+2 S We replays do the RE same using with the first R, reaching FrE until the it reaches arbitraryits i R (loosing replays RE Using the previous technique accumulate F desired state configuration thekframes ctimes sent by it and 2+1 we sE a the leftovers of Fk2rE that are not in qr,s) i F k1+1 qs,r sE sE S' S R' R qr,s F k2+1rE Chapter 3 - Motivating Self-Stabilization 3-17 Crash-Resilient Data-Link Algorithm,With a Bound on the Number of Frames in Transit Crashes are not considered severe type of faults (Byzantine are more severe - chapter 6) The algorithm uses the initialization procedure, following the crashes of S and R bound – the maximal number of frames that can be in transit When the sender receives the first <ackClean,bound+1> it can be sure that the only label in transit is bound+1, and can initialize the S received <ackClean,1>, then sends repeatedly alternating bit algorithm (similarly R can initialize as well) S crashes S ,in after-crash state, invokes a clean procedure <clean,2> until it will receive <ackClean,2> CrashSuntil <clean <m .<clean . .<clean .S . .,0> ,1> <clean ,1>. ,2>.,bound+1>. .. ..<clean <clean ,1> . <clean . .<ackClean,bound+1> .,1> ,1> .,1> . Continue receives new S ..... R bits = 0 <ackClean <ackClean <ackClean ,1> ,1>bitR = 1 ,bound+1> ,1> Chapter 3 - Motivating Self-Stabilization 3-18 Crash-Resilient Data-Link Algorithm – R crashes R received msg and assigned FrameBit to bitR it then delivers msg to the output queue – The Problem : extra copy of msg in output queue R the crashes <msg ,FrameBit> CrashR <msg ,FrameBit> S R bitR = i =FrameBit Chapter 3 - Motivating Self-Stabilization 3-19 Crash-Resilient Data-Link Algorithm – R crashes Can we guarantee at most one delivery, and exactlyonce delivery after the last crash? bitR initialization should assure that a message fetched after the crash will be delivered A solution: S sends each message in a frame with label 0, until Ack. arrives and then sends the same message with label 1 until an Ack. arrives R delivers a message only with label 1 that arrives immediately after label 0 Chapter 3 - Motivating Self-Stabilization 3-20 Chapter 3: roadmap 3.1 Initialization of a Data-Link Algorithm in the Presence of Faults 3.2 Arbitrary Configuration Because of Crashes 3.3 Frequently Asked Questions Chapter 3 - Motivating Self-Stabilization 3-21 What is the Rational behind assuming that the states of the processors can be corrupted while the processors’ programs cannot ? If the program is subjected to corruption, any configuration is possible. The Byzantine model allows 1/3 of processors to execute corrupted programs The program is stored in a long-term memory device which makes it possible to 1. Reload program statements periodically 2. Protect the memory segment using a read-only memory device Chapter 3 - Motivating Self-Stabilization 3-22 Safety Properties Safety and Liveness properties should be satisfied by a distributed algorithm Safety ensures avoiding bad configurations Liveness ensures achieving the systems’ goal The designer of a self-stabilizing algorithm wants to ensure that even if the safety property is violated, the system execution will reach a suffix in which both properties hold What use is an algorithm that doesn’t ensure that a car never crashes? If the faults are severe enough to make the algorithm reach an arbitrary configuration, the car may crash no matter what the algorithm is chosen Chapter 3 - Motivating Self-Stabilization 3-23 Safety Properties … A safety property for a car controller might be: never turn into a one-way road When A selfnoStabilization specificationcontroller exists the will car recover can continue from this driving on this non-legal road and init crash (by turning with other the car) cars Chapter 3 - Motivating Self-Stabilization 3-24 Processors Can Never be Sure that a Safe Configuration is Reached What use is an algorithm in which the processors are never sure about the current global state? The question confuses the assumptions (transient faults occurrence) with the algorithm that is designed to fit the severe assumptions. A self-stabilizing algorithm can be designed to start in a particular (safe) state A self-stabilizing algorithm is at least good as a non-self-stabilizing one for the same task, and is in fact much better !!! Chapter 3 - Motivating Self-Stabilization 3-25
© Copyright 2026 Paperzz