Packet Efficient Implementation of the Omega Failure Detector Ω Lyon, France Quentin Bramas Dianne Foreback Mikhail Nesterenko Sébastien Tixeuil November 8, 2016 What is Omega? Oracle • Abstract construct that enables a solution to an otherwise unsolvable problem in a particular distributed system model • Encapsulates the portion of solution that is impossible to implement Consensus requires non-crashed (correct) processes to agree on a value • FLP [11]: Consensus is impossible to solve in asynchronous system model even a if single process may crash: impossible to know if a process crashed or is merely slow requires an oracle Omega - an oracle that eventually outputs the same leader at every correct process • Weakest failure detector to solve consensus in the asynchronous system model [6] • Encapsulates the least amount of synchrony for a solution to Consensus 2 Notation • Message – contents to be distributed to other processes (using packets) • Packet – portion of data transmitted across channel • Fair-lossy channel - if an infinite number of messages are sent along this channel, an infinite number of messages are received implementation efficiency metrics • Message Efficient Implementation: only a single process sends an infinite number of messages and all but finitely many messages are of constant size • Packet Efficient Implementation: if all but finitely many messages are transmitted using O(n) packets. Packets of different messages may potentially use different channels ⇒ the number of used channels is not limited • Super Packet Efficient Implementation: requires only a O(n) channels to transmit packets of an infinite number messages ⇒ the number of used channels is limited 3 Omega Implementations Implementation conditions • Omega is impossible to implement in the asynchronous system model • Must strengthen the system model with added synchrony and reliability assumptions • What is the least restrictive communication model to implement Omega? Aguilera et al. [1] algorithm • Requires at least one process to have eventually timely channels to all correct processes • Requires a (possibly different) single process to have fairlossy channels to/from all others timely channel drops all packets • Message efficient Delporte-Gallot et al. [9] algorithm determines timely channel graph • Can be used to construct Omega • Requires at least one process to have timely paths to all correct processes • Requires all channels to be reliable • Super packet efficient If channel timeliness/reliability is arbitrary, what are the necessary&sufficient conditions for Omega implementation? 4 Single-hop vs. Multi-hop Theorems 1 and 2 (summarized): If timely channel probability is 𝑝 < 1 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑐ℎ𝑎𝑛𝑛𝑒𝑙 then, as network size n grows, the probability of leader existence using • Single-hop direct communication: approaches zero exponentially fast • Multiple-hop communication: approaches one exponentially fast Theorems 3 and 4 (summarized): The probability of a leader persisting while the timeliness of channels change • Single-hop tends to zero • Multiple-hop tends to infinity Practically speaking, a multi-hop omega implementation is far more likely to succeed in establishing a persistent leader v u w x t z y 5 Our Deterministic Results • Prove necessary conditions for Omega At least one process (leader) needs to have a timely path to all correct processes All processes have to have a fair-lossy path to the leader • Present an algorithm MPO that implements Omega under these conditions ⇒ the conditions are necessary and sufficient • Prove cannot have super packet-efficient algorithm ⇒ Algorithm of Delporte-Gallot et al. channel reliability requirement is necessary 6 Infinite Number of Messages Theorem 5 and Corollary 1 (summarized): In an implementation of Omega, at least one process needs to send infinitely many timely messages. This process must be the leader. Intuition: The leader needs to periodically inform other processes of its correctness or they will not be able to detect its crash x Why?: Cannot distinguish between crashed and slow process thus toggling between leaders may occur y w z v Let process x be the leader Let v, w and z crash If x stops sending timely messages, y thinks x crashed and elects itself as leader Send messages arbitrarily and now x picks y as leader Repeat as above but with y 7 No Channel Reliability/Timeliness Lemma 1 and Corollary 2 (summarized): If no channel reliability/timeliness properties, to timely deliver a message • Each recipient must send it across every outgoing channel (except possibly the channels to the leader and sender) • Ω 𝑛2 packets are needed (not packet efficient) Intuition: Cannot establish properties, have to send packets everywhere otherwise might skip be the channel that contains the only timely path leading to some process x y w z v 8 Fair-Lossy Path Lemma 2: In any message efficient implementation of Omega, each correct process must have a fair-lossy path to the leader Intuition: must be able to provide feedback if the leader is no longer timely. • Let process x be the leader • If y no longer considers x its leader and gets no other messages, y elects itself leader • y does not send fair-lossy messages so others do not know of y’s decision • Thus, no agreement on leader x y w z v 9 Eventually Timely Paths Theorem 6: For a message and packet efficient implementation • At least one process x must have an eventually timely path to all correct processes • every correct process has a fair-lossy path to the leader Intuition: • The leader x must send an infinite number of messages (so others do not think it crashed) • If a proportional number of processes do not have timely incoming channels, then all processes must forward each message • ⇒ (per Lemma 2) number of packets required for each message is in Ω 𝑛2 - x contradiction y w z v 10 Our Deterministic Results • Prove necessary conditions for Omega At least one process (leader) needs to have a timely path to all correct processes All processes have to have a fair-lossy path to the leader • Present an algorithm MPO that implements Omega under these conditions ⇒ the conditions are necessary and sufficient • Prove cannot have super packet-efficient algorithm ⇒ Algorithm of Delporte-Gallot channel reliability requirement is necessary 11 No Super Packet Efficiency Theorem 7: Even if there the leader has an eventually timely path to every correct process and every correct process has a fair-lossy path to the leader, there does not exist a message and super packet efficient implementation of Omega S1 L1 S2 eventually timely channel x y leader leader w z timely component L2 eventually reliable channel timely component Intuition: • Assume there are an equal number of processes in each component • If there is no communication between processes of the two components initially, processes in S1 assume those in S2 are crashed and vice versa • Processes in S1 elect L1 from S1 as their leader and processes in S2 elect L1 from S2 as their leader • The eventuality of the timely and reliable channels occur after the leaders are elected Thus, the conditions of the theorem exist. Yet, disagreement on the leader occurs 12 Our Deterministic Results • Prove necessary conditions for Omega At least one process (leader) needs to have a timely path to all correct processes All processes have to have a fair-lossy path to the leader • Present an algorithm MPO that implements Omega under these conditions ⇒ the conditions are necessary and sufficient • Prove cannot have super packet-efficient algorithm ⇒ Algorithm of Delporte-Gallot channel reliability requirement is necessary 13 MPO Algorithm: Features • Leader candidate estimates reliability of channels by building arboresence of timely channels • Sends packets over these channels • Gets failed messages if untimely • Process with lowest weight arborescence wins • Weight used to evaluate channel reliability • Proceeds in phases to preserve efficiency • Messages sent an infinite number of times are of constant size x y w alive(x, phase#, shoutID) y 4 z v w x startPhase(x, phase#, arb) w X z v 14 x Takes Leadership • x calculated weight of arbs from all processes and it has the lowest arb • x broadcasts startPhase sending the arb to limit the channels being used x startPhase(x, phase#, arb) y w z v 15 Leader x Sends alive Goal: alive timely received by all correct processes according to arborescence per phase#, otherwise, x keeps track of failed to build new arb should it lose leadership w x x alive(x, phase#, shoutID) y 4 w X z y w 5 v failed(x,y,z) z v Timely received via the arborescence: Not timely received by z • w, y timely receive x’s alive from parent per • z broadcasts failed, and increases current phase# (ignore if earlier phase#) y forward alive according to arborescence or shouts (broadcast) w‘s turn to shout alive • v first receives from non-parent w, waits to receive from parent y (or turns on timer if it was off) timer to gauge delay • All other processes broadcast z’s failed once • x receives failed for the first time, increases yz weight—used to calculate arbs for next phase (should it lose leadership) 16 x timeout Expires : Is x Still the Leader? x calculates new arb considering updated edge weights of untimely channels • x still a leader: If arb is smaller than other processes arbs Message efficient: uses old arb and never updates phase # (no need to send a new arb to all) • x loses leadership: Increases its phase# broadcasting stopPhase Message Efficient: stopPhase with newer phase# used by other processes to turn off their timer for x and to quit sending failed x stops sending alive x keeps own timer on to check if regains leadership • • • • Timeout expires Calculate new arb • • Kept leadership Use old arb, send alive x x Lost leadership Broadcast stopPhase x stopPhase(x; phase#) y w y w y w 5 z v z v z v 17 x Regains Leadership x arb is once again smaller than others • Other leader must have sent stopPhase due to non-timely channels x startPhase(x; phases[x]; arbs[x]) y w z x alive(x, phase#, shoutID) y v w z v • x timeout expires (process’ own timer is always on) • x kept collecting failed messages updating its edges—calculates its new arb • x broadcasts startPhase with new arb to all other processes So other processes will await alive messages and send failed if not timely received from parent according to new arb 18 MPO Algorithm Message and Packet Efficient Message Efficient: Only a single process x sends an infinite number of messages and all but finitely many messages are of constant size x w alive(x, phase#, shoutID) y w Non-constant message, startPhase, sent finite times v z Packet Efficient: All but finitely many messages are transmitted using O(n) packets Packets of different messages may potentially use different channels Thus, the number of used channels is not limited w x alive(x, phase#, shoutID) Eventually, only 2 𝑛 − 1 packets sent y w Broadcast precludes super packet efficiency of O(n) channels z v 19 Packet Efficient Implementation of the Omega Failure Detector Thank you! Ω Lyon, France Questions? November 8, 2016
© Copyright 2026 Paperzz