Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon Agenda • A short introduction to gossip algorithms • Cluster/Grid Information services requirements – How good is old information • The distributed bulletin board model • Implementation 2 A Problem • In an n node system assume that every pair of nodes can communicate directly • node i wishes to send a message (rumor, color) to all other nodes. • Possible deterministic solutions –BROADCAST (only in a broadcast medium) –Defining a static tree between the nodes and sending the message along the edges of this tree 3 A Gossip Style solution • Starting with the round in which a rumor is generated • each node that holds the rumor selects another node independently and uniformly at random • send the rumor to this node • The distribution of the rumor is terminated after some fixed number of O( ln n ) rounds • At this point all players are informed with high probability 4 Uniform Gossip Example 1 t 5 Uniform Gossip Example 2 t 6 Uniform Gossip Example 3 t 7 Uniform Gossip Example 4 t 8 Uniform Gossip Example 5 t 9 Gossip benefits • Robustness to the presence of node failures – Messages will continue to propagate due to the random selection of destination – F nodes failure results in only O(F) uninformed players • Simplicity – All nodes run the same algorithm • Scalability – The number of massages each nodes send (and possibly receive) each round is fixed 10 Gossip taxonomy • Other names are – Epidemic algorithms (demers et al) – Randomized communication (Karp et al) • Propagation can be done by – Push – sending the information from the node to the selected node – Pull – the other way around – Push&Pull both ways • We distinguish between 2 conceptual layers – A basic gossip algorithm » by which nodes choose other nodes for communication – A gossip-based protocol » Built on top of a gossip algorithm » Determine the content of the messages that are sent » The way received messages cause nodes to update their internal state 11 Rumor speeding bounds From a single node to all • Time complexity: O (ln n) • Message complexity (Karp el al) lower bound to the number of messages: (n ln ln n) 12 Spatial Gossip (Kampe at al) • New information is most interesting to nodes that are nearby • Combines the benefits of – Uniform gossip – Deterministic flooding • The gossip algorithm chooses the nodes according to px, y cx (d 1) D • New information is spread to nodes at distance d with high probability,in : O(log 1 d ) 13 Aggregating values • Gossip can also be used to aggregate a value over all nodes • Average, maximum, minimum … • In this case the question is how fast the local value in each node converge to the desired value 14 Cluster/Grid Information services • Basic properties of Grid environment – Information sources are distributed – Individual sources are subject to failure – Total number of information providers is large – Both the types of information sources and the ways it is used can be varied • We cannot in general provide users with accurate information: any information delivered to a user is “old” – How useful is old information? (Mitzenmacher) – How to build an information service with guaranteed age properties? 15 Distributed Bulletin board • The system – Consists of ‘N’ nodes (or clusters) – Distributed – Nodes are subject to failure • Each node maintains a data structure that holds an entry on selected (or all) nodes in the system • We refer to this data structure as “The vector” • Each vector entry holds: – state of the resources (static and dynamic) about the corresponding node – age of the information (tune to the local clock) • The vector is a distributed bulletin board that serves information requests locally 16 Algorithm 1- Information dissemination • Each time unit – Update local information – Find all vector entries which are up to age t – Choose a random node – Send the above entries to that node • Upon receiving a message – Compute the received entries age – Update the entries which the newly received information is fresher A:1 C:2 D:4 A:1 B:12 C:2 B:1 C:3 E:3 A:4 B:12 C:2 D:4 E:11 D:4 E:11 17 Algorithm 1 : t=2 1 t 18 Algorithm 1 : t=2 2 t 19 Algorithm 1 : t=2 3 t 20 Algorithm 1 : t=2 4 t 21 Algorithm 1 : t=2 5 t 22 Bounds and Approximations • We want to know “how old” is the information in the vector • First we find E(Xt) (for the asynchronous case) – The expected number of nodes that have information about node i which is up to t time unit old E[ X t ] n e 1 (1 ) t n n 1 e ESynchronous [ X t ] case e t 1 (1 ) t n E[ X t ] 2 t 23 Bounds and Approximations • An approximation for the expected age of the vector n 1 n 1 Av ( Aw ) n E[ X t ] 24 Real results 25 Approximating the age distribution • Ak is a random variable describing the number of nodes which are up to age k k t E[ X k ] E[ Ak ] k Aw ) k t n(1 q 26 Age distribution 27 Handling inactive nodes • The presence of inactive nodes causes problems – Age quality of the information deteriorate – Number of ARP broadcasts increase linearly • Using a fixed size window improves the age quality but the number of ARP broadcasts stay the same 28 Algorithm 2 • Algorithm 2 solves the above 2 issues • Works basically the same as algorithm 1 with the following difference when sending a message – Calculate l the number of active nodes (from the local vector) – Generate a random number between k=0…l – If K=0 send the window to all nodes – Else send the window only to the active nodes • Using Algorithm 2 the maximal expected number of messages to inactive nodes ≤ 1 – From all nodes at each round 29 Algorithm 2 – Age performance 30 Algorithm 2 – minimizing messages to inactive nodes 1 t 31 Algorithm 2 2 t 32 Algorithm 2 3 t 33 Algorithm 2 4 t 34 Supporting Urgent information • In previous algorithm information is propagated from all nodes constantly • In some cases we wish to send an important message urgently to all – such as the detection of a newly dead node – In this case the source node give the message high priority 2*log(n) • When a node assemble the window it is about to send it takes the entries with the highest priority and only then the younger entries • The priority of an entry is decremented every time unit • The result is that urgent messages are disseminated in O(log(n)) steps • And regular information is disseminated a bit slower 35 Information service clients • MOSIX – load balancing » Fresh information is used by the load balancing algorithm to consider migrating processes – mmon, Mosix Monitoring tool » Presents the vector of a specific node » mmon –h xil-10 • MPICH – Improved assignment of processes to nodes » No assignment to “dead” nodes » Assignment to the least loaded ones • Nagios – – • Colleting information about clusters over time (history) Periodically retrieving a vector from a machine and keeping it Decision algorithms in the cluster level – – Leader election (queue fault tolerance) Node reservation 36 Conclusions • Constructed a distributed bulletin board – Age properties are guaranteed – The administrator can configure it to the desired properties – No two nodes have the same view of the system – Information requests are served locally – Noise level (messages to inactive) is constant – Urgent messages are propagated quickly 37 Future Work • Investigating other gossip models – Push and Pull-Push • Using only a partial view of the system 38
© Copyright 2026 Paperzz