A MIDDLEWARE FOR GOSSIP PROTOCOLS by Michael Chow and Robbert Van Renesse Cornell University Subjects 1.Introduction 1.1. The Problem 2.The Middleware 2.1. The Structure 2.1.1. Modules 2.1.2. Layers 2.1.3. Core Architecture 2.2. How does it works 3.Simulation 4.Related Work 5.Conclusion 6.Future work 1.Introduction ➔ ➔ Gossip protocols provide updates in a scalable and reliable way. ◆ But this can make the management of gossip applications very heavy When a gossip protocol goes bad often the system has to be taken down. ◆ Even Amazon didn’t escape to this ● Due to a bit flip all servers failed and the system needed to be shutted down. The system was restored after 6 hours 1.1.The Problem One of the nodes gossip some data that is corrupted “ 1.1.The Problem The nodes that received that data come infected too and will gossip to other nodes(really like a virus) “ 1.1.The Problem “ Then, eventually, all nodes will be infected... 1.1.The Problem “ Then, eventually, all nodes will be infected... ...and we have to shut down all the system(that includes the nodes that were infected) SHUTDOWN 1.1.The Problem “ In some point the bug it will be fixed and spread for all the nodes... 1.1.The Problem “ In some point the bug it will be fixed and spread for all the nodes... ...but one of the nodes is off and doesn’t receive that fix isOff 1.1.The Problem “ In time, is turned on and still infected with the old version of the message 1.1.The Problem “ And will spread that bug again Restart the process all over again 2.The Middleware ➔ After knowing the problem, their idea was make a layered middleware with the capability of rapid code updating. ➔ The code updating scheme use distribute code, like Trickle(related work), and is managed by the core, because it can’t be updated dynamically, many of the decisions were driven to keep the core small and simple. “ Characteristics: Structure: ➔ ➔ ➔ ➔ ➔ ➔ Java based Resilient Dynamic update Core Layers Modules 2.1.The Structure Modules ➔ ➔ ➔ “ Where the Java class files are implemented ◆ Java classes are immutable ◆ One of the classes is an interface to communicate with the core. ID(tuple) ◆ Unique name ◆ Deployment number(tuple) Versions (deployment) module’s name deployment number <time_initiated_deployment;ID_node_that_initiated_the_deployment> code archive 2.1.The Structure Modules - Deployments d1 “ v1 update v2 d2 roll back d3 v1 This deployments it will be stored on a map of the core (deployment number, module name) -> code archive 2.1.The Structure Layers ➔ ➔ ➔ module 1 “ Used for modules communication ◆ Modules can use services from other modules Avoid duplicated code Works like a interface module 2 module 3 layer 1 module 4 layer 2 core 2.1.The Structure Layers “ Upcall to all modules which use the functionality that was updated module 1 module 2 module 3 module 4 ll a pc u layer 2 layer 1 core 2.1.The Structure Core ➔ ➔ ➔ ➔ ➔ “ A module that acts like a HTTP server Mediates the gossip between modules of same type from different nodes Provide few services ◆ Small and simple ◆ Cannot be updated Configuration file List of rendezvous servers and membership hints Many services → Propitious to fail If fails the system has to be shutted down 2.1.The Structure Core - Configuration file “ ➔ ➔ ➔ List of modules ◆ current versions ◆ deployment number Determines which versions of each module is running The node gossips this file periodically to other cores to check if is up-to-date 2.2.How does it works Gossip between modules “ node1 node2 sends a request HTTP GET or POST APPLICATION APPLICATION module 2 module 2 module 1 module 1 module 4 module 4 module 3 module 3 layer 1 layer 1 layer 2 layer 2 config file config file CORE(SERVER HTTP) CORE(SERVER HTTP) gossip request(src_deployment_number) 2.2.How does it works Gossip between modules “ node1 node2 on receipt APPLICATION APPLICATION module 2 module 2 module 1 module 1 module 4 module 4 module 3 module 3 layer 1 layer 1 layer 2 layer 2 config file config file CORE(SERVER HTTP) 1)See if deployment number matches CORE(SERVER HTTP) gossip request(src_deployment_number) 2.2.How does it works Gossip between modules “ node1 node2 on receipt APPLICATION APPLICATION module 2 module 2 module 1 module 1 module 4 2)demultiplexes the message module 3 layer 1 layer 2 module 3 layer 1 layer 2 if it matches config file config file CORE(SERVER HTTP) module 4 1)See if deployment number matchs CORE(SERVER HTTP) gossip request(src_deployment_number) 2.2.How does it works Gossip between modules “ node1 node2 on receipt APPLICATION APPLICATION 3)reply to the request node module 2 module 2 module 1 module 1 module 4 2)demultiplexes the message module 3 layer 1 module 3 layer 1 layer 2 layer 2 config file config file CORE(SERVER HTTP) module 4 1)See if deployment number matchs response() CORE(SERVER HTTP) 2.2.How does it works Gossip between modules “ node1 APPLICATION module 2 module 1 node2 on receipt 2)determinates which of the nodes has the more recent configuration module 4 module 3 layer 1 layer 2 APPLICATION module 2 module 1 module 3 layer 1 layer 2 doesn’t match config file config file CORE(SERVER HTTP) module 4 1)See if deployment number matchs CORE(SERVER HTTP) gossip request(src_deployment_number) 2.2.How does it works Gossip between modules “ on receipt node1 3)reply with the recent config file and the missing classes APPLICATION module 2 module 1 2)determinates which of the nodes has the more recent configuration module 4 module 3 layer 1 layer 2 node2 APPLICATION module 2 module 1 module 3 layer 1 layer 2 doesn’t match config file config file CORE(SERVER HTTP) module 4 1)See if deployment number matchs CORE(SERVER HTTP) gossip request(src_deployment_number) 2.2.How does it works Transferring states OLD VERSION “ State - its for the new version of the module continue what was made until then. Keeps up the performance NEW VERSION module 4 module 4 Class Interface Module Class Interface Module ... public String transferState() public void acceptState(String state) ... ….. 1) stops old version 2)send state core ... public String transferState() public void acceptState(String state) ... ….. 3) execute new version 2.2.How does it works Gossip between Cores(membership hints and rendezvous servers) Problem: Modules may fail “ For this, the cores has a gossip protocol that works with: ➔ List of membership hints ➔ List of rendezvous nodes (static) List of membership hints is a set of 24 addresses from the network where the communication had success List of rendezvous nodes is a set of fixed nodes that normally make the deployments, but any node can do it. 2.2.How does it works Gossip between Cores(membership hints and rendezvous servers) “ Add an address to membership hints 24 membership nodes rendezvous servers core 3 core 1 List of memberships List of rendezvous g oss nodes (static) ip address.coreA address.coreB address.coreC address.coreD address.coreA address.coreB address.coreC address.coreD ############ co re cao re ac or e a co re ac or e req ues t(s rc_ dep core 2 loy men co t_n re umb er) core a core b core c core d 2.2.How does it works Gossip between Cores(membership hints and rendezvous servers) “ Add an address to membership hints 24 membership nodes rendezvous servers gossip request(src_deployment_number) core 1 List of memberships List of rendezvous nodes (static) address.coreA address.coreB address.coreC address.coreD address.core2 address.coreA address.coreB address.coreC address.coreD ############ core 3 core 2 co re cao re ac or e a co re ac or e co re core a core b core c core d 2.2.How does it works Gossip between Cores(membership hints and rendezvous servers) Check an address of membership hints core 1 List of memberships hints address.coreA address.coreB address.coreC address.coreD address.core2 address.core3 ... “ 24 membership nodes rendezvous servers core 3 List of rendezvous nodes (static) address.coreA address.coreB address.coreC address.coreD ############ core 2 co re cao re ac or e a co re ac or e co re core a core b core c core d 2.2.How does it works Gossip between Cores(membership hints and rendezvous servers) Check an address of membership hints core 1 List of memberships hints address.coreA address.coreB address.coreC address.coreD address.core2 address.core3 ... “ 24 membership nodes rendezvous servers core 3 List of rendezvous nodes (static) address.coreA address.coreB address.coreC address.coreD ############ randomly choose a hint core 2 co re cao re ac or e a co re ac or e co re core a core b core c core d 2.2.How does it works Gossip between Cores(membership hints and rendezvous servers) Check an address of membership hints “ address.coreA address.coreB address.coreC address.coreD address.core2 address.core3 ... List of rendezvous nodes (static) address.coreA address.coreB address.coreC address.coreD ############ let’s say core 2 rendezvous servers core 3 core 1 List of memberships hints 24 membership nodes gos sip core 2 co re cao re ac or e a co re ac or e co re core a core b core c core d 2.2.How does it works Gossip between Cores(membership hints and rendezvous servers) Check an address of membership hints “ address.coreA address.coreB address.coreC address.coreD address.core3 ... List of rendezvous nodes (static) address.coreA address.coreB address.coreC address.coreD ############ let’s say core 2 rendezvous servers core 3 core 1 List of memberships hints 24 membership nodes gos sip if it fails he’s removed from the list core 2 co re cao re ac or e a co re ac or e co re core a core b core c core d 2.2.How does it works Gossip between Cores(membership hints and rendezvous servers) Check an address of membership hints “ 24 membership nodes rendezvous servers co re core cao a re ac or If gossip on every membership hint fails, the rendezvous nodes are still core e a core 1 core co updates. available and the node will keep 3receiving re b ac List of List of rendezvous or e memberships nodes (static) gos hints core core sip address.coreA co 2 re c address.coreA address.coreB address.coreB address.coreC address.coreD address.core3 ... address.coreC address.coreD ############ let’s say core 2 if it fails he’s removed from the list core d 3.Simulation Performance “ Objective : Testing overhead of automatic code updating ➔ ➔ ➔ ➔ 100 Nodes running the middleware 30 memberships 10 rendezvous nodes Application running: simple membership protocol that gossips membership views. 3.Simulation Performance ➔ First 50s, the messages cover a large portion of the traffic. ➔ After 50s the application started to dominate the traffic. ➔ The core doesn’t perform no more updates. Only checks the config. file “ 3.Simulation Performance Test: how much time it takes to each node receive the code after creating a new deployment. ➔ Slow until time 2, rendezvous nodes loads new code separately ➔ Then quickly reach to the rest of the participants “ 4.Related Work ➔ ➔ ➔ Trickle ◆ An algorithm used for propagating code updates through wireless sensors. ◆ Gossips metadata of the versions running(like configuration file) Mobile code and mobile agents ◆ Avoids moving large amounts of data across the network Other gossip middlewares ◆ GossipKit ● Provides extensibility ● This middleware provides reliable code updating with layered-upcall architecture ◆ T-Man ● Creates and manager different network overlays ● This middleware do the same with the code updating service “ 4.Conclusion This middleware resolves the problem at point 1(the system has to be shutted down) with : ➔ Dynamic updates ➔ Updating and rolling back on versions “ 5.Future Work ➔ ➔ ➔ Including NATS (Network Address Translation) through a layer service Security with trusted authorities and using cryptography Update the core module itself(!!)
© Copyright 2025 Paperzz