ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007 1 Motivation • Why implement Byzantine Fault-Tolerant replication? – Increasing value of data and decreasing cost of hardware – More non-stop-fail behaviors than believed – BFT is becoming cheaper – Cost of 3-way non-BFT replication close to cost of BFT replication 2 Zyzzyva (I) • Uses speculation to reduce the cost of BFT replication – Primary replica proposes order of client requests to all secondary replicas (standard) – Secondary replicas speculatively execute the request without going through an agreement protocol to validate that order (new idea) 3 Zyzzyva (II) • As a result – States of correct replicas may diverge – Replicas may send diverging replies to client • Zyzzyva’s solution – Clients detect inconsistencies – Help convergence of correct replicas to a single total ordering of requests – Reject inconsistent replies 4 How? • Clients observe a replicated state machine • Replies contain enough information to let clients ascertain if the replies and the history are stable and guaranteed to be eventually committed • Replicas have checkpoints 5 Byzantine agreement (I) • No solution for less than four entities 6 Byzantine agreement (II) • To achieve agreement in the presence of f failed nodes (“traitors”) we need – 3f + 1 entities 7 Practical BFT (I) • Practical Byzantine Fault-Tolerant protocol (PBFT) [Castro and Liskov 1999] 8 Practical BFT (II) Replicas decide on correct ordering 9 Practical BFT (III) 1. Client sends signed request to primary replica 2. Primary assigns a sequence number to the request and sends to all other replicas a PRE-PREPARE message 3. Secondary replicas validate the message and send a PREPARE message to all replicas 4. Replicas that can collect 2f PREPARE messages send a COMMIT message to all replicas 5. Replicas that can collect 2f+ 1 COMMIT message send a REPLY to the client 10 A shortened version Faster agreement is achieved thanks to a more complex view change protocol 11 The explanation (I) • "No replicated service that uses the traditional view change protocol can be live without an agreement protocol that includes both the prepare and commit full exchanges" • "The traditional view change protocol lets correct replicas commit to a view change and become silent in a view without any guarantee that their action will lead to the view change." 12 The explanation (II) • Zyzzyva – Adds an extra phase to its view change protocol – Guarantees that a correct replica will not abandon a view unless every other correct replica does it 13 Zyzzyva Agreement (I) • Common case: no faulty replicas 14 Explanations • Secondary replicas assume that – Primary replica gave the right ordering – All secondary replicas will participate in transaction • Initiate speculative execution • Client receives 3f + 1 mutually consistent responses 15 Zyzzyva Agreement (II) • With a faulty replica 16 Explanations (I) • Client receives 3f mutually consistent responses • Gathers at least 2f + 1 mutually consistent responses • Distributes a commit certificate to the replicas • Once at least 2f + 1 replicas acknowledge receiving a commit certificate, the client considers the request completed 17 Explanations (II) • If enough secondary replicas suspect that the primary replica is faulty, a view change is initiated and a new primary elected 18 Comparison with traditional solutions 19 State maintained at each replica 20 Explanations (I) • Each replica maintains – A history of the requests it has executed – A copy of the max commit certificate it has received • Let it distinguish between committed history and speculative history 21 Explanations (II) • Each replica constructs a checkpoint every CP_INTERVAL requests • It maintains one stable checkpoint with a corresponding stable application state snapshot • It might also have up to one speculative checkpoint with its corresponding speculative application state snapshot 22 Explanations (III) • Checkpoints and application state become committed through a process similar to that of earlier BFT agreement protocols – Replicas send signed checkpoint messages to all replicas when they generate a tentative checkpoint – Commit checkpoint after they collect f + 1 signed matching checkpoint messages 23 View change sub-protocol (I) 24 Explanations • Two-phase protocol • Elects a new primary • Guarantees that it will not introduce any changes in a history that has already completed at a correct client 25 Performance: throughput 26 Comments • Zyzzyva-5 is a special version of Zyzziva requiring more replicas but having a lower overhead 27 Performance: latency 28 Scalability: peak throughputs 29 CONCLUSIONS • Systematically exploiting speculative execution results in a protocol much faster than conventional BFT agreement protocols. Observe that Zyzzyva is optimized for the most frequent case but provides the correct result in all cases • A good rule to follow 30
© Copyright 2026 Paperzz