The DHCP Failover Protocol A Formal Perspective

The DHCP Failover Protocol
A Formal Perspective
Rui Fan
Ralph Droms
Nancy Griffeth
Nancy Lynch
MIT
Cisco Systems
CUNY
MIT
Fault Tolerant DHCP




Dynamic Host Configuration Protocol (DHCP) is a
widely deployed protocol to assign IP addresses and
other client parameters.
DHCP is also important for the wireless and mobile
setting.
Current implementations use one DHCP server, are
not fault tolerant.
Main challenge to using multiple servers is to
maintain consistent view of assigned addresses
across servers to avoid double allocation.


Standard database techniques are too slow.
The DHCP Failover Protocol (DKS+’03) is a 2-server
DHCP algorithm retaining the client interface and
performance of DHCP.
Our Contributions


We present an algorithm based on DKS+’03,
generalized to arbitrary number of servers.
Rigorously specify algorithm and its behavior using TIOA


We decompose the DHCPF problem into independent
subproblems.




Helps end-users understand and use DHCP.
Subproblems can be solved separately, and their solutions
composed to solve DHCPF.
Helps to understand and prove the correctness of the algorithm.
Helps to analyze the effects of network parameters on algorithm
performance, and to optimize the algorithm.
Demonstrates that formal, theoretical approach can
provide correct, simple and efficient solutions to
complex, real-world problems.
Timed I/O Automaton

Formal modeling framework for describing distributed
systems.



A Timed I/O Automaton (TIOA) [KLSV’05] consists of






Rigorous and structured.
Composition, simulation, other proof / design techniques.
States, start states
Discrete actions State transitions (state, action, state)
Continuous actions (trajectories) A mapping from [0,t] to states
Scheduling of actions is nondeterministic.
Execution is alternating sequence of trajectories and
discrete actions.
Example A mobile robot.



State is its position.
Discrete actions are changes in destination.
Trajectories are movement towards destination.
System Assumptions

Ideally, we want DHCPF to satisfy the following.




These properties depend on correct behavior of
network and environment.
Clock assumption



Safety property No IP address is double allocated.
Liveness property All client commands are quickly
executed.
Clients and servers have bounded skew clocks.
Let D be a constant. Then |clocki(t) – t| D, for every
client or server i, and every time t.
Both safety and liveness depend on clock
assumption.
System Assumptions

Stability

Let l be a parameter. A time interval [t, t’] is lstable if



Timeliness


Some server is alive throughout [t-l, t’].
No server fails or recovers during [t-l, t’].
Time interval [t, t’] is l-timely if any message sent
during [t, t’-l] is delivered within l time.
Liveness property depends on having
sufficiently long stable and timely time
intervals.
System Assumptions

Failure detector U
 Utells servers which other servers are alive.



Can be implemented by heartbeats, network admin, etc.
Let n be a parameter. U is n–perfect if it satisfies




Model by recvU,j(dead, j’) and recvU,j(alive, j’) actions, where j, j’ are
servers.
Accuracy If recvU,*(dead, j’) occurs at time t, then j’ is dead sometime
in [t-n, t]. Likewise for recvU,*(alive, j’).
Timeliness Every j gets a recvU,j(dead, j’) or recvU,j(alive, j’) msg every
n seconds, for every j’.
Failure detectors used in many distributed algorithms, and are
sometimes provably necessary.
Safety depends on a failure detector U.
A Formal Spec of DHCPF

DHCP client interface and message
exchange sequence.



k is an interaction identifier.
Client is correct if it executes this message
sequence.
Say client i owns an IP address f at
time t if send*,i(ack,*,f,t) occurs before
t, and t  t – D.


Takes into account clock skew of client.
If i doesn’t own f at t, then i is definitely not
using fat t.
 Assumes correct clients.
client
server
A Formal Spec of DHCPF



Assume a n-perfect failure detector, and a D bound on
clock skew.
Safety For all IP addresses f and at all times t, at most
one client owns f at t.
Request liveness Suppose time t is (4n+4D)-stable and dtimely, and client i does bcast(discover,k) at time t.
Assume client i is correct and does not fail during [t, t+4d].
Then


By time t+d, every live server receives i’s message.
By time t+2d, either send(offer,k,f) occurs for some f, or for every
f, either



f was offer’ed to some client but not request’ed.
There is a lease for f which has not expired.
If send(offer,k,*) occurs, then send(ack,k,*,*) occurs by time t+4d.
A Formal Spec of DHCPF

Renew liveness Suppose time t is (4n+4D)-stable and
d-timely, and client i has a lease for f for time  t+d+D.
Then if i bcasts renew for f at t, i recvs an ack for f by
time t+2d.
DHCPF Algorithm Overview



We break the DHCPF problem into two independent subproblems,
Lease and Elect.
 Elect
 For any IP address f, elect a leader server for f.
 Only the leader can lease f to clients.
 There is at most one leader for f at any time.
 The leader can change as servers fail and recover.
 Lease
 The leader gives out leases for f.
 Ensure clients can always request or renew leases for f.
 Ensure no double allocation even if leader changes.
Lease and Elect run continuously, in parallel.
The DHCPF algorithm is the formal composition Elect  Lease.
The Elect Algorithm

For any IP address f, Elect ensures








Safety There is at most one leader server for f at any time.
Liveness If execution is currently “nice”, then a leader exists.
Code shown is for server j.
clock The current clock value at j.
live Set of servers j thinks is alive.
my-addrs Set of IP addresses j thinks it is leader for.
lead-time[f] Time when j became leader for f.
rec-time Time when j last recovered.
The Elect Algorithm
is min, and enough
time passed



no longer
min
Basic idea is the min live server should be leader for f’s.
 Actually, can use a different minf for each f, for load balancing.
If j hears j’ is alive
 Add j’ to live.
 For each f, if j no longer minf for f, give up leadership of f.
If j hears j’ is dead
 Remove j’ from live.
 For each f, if j became minf for f, and enough time passed since last
recovery, become leader for f.
 Time to wait depends on quality of failure detector n, and clock skew D.
Elect Properties



Assume U is n-perfect, and clock skew is at most D.
Theorem (Safety) At any time, for any address f, there is at
most one server j with fmy-addrsj.
Proof
s1, s2 both
leaders for f
s1 is alive from
this point on
n
n
s1
s2
t-2n

t-n
t
s2 sees s1, won’t
become leader
Theorem (Liveness) If current state is (4n + 4D)-stable, then
for every address f, we have fmy-addrsminf L, where L is the
set of current live servers.
The Lease Algorithm
s1





To avoid double allocation, leader should
tell others servers its leases, in case it fails.
Waiting for acks from other servers is too
slow.
Leader first gives client a temporary
Maximum Client Lead Time (MCLT) lease.
 Client gets a shorter lease than he asked
for.
While client is using MCLT lease, leader
negotiates an acknowledged lease with
other servers.
 When client renews, he gets the lease he
asked for last time.
In this example, suppose MCLT = 3.
s2
1
2
3
4
5
10
The Lease Algorithm
s1




When new leader takes over, it
waits MCLT time, and also till its
max acknowledged lease expires.
This upper bounds the maximum
potential lease that the previous
leader might have given out.
Leader only gives out new lease
for f when all potential leases
have expired.
This is the main idea of DKS+’03.
s2
1
2
3
4
5
The Lease Algorithm
wait for max of MCLT and
potlease
give the
ack’ed lease
check f is
available
MCLT lease
negotiate acknowledged
lease





potlease[f] Maximum potential lease given out for f.
reserved Set of addresses offered but not requested.
acklease[f] The lease value that j will give for f.
k An interaction identifier.
write-acks[k] Set of servers acknowledging interaction instance k.
every server
increased
potlease, so j can
increase
acklease
Safety of Elect  Lease


Theorem Elect  Lease satisfies the
safety property of the DHCPF
specification.
Proof A sequence of invariants, proved
by induction on the execution.



Prove that servers have good estimate
of max lease given out for f.
Lemma For all j, j’, if jwrite-acks[k]j’,
then potlease[fk]j tk
Lemma For all j, j’, max(potlease[f]j,
clockj + MCLT + 2D)  acklease[f]j’
 Key invariant of [DKS+’03].
 Only consider actions s which increase
acklease[f]j’.
Safety of Elect  Lease

Lemma Let W be the leader for f. Then
potlease[f]W  acklease[f]j, for all j.



If inductive step doesn’t change leader, we show this using
the fact that there’s at most one leader for f.
If leader changes, then W sets potlease[f]W 
max(potlease[f]j, clockj + MCLT + 2D).
Since leader always knows the max lease for f, it
avoids double allocation during request or renew.
Liveness of Elect  Lease

Hard to state


Need to identify all situations which prevent progress.
Easy to prove!
When nothing bad happens, something good happens.
Theorem Elect  Lease satisfies the request and renew
liveness properties of the DHCPF specification.
Proof (Request liveness)








Suppose client i bcasts discover at time t. By time t+d, every live
server gets i’s message.
Since t is (4n + 4D)-stable and d-timely, then every f has a leader.
Server j doesn’t offer i any address only if for every f j owns, f has
been reserved by another client, or the lease for f hasn’t expired.
If i is offered some f’s, then no other client is offered those f’s, so
within 2d time, i gets ack for f.
Renew liveness proof similar.
Conclusions



Formally specified and implemented a fault tolerant
DHCP algorithm using TIOA.
A simple algorithm based on decomposition into
independent subproblems.
Is our decomposition “good”?



Does DHCPF need a perfect failure detector?
Is the dependence on clock skew and msg delay the best
possible?
Is “goodness” merely a “human” and case-by-case
concept, or a more universal one?

Perhaps not totally far-fetched? Church-Turing formalized
computation, Cook-Levin formalized completeness…
Thank you!