Concurrent Reading and Writing using Mobile Agents

Hwajung Lee
Question 1.
Why is physical clock synchronization important?
Question 2.
With the price of atomic clocks or GPS coming down,
should we care about physical clock synchronization?
Types of Synchronization
Types of clocks



Unbounded 0, 1, 2, 3, . . .
Bounded 0,1, 2, . . . M-1, 0, 1, . . .
External Synchronization
Internal Synchronization
Phase Synchronization
Unbounded clocks are not realistic, but are easier to
deal with in the design of algorithms. Real clocks are
always bounded.
c
l
o
c
k
t
i
m
e
What are these?
Drift rate 
Clock skew 
Resynchronization interval R
Š
clock 1
clock 2
drift rate= 
R
R
Newtonian time
Max drift rate  implies:
(1- ) ≤ dC/dt < (1+ )
Challenges
(Drift is unavoidable)
Accounting for propagation delay
Accounting for processing delay
Faulty clocks
Berkeley Algorithm
A simple averaging algorithm
that guarantees mutual
consistency |c(i) - c(j)| < 
Step 1. Read every clock in the system.
Step 2. Discard outliers and substitute
them by the value of the local clock.
Step 3. Update the clock using the
average of these values.
Lamport and Melliar-Smith’s
averaging algorithm handles
byzantine clocks too
c
c-
i
c+
j
c-2
k
Bad clock
A faulty clocks exhibits 2-faced
or byzantine behavior
Assume
n clocks, at most t are faulty
Step 1. Read every clock in the system.
Step 2. Discard outliers and substitute them by
the value of the local clock.
Step 3. Update the clock using the average of
these values.
Synchronization is maintained if
Why?
n > 3t
Lamport & Melliar-Smith’s
algorithm (continued)
c
c-
i
c+
j
The maximum difference between
the averages computed by two
non-faulty nodes is (3t / n)
To keep the clocks synchronized,
3t/ n < 
c-2
kk
So,
Bad clocks
3t < n
External Synchronization
Time
server
Client pulls data from a time server
every R unit of time, where R <  /
2.
For accuracy, clients must compute
the round trip time (RTT), and
compensate for this delay
while adjusting their own clocks.
(Too large RTT’s are rejected)
Tiered architecture
Level 1
Time
server
Level 0
Level 1
Level 1
Broadcast mode
- least accurate
Procedure call
- medium accuracy
Peer-to-peer mode
- upper level servers use
this for max accuracy
Level 2
Level 2
Level 2
The tree can reconfigure itself if some node fails.
Let Q’s time be ahead of P’s time by .
Then
T2
Q
T2 = T1 + TPQ + 
T4 = T3 + TQP - 
T3
y = TPQ + TQP = T2 +T4 -T1 -T3 (RTT)

P
T1
= (T2 -T4 -T1 +T3) / 2 - (TPQ - TQP) / 2
T4
x
Between y/2 and -y/2
So, x- y/2 ≤  ≤ x+ y/2
Ping several times, and obtain the smallest value of y. Use it to calculate 
1. What problems can occur when a clock value is
Advanced from 171 to 174?
2. What problems can occur when a clock value is
Moved back from 180 to 175?
1.What happened to the instant 172 and 173?
2. The instants 175 -180 appear twice
Mutual Exclusion
p0
CS
p1
CS
p2
CS
p3
CS
Some applications are:
1.
2.
3.
4.
5.
Resource sharing
Avoiding concurrent update on shared data
Controlling the grain of atomicity
Medium Access Control in Ethernet
Collision avoidance in wireless broadcasts
ME1. At most one process in the CS. (Safety property)
ME2. No deadlock. (Safety property)
ME3. Every process trying to enter its CS must eventually succeed.
This is called progress. (Liveness property)
Progress is quantified by the criterion of bounded waiting. It
measures
a form of fairness by answering the question:
Between two consecutive CS trips by one process, how many times
other processes can enter the CS?
There are many solutions, both on the shared memory model and the
message-passing model
Client
do true 
send request;
reply received  enter CS;
send release;
<other work>
server
busy: boolean
od
queue
req
reply
clients
release
Server
do request received and not busy  send reply; busy:= true
request received and busy  enqueue sender
release received and queue is empty  busy:= false
release received and queue not empty  send reply
to the head of the queue
od
- Centralized solution is simple.
- But the server is a single point of failure. This is BAD.
- ME1-ME3 is satisfied, but FIFO fairness is not guaranteed.
Why?
Can we do better? Yes!
{Life of each process}
1. Broadcast a timestamped request to all.
2.
Request received  enqueue sender in local Q;.
Not in CS  send ack
Q0
Q1
0
1
2
3
In CS  postpone sending ack (until exit
from CS).
3. Enter CS, when
(i) You are at the head of your own local Q
(ii) You have received ack from all processes
Q2
Q3
4. To exit from the CS,
(i) Delete the request from Q, and
(ii) Broadcast a timestamped release
5. Release received  remove sender from local Q.
Completely connected topology
Can you show that it satisfies
all the properties (i.e. ME1, ME2,
ME3) of a correct solution?
{Lamport’s algorithm}
1. Broadcast a timestamped request to all.
2. Request received  enqueue it in local Q. Not in
CS  send ack, else postpone sending ack
until exit from CS.
3. Enter CS, when
(i) You are at the head of your Q
(ii) You have received ack from all
4. To exit from the CS,
Q0
(i) Delete the request from your Q, and
(ii) Broadcast a timestamped release
5. When a process receives a release message, it
removes the sender from its Q.
Q2
Q1
0
1
2
3
Q3
Completely connected topology
Can you show that it satisfies all the properties
(i.e. ME1, ME2, ME3) of a correct solution?
Q0
Observation. Processes taking a decision to enter CS
must have identical views of their local queues,
when all acks have been received.
Proof of ME1. At most one process can be in its CS at
any time.
Suppose not, and both j,k enter their CS. This implies


j in CS  Qj.ts.j < Qk.ts.k
k in CS  Qk.ts.k < Qj.ts.j
Impossible.
Q2
Q1
0
1
2
3
Q3
Proof of ME2. (No deadlock)
The waiting chain is acyclic.
i waits for j
 i is behind j in all queues
(or j is in its CS)
 j does not wait for i
Proof of ME3. (progress)
New requests join the end of the
queues, so new requests do not
pass the old ones
Q0
Q2
Q1
0
1
2
3
Q3
Proof of FIFO fairness.
timestamp (j) < timestamp (k)
Req (30)
 j enters its CS before k does so
Suppose not. So, k enters its CS before j. So k
did not receive j’s request. But k received the
ack from j for its own req.
This is impossible if the channels are FIFO
.
Message complexity = 3(N-1)
(N-1 requests + N-1 ack + N-1 release)
j
k
Req
(20)
ack
{Ricart
& Agrawala’s algorithm}
What is new?
1. Broadcast a timestamped request to all.
2. Upon receiving a request, send ack if
-You do not want to enter your CS, or
-You are trying to enter your CS, but your timestamp is
higher than that of the sender.
(If you are already in CS, then buffer the request)
3. Enter CS, when you receive ack from all.
4. Upon exit from CS, send ack to each
pending request before making a new request.
(No release message is necessary)
{Ricart & Agrawala’s algorithm}
ME1. Prove that at most one process can be in CS.
ME2. Prove that deadlock is not possible.
ME3. Prove that FIFO fairness holds even if
channels are not FIFO
Message complexity = 2(N-1)
(N-1 requests + N-1 acks - no release message)
TS(j) < TS(k)
Req(k)
Ack(j)
j
k
Req(j)
Timestamps grow in an unbounded manner.
This makes real implementation impossible.
Can we somehow bounded timestamps?
Think about it.
{Maekawa’s algorithm}
- First solution with a sublinear O(sqrt N) message
complexity.
- “Close to” Ricart-Agrawala’s solution, but each process
is required to obtain permission from only a subset
of peers

With each process i, associate a subset
Si.Divide the set of processes into
subsets that satisfy the following two
conditions:
S1
S0
0,1,2
1,3,5
i  Si
i,j :  i,j  n-1 :: Si Sj ≠ 

Main idea. Each process i is required to
receive permission from Si only.
Correctness requires that multiple
processes will never receive permission
from all members of their respective
subsets.
2,4,5
S2
Example. Let there be seven processes 0, 1, 2, 3, 4, 5, 6
S0
S1
S2
S3
S4
S5
S6
=
=
=
=
=
=
=
{0, 1, 2}
{1, 3, 5}
{2, 4, 5}
{0, 3, 4}
{1, 4, 6}
{0, 5, 6}
{2, 3, 6}
Version 1 {Life of process I}
1. Send timestamped request to each process in
Si.
2. Request received  send ack to process with
the lowest timestamp. Thereafter, "lock" (i.e.
commit) yourself to that process, and keep
others waiting.
3. Enter CS if you receive an ack from each
member in Si.
4. To exit CS, send release to every process in Si.
5. Release received  unlock yourself. Then send
ack to the next process with the lowest
timestamp.
S0 =
{0, 1, 2}
S1 =
{1, 3, 5}
S2 =
{2, 4, 5}
S3 =
{0, 3, 4}
S4 =
{1, 4, 6}
S5 =
{0, 5, 6}
S6 =
{2, 3, 6}
S0 =
{0, 1, 2}
S1 =
{1, 3, 5}
S2 =
{2, 4, 5}
S3 =
{0, 3, 4}
S4 =
{1, 4, 6}
Process k will never send ack to both.
S5 =
{0, 5, 6}
So it will act as the arbitrator and establishes ME1
S6 =
{2, 3, 6}
ME1. At most one process can enter its critical
section at any time.
Let i and j attempt to enter their Critical Sections
Si  Sj ≠ 
there is a process k  Si  Sj
ME2. No deadlock. Unfortunately deadlock
is possible! Assume 0, 1, 2 want to enter
their critical sections.
From S0= {0,1,2}, 0,2 send ack to 0, but 1
sends ack to 1;
From S1= {1,3,5}, 1,3 send ack to 1, but 5
sends ack to 2;
From S2= {2,4,5}, 4,5 send ack to 2, but 2
sends ack to 0;
Now, 0 waits for 1, 1 waits for 2, and 2 waits
for 0. So deadlock is possible!
S0
S1
S2
S3
S4
S5
S6
=
=
=
=
=
=
=
{0,
{1,
{2,
{0,
{1,
{0,
{2,
1,
3,
4,
3,
4,
5,
3,
2}
5}
5}
4}
6}
6}
6}
Avoiding deadlock
If processes receive messages in
increasing order of timestamp, then
deadlock “could be” avoided. But this is
too strong an assumption.
Version 2 uses three additional messages:
- failed
- inquire
- relinquish
S0 =
{0, 1, 2}
S1 =
{1, 3, 5}
S2 =
{2, 4, 5}
S3 =
{0, 3, 4}
S4 =
{1, 4, 6}
S5 =
{0, 5, 6}
S6 =
{2, 3, 6}
New features in version 2
Send ack and set lock as usual.
- If lock is set and a request with larger
timestamp arrives, send failed (you have no
chance). If the incoming request has a lower
timestamp, then send inquire (are you in
CS?) to the locked process.
- Receive inquire and at least one failed
message  send relinquish. The recipient
resets the lock.
-
S0 =
{0, 1, 2}
S1 =
{1, 3, 5}
S2 =
{2, 4, 5}
S3 =
{0, 3, 4}
S4 =
{1, 4, 6}
S5 =
{0, 5, 6}
S6 =
{2, 3, 6}
0
0
6
1
ack
18
req
5
4
12
req
6
1
25
req
2
failed
inquire
5
18
2
12
3
4
3
-
Let K = |Si|. Let each process be a member of D
subsets. When N = 7, K = D = 3. When K=D, N = K(K1)+1. So K is of the order √N
- The message complexity of Version 1 is 3√N.
Maekawa’s analysis of Version 2 reveals a complexity
of 7√N

Sanders identified a bug in version 2 …
Suzuki-Kasami algorithm
The Main idea
Completely connected network of processes
There is one token in the network. The holder
of the token has the permission to enter
CS.
Any other process trying to enter CS must
acquire that token. Thus the token will
move from one process to another based
on demand.
I want to enter CS
I want to enter CS
req
Process i broadcasts (i, num)
req
Sequence number
of the request
queue Q
Each process maintains
-an array req: req[j] denotes the
sequence no of the latest request from
process j
(Some requests will be stale soon)
Additionally, the holder of the token maintains
-an array last: last[j] denotes the
sequence number of the latest visit to CS
from for process j.
- a queue Q of waiting processes
last
req
req
req
req: array[0..n-1] of integer
last: array [0..n-1] of integer
When a process i receives a request (k, num)
from
process k, it sets req[k] to max(req[k], num).
The holder of the token
--Completes its CS
--Sets last[i]:= its own num
--Updates Q by retaining each process k only
if
1+ last[k] = req[k]
(This guarantees the freshness of the request)
--Sends the token to the head of Q, along with
the array last and the tail of Q
In fact, token  (Q, last)
Req: array[0..n-1] of integer
Last: Array [0..n-1] of integer
{Program of process j}
Initially, i: req[i] = last[i] = 0
* Entry protocol *
req[j] := req[j] + 1
Send (j, req[j]) to all
Wait until token (Q, last) arrives
Critical Section
* Exit protocol *
last[j] := req[j]
k ≠ j: k  Q  req[k] = last[k] + 1  append k to Q;
if Q is not empty  send (tail-of-Q, last) to head-of-Q fi
* Upon receiving a request (k, num) *
req[k] := max(req[k], num)
req=[1,0,0,0,0]
req=[1,0,0,0,0]
last=[0,0,0,0,0]
1
0
2
req=[1,0,0,0,0]
4
req=[1,0,0,0,0]
3
req=[1,0,0,0,0]
initial state
req=[1,1,1,0,0]
req=[1,1,1,0,0]
last=[0,0,0,0,0]
1
0
2
req=[1,1,1,0,0]
4
req=[1,1,1,0,0]
3
req=[1,1,1,0,0]
1 & 2 send requests
req=[1,1,1,0,0]
req=[1,1,1,0,0]
last=[1,0,0,0,0]
Q=(1,2)
1
0
2
req=[1,1,1,0,0]
4
req=[1,1,1,0,0]
3
req=[1,1,1,0,0]
0 prepares to exit CS
req=[1,1,1,0,0]
1
0
req=[1,1,1,0,0]
last=[1,0,0,0,0]
Q=(2)
2
req=[1,1,1,0,0]
4
req=[1,1,1,0,0]
3
req=[1,1,1,0,0]
0 passes token (Q and last) to 1
req=[2,1,1,1,0]
1
0
req=[2,1,1,1,0]
last=[1,0,0,0,0]
Q=(2,0,3)
2
req=[2,1,1,1,0]
4
req=[2,1,1,1,0]
3
req=[2,1,1,1,0]
0 and 3 send requests
req=[2,1,1,1,0]
req=[2,1,1,1,0]
1
0
2
4
req=[2,1,1,1,0]
3
req=[2,1,1,1,0]
last=[1,1,0,0,0]
Q=(0,3)
req=[2,1,1,1,0]
1 sends token to 2
3
1
2
1
4
5
4
1
6
1,4
7
4,7
1,4,7 want to enter their CS
3
2
1
1
4
5
4
6 1,4
7
4,7
2 sends the token to 6
3
1
2
6
4
7
4
4
4
5
4,7
6 forwards the token to 1
The message complexity is O(diameter) of the tree. Extensive
empirical measurements show that the average diameter of randomly
chosen trees of size n is O(log n). Therefore, the authors claim that the
average message complexity is O(log n)
--
How many messages are in transit
on the internet?
-- What is the global state of a distributed
system of N processes?
How do we compute these?
2
(2,0)
0
(1,2)
1
(0,1)
Let a $1 coin circulate in a network of a million banks.
How can someone count the total $ in circulation? If
not counted “properly,” then one may think the total $
in circulation to be one million.
Major uses in
- deadlock detection
- termination detection
- rollback recovery
- global predicate
computation
A cut is a set of events.
(a  consistent cut C)  (b happened before a)  b  C
If this is not true, then the cut is inconsistent
P1
b
a
c
m
P2
d
g
e
f
P3
k
Cut 1
(Consistent)
h
i
Cut 2
(Not consistent)
j
The set of states immediately following a consistent


cut forms a consistent snapshot of a distributed
system.
A snapshot that is of practical interest is the most
recent one. Let C1 and C2 be two consistent cuts
and C1  C2. Then C2 is more recent than C1.
Analyze why certain cuts in the one-dollar bank are
inconsistent.
How to record a consistent snapshot? Note that
1. The recording must be non-invasive
2. Recording must be done on-the-fly.
You cannot stop the system.