Logical Clocks

Logical Clocks (2)
Topics
 Logical clocks
 Totally-Ordered Multicasting
 Vector timestamps
Readings
 Van Steen and Tanenbaum: 5.2
 Coulouris: 10.4
 L. Lamport, “Time, Clocks and the Ordering
of Events in Distributed Systems,”
Communications of the ACM, Vol. 21, No. 7,
July 1978, pp. 558-565.
 C.J. Fidge, “Timestamps in Message-Passing
Systems that Preserve the Partial
Ordering”, Proceedings of the 11th
Australian Computer Science Conference,
Brisbane, pp. 56-66, February 1988.
Problems with Lamport Clocks
 Lamport timestamps do not capture
causality.
 With Lamport’s clocks, one cannot directly
compare the timestamps of two events to
determine their precedence relationship.
If C(a) < C(b) is not true then a  b is also not
true.
 Knowing that C(a) < C(b) is true does not allow
us to conclude that a  b is true.
 Example: In the first timing diagram, C(e) = 1
and C(b) = 2; thus C(e) < C(b) but it is not the
case that e  b

Problems with Lamport Clocks
 Why is this important?
 For the banking example, it doesn’t matter what
order operations are executed at each replica.
 It only matters that all replicas execute the
operations in the same order. This is total order.
 Causal order is used when a message received by a
process can potentially affect any subsequent
message sent by that process. Those messages
should be received in that order at all processes.
Unrelated messages may be delivered in any order.
 Do Lamport timestamps support causal order?

NO
Example Application:Bulletin
Board
 The Internet’s electronic bulletin board (or
Google’s Groups or any chat service)
 Users (processes) join specific groups
(discussion groups).
 Postings, whether they are articles or
reactions, are multicast to all group
members.
 Could use a totally-ordered multicasting
scheme.

Should we be doing this?
Display from a Bulletin Board
Program
 Users run bulletin board applications which multicast
messages
 One multicast group per topic (e.g. os.interesting)
 Require reliable multicast - so that all members receive
messages (more on this later)
 Ordering:
Bulletin board: os.interesting
total (makes
the numbers
From
Subject
causal (makes
Item
the same at
replies come after
23
A.Hanlon
Mach
all sites)
original message)
24
G.Joseph
Microkernels
25
A.Hanlon
Re: Microkernels
26
T.L’Heureux
RPC performance
27
M.Walker
Re: Mach
end
FIFO (gives sender order)
Figure 11.13
Colouris
•
Example Application: Bulletin
Board
 A totally-ordered multicasting scheme
does not imply that if message B is
delivered after message A, that B is a
reaction to A.
 The receipt of an article precedes the
posting of a reaction.
 The receipt of the reaction to an article
should always follow the receipt of the
article.
Example Application: Bulletin
Board
 If we look at the bulletin board example, it
is allowed to have items 26 and 27 in
different order at different sites.
 Items 25 and 26 may be in different order
at different sites.
Problem with Lamport Clocks
 The main problem is that a simple integer
clock cannot order both events within a
process and events in different processes.
 C. Fidge developed an algorithm that
overcomes this problem.
 Fidge’s clock is represented as a vector
[v1,v2,…,vn] with an integer clock value for
each process (vi contains the clock value of
process i). This is a vector timestamp.
Vector Timestamps
 Properties of vector timestamps
 vi
[i] is the number of events that have
occurred so far at Pi
 If vi [j] = k then Pi knows that k events have
occurred at Pj
Vector Timestamps

A vector clock is maintained as follows:



Initially all clock values are set to the smallest
value (e.g., 0).
The local clock value is incremented at least
once before each event of interest (in our
case this will be a send event) in a process, q
i.e., vq[q] = vq[q] +1
Let vq be piggybacked on the message sent by
process q to process p; We then have:
•
For i = 1 to n do
vp[i] = max(vp[i], vq [i] );
Vector Timestamp
 For two vector timestamps, va and vb
 va
 vb if there exists an i such that va[i]  vb[i]
 va ≤ vb if for all i va[i] ≤ vb[i]
 va < vb if for all i va[i] ≤ vb[i] AND va is not equal
to vb
 Events a and b are causally related if va < vb
or vb< va .
 Vector timestamps can be used to
guarantee causal message delivery.
Example Application: Bulletin
Board
 Each process Pi has an array Vi where Vi[j]
denotes the number of events that process
Pi knows have taken place for Pj.
In this application “events” refers to the
sending of a message.
 Thus if Vi[j] = 6 then Pi knows that Pj has sent 6
messages.

Example Application: Bulletin
Board
 Let Vi be piggybacked on the message sent
by process Pi to process Pj ; When process
Pj receives the message, then Pj does the
following:
• For k = 1 to n do
Vj[k] = max(Vi[k], Vj [k] );
• Thus if process i knows that process k sent 5
messages (Vi [k] =5 and Vj[k] = 3) then
process j didn’t know about the latest two
messages sent by process k.
Example Application: Bulletin
Board
 When process i sends a message, it does
the following:
Vi[i] = Vi[i] + 1;
 This is basically saying that the number of
messages process i has sent is incremented
by one.
Example Application: Bulletin
Board
 When a process Pi posts (sends) an article,
a, it multicasts that article with timestamp
Vi
 Process Pj posts a reaction. Let’s call this
message r. Assume that the value of the
timestamp is Vj
 Note that Vj > Vi
 Message r may arrive at Pk before message
a.
 Pk will postpone delivery of r to the display
of the bulletin board until all messages
that causally precede r have been received
Example Application: Bulletin
Board
 Message r (from Pj ) is delivered to Pk iff the
following conditions are met:

Vj[j] = Vk[j]+1
• This condition is satisfied if r is the next message that Pk
was expecting from process Pj
• Assume that Vk[j]=5. This means that Pk knows Pj sent 5
messages and that it should be waiting for the 6th message.
• If Vj[j] = 8 then messages 6,7 are lost or not yet arrived.

Vj[i] ≤ Vk[i] for all i not equal to j
• This condition is satisfied if Pk has seen at least as many
messages as seen by Pj when it sent message r.
• Assume for some i that Vj[i] = 2 and Vk[i] =1 and thus Vj[i] >
Vk[i] for some i. This indicates that Pi sent a message that
was received by Pj but not Pk
Example Application: Bulletin
Board
P2 [0,0,0]
P1 [0,0,0]
P3 [0,0,0]
Post a
[1,0,0] a
c
[1,0,0]
e
[1,0,0]
[1,0,1] r: Reply a
b
d
[1,0,1]
g
[1,0,1]
Message a arrives at P2 before the reply r from P3 does
Example Application: Bulletin
Board
P2 [0,0,0]
P1 [0,0,0]
P3 [0,0,0]
Post a
[1,0,0] a
[1,0,0]
g
[1,0,1] r: Reply a
Buffered
b
[1,0,1]
d
c
[1,0,0]
Deliver r
The message a arrives at P2 after the reply from P3; The reply is
not delivered right away.
Summary
 No notion of a globally shared clock.
 Local (physical) clocks must be
synchronized based on algorithms that
take into account network latency.
 Knowing the absolute time is not necessary.
 Logical clocks can be used for ordering
purposes.