Document

Coverage
•
•
•
•
•
•
•
•
Nature of Causality
Causality: Why is it Important/Useful?
Causality in Life vs. Causality in Distributed Systems
Modeling Distributed Events - Defining Causality
Logical Clocks
General Implementation of Logical Clocks
Scalar Logical Time
Demo: Scalar Logical Time with Asynchronous
Unicast Communication between Multiple Processes
• Conclusions
• Questions / Reference
Nature of Causality
• Consider a distributed computation which is
performed by a set of processes:
– The objective is to have the processes work
towards and achieve a common goal
– Processes do not share global memory
– Communication occurs through message passing
only
Process Actions
• Actions are modeled as three types of events:
– Internal Event: affects only the process which is
executing the event,
– Send Event: a process passes messages to other
processes
– Receive Event: a processes gets messages from
other processes
P1
P2
P3
b
a
d
g
i
c
m
j
f
e
l
o
n
r
h
k
p
q
Causal Precedence
• Ordering of events for a single process is simple:
they are ordered by their occurrence.
P
a
b
c
d
• Send and Receive events signify the flow of
information between processes, and establish causal
precedence between events at the sender and
receiver

a 
d
P1
P2

b
c

Distributed Events
• The execution of this distributed computation results
in the generation of a set of distributed events.
• The causal precedence induced by the send and
receive events establishes a partial order of these
distributed events:
– The precedence relation in our case is “Happened
Before”, e.g. for two events a and b, a  b means
“a happened before b”.
P1
P2
ab
a
b
(Event a precedes event b)
Causality:
Why is it important/useful?
 This causality among events (induced by the
“happened before” precedence relation) is
important for several reasons:
– Helps us solve problems in distributed computing:
 Can ensure liveness and fairness in mutual exclusion
algorithms,
 Helps maintain consistency in replicated databases,
 Facilitates the design of deadlock detection
algorithms in distributed systems,
Importance of Causality (Continued)
 Debugging of distributed systems: allows the
resumption of execution.
 System failure recovery: allows checkpoints to be
built which allow a system to be restarted from a
point other than the beginning.
 Helps a process to measure the progress of other
processes in the system:
– Allows processes to discard obsolete
information,
– Detect termination of other processes
Importance of Causality
 Allows distributed systems to optimize the
concurrency of all processes involved:
– Knowing the number of causally dependent events in a
distributed system allows one to measure the
concurrency of a system:
All events that are not causally related can be
executed concurrently.
Causality:
Life vs. Distributed Systems
 We use causality in our lives to determine the
feasibility of daily, weekly, and yearly plans,
 We use global time and (loosely) synchronized clocks
(wristwatches, wall clocks, PC clocks, etc.)
Causality (Continued)
 However, (usually) events in real life do not occur at
the same rate as those in a distributed system:
 Distributed systems’ event occurrence rates are
obviously much higher,
 Event execution times are obviously much smaller.
 Also, distributed systems do not have a “global” clock
that they can refer to,
 There is hope though! We can use “Logical Clocks”
to establish order.
Modeling Distributed Events:
Defining Causality and Order
 Distributed program as a set of asynchronous
processes p1, p2, …, pn, who communicate through a
network using message passing only. Process
execution and message transfer are asynchronous.
P3
P1
P4
P2
Modeling Distributed Events
 Notation: given two events e1 and e2,
– e1  e2 : e2 is dependent on e1
– if e1  e2 and e2  e1 then e1 and e2 are concurrent: e1 || e2
P1
P2
e1
e2
P1
P2
e1
e2
Logical Clocks
 In a system of logical clocks, every process has a
logical clock that is advanced using a set of rules
P2
P1
P3
Logical Clocks - Timestamps
 Every event is assigned a timestamp (which the
processes use to infer causality between events).
P1
P2
Data
Logical Clocks - Timestamps
 The timestamps obey the monotonicity property. e.g.
if an event a causally affects an event b, then the
timestamp for a is smaller than b.
P1
P2
a
b
Event a’s timestamp is
smaller than event b’s
timestamp.
Formal Definition of Logical Clocks
 The definition of a system of logical clocks:
– We have a logical clock C, which is a function that
maps events to timestamps, e.g.
For an event e, C(e) would be its timestamp
P1
e
P2
Data
C(e)
Formal Definition of Logical Clocks
 For all events e in a distributed system, call them the
set H, applying the function C to all events in H
generates a set T:
e  H, C(e)  T
P1
P2
a
b
H = { a, b, c, d }
T = { C(a), C(b), C(c), C(d) }
d
c
Formal Definition of Logical Clocks
 We define the relation for timestamps, “<“, to be our
precedence relation: “happened before”.
 Elements in the set T are partially ordered by this
precedence relation, i.e.:
The timestamps for each event in the distributed
system are partially ordered by their time of
occurrence.
More formally,
e1  e2  C(e1) < C(e2)
Formal Definition of Logical Clocks
 What we’ve said so far is, “If e2 depends on e1, then
e1 happened before e2.”
 This enforces monotonicity for timestamps of events
in the distributed system, and is sometimes called the
clock consistency condition.
General Implementation of
Logical Clocks
 We need to address two issues:
 The data structure to use for representing the
logical clock and,
 The design of a protocol which dictates how the
logical clock data structure updated
Logical Clock Implementation:
Clock Structure
 The structure of a logical clock should allow a
process to keep track of its own progress, and the
value of the logical clock. There are three well-known
structures:
– Scalar: a single integer,
– Vector: a n-element vector (n is the number of
processes in the distributed system),
– Matrix: a nn matrix
Logical Clock Implementation:
Clock Structure
• Vector: Each process keeps an n-element vector
Process 1’s Logical Time
C1
C2
Process 1’s view of Process 2’s Logical Time
C3
Process 1’s view of Process 3’s Logical Time
• Matrix: Each process keeps an n-by-n matrix
C1 C1´ C1 ´´
C2 C2 ´ C2 ´´
C3 C3 ´ C3 ´´
Process 1’s view of Process 3’s view of
everyone’s Logical Time
Process 1’s Logical Time and view of
Process 2’s and Process 3’s logical
time.
...
Logical Clock Implementation:
Clock Update Protocol
 The goal of the update protocol is to ensure that the
logical clock is managed consistently; consequently,
we’ll use two general rules:
 R1: Governs the update of the local clock when an event
occurs (internal, send, receive),
 R2: Governs the update of the global logical clock
(determines how to handle timestamps of messages
received).
 All logical clock systems use some form of these two
rules, even if their implementations differ; clock
monotonicity (consistency) is preserved due to these
rules.
Scalar Logical Time
 Scalar implementation – Lamport, 1978
 Again, the goal is to have some mechanism that
enforces causality between some events, inducing
a partial order of the events in a distributed
system,
 Scalar Logical Time is a way to totally orders all
the events in a distributed system,
 As with all logical time methods, we need to define
both a structure, and update methods.
Scalar Logical Time: Structure
 Local time and logical time are represented by a
single integer, i.e.:
– Each process pi uses an integer Ci to keep track of
logical time.
P1
C1
P2
C2
P3
C3
Scalar Logical Time:
Logical Clock Update Protocol
 Next, we need to define the clock update rules:
– For each process pi:
 R1: Before executing an event, pi executes the following:
Ci = Ci + d
(d > 0)
d is a constant, typically the value 1.
 R2: Each message contains the timestamp of the sending
process. When pi receives a message with a timestamp
Cmsg, it executes the following:
Ci = max(Ci, Cmsg)
Execute R1
Scalar Logical Time:
Update Protocol Example
C1 = 0
P1
d=1
P2
C2 = 0
C2 = 1 (R1)
C1 = 1 (R1)
C1 = 2 (R1)
C2 = 2 (R1)
C2 = max(2, 1) (R2)
C2 = 3 (R1)
C2 = 4 (R1)
C2 = 5 (R1)
C1 = 3 (R1)
C1 = max (3, 6) (R2)
C1 = 7 (R1)
C2 = 6 (R1)
C2 = 7 (R1)
Scalar Logical Time: Properties
 Properties of this implementation:
– Maintains monotonicity and consistency
properties,
– Provides a total ordering of events in a distributed
system.
Scalar Logical Time: Pros and Cons
 Advantages
 We get a total ordering of events in the system. All
the benefits gained from knowing the causality of
events in the system apply,
 Small overhead: one integer per process.
 Disadvantage
 Clocks are not strongly consistent: clocks lose
track of the timestamp of the event on which they
are dependent on. This is because we are using a
single integer to store the local and logical time.
Demo - Simple Scalar Logical Time
Application
• Consists of several processes, communicating
asynchronously via Unicast,
• Only Send and Receive events are used; internal
events can be disregarded since they only complicate
the demo (imagine processes which perform no
internal calculations),
• Scalar logical time is used,
• Written in Java.
Demo: Event Sequence
• Start one process (P1)
• P1 uses a receive thread to process incoming
messages asynchronously.
• P1 will sleep for a random number of seconds
• Upon waking, P1 will attempt to send a message to a
random process, emulating asynchronous and
random sending. P1 repeats this process.
• Start process 2 (P2). The design of the application
allows processes to know who is in the system at all
times.
• P2 performs the same steps as P1…
Conclusions
 Logical time is used for establishing an ordering of
events in a distributed system,
 Logical time is useful for several important areas and
problems in Computer Science,
 Implementation of logical time in a distributed system
can range from simple (scalar-based) to complex
(matrix-based), and covers a wide range of
applications,
 Efficient implementations exist for vector and matrix
based scalar clocks, and must be considered for any
large scale distributed system.