Time and Global States

CSS434 Time and Global States
Textbook Ch11
Professor: Munehiro Fukuda
CSS434 Time & Global States
1
Outline

Physical clock synchronization


Logical clock synchronization


Parallel and distributed simulation
Global states and consistent cuts


Applications: make
Distributed garbage collection, deadlock detection,
distributed termination detection, and discreteeven simulation
Distributed debugging

Checking if a captured snapshot is one of
transitory states we have considered.
CSS434 Time & Global States
2
Why Clock Synchronization


Computer clock: a counter decremented by a crystal oscillation.

Single computers: all processes use the same clock. – No problem

Multiple computers: impossible to guarantee that the crystals in
different computers all run at exactly the same frequency.
Netw ork
Synchronization:

Absolute (with real time)
 Necessary for real-time applications such as on-line reservation
systems

Relative (with each other)
 Required for those applications that need a consistent view of
time across all nodes.
CSS434 Time & Global States
3
Clock Synchronization
Passive Centralized Algorithms – Christian’s Algorithm
Client
T1
T0
Time?
Time server




Time=T
Processing
Time
Assumption: processing time has been measured or estimated
Message_delay = (T1 – T0 – processing)/2
New client time = T + message_delay
Improvements:


Average multiple measurements
Discard outlying measurements
CSS434 Time & Global States
4
Clock Synchronization
Active Centralized Algorithm – Berkeley Algorithm
Time server
Time?
C1_time
Diff(1)
Client 1
Time?
C2_time
Diff(2)
Client 2



Assumption: processing time has been measured or estimated
Server: diff(i) = server_time – (ci_time + message_deley)
Client: ci_time = ci_time + diff(i)
CSS434 Time & Global States
5
Clock Synchronization
Distributed Algorithm – Averaging Algorithm
Node 1
T0
T0 +R
T0 +2R
T0 +3R
N1_time=31 N2_time=32
N3_time=30
Node 2
N1_time=31
Node 3




N3_time=30
N2_time=32
Assumption: R is large enough to wait for all broadcast messages
All nodes broadcast their time periodically
Each node computes average.
Improvement:


Discard outlying time messages.
Exchange their time with their local neighbors.
CSS434 Time & Global States
6
Clock Synchronization
Network Time Protocol
UTC (Coordinated Universal Time)
1
UDP message
2
3
2
3
3
Note: Arrows denote synchronization control, numbers denote strata.
CSS434 Time & Global States
7
Clock Synchronization
Network Time Protocol
Server B
o
Ti-2
t
Ti-1
Time
t’
m
m'
Time
Server A
Ti- 3
Ti
Ti-2 = Ti-3 + t + o
Ti = Ti-1 + t’ - o
di = t + t’ = Ti-2 – Ti-3 + Ti – Ti-1
2o = Ti-2 – Ti-3 – t + Ti-1 – Ti + t’
o = (Ti-2 – Ti-3 + Ti-1 – Ti)/2 + (t’ – t)/2
CSS434 Time & Global States
8
Event Ordering
Happened-Before Relation
Most applications need not maintain the real-time
synchronized clock.
1. Event eki: The kth event of process i
2. Sequence hki: The history of process I through
to the event eki
3. Cause-and-effect e→e’: e proceeds e’.
4. Parallel events e∥e’: e and e’ happen in parallel
5. Happens-Before Relation:
• If eki, eli ∈hi and k < l, then eki → eli,
• If ei = send(m) and ej = receive(m), then ei → ej,
• If e → e’ and e’ → e”, then e → e”
CSS434 Time & Global States
9
Event Ordering
Logical Clock
LC(ei) := (ei != receive(m)) ? LC + 1 : max(LC, TS(m)) + 1
where TS(m) is the timestamp of message m:
P1
e21
e11
LC=2
LC=1
P2
P3
m1
e 12
e 22
LC=3
LC=4
e 13
e 23
LC=5
LC=1
1.
2.
m1
ee’ LC(e) < LC(e’) for all events
However, we cannot inferLC(e) < LC(e’)  ee’
Example: LC(e21) > LC(e13) but e21 || e13
CSS434 Time & Global States
10
Event Ordering
Vector Clock
Vi[I] = vi[i] + 1;
Pi includes the value t = Vi in every message it sends
Vi[j] = max(vi[j], t[j]) for j = 1,2,…,N
P1
e 21
e11
(1,0,0)
(2,0,0)
1.
2.
(2,1,0)
e 12
P2
P3
m1
(2,2,0)
e 22
(0,0,1)
m1 (2,2,2)
e 13
e23
ee’ V(e) < V(e’)
V(e) < V(e’)  ee’
Example:neither V(e21)  V(e13) nor V(e21)  V(e13), and thus e21 || e13
CSS434 Time & Global States
11
Global State
Applications necessary to detect a correct global state
p2
p1
objec t
referenc e
mess age
a. Garbage collec tion
garbage objec t
This is not a garbage, because
the in-transit message points to it
p1
w ait-for
b. Deadlock
p2
w ait-for
Both p1 and p2 are passive and thus seems ready to finish,
but an in-transit message makes p1 active again.
p1
p2
activate
c . Termination
pas sive
CSS434 Time & Global States
pas sive
12
Global State
Consistent Cut
Finding C such that (e ∈ C) ∧(e’ → e) ⇒ e’ ∈ C
p1
p2
p3
p4
e11
e12
e21 e21
e2
e32
2
(send)
e23
e13
e14
C
(receive)
e24
CSS434 Time & Global States
C’
e34
13
Global State
Distributed Snapshot – Chandy/Lamport [1985]
P0
P0
Snapshot request
P1
s
s
Message recording
s
Ordinary message
s
P1



P2
m
m m
s
P2
A process that wants to take a snapshot sends a snapshot request to the others.
Each process records its state upon receiving the first snapshot request.
Each process keep recording the messages until receiving a snapshot request
from each of the other process except the one that has originally initiated a
snapshot.
CSS434 Time & Global States
14
Global State
Distributed Snapshot – Chandy/Lamport [1985]
Marker (Snapshot request) receiving rule for process pi
On pi’s receipt of a marker (snapshot request) message over channel c:
if (pi has not yet recorded its state) it
records its process state now;
records the state of c as the empty set;
turns on recording of messages arriving over other incoming channels;
else
pi records the state of c as the set of messages it has received over c
since it saved its state.
end if
Marker (Snapshot request) sending rule for process pi
After pi has recorded its state, for each outgoing channel c:
pi sends one marker message over c
(before it sends any other message over c).
CSS434 Time & Global States
15
A Distributed Snapshot Example
1. Global state S0
2. Global state S1
3. Global state S2
4. Global state S3
Time line
c2
<$1000, 0>
p1
Start recording its state
c1
Consistent Cut
(empty )
(Order 10, $100), M
c1
(empty )
p1
c2
(Order 10, $100), M
Stop recording
c1
(fiv e w idgets ) M
<$900, 0>
<$900, 5>
p1
p1
<$50, 2000>
p2
<$50, 2000>
p2
<$50, 1995>
(empty )
c2
<$900, 0>
p2
c2
(Order 10, $100)
c1
(empty )
Record its state
p2
<$50, 1995>
(M = marker mes sage)
CSS434 Time & Global States
16
Samadi’s Algorithm [1985]
1.
2.
3.
Each process returns an ack whenever receiving a message.
Once receiving a snapshot message, each process returns a tag instead of an ack until a new
GVT is compute.
When receiving a snapshot message, each process returns to P0 the minimum time among:
- the minimum timestamp among events that have not yet been processes.
- the minimum timestamp among messages that have not yet been acknowledged.
- the minimum timestamp among tags it has received.
Report 12 Report 20
Take snapshot
p0
Report 15
Done
p1
12
16
20
p2
tag
20
p3
15
CSS434 Time & Global States
ack
17
Mattern’s Algorithm [1993]
1.
2.
3.
4.
5.
p1
Process Pi maintains a vector counter: Vi[1..n].
Pi writes in Vi[j] the number of messages sent to Pj.
Pi subtract one from Vi[j] when receiving a message from Pj
During the 1st circulation of a ‘take snapshot’ message, Pi performs:
C[1..n]+=Vi[1..n]; Vi[1..n] = 0
Upon completing the 1st circulation, c[I] presents the number of messages in
transit to Pi.
During the 2nd circulation, Pi wait for performs:
C[i] = 0
1st snapshot
+1
-1
p2
p3
(0,0,0,0)
-1
(0,0,0,1)
-1
+1
p4
+1
+1
(0,0,1,1)
(0,2,-1,0)
+1
2nd snapshot
-1
(0,0,0,0)
-1
(0,1,0,0)
(0,0,1,0)
CSS434 Time & Global States
18
An Example: Parallel and
Distributed Simulation
Process 0
Process 1
1cell/time unit
1cell/5 time units
Process 3
Process 2
1cell/20 time units
CSS434 Time & Global States
19
An Example: Parallel and
Distributed Simulation (Cont’d)
Barrier per every simulation cycle
Process 0 e1 e2
e3
e4
What drawbacks does this method have?
attack
Process 1
Process 2
e5
e1
e6
e7
e8
e9
attack
e12
e11
e10
e1
Process 3
CSS434 Time & Global States
20
An Example: Parallel and
Distributed Simulation (Cont’d)
Discrete event simulation with optimistic synchronization
Old event history kept so as to rollback computation.
Process 0 e1 e2
e3
e4
When can we garbage collect such history?
attack
Process 1
e5
e1
e6
e7
e3
e2
Rollback
Process 2
e8
e9
attack
e12
e11
e10
e1
Rollback
Process 3
e13 e14
CSS434 Time & Global States
21
Time Warp[Jefferson 1985]
Optimistic Distributed Simulation
•
Each process has an input message, an output message, and an event history queue.
•
When a process receives a message whose timestamp is older than its local time:
1.Roll back its local event execution to that old timestamp.
2.Roll back its receipt of input messages whose timestamp is newer than that old
timestamp.
3.Send anti-messages to cancel all emanated messages whose timestamp is newer
than that old timestamp.
•
GVT (Global Virtual Time): is periodically computed to garbage-collect all the executed events
whose timestamp is older than GVT.
Rollback
141
121
142 162
p1
LVT
Arrived late
LVT
p2
p3
120
122 141
120
142
135
LVT
CSS434 Time & Global States
Anti-message
152
143
163
22
SPEEDS[Steinman 1992]
Breathing Time Buckets
This is an optimistic distributed simulator, but so aggressive as Time Warp.
•
Each process broadcasts the oldest local even among those it will execute. This is called a
Local Event Horizon (LEH).
•
A process must suspend its even processing if it has received an older LEH than the one it
is currently processing.
•
The oldest LEH among all processes become the next Global Event Horizon (GEH).
•
Each process may send out all messages and process all events before this new GEH.
•
Processes which have already processed beyond GEH must roll back their computation to
GEH. No anti-messages are sent out.
P1’s LEH
p1
p2
Next GEH
(GVT)
P2’s LEH
CSS434 Time & Global States
23
Vector timestamps and variable
values for the execution
(1,0)
(2,0)
(3,0)
x1= 1 x1 = 100 x1 = 105
(4,3)
x1= 90
p1
m1
m2
Phys ic al
time
p2
x2 = 100 x2 = 95
(2,1) (2,2)
x2 = 90
(2,3)
Cut C 2
Cut C 1
Constraints |x1 – x2| <= 50. Before this is violated, a process must send its value to
its partner.
CSS434 Time & Global States
24
The lattice of global states for the
execution
Lev el 0
S 00
1
S 10
2
3
4
5
S 20
S30
S21
S31
S 22
S32
6
7
Sij= global s tate after i ev ents at process 1
and j events at proc es s 2
S23
S 33
S 43
CSS434 Time & Global States
25
Evaluating possibly and definitely f
Lev el 0
F
1
F
2
3
4
5
F = (f S False);T = (f S  True)
F
T Possibility
F
– Must be True, and thud we don’t care
F
(1,0)
(2,0)
(3,0)
x1= 1 x1 = 100 x1 = 105
?F
(4,3)
x1= 90
p1
m1
Definitely = T for all linialization
(4,3) = T and this is definitely
m2
Phys ic al
time
p2
x2 = 100 x2 = 95
(2,1) (2,2)
All included: (2,2) = True
CSS434 Time & Global States
x2 = 90
(2,3)
Cut C 2
Cut C 1
26
Paper Review by Students

Distributed Snapshot




Samadi’s Algorithm
Mattern’s Algorithm
Discussions: What are pros & cons of these
algorithms?
Optimistic Synchronization



SPEEDS
Time Warp
Discussions: What are pros & cons of these
algorithms in terms of performance, process
creation/termination, dynamic memory allocation,
and I/O handling?
CSS434 Time & Global States
27
Exercises (No turn-in)
1.
2.
3.
Textbook p627, Q14.7: An NTP server B receives server A’s message at
16:34:23.480 bearing a timestamp 16:34:13.430 and replies to it. A receives
the message at 16:34:15.725, bearing B’s timestamp 16:34:25.7. Estimate
the offset between B and A and the accuracy of the estimate.
Textbook p628, Q14.14: Two processes P and Q are connected in a ring using
two channels, and they constantly rotate a message m. At any one time,
there is only one copy of m in the system. Each process’s state consists of
the number of times its has received m, and P sends m first. At a certain
point, P has the message and its state is 101. Immediately after sending m, P
initiates the snapshot algorithm. Explain the operation of the algorithm in this
case, given the possible global state(s) reported by it.
Textbook p429, Q14.15: The figure below shows events occurring for each of
two processes, p1 and p2. Arrows between processes denote message
transmission. Draw and label the lattice of consistent states (p1 state, p2
state), beginning with the initial state (0,0).
p1
time
p2
CSS434 Time & Global States
28