read(y,w) - Mark R. Tuttle

Model checking
transactional memory
with Spin
John O’Leary, Bratin Saha, Mark Tuttle
Intel Corporation
We used the Spin model checker to prove that
Intel’s software transactional memory is correct.
What is transactional memory?
A programming abstraction that makes it easier to write
concurrent programs.
ICDCS 2009
Slide 2 of 29
Concurrent programs are tricky
enqueue(a)
concurrent queue
enqueue(b)
enqueue(c)
enqueue(v) =
if last == max return false;
last := last + 1;
queue[last] := v;
return true;
• How do you synchronize access to tail of the queue?
– What keeps two threads from writing the same queue entry?
ICDCS 2009
Slide 3 of 29
Locks are hard
enqueue(a)
concurrent queue
enqueue(b)
enqueue(c)
enqueue(v) =
acquire(lock);
if last == max return false;
last := last + 1;
queue[last] := v;
return true;
release(lock);
• Locks used badly can lead to many subtle problems:
– Hot-spots, blocking, dead-lock, priority inversion, preemption …
ICDCS 2009
Slide 4 of 29
Transactional memory is easy
enqueue(a)
concurrent queue
enqueue(b)
enqueue(c)
enqueue(v) =
atomic {
if last == max return false;
last := last + 1;
queue[last] := v;
return true;
}
• A programming abstraction for many core
– Makes it easy to implement atomic operations without locks
– Makes is possible for average programmers to write correct code
ICDCS 2009
Slide 5 of 29
Programming is easy
concurrent linked list
head
0
0
0
0
A takes head off list
B increments list elements
A: atomic {
result := head;
if head != null then
head := head.next;
}
use result;
B: atomic {
node := head;
while (node != null) {
node.value++;
node := node.head;
}
• Elegant code, properly synchronized, no data races
ICDCS 2009
Slide 6 of 29
Implementation is hard
• Some implementations expose intermediate states
– A and B appear sequential, but run concurrently: data races!
– A and B can exhibit the “privatization” bug:
result
head
A reads head
B reads head
head
0
1
result = 0
0
1
result = 1!
0
1
0
1
result = 0!!
• Proving correctness is not easy
– We think model checking can help
ICDCS 2009
Slide 7 of 29
Our results
• McRT is a software transactional memory from Intel
• Spin is a software model checker from AT&T + NASA
• We use Spin to prove that McRT is correct
– “Every execution of every purely-transactional program with
two transactions doing three reads and writes is serializable”
– We validate an implementation model of an industrial product,
not just an abstract protocol model
• We give a Spin accelerator for shared memory programs
ICDCS 2009
Slide 8 of 29
What is McRT?
ICDCS 2009
Slide 9 of 29
ICDCS 2009
Slide 10 of 29
That’s it
• We modeled this pseudocode exactly
– We even model pointer dereferencing with array indexing
• We do make the usual simplifying assumptions
– No partial writes: modeled only whole-block loads and stores
– No conflict handling: one of two conflicting transactions aborts
• Timestamps are the key to the protocol
ICDCS 2009
Slide 11 of 29
Timestamps are everywhere
• Global timestamp: global.ts
– Advances whenever a transaction tries to commit or abort
– When it changes, memory may have changed, so be careful
• Transaction timestamp: txn.ts
– Transaction start time (and current proposal for commit time)
– Will be read by other transactions when they commit
– Stored in transaction descriptor
• Along with transaction read set, write set, undo log (local data)
• Memory block timestamp: blk.ts
– Commit time of last transaction writing the block
– Stored in transaction record
• Along with a lock needed to write the block
ICDCS 2009
Slide 12 of 29
Design rule 1
• No transaction ever sees inconsistent data
– Not even an aborting transaction!
– Requires frequent checks that the read set is still valid
• Validate() =
– ts := global.ts
– for each blk in my read set
• confirm blk is not locked by another transaction
• confirm blk.ts  my.ts
• abort if either confirmation fails
– my.ts := ts
• After validation conclude
– Read set has not change since transaction start
ICDCS 2009
Slide 13 of 29
Design rule 2
• No transaction commits until conflicting transactions abort
– Wait for conflicting transactions to undo changes upon abort
– Avoids linked list privatization bug illustrated in introduction
• Quiesce(my.ts) =
– for each active transaction txn
• block while txn.ts < my.ts and txn remains active
• After quiescence conclude
– Every conflicting transaction will validate which it commits
– Validation will fail, transaction will abort, and undo its changes
ICDCS 2009
Slide 14 of 29
Protocol sketch
• Commit
– Increment global ts
– Validate read set
– Set write set ts to global ts
• Abort
– Increment global ts
– Undo changes to write set
– Set write set ts to global ts
ICDCS 2009
• Read
– Add block to read set
– … unless
• Block is locked
• blk.ts > txn.ts
• Write
–
–
–
–
Add block to write set
Add block value to undo log
Update block value
… unless
• Block is locked
• blk.ts > txn.ts
Slide 15 of 29
What does our model look like?
ICDCS 2009
Slide 16 of 29
An invocation/response model
pgm
pgm
ReadI(x)
WriteI(x,v)
ReadR(v)
AbortR
mcrt
Shared memory
ICDCS 2009
pgm
WriteR
AbortR
StartI
CommitI
mcrt
StartR
CommitR
AbortR
mcrt
global timestamp
transaction timestamps
program memory
block timestamps and locks
Slide 17 of 29
McRT environment
Environment
pgm1
send1
recv1
mcrt1
pgm2
send2
recv2
mcrt2
pgm3
send3
recv3
mcrt3
shared memory
ICDCS 2009
Slide 18 of 29
Environment generates
programs on the fly
pgm k
pgm k
pgm k
read(x,_)
-------
read(x,v)
read(y,_)
----
read(x,v)
read(y,_)
---ReadI(y)
ReadI(x)
ReadR(v)
mcrt k
ICDCS 2009
mcrt k
Slide 19 of 29
Environment simulates programs
including aborts
pgm k
pc
read(x,v)
read(y,w)
write(z,u)
WriteI(z,u)
pc
pgm k
pgm k
read(x,_)
read(y,_)
write(z,u)
read(x,_)
read(y,_)
write(z,u)
StartI
AbortR
mcrt k
ICDCS 2009
mcrt k
Slide 20 of 29
Environment checks results
pgm 1
pgm 2
pgm 3
read(x,v)
read(y,w)
write(z,u)
read(w,a)
read(y,w)
write(w,b)
write(m,l)
write(n,p)
write(z,v)
CommitR
CommitR
CommitR(ordering hint)
mcrt 1
• CommitR carries transaction ordering hint
• Environment finds a transaction ordering
consistent with transaction results and
program memory
ICDCS 2009
Slide 21 of 29
We modeled pseudocode “exactly”
Let’s look at the least “exact” match: Abort
ICDCS 2009
Slide 22 of 29
Pseudocode
STMTxnAbort(TxnDesc* txnDesc, uint32 reason) {
Our model
inline abortTransaction(txnDescPtr, ...) {
for ( (addr, val, size) in txnDesc->undoLog ) {
if (addr is on dead stack frames) continue;
switch(size) {
case 4: *(uint32*)addr = val; break;
...
}
}
foreach adr in 0..(num_addresses)-1 {
if
:: txnDesc(txnDescPtr).undoLog[adr] != null_data ->
memory[adr] = txnDesc(txnDescPtr).undoLog[adr];
:: else
fi
};
if ((token = txnDesc->token) == 0)
token = lockedIncrement(globalTimeStamp);
fetch_and_incr (globalTimeStamp,token,token_new);
for ( txnRecPtr in txnDesc->writeSet )
*txnRecPtr = token;
foreach blk in 0..(num_memory_blocks)-1 {
if
:: txnDesc(txnDescPtr).writeSet[blk] ->
txnRecHeap[blk] = token_new;
:: else
fi
};
/* reset transaction descriptor for restart */
initTxnDesc(txnDesc(txnDescPtr),...);
txnDesc->localTimeStamp = 0;
txnDesc(txnDescPtr).localTimeStamp = 0;
backoff();
abortInternal(txnDesc); /* longjmp */
}
ICDCS 2009
}
Slide 23 of 29
What obstacles did we face?
ICDCS 2009
Slide 24 of 29
Challenges
• Modeling environment, abort, timestamps, …
• Code-level models are hard to model check
– Too much detail, too many interleavings
• SPIN statement-merging is conservative
– Intended to reduce detail by creating larger atomic blocks
– Looping over data structures inhibits statement-merging
• SPIN partial-order reduction is conservative
– Intended to identify and ignore “redundant” interleavings
– Global variables (like shared memory) inhibit partial-order reduction
ICDCS 2009
Slide 25 of 29
A SPIN preprocessor
• Loop unrolling to help statement merging, etc.
– Loop unrolling alone gives 50% speedup
adr = 0;
do
:: adr < num_addresses -> memory[adr] = 0
:: else -> break;
od;
memory[0] := 0
memory[1] := 0
memory[2] := 0
memory[3] := 0
• Model rewriting to help partial order reduction (planned)
– Help Spin find fewer, longer atomic blocks to reorder
– Rewrite model as a set of transitions of the form
atomic{ local access; local access; … ; global access}
ICDCS 2009
Slide 26 of 29
Related work
A deep result:
Model checking TM often reduces to checking 2 threads
• Deferred update
[Guerraoui, Henzinger, Jobstmann, Singh, PLDI’08]
– Applies to any TM that satisfies four structural properties
– Clean, elegant result, but doesn’t apply to McRT
• Update in place
[Guerraoui, Henzinger, Singh, CAV’09]
– Requires hand proof than TM satisfies four generalize properties
– They prove this for an abstract model of McRT
– Proof not clear for our implementation model of McRT
ICDCS 2009
Slide 27 of 29
The abstract model
Our implementation model is 2500+ lines of Spin
ICDCS 2009
Slide 28 of 29
Conclusion
We validated Intel’s implementation of STM.
We optimized SPIN’s performance on
shared memory protocols.
ICDCS 2009
Slide 29 of 29

Download Report

read(y,w) - Mark R. Tuttle

Paperzz.com

Your Paperzz