fmcad06_intel_talk - Formal Verification at Utah

A Compositional Approach to Verifying
Hierarchical Cache Coherence Protocols
Xiaofang Chen1
Yu Yang1
Ganesh Gopalakrishnan1
Ching-Tsun Chou2
1University
of Utah
2Intel Corporation
* Supported in part by Intel SRC Customization Award 2005-TJ-1318
1
Hierarchical Cache Coherence Protocols
Chip-level protocols
Intra-cluster protocols
…
mem
dir
mem
dir
Inter-cluster protocols
FMCAD 2006
2
Verification Challenges
No public domain benchmarks
More complicated with more
 Corner cases
 State space
FMCAD 2006
3
Outline
Two hierarchical protocols
 Inclusive
 Non-inclusive
A compositional approach
 Abstraction
 Counter-example guided refinement
 Soundness
FMCAD 2006
4
A Multicore Coherence Protocol
Remote Cluster 1
L1
Cache
L1
Cache
Home Cluster
L1
Cache
L1
Cache
Remote Cluster 2
L1
Cache
L1
Cache
L2 Cache+Local Dir
L2 Cache+Local Dir
L2 Cache+Local Dir
RAC
RAC
RAC
Global Dir
Main
Memory
FMCAD 2006
5
Protocol Features
Both levels use MESI protocols
 Level-1: FLASH
 Level-2: DASH
Silent drop on non-Modified cache lines
Network channels are non-FIFO
FMCAD 2006
6
Livelock Problem
1. Req_E
Agent1
5.
6. Fwd_Req
NACK
4. Req_S
Dir
Agent2
3. Silent-drop
2. Grant_E
Invld
Invld
Excl
FMCAD 2006
7
Blocking WB + NACK_SD
A1
(I)
Req_E
A2
Dir
Req_S
Gnt_E
(I)
(E)
Modify
(M)
Fwd_S
WB
(I)
WB_Ack
NAck_SD
NAck
FMCAD 2006
8
Complexity of the Protocol
Multiplicative effect of four protocols running
concurrently
Model check failed after 161,876,000 of
states
FMCAD 2006
9
Outline
Two hierarchical protocols
Inclusive
 Non-inclusive
A compositional approach
 Abstraction
 Counter-example guided refinement
 Soundness
FMCAD 2006
10
A Compositional Approach
Abstraction
Constraining
…
Original protocol
Abstracted protocol
FMCAD 2006
11
Non-Circular Assume/Guarantee
 We can’t
 Verify: h ║ r1 ║ r2 ╞ Coh
 Instead
 Check-1: h ║ R1 ║ R2 ╞ Coh1 Λ Constrains1
 Check-2: H ║ r1 ║ R2 ╞ Coh2 Λ Constrains2
FMCAD 2006
12
Verification Methodology
Abstraction
 Two abstracted protocols
Fixing real bugs in M
Refinement
FMCAD 2006
13
Abstracted Protocol #1
Home Cluster
Remote Cluster 1
L1
Cache
L1
Cache
Remote Cluster 2
L2 Cache+Local Dir’
L2 Cache+Local Dir
L2 Cache+Local Dir’
RAC
RAC
RAC
Global Dir
Main
Memory
FMCAD 2006
14
Abstracted Protocol #2
Remote Cluster 1
L1
Cache
L1
Cache
Home Cluster
Remote Cluster 2
L2 Cache+Local Dir
L2 Cache+Local Dir’
L2 Cache+Local Dir’
RAC
RAC
RAC
Global Dir
Main
Memory
FMCAD 2006
15
Abstraction
 States
 Projection
 Transitions
 Overapproximation
FMCAD 2006
16
Abstraction on States
Intra-cluster details
Inter-cluster details
FMCAD 2006
17
Abstracting Transitions
 Rule-based system: guard  action;
 Relaxing guards
 Relaxing expr values
 Remove stmt
Procs[p].WbMsg.Cmd = WB_Wb
→
true
Procs[p].L2.Data := Procs[p].WbMsg.Data;
→
Procs[p].L2.HeadPtr := L2;
…
FMCAD 2006
Procs[p].L2.Data := d; …
18
Detecting Bugs in M
When a real error is found in Mi
 Fix bug in M
 Regenerate Mi’s
 Iterate the process
FMCAD 2006
19
Refinement
When a bogus error found in Mi
 Analyze and find out problematic rule
g→a
 Locate original rule in M
G→A
 Add a new lemma in one abstracted protocol
G => P
 Strengthen rule into
gΛP→a
FMCAD 2006
20
Details of Refinement (I)
1
M1
1. False alarm found
 Remote cluster-1 can
modify its L2 line arbitrarily
true → …
FMCAD 2006
21
Details of Refinement (II)
1
M1
2. Locate the original rule in M
before abstraction
Procs[p].WbMsg.Cmd = WB
→…
 Guard: when the local dir receives
a WB from an L1 cache
FMCAD 2006
22
Details of Refinement (III)
1
3
M1
3. Strengthen problematic rule in 1.
true &
Procs[p].L2.State = Excl
 Only when local dir is exclusive,
could L2 modify its line
→…
FMCAD 2006
23
Details of Refinement (IV)
1
3
M1
4. Why strengthening is sound?
FMCAD 2006
24
Details of Refinement (V)
M1
1
3
4. We can add a new lemma in M2
M2
Procs[p].WbMsg.Cmd = WB
=>
Procs[p].L2.State = Excl
4
FMCAD 2006
25
One Detail
Remote Cluster 1
Home Cluster
Remote Cluster 2
Excl
5
Invld
4
1
Excl
Invld
3
Excl: 1
1
Req_E
2
Req_E
4
Fwd_ReqE
5
Gnt_E
FMCAD 2006
2
3 Fwd_ReqE
26
Original Transitions (I)
GUniMsg[src].Cmd
=
RDX_RAC &
GUniMsg[src].Cluster
=
r&
Procs[r].L2.Gblock_WB =
false &
Procs[r].L2.State
=
Excl &
Procs[r].L2.HeadPtr
!= L2

…
undefine GUniMsg[src];
GUniMsg[src].Cmd := GUNI_None;
FMCAD 2006
27
Original Transitions (II)
Procs[r].ShWbMsg.Cmd = SHWB_FAck &
src_node = L2
 …
true &
ABSProcs[r].L2.State = Excl &
ABSProcs[r].RAC.State = Inval &
ABSProcs[r].L2.Gblock_WB = false &
GUniMsg[src].Cmd = RDX_RAC &
GUniMsg[src].Cluster = p
 …
FMCAD 2006
28
Adding A Variable
Remote Cluster 1
Home Cluster
Remote Cluster 2
Excl
5
Invld
4
1
Excl
Invld
3
Excl: 1
2
ifKeepMsg: boolean
FMCAD 2006
29
Soundness of the Approach
Goal
 If M1 and M2 can be model checked correct
w.r.t. the coherence property Ф in M, M must
also be correct w.r.t Ф
FMCAD 2006
30
Soundness Proof
Temporal Induction
 Initial states
Each var has the same value in M, M1 and M2
Each newly added lemma is checked in M1 and M2
Each property is checked
 Suppose soundness in state s
FMCAD 2006
31
Soundness Proof (II)
M
h1, h2, r11, r12, r21, r22
M1
h1, h2,
g1 & p1  a1
r12,
r22
M2
h1,
g  a
r11, r12,
r22
g2 & p2  a2
FMCAD 2006
h1’, h2’, r11’, r12’, r21’, r22’
h1’, h2’,
r12’,
r22’
h2’, r11’, r12’,
r22’
32
Experiment Results
A real bug found
10 iterations of refinements
 The size of each error trace is < 12
 One person-day of work
FMCAD 2006
33
Reduction
 64-bit Murphi
 IA-64 with 20GB of memory
Protocol
Number of states
M
> 161,876,000
M1
31,919,219
M2
78,689,678
FMCAD 2006
34
Outline
Two hierarchical protocols
Inclusive
 Non-inclusive
 A compositional approach
Abstraction
Counter-example guided refinement
Soundness
FMCAD 2006
35
Caching Hierarchy
 Inclusive
 Exclusive
 Non-inclusive
FMCAD 2006
36
A Non-Inclusive Hierarchical Protocol
Remote Cluster 1
L1
Cache
L1
Cache
Home Cluster
L1
Cache
L1
Cache
Remote Cluster 2
L1
Cache
L1
Cache
L2 Cache+Local Dir
L2 Cache+Local Dir
L2 Cache+Local Dir
RAC
RAC
RAC
Global Dir
Main
Memory
FMCAD 2006
37
Protocol Differences
 Broadcasting channels
L1
Cache
L1
Cache
SnoopMsg[]
L2 Cache+Local Dir
RAC
FMCAD 2006
38
Imprecise Local Directory
L1-1
(S)
L1-2
(I)
GDir
LDir
S: L1-1
Req_S
Swap
Broadcast
Fwd_Req
NAck
Gnt_S
S: L1-2
Imprecision!
Gnt_S
FMCAD 2006
39
Verification Difficulty
 Coherence properties
 Can involve multiple L1 caches
 Refinement
 Noninterference lemmas cannot infer L2 cache
line states, from local behaviors
FMCAD 2006
40
An Example
Invld
Excl
Excl
WB
Invld
WB
Excl
Invld
L2:
L2:
(Excl, data1)  (Excl, data2)
(Invld, *)  (Excl, data2)
FMCAD 2006
41
Two Approaches of Refinement
 Inferring “exclusive” from
 Outside the cluster
 Inside the cluster
FMCAD 2006
42
Infer exclusive From Outside
Cluster p
Excl
Invld
IsExcl(p) Ξ
Dir.State = Excl &
GUniMsg[p].Cmd != (ACK || IACK || ImACK) &
WB
Invld
GUniMsg[h].Cmd != (ACK || IACK || ImACK) &
GWbMsg.Cmd = GWB_None &
( (GShWbMsg.Cmd = GSHWB_None &
L2:
(Invld, *)  (Excl, data2)
Dir.Headptr = p) ||
(GShWbMsg.Cmd = DXFER &
GShWbMsg.Cluster = p))
FMCAD 2006
43
Refinement Example
Cluster p
p.WbMsg.Cmd = WB
Excl
Invld
=>
IsExcl(p)
WB
Invld
(Invld & IsExcl(p), *)
L2:

(Excl, data2)
(Invld, *)  (Excl, data2)
FMCAD 2006
44
Infer exclusive From Inside
M1
M2
FMCAD 2006
45
Definition of IE
IE(p):
exists i: L1_caches
(p.L1(i).state = Excl or
p.SnoopMsg(i).Cmd = (Put or PutX) or
p.UniMsg(i).Cmd = PutX) or
p.WbMsg.Cmd = WB or
p.ShWbMsg.Cmd = ShWb or
p.ShWbMsg.Cmd = FAck
FMCAD 2006
46
Refinement
Cluster p
Procs[p].WbMsg.Cmd = WB &
Procs[p].L2.Stae = Invld
Excl
Invld
=>
IE(p)
WB
Invld
(Invld & IE(p), *)
L2:

(Excl, data2)
(Invld, *)  (Excl, data2)
FMCAD 2006
47
Soundness
 Still holds by adding the extra bits “IE”
FMCAD 2006
48
Experiment Results
17 iterations of refinements
Size of each error trace is < 8
Protocol
Number of states
M
> 1,521,900,000
M1
234,478,105
M2
283,124,383
FMCAD 2006
49
Outline
 Two hierarchical protocols
Inclusive
Non-inclusive
 A compositional approach
Abstraction
Counter-example guided refinement
Soundness
FMCAD 2006
50
Conclusion
Developed 2-level hierarchical protocols
Proposed a compositional approach
 Abstraction
 Bug fixing
 Refinement
Proved the soundness
FMCAD 2006
51
Related Work
 FMCAD’04
 Chou et. al., A simple method for
parameterized verification of cache coherence
protocols
 CHARME’99
 McMillan, Verification of infinite state systems
by compositional model checking
FMCAD 2006
52
For Details
http://www.cs.utah.edu/formal_verification/
FMCAD 2006
53
A Multicore Coherence Protocol
Remote Cluster 1
L1
Cache
L1
Cache
Home Cluster
L1
Cache
L1
Cache
Remote Cluster 2
L1
Cache
L1
Cache
L2 Cache+Local Dir
L2 Cache+Local Dir
L2 Cache+Local Dir
RAC
RAC
RAC
Global Dir
Main
Memory
FMCAD 2006
54
About the Bug
IACK
FMCAD 2006
55
Another Decomposing Approach
 Split protocols hierarchically
 Intra-cluster protocol
 Inter-cluster protocol
FMCAD 2006
56
Intra-cluster Protocol
Cluster
L1
Cache
L1
Cache
L2 Cache+Local Dir
RAC
Environment
FMCAD 2006
57
Inter-cluster Protocol
Remote Cluster 1
Home Cluster
Remote Cluster 2
L2 Cache+Local Dir’
L2 Cache+Local Dir’
L2 Cache+Local Dir’
RAC
RAC
RAC
Global Dir
Main
Memory
FMCAD 2006
58
Verification Difficulty
Remote Cluster 1
L1
Cache
L1
Cache
Home Cluster
L1
Cache
L1
Cache
Remote Cluster 2
L1
Cache
L1
Cache
L2 Cache+Local Dir
L2 Cache+Local Dir
L2 Cache+Local Dir
RAC
RAC
RAC
Environment
Global Dir
Main
Memory
FMCAD 2006
59
An Example Scenario
Remote Cluster 1
4
Home Cluster
Remote Cluster 2
Excl
5
7
Invld
1
6
Excl
Invld
3
Excl: 1
2
1
Req_E
2
Req_E
3 Fwd_ReqE
4
Swap
5
Req_E
6 Fwd_ReqE
7
NACK
FMCAD 2006
60