De-synchronization: from synchronous to asynchronous Based on the paper: Blunno, Cortadella, Kondratyev, Lavagno, Lwin, Sotiriou, Handshake protocols for de-synchronization, ASYNC 2004. Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example De-synchronize Synchronous CLK Asynchronous CLK Synchronous circuit MS flip-flop L 0 L 1 L 0 CLK 0 L 0 L L 1 De-synchronization L 0 L 1 L 0 L 1 C C C C C C 0 L 0 L De-synchronization Distributed controllers substitute the clock network C C C The data path remains intact ! C C C Design flow Think synchronous Design synchronous: one clock and edge-triggered flip-flops De-synchronize (automatically) Run it asynchronously Prior work Micropipelines (Sutherland, 1989) Local generation of clocks Varshavsky et al., 1995 Kol and Ginosar, 1996 Theseus Logic (Ligthart et al., 2000) Commercial HDL synthesis tools Direct translation and special registers Phased logic (Linder and Harden, 1996) (Reese, Thornton, Traver, 2003) Conceptually similar Different handshake protocol (2 phase vs. 4 phase) Automatic de-synchronization Devise an automatic method for de-synchronization Identify a subclass of synchronous circuits suitable for de-synchronization Formally prove correctness Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example Synchronous flow De-synchronized flow + Flow equivalence [Guernic, Talpin, Lann, 2003] A B Flow equivalence CLK A B 1 5 3 1 A B 1 5 1 0 2 1 5 2 3 1 4 Synchronous behavior 3 2 0 2 1 5 3 2 3 3 1 4 2 4 De-synchronized behavior 1 4 1 6 3 6 3 0 1 0 1 Flow equivalence CLK A B 1 5 3 1 A B 1 5 1 0 2 1 5 2 3 1 4 Synchronous behavior 3 2 0 2 1 3 2 5 3 3 1 4 2 4 De-synchronized behavior 1 4 1 6 3 6 3 0 1 0 1 Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example L 0 L 1 L 0 L 1 C C C C C C 0 L 0 L C C C C C C L C A0 B0 C0 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A latch cannot read another data item until the successor has captured the current one A0 B1 C0 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A latch cannot read another data item until the successor has captured the current one A0 B0 C0 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A latch cannot read another data item until the successor has captured the current one A1 B0 C0 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A latch cannot read another data item until the successor has captured the current one A0 B0 C0 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A0 B0 C0 D1 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A0 B0 C0 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A0 B0 C1 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A latch cannot become opaque before having captured the data item from its predecessor A0 B1 C1 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A latch cannot become opaque before having captured the data item from its predecessor A0 B0 C1 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A latch cannot become opaque before having captured the data item from its predecessor A0 B0 C0 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A latch cannot become opaque before having captured the data item from its predecessor A0 B0 C0 D0 A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A B C D A+ B+ C+ D+ A- B- C- D- A B Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example Can we increase concurrency ? A+ A+ B+ A- B- B+ A+ B+ A- B- not flow-equivalent A- B- A B A data overrun B A data lost B A+ B+ A- B- Can we reduce concurrency ? How much ? A+ B+ A- B- (8 states) A+ B+ A+ B+ A- B- A+ B+ A- B- (6 states) A- B- A+ B+ A- B- (5 states) A+ B+ (4 states) A- B- A B de-synchronization model A A B B fully decoupled (Furber & Day) GasP, IPCMOS A B A semi-decoupled (Furber & Day) A B B simple 4-phase non-overlapping A+ A+ B+ ABfully decoupled (Furber & Day) simple 4-phase A+ B+ A- B- B+ ABde-synchronization model A+ B+ ABsemi-decoupled (Furber & Day) A+ B+ ABGasP, IPCMOS non-overlapping A+ B+ A- B- 4-phase latch controllers Rin Ain Lt Rout Rin Aout Ain Lt Rout Aout Furber and Day, IEEE Trans. VLSI, June 1996 Implementation note: Lt=0 (transparent), Lt=1 (opaque) 4-phase latch controllers Rin+ Rout+ Lt+ Ain+ Rin Ain Lt Aout+ ? Rout RinLt- Rout- Aout Ain- Aout- 4-phase latch controllers Rin+ Ain+ Rin Ain Lt Rout Rout+ Lt+ Rin- Aout+ Rout- Aout Ain- Simple 4-phase controller Lt- Aout- 4-phase latch controllers Rin+ Ain+ Rout+ Lt+ Rin- Ain- Simple 4-phase controller Aout+ Rout- Lt- Aout- 4-phase latch controllers Rin Ain Lt Rout Rin+ A+ Rout+ Ain+ Lt+ Aout+ Rin- A- Rout- Ain- Lt- Aout- Aout Semi-decoupled controller 4-phase latch controllers Rin+ A+ Rout+ Ain+ Lt+ Aout+ Rin- A- Rout- Ain- Lt- Aout- Semi-decoupled controller 4-phase latch controllers Rin+ A+ Rout+ Ain+ Lt+ Aout+ B+ Rin Ain Lt Rout Rin- A- Rout- Ain- Lt- Aout- Aout B- Fully decoupled controller 4-phase latch controllers Rin+ A+ Rout+ Ain+ Lt+ Aout+ B+ Rin- A- Rout- Ain- Lt- Aout- B- Fully decoupled controller 4-phase latch controllers (state graphs) Semi-decoupled controller Fully decoupled controller A Ri Ai Ri+ cntrl A- Ai+ Ri- Ai- Rx Ax Rx+ B Ro cntrl Ao B- Ax+ A+ Rx- Ro+ Ao+ B+ Ro- AxAo(semi-decoupled 4-phase protocol) A Ri Ai cntrl Rx Ax B Ro cntrl Ao A- B- A+ B+ (semi-decoupled 4-phase protocol) A Ri Ai cntrl Rx Ax B Ro cntrl Ao A- B- A+ B+ (semi-decoupled 4-phase protocol) A Ri Ai cntrl Rx Ax B Ro cntrl Ao A- B- A+ B+ (semi-decoupled 4-phase protocol) A Ri Ai cntrl Rx Ax B Ro cntrl Ao A- B- A+ B+ (semi-decoupled 4-phase protocol) A Ri Ai cntrl Rx Ax B Ro cntrl Ao A- B- A+ B+ (semi-decoupled 4-phase protocol) A Ri Ai cntrl Rx Ax B Ro cntrl Ao A- B- A+ B+ (semi-decoupled 4-phase protocol) A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example Which protocols are valid for de-synchronization ? A+ B+ A- B- Theorem: the de-synchronization protocol preserves flow-equivalence Proof: by induction on the length of the traces Induction hypothesis: same latch values at reset Induction step: same values at cycle i same values at cycle i+1 A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- Theorem: any reduction in concurrency preserves flow-equivalence A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- Any hybrid approach preserves flow-equivalence ! Semidecoupled Fully decoupled Semidecoupled nonnonoverlapping overlapping Fully decoupled Semidecoupled A B C D A+ B+ C+ D+ A- B- C- D- A B C D A+ B+ C+ D+ A- B- C- D- semidecoupled nonoverlapping fully decoupled Flow-equivalence is preserved, … but … Liveness Preservation of flow-equivalence: all the generated traces are equivalent Are all traces generated ? (Is the marked graph live ?) Not always ! A+ B+ C+ D+ A- B- C- D- Semi-decoupled 4-phase handshake protocol Liveness: all cycles have at least one token [Commoner 1971] A+ B+ C+ D+ A- B- C- D- Simple 4-phase handshake protocol Results about liveness At least three latches in a ring are required with only one data token circulating [Muller 1962] Theorem (this paper): any hybrid combination of protocols is live if the simple 4-phase protocol is not used Proof: any cycle has at least one token A+ A+ B+ ABfully decoupled (Furber & Day) simple 4-phase A+ B+ A- B- B+ ABde-synchronization model A+ B+ ABsemi-decoupled (Furber & Day) A+ B+ ABGasP, IPCMOS non-overlapping A+ B+ A- B- Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example Async DLX block diagram Synchronous RTL Synchronous Desynchronized = Cycle: 4.4ns Power: 70.9mW Area: 372,656m Cycle: 4.45ns Power: 71.2mW Area: 378,058m All numbers are after Placement & Routing Total of 1500 flip-flops, 3000 latches DE-SYNC design includes 5 controllers, each driving 2 clock trees Power numbers include the clock tree Technology: UCM/Virtual Silicon 0.18 µm Discussion The de-synchronization model provides an abstraction of the timing behavior [5,7] A [2,3] [0,0] B [3,5] E [3,5] D F C [1,2] [8,9] [2,4] • Timing analysis • Exploration of the design space G Conclusions EDA tools require a formal support (they must work for all circuits) A complete characterization of 4-phase protocols has been presented (partial order based on concurrency) Design flow developed at Cadence Berkeley Labs Automated from gate netlist Static timing analysis to derive matched delays Constrained P&R to meet timing constraints
© Copyright 2026 Paperzz