Fault Tolerant Four-State Logic by Using Self-Healing Cells Martin Delvai Institute of Computer Engineering Vienna University of Technology, Austria [email protected] Thomas Panhofer, Werner Friesenbichler Austrian Aerospace GmbH {thomas.panhofer, werner.friesenbichler}@space.at sient and permanent fault mitigation are treated in Section V and VI, respectively. Finally, Section VII draws a conclusion. Abstract— The trend towards higher integration and faster operating speed leads to decreasing feature sizes and lower supply voltages in modern integrated circuits. These properties make the circuits more error-prone, requiring a fault tolerant implementation for applications demanding high reliability, e.g. space missions. In previous work we presented a concept how to obtain fault tolerant digital circuits by using asynchronous Four-State Logic (FSL). This type of logic already exhibits a high degree of fault tolerance where most faults simply halt the circuit (deadlock). The remaining types of faults are handled by temporal redundancy. Adding a deadlock detection unit and introducing the concept of Self-Healing Cells (SHCs) leads to a highly reliable circuit that is able to tolerate even multiple faults. However our experiments revealed that some specific fault constellations neither cause a deadlock nor are they detected by a redundant calculation. We present two improved ways of error detection, which allow to capture even these types of faults. Further, a comparison between the size of an SHC and the achieved fault tolerance wrt. multiple faults is performed. II. R ELATED W ORK Different methods based on classic QDI logic using a 4phase protocol have been proposed to mitigate transient faults [3], [4] and to detect permanent faults [5], [6]. Although easier to implement, this protocol requires spacers to separate the particular data words, which reduces the speed of the circuit. Often the investigations end with the enforcement of a deadlock but there is only little investigation how to recover from such a state [7]. To recover from permanent errors runtime reconfiguration can be performed. Genetic algorithms [8] autonomously evolve a working circuit, however, the result is rather unpredictable. Reverse-engineering the bitstream of an FPGA to manipulate the circuit systematically is possible [9], but as for the genetic algorithms seems not to be acceptable for high-reliability applications. The only known tool for manipulating synthesized circuits provided by FPGA manufacturers is JBits [10] from Xilinx, but it is restricted to the Virtex-II series and Java based, which cannot be used e.g. for space applications. Using different pre-synthesized implementations of the same circuit [11] overcomes these problems, but requires large memories to store the bitstreams. All these methods can only be applied to FPGAs, whereas our approach can also be used in ASICs. Further, the actual configuration of a circuit can be easily determined leading to a predictable behavior. I. I NTRODUCTION During its lifetime, a circuit might be exposed to various types of faults. In particular for applications where repair is not feasible, e.g. space missions, high reliability is an important issue that requires error mitigation and failure handling techniques. A robust architecture having an increased inherent fault tolerance is of advantage. Asynchronous FourState Logic (FSL) [1] shows this property and additionally provides a more deterministic behavior in the presence of errors than conventional single-rail logic. The concept we propose tolerates both transient and permanent faults. It is based on the fact that FSL circuits tend to stop (deadlock) in the presence of errors. Combining a deadlock detection unit with Self-Healing Cells (SHCs) allows to re-route the circuit after a deadlock. The flexibility of SHCs allows to recover even from multiple faults. Due to the handshaking in FSL, no data is lost in case of a deadlock – thus after completion of the mitigation process, the operation will continue without any additional actions. Unfortunately, some faults produce a wrong result without leading to a deadlock. These faults are more severe because they cannot be detected that simply. Just adding temporal redundancy does not cure due to the vulnerability to token and synchronization errors [2]. A more evolved architectural design of FSL is presented to mitigate these faults. The paper starts with related work in Section II. It briefly introduces the principle, the robustness aspects as well as the underlying fault hypothesis of FSL in Section III before it moves on to the self-healing approach in Section IV. Tran- 978-1-4244-2658-4/08/$25.00 ©2008 IEEE III. F OUR S TATE L OGIC – FSL A. Concept of FSL FSL is a member of Quasi Delay Insensitive (QDI) circuits [12] that uses a two-phase handshake protocol. Consecutive data is separated by using two alternating, diverse code sets: ϕ 0 and ϕ 1, called phases. Fig. 1 shows the encoding and transition between the boolean values TRUE/ FALSE denoted as ’h’/’l’ in phase ϕ 0 and ’H’/’L’ in ϕ 1. Each logic value is encoded by the two signal rails a and b [13]. Combinational FSL logic only calculates a new output if the inputs are consistent, i.e. have the same phase, otherwise the old output value is preserved. Assuming an appropriate design of the circuit [14], it operates glitch-free. This fact is important, since due to the absence of a clock signal the storage elements in an FSL circuit have to derive the trigger event when to store new data from the data encoding itself. The glitch-free operation ensures that consistent data 1 „FALSE“ logic state ϕ0 (a,b) „FALSE“ „TRUE“ l (0,0) h (1,1) ϕ1 (a,b) l(0,0) ϕ0 L (0,1) H (1,0) h(1,1) ϕ0 H(1,0) ϕ1 Fig. 1. fault has vanished. However, if a specific amount of time is exceeded, the circuit is considered to be halted forever, which is called deadlock. A token error describes the corruption of data, where data moving through an asynchronous channel is referred to as token. Synchronization errors are related to faults in the handshaking process and may lead to an unintended token consumption. Token and synchronization errors do not stop the circuit but generate wrong data [2]. L(0,1) ϕ1 „TRUE“ FSL encoding and state transitions IV. S ELF -H EALING A PPROACH Data in f(x) Φ Φ en Latch Data in f(x) Φ Φ en Ctrl done Data out done Latch Monitoring Unit Fig. 2. Primarily our approach aims to mitigate faults by using FSL logic, which already widely tolerates delay faults [14]. Transient and permanent faults require special attention. In QDI circuits permanent faults cause a deadlock, which simplifies their detection but requires a costly repair procedure. We developed Self-Healing-Cells (SHC), which allow to recover from permanent errors by duplicating the internal logical structure and providing a flexible routing. Each SHC constitutes a fault containment region. Thus, in contrast to a simple duplication of the entire target device the finer grained SHC approach is able to tolerate multiple faults. Transient faults either suspend the circuit’s operation or trigger wrong data. In case the transient fault leads to a deadlock, the same recovery mechanism as for permanent faults can be applied. However, if the transient fault generates wrong data, additional measures are required. A first approach to detect faults was presented in [2] but also identified some weaknesses. Section V provides two possible improvements to overcome these shortcomings. Data out Φ en Ctrl pass Latches Φ Data out Latches Data in Latches is also valid data, so both combinational logic and registers derive their trigger from a consistency check of the input data. Registers control the data flow. After the data has been captured, an acknowledge signal is generated, which requires the preceding stage to issue new data. Ctrl pass done Latch pass Watchdog FSL pipeline structure The data path of FSL circuits is modeled similarly to a micropipeline [15] and is shown in Fig. 2. The monitoring and watchdog unit are not part of the micropipeline concept but are required for deadlock detection, see Section VI. B. Robustness of FSL Circuits V. T RANSIENT E RROR R ECOVERY To assess the robustness we consider permanent, transient and delay faults. FSL is largely immune against delay faults, provided the basic FSL gates are internally not affected by delay constraints [14]. Delay faults degrade the performance of the circuit but do not affect a correct data processing. For transient faults, we distinguish a settled state, where all inputs are consistent and a transient state, where the inputs are inconsistent. In the settled state, a single fault at any input produces an inconsistent input vector, which does not propagate because FSL circuits do not react on inconsistent data. As long as the transient fault persists, the circuit is halted. In the transient state, a fault can cause an error only if it generates a consistent input vector: Assuming an n-bit vector, this requires that n − 1 signals have acquired the new value and the fault affects the remaining, slowest signal, in particular the rail that is not going to change anyway. This reduces the probability, that a transient fault yields an error 1 by a factor of 2n compared to conventional single-rail logic. Permanent faults are crucial for asynchronous logic because they may result in permanent inconsistent signal vectors and thus in a deadlock. A detailed analysis of robustness of quasi delay-insensitive circuits can be found in [16]. A. Fault injection Initially, we added temporal redundancy [2] to detect transient faults not mitigated by FSL: Each operation is performed twice, once in ϕ 0 and the other time in ϕ 1. The results are compared at logical level and if they differ a third calculation is initiated to identify the correct result. Due to the alternating phases permanent errors are also detected, which are invisible to conventional temporal redundant systems. The main drawback of this method is that it requires two operations, which reduces the operational speed by 50%. However, due to the 2-phase protocol, the throughput is still as high as for classical QDI circuits using a 4-phase protocol. Fault injection experiments were performed using a pipelined 4-bit ripple carry adder designed in FSL, as shown in Fig. 3. For simplicity, both operands are fetched at the same time using an 8-bit wide register. Transient faults were injected into the combinational logic of the adder as well as the operand and result register on all signals and at all time instants of the simulation. The particular components were simulated with arbitrary propagation delays to obtain a realistic behavior. The fault duration was selected longer than the slowest propagation delay in the circuit to avoid suppressing the transient by the logic function. The fault injection experiments were performed with a simple FSL circuit and with the same circuit again but applying temporal redundancy. C. Fault Hypothesis in FSL Permanent, transient and delay faults may trigger delay, token or synchronization errors. Delay errors suspend the execution of a circuit. The circuit may resume when the 2 4 Operand Register SOURCE TABLE I Sum 4 4 Data 8 4-bit Adder 5 Result Register Done 4- BIT PIPELINE ADDER WITH TIME REDUNDANCY Carry SINK Pass N=0 Fig. 3. Faults injected Mitigated by FSL Errors Deadlocks Wrong data Mitigated by time red. Undetected Failures 4-bit FSL pipeline adder The results shown in Table I are grouped according to the size of the redundant data set N, see also subsection V-B. For N = 0 no redundant calculation is performed, while N = 1 uses a simple time redundancy scheme where each operation is repeated. In both cases, 2314 (6.8%) of the injected faults lead to an error, where 533 (1.6%) produced a deadlock. In the pure FSL circuit, 1781 (5.2%) of the errors generate wrong data, which leads to undetected failures in the absence of any control mechanism. However, even with a redundant calculation 83 (0.2%) errors remained undetected. A closer examination revealed that these failures were triggered by synchronization errors that consume two consecutive tokens forming one complete redundant data set. Such a behavior is simulated in Fig. 4, showing a 4-bit wide pipeline having three stages. At time t2, the pipeline is filled with the sequence ’HLLH’, ’hllh’ and ’HHHH’, where two consecutive tokens build one redundant data set. A transient fault on done2 (from t3 to t4), consumes the set ’HHHH’/’hhhh’ without passing it through the pipeline and so the complete set vanished from the data sequence. fsdatain llll HLLH hllh HHHH hhhh LHLH fsdata3 HLLH llll llll lhlh HHHH hhhh LHLH hllh HLLH t0 Fig. 4. hllh HLLH t1 t2 t3 N=4 34230 100.0% 31903 93.2% 2327 6.8% 613 1.8% 1714 5.0% 1714 5.0% 0 0.0% To harden FSL against synchronization errors, the sensitivity of the handshake to transient faults has to be removed. For this purpose the rail synchronization technique [4] was adapted. The idea is to join two asynchronous channels and to ensure that each channel cannot memorize data without the presence of valid data in the other channel. Due to this interlocking an error must affect both channels to propagate. A transient fault on one channel is blocked as long as the other channel does not contain valid data. In case the other channel already contains a valid token, the transient fault is memorized but will produce an invalid token. The circuit in [4] has been developed for standard QDI logic, which uses a 4-phase protocol and distinguishes valid and invalid tokens. The rail synchronization also depends on minor timing assumptions, since it assumes that the correct rail of the faulty channel has enough time to propagate to the storage element and to generate the invalid token before the erroneous value is acknowledged. Otherwise, a token error is produced. FSL uses a 2-phase protocol and does not comprise invalid tokens. Further, an FSL register already comprises a synchronization mechanism at the input. In contrast to the method in [4], all input bits are synchronized and not only two adjacent ones. This property makes an FSL register largely immune against token errors triggered by transient faults at the register inputs, since the fault will only be memorized if all bits agree to the same phase. The wider the data bus the higher is this immunity. If the register comprises only one or two bits, there is no difference to the rail synchronization method. The original FSL lacks a synchronization mechanism for the acknowledge signal. In our approach the register is split in two parts as depicted by Fig. 5. Each part provides an acknowledge signal that is joined at the preceding pipeline done3 llll 34230 100.0% 31916 93.2% 2314 6.8% 533 1.6% 1781 5.2% 1698 5.0% 83 0.2% C. Acknowledge Synchronization done2 fsdata2 34230 100.0% 31916 93.2% 2314 6.8% 533 1.6% 1781 5.2% 0 0.0% 1781 5.2% As shown in Table I, no undetected failures were observed with N = 4. The number of errors is not identical to the simulation with N = 1, since re-arranging the sequence of the input data creates a dependency whether the injected fault flips a signal rail or not. However, the variation is only minor. Although the increase of the redundant data set size N allowed to detect all faults, it has the drawback that the whole set has to be stored during the redundant calculation and that the comparison becomes more complicated. Thus another mechanism to avoid synchronization errors is presented. done1 fsdata1 Size of redundant data set N=1 t4 t5 Token vanishing due to a synchronization error This token vanishing stems from the 2-phase protocol in FSL that reacts on every transition. A pulse on the acknowledge signal creates two events and thus two synchronization errors. If the data source of the affected register is fast enough to provide a new token for each transition, two consecutive tokens will be consumed. If these tokens form a redundant data set, both errors will remain undetected. B. Grouped Data Sets The operations to be performed are grouped in two redundant data sets, each comprising N operations. The sets are processed consecutively and the results are checked for equality. For N = 1, each operation is simply repeated, i.e. the data series is {1, 1, 2, 2, ...}, where the numbers denote the particular tokens in the series. The length of the redundant data set should be kept short for a fast error detection. The fault injection experiment was repeated with a redundant set size of N = 4, where the data series is {1, 2, 3, 4, 1, 2, 3, 4, 5, 6, ...}. 3 Data out Latches Data in Φ Data in Φ Φ en Latches stage. The register will only accept new data if both acknowledge inputs have the same value, thus a transient fault occurring only on one of the two acknowledge lines will be blocked. We modified the registers in Fig. 3 accordingly and repeated the fault injection with the input data from Table I. Data out Φ en Ctrl Ack1 Ctrl f(x) done Latch Data out Latches Data in Φ Ack2 Data in Φ Φ en Φ Ctrl Ack1 Ack2 Latch Fig. 5. Data out en Ctrl done Latch Latches done ’R2 ’. One SHC is able to tolerate at least one fault, either internal or at its exernal interfaces. It is possible to implement any circuit of arbitrary complexity as SHC, ranging from low level gates (e.g. AND, OR, etc.) up to complex circuits (e.g. a complete processor). Designing a low level gate as SHC provides a high degree of fault tolerance but also increases the relative hardware overhead. Implementing more complex functions as SHC reduces that overhead but decreases the capability to mitigate faults. Each SHC can recover at least one fault but depending on the internal fault distribution there is a high probability that even multiple faults can be recovered. The watchdog unit used for deadlock detection can only determine the faulty pipe stage but not the faulty gate itself. Thus the configuration inputs are changed using a ”trialand-error” method. In our prototype circuit, the configuration pattern is generated by a counter. As soon as a configuration is found that resumes the circuit operation, it is assumed that the circuit has been repaired correctly. If the new configuration does not remove the deadlock, the reconfiguration process will go on. Note that for larger circuits a more sophisticated reconfiguration controller is required, as a simple counter would lead to large reconfiguration times. Currently our approach considers only combinational circuits. However, a permanent fault or a bit-flip in a register behaves equally to a permanent fault that affects the interconnections between a register and the combinational logic [17], i.e. an external interface of a SHC – and such a fault can be mitigated by appropriate re-routing. During a deadlock the input values of gates, pipelines, etc. will be kept valid since the subsequent circuit stage cannot consume the token. Thus, after a working configuration has been found, the circuit will autonomously continue its operation without loss or corruption of data. Notice that this ability is specific to QDI circuits. Ack1 Ack2 done Latch FSL Register with Acknowledge Synchronization TABLE II 4- BIT PIPELINE ADDER WITH TIME REDUNDANCY AND ACKNOWLEDGE SYNCHRONIZATION Faults injected Mitigated by FSL Errors Deadlocks Wrong data Mitigated by time red. Undetected Failures redundant calculation with N = 1 w/o ack sync with ack sync 34230 100.0% 38220 100.0% 31916 93.2% 35554 93.0% 2314 6.8% 2666 7.0% 533 1.6% 1218 3.2% 1781 5.2% 1448 3.8% 1698 5.0% 1448 3.8% 83 0.2% 0 0.0% As Table II shows, no undetected failures occurred, even with the simple time redundancy scheme N = 1. Note that the circuit under test was simulated with arbitrary propagation delays that have been set rather high. Especially the skew between the particular bits of a register was selected artificially large to provoke token errors. A probability assessment of these types of errors in real circuits is currently ongoing. B. Comparison of Different SHC Sizes To compare the resource overhead versus fault tolerance of different SHC sizes/complexities, two implementation extremes of a 1-bit full adder have been designed. Circuit A is built from low level SHCs (Fig. 6a), as shown for a half adder in Fig. 6b. In circuit B (Fig. 6c), two complete full adders were implemented in the SHC, which basically act as hot-redundant circuits. Obviously, the fine granular version A built from basic gates occupies more resources than the coarse granular circuit B. In turn, circuit A shows a much higher degree of fault tolerance compared to circuit B as will be shown later. To reduce simulation time the simulation was performed using a 1-bit full adder instead of the 4-bit adder. Since a 4bit adder consists of cascaded 1-bit adders and faults on the external interfaces have been considered, the results obtained from the 1-bit adder are valid also for the 4-bit adder. The goal of the simulation was to prove that for any injected stuck-at fault a working configuration can be found. Transient faults are handled by the concept described in VI. P ERMANENT E RROR R ECOVERY For deadlock detection a watchdog unit is added as shown in Fig. 2. It monitors the phase detectors and triggers a reconfiguration if they do not change their state within a defined time interval. Moreover, the states of the phase detectors can be used to gain information about the fault location. The watchdog timing can be chosen several orders of magnitudes higher than the circuit processing time. Thus, this timing assumption is no practical limitation to the QDI concept. A. Self-Healing Cells To recover from deadlocks we introduced Self-Healing Cells (SHC). A SHC is an internal redundant circuit, such as the 2-input FSL AND gate in Fig. 6a, which allows to bypass defective circuit parts by re-routing the internal signals. The routing is controlled by the reconfiguration-inputs ’R1 ’ and 4 anom a1_int bnom b1_int anom bnom FSL AND 1 cnom ared bred R1 R2 cnom bnom cred cinnom SH-AND snom ared a2_int bred b2_int FSL AND 2 cred sred R3 R4 R1 R2 anom SH-XOR TABLE III a1_int sumnom b1_int FSL FA 1 cin1_int C OMPARISON OF SHC S WITH D IFFERENT C OMPLEXITY coutnom Circuit A: 17600 fault conditions Circuit B: 22720 fault conditions number of signals number of reconfig. inputs equivalent gate count failed with 1 fault failed with 2 faults failed with 3 faults failed with 4 faults failed with 5 faults failed with 6 faults failed with 7 faults failed with 8 faults failed with 9 faults failed with 10 faults failed with 11 faults a2_int ared sumred b2_int bred cin2_int cinred FSL FA 2 coutred R1 R2 (a) Self-Healing Basic Gate Fig. 6. (b) Self-Healing Half-Adder (c) Complex Self-Healing Cell (Full-Adder) Self-Healing Cells of Different Complexity section V and thus have not been considered in this simulation. We randomly injected permanent stuck-at-0 and stuckat-1 faults on all internal and external signals, including the reconfiguration inputs. Then we applied all valid input stimuli. Since we have three inputs (a, b, cin) this results in 8 combinations for each phase or 16 combinations in total. Each signal was subjected at least once to both stuckat faults. Due to the storage elements in the FSL gates the circuit behavior in the presence of errors depends on the history. To consider this dependency we performed five independent simulation runs for each fault configuration and took the mean values. In total we simulated 17600 fault conditions for circuit A and 22720 for circuit B. The result of the adder (sum and carry) was compared to the expected value. The circuit was defined to be working if at least one of the two redundant outputs showed the correct result. If both outputs were wrong, the reconfiguration inputs were counted up until a correct result was obtained. The development of a decision algorithm to determine the correct output is currently ongoing. The simulations revealed that for both circuits it is possible to repair each single fault, independent of its location. For multiple faults there is a high probability for repair, however, depending on the fault location in the circuit this cannot be guaranteed. If e.g. two faults affect both redundant paths within a SHC, the circuit will fail. The same applies if a signal and its associated reconfiguration input are affected by a permanent fault at the same time. The summary of the simulations is presented in Table III. The number of signals was extracted from the behavioral design. To obtain the resource occupation of the two circuits, both designs were synthesized into a Xilinx Virtex-4. It can be seen that even with 11 simultaneously injected faults, which means that about 10% of the circuit’s signals are defect, still about 54% (circuit A) and 37.5% (circuit B) of the fault constellations could be repaired. The resource overhead is approximately 40% higher with gate-level SHCs compared to the full-adder implemented as a single SHC. However, as can be seen in Fig. 7, the gain of fault tolerance of circuit A compared to circuit B is also significant, in particular for a small number of faults where the probability for multiple faults within the same SHC is low. circuit A circuit B A/B gate SHC FA SHC comparison 142 110 +29.1% 10 2 +400.0% 580 412 +40.8% 0.0% 0.0% -0.0% 2.1% 5.0% -57.2% 5.5% 14.3% -61.9% 11.0% 22.8% -51.8% 16.0% 30.5% -47.4% 22.0% 39.0% -43.7% 26.8% 45.2% -40.6% 32.5% 49.9% -34.9% 38.1% 54.7% -30.4% 42.3% 59.2% -28.5% 46.0% 62.5% -26.5% 60 percentage 50 40 30 20 Failed Conditions Full-Adder SHC 10 Failed Conditions Gate-Level SHC Gain of Fault-Tolerance 0 1 Fig. 7. 2 3 4 5 6 7 8 number of simultaneously injected faults 9 10 11 Fail-Cases and Gain of Fault-Tolerance vs. Number of Faults be used in a real application as the expected output is unknown. However, this information is not required within our approach: If a signal in the nominal or redundant data path is affected by a fault, the output generated from this signal may (i) carry the right phase encoding but a faulty logical value (“high” instead of “low”, e.g.). or (ii) carry a wrong phase encoding (ϕ 1 instead of ϕ 0, e.g.). The first case is detected by the redundant calculation process described previously and does not require a reconfiguration. In the second case the circuit automatically stops its execution due to the fact that a combinational gate or at least the register at the end of the data path in question will never get a consistent input. A deadlock occurs and the watchdog unit triggers the reconfiguration process. The latter reconfigures the circuit so that the affected signal is excluded from the data path. Subsequently the circuit will continue its execution. Thus also the identification of a working configuration does not require any additional means – after selecting a correct configuration the circuit starts to work autonomously, otherwise the next configuration is applied. D. Simulation of a Deadlock Recovery A simple implementation of the deadlock recovery is shown in Fig. 8. The watchdog counter is reset by the phase detectors in the FSL registers. If it wraps around, a new configuration is requested by asserting Req. The reconfiguration unit comprises a counter that is incremented with each request. After the new reconfiguration has been applied, the Ack signal is asserted, which resets the deadlock C. Reconfiguration Control To evaluate whether the fault can be repaired, we compared the output with the expected value. This method cannot 5 detector. If the configuration was successful the circuit’s operation continues preventing any further requests. If not, a new setting will be requested after the deadlock timeout has expired until a working configuration is found. nom nom red red nom nom f(x) Reg SHC red red nom f(x) Reg Reg Various fault injection simulations have shown that transient faults are either mitigated by the inherent fault tolerant properties of FSL, end up in a deadlock or can be detected via time redundancy. It has been shown that all permanent single faults can be repaired by designing the basic circuit elements as self-healing cells. Further, this concept provides a high probability to recover even from multiple faults. SHC red R EFERENCES Reg [1] A. J. McAuley, “Four state asynchronous architectures,” IEEE Transactions on Computers, vol. 41, no. 2, pp. 129–142, February 1992. [2] W. Friesenbichler, T. Panhofer, and M. Delvai, “Improving fault tolerance by using reconfigurable asynchronous circuits,” in Proceedings of the 11th Workshop of Design and Diagnostics of Electric Circuits and Systems (DDECS’08), March 2008, pp. 267–270. [3] W. Jang and A. J. Martin, “SEU-tolerant QDI circuits,” in Proceedings of the 11th IEEE International Symposium on Asynchronous Systems & Circuits (ASYNC), 2005, pp. 156–165. [4] Y. Monnet, M. Renaudin, and R. Leveugle, “Hardening techniques against transient faults for asynchronous circuits,” in Proceedings of the 11th IEEE International On-Line Testing Symposium (IOLTS’05), 2005, pp. 129–134. [5] C. LaFrieda and R. Manohar, “Fault detection and isolation techniques for quasi delay-insensitive circuits,” in DSN ’04: Proceedings of the 2004 International Conference on Dependable Systems and Networks (DSN’04). Washington, DC, USA: IEEE Computer Society, 2004, p. 41. [6] S. Peng and R. Manohar, “Efficient failure detection in pipelined asynchronous circuits,” in Proceedings of the 2005 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT’05), 2005, pp. 484–493. [7] S. Peng and R. Manohar, “Fault tolerant asynchronous adder through dynamic self-reconfiguration,” in ICCD ’05: Proceedings of the 2005 International Conference on Computer Design, 2005, pp. 171–179. [8] R. S. Oreifej, R. N. Al-Haddad, H. Tan, and R. F. DeMara, “Layered approach to intrinsic evolvable hardware using direct bitstream manipulation of Virtex II pro devices,” in Proceedings of the 17th International Conference on Field Programmable Logic and Applications, 2007. [9] S. Raaijmakers and S. Wong, “Run-time partial reconfiguration for removal, placement and routing on the Virtex-II pro,” in Proceedings of the 17th International Conference on Field Programmable Logic and Applications, 2007. [10] S. A. Guccione, D. Levi, and P. Sundararajan, “Jbits: A java-based interface to FPGA hardware,” in Proceedings of the 2nd Annual Military and Aerospace Applications of Programmable Devices and Technologies Conference (MAPLD), 1999. [11] W.-J. Huang and E. J.McCluskey, “Column-based precompiled configuration techniques for FPGA fault tolerance,” in Proceedings of the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’01), 2001. [12] J. Sparso and S. Furber, Eds., Principles of Asynchronous Circuit Design - A Systems Perspective. Kluwer Academic Publishers, 2001. [13] M. Delvai and A. Steininger, “Asynchronous logic design - from concepts to implementation,” The 3rd International Conference on Cybernetics and Information Technologies, Systems and Applications - Volume 1, Jul. 2006. [14] W. Huber, “Design of an asynchronous processor based on code alternation logic - exploration of delay insensitivity,” Ph.D. dissertation, Vienna University of Technology, 2005. [15] I. E. Sutherland, “Micropipelines,” Communications of the ACM, vol. 32, no. 6, pp. 720–738, 1989. [16] S. Piestrak and T. Nanya, “Towards totally self-checking delayinsensitive systems,” Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on, pp. 228– 237, 27-30 Jun 1995. [17] O. A. Petlin and S. B. Furber, “Built-in self-testing of micropipelines,” in Proceedings of the Third International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC ’97), April 1997. Req Deadlock Detector Fig. 8. Ack Reconfiguration Unit Deadlock Detection and Reconfiguration Permanent faults were applied to the 4-bit ripple carry adder in Fig. 3 but now each basic gate of the adder is constructed as SHC. The circuit comprises 34 reconfiguration inputs, and initializes all SHCs to their nominal input. Fig. 9 shows the simulation of a permanent fault injected into the carry bit calculation of the LSB. NOM datax_nom 1 2 4 6 8 10 12 14 0 2 4 6 8 datay_nom 1 3 5 7 9 11 13 15 1 3 5 7 9 dataout 2 1 9 13 17 21 25 29 1 5 9 13 17 5 RED datax_red 1 2 4 6 8 10 12 14 0 2 4 6 8 datay_red 1 3 5 7 9 11 13 15 1 3 5 7 9 dataout 2 1 9 13 17 21 25 29 1 5 9 13 17 5 HA0_AND2_NOM a1 {1 1} {1 0} {0 0} {0 1} {0 1} {0 0} {0 0} {0 0} {0 1} {0 1} b1 {1 1} {1 0} {1 1} {1 0} {1 0} {1 1} {1 1} {1 1} {1 0} {1 0} c1 {1 1} {1 0} a2 {1 1} {1 0} {0 0} {0 1} {0 1} {0 0} {0 0} {0 0} {0 1} {0 1} b2 {1 1} {1 0} {1 1} {1 0} {1 0} {1 1} {1 1} {1 1} {1 0} {1 0} c2 {1 1} {1 0} {0 0} {0 1} {0 1} {0 0} {0 0} {0 0} {0 1} {0 1} HA0_AND2_RED Deadlock Detect deadlock r 000000EFE 000000EFF Fig. 9. 000000F00 Reconfiguration Simulation The inputs of gate HA0 AND2 NOM are a1 = 00 and b1 = 11, which nominally results in an output of c1 = 00. Due to the permanent fault, the output is stuck at c1 = 10, which generates a wrong phase and produces a deadlock. Examining the reconfiguration inputs shows that bits 8-11 have to be set to logic 1 to select the redundant carry bit. To save time, the simulation starts with a reconfiguration setting of 0x000000EFE. The circuit is halted due to the permanent fault and the deadlock unit starts generating requests for the reconfiguration unit. When the reconfiguration input is set to 0x000000F00, the redundant carry bit is selected, which holds the correct value c1 = 00 and the circuit resumes its operation, which is indicated by the activity on the data lines. VII. C ONCLUSION This paper illustrates a self-healing approach for asynchronous circuits based on FSL. It combines hardware and time redundancy and is able to tolerate transient as well as multiple permanent faults. 6
© Copyright 2026 Paperzz