Bounding Error Masking in Linear Output Space Compression Schemes Steen Tarnick Max Planck Society Fault-Tolerant Computing Group at the University of Potsdam, Germany Abstract Based on the principle of linear output space compression we present a design method for concurrent checkers such that the masking probability of errors caused by faults of a given set of circuit faults is below a given bound, while keeping the space compression ratio, dened as the ratio of the number of circuit outputs to the number of outputs of the space compressor, as high as possible. Experiments performed on the ISCAS-85 benchmark circuits show that the compression ratios achieved with compression functions computed with this method can be very high, even for very low bounds for the error masking probability and large fault sets. 1 Introduction Output space compression (OSC) [3, 4, 6] is a simple method to design self-testing or partially self-checking [7] circuits. The basic principle of OSC is shown in Fig. 1. A space compressor SC maps the n-dimensional output space of the circuit under check (CUC) to a k-dimensional space, k < n. The predictor P is designed to generate the same output responses as the compressor. The two responses of the compressor and the predictor are then compared by a comparator C. If the two responses are dierent an error is indicated by the comparator. CUC and predictor together form a new circuit CUC', the outputs of which are now encoded as a systematic code. The compressor and the comparator together form a new concurrent checker CC which is a conventional systematic code checker. A special kind of OSC is the linear OSC where the compressor simply consists of XOR-trees. In a linear OSC scheme the output responses of CUC' are therefore encoded as words of a linear code. In general, concurrent checking schemes are designed to detect certain types of errors (e.g. single errors, double errors, unidirectional errors) or all or a high percentage of single stuck-at 0/1 faults. In [6] a linear OSC scheme was presented that is tailored to a predened set of arbitrary faults. This scheme guarantees that each fault of the given set will be detected, i.e., at least one error caused by each fault will be propagated to the outputs of the compressor. However, this scheme does not allow to evaluate the quality of the error detection. It is therefore useful to introduce a quality measure for the error detection. As measure for the error detection quality we consider the minimum error detection probability with respect to a fault set, dened as the minimum ratio of the number of errors caused by any fault of the given fault set detected by the checker to the total number of errors caused by that fault. In this paper we derive a method that allows to compute the compression function of the OSC scheme in order to achieve a given minimum error detection probability for a given fault set, while keeping the compression ratio as high as possible, i.e., the number of outputs (or XOR-trees) of the compressor as low as possible. First we show how to compute the masking probability of errors caused by a given fault with respect to a xed compression function. In the second part of the paper a procedure will be derived to compute a compression function with a desired minimum error detection probability for a set of faults. This problem can be mapped to a set covering problem. For the solution of the problem we use a simple greedy strategy, combined with a threshold accepting algorithm [2]. The derived algorithm will be discussed on the ISCAS-85 benchmark circuits [1]. Experimental results show the eectiveness of the developed procedure in terms of high output space compression ratios. n CUC y SC k z k P CUC’ 2 z’ C CC error indication m x Fig. 1. General OSC Scheme. 2 Denitions and Problem Statement Let F : X ! Y , X B m , Y B n , B = f0; 1g, be a Boolean function realised by the CUC, and Fi : X ! B the function realised at output i of the circuit. Let denote a set of physical faults in the circuit. For our purposes we consider a functional fault model, i.e., each fault ' 2 is represented as a faulty circuit function, denoted by F(x; '). The fault-free circuit function is denoted by F(x; ;). Denition 1 The space compression function is a Boolean function h : Y ! Z that maps the circuit output space Y B n to the space Z B k , k < n. A linear space compression function can be expressed by a n k matrix C = (cij ), cij 2 B, which is a submatrix of the generator matrix of the linear code. Therefore, the space compressor SC is fully dened by the matrix C, and we have z = C Ty, z 2 Z, y 2 Y . The dimension dim(Z) of the space Z corresponds to the number of outputs of the space compressor SC, and therefore to the number of outputs of the predictor P. Under the hypothesis that the complexity of the predictior function and the area required by the predictor P (and also by the whole scheme) mainly depends on the number of outputs of P, the goal is to minimize the dimension of Z. Problem Statement: Given a combinational circuit CUC, and a fault set . Find a linear mapping h : Y ! Z with dim(Z) = min such that the masking probability of an error caused by a fault ' 2 is below a given bound 1 ? Pmin, i.e., the error is detected with a probability P Pmin . Denition 2 [6] Two circuit outputs Fi and Fj are called weakly independent with respect to a fault ' if there is an input x such that Fi (x; ') Fj (x; ') 6= Fi (x; ;) Fj (x; ;): If two circuit outputs are not weakly independent with respect to ' then they are said to be dependent with respect to '. This denition can be easily extended to a set of circuit outputs [6]. If a set of outputs is weakly independent with respect to a fault ' then there is at least one input x 2 X such that the fault causes an error in the parity of this set of outputs. Denition 3 Let Fi be a circuit output and ' a fault in the circuit. The error function of Fi with respect to ' is dened as (Fi ; ') = Fi(x; ') Fi(x; ;): As can easily be seen, a set H of circuit outputs isPweakly independent with respect to a fault ' if Fk2H (Fk ; ') 6= 0. Let H be an additional circuit output dened by XORing the set H of circuit outputs (we will use the notation H for a set of circuit outputs and an output H dened by XORing the circuit outputs Fk 2 H interchangeably). The property of a set H of circuit outputs to be weakly independent with respect to a fault ' indicates that ' can be detected at the output H. However, the property to be weakly independent does not say anything about the quality of the fault detection. It is therefore sensible to introduce a measure that quanties the weak independence of circuit outputs. Denition 4 The satisfying set Xi of (Fi ; ') is the set of all inputs x that can detect the fault ' at the output Fi Xi = fxj(Fi; ') = 1g: Denition 5 The degree of dependence of two circuit outputs Fi and Fj with respect to a fault ' is dened as Xi \ Xj j = 1 ? jXi Xj j ; %' (Fi; Fj ) = jjX [X j jX [ X j i j i j where denotes the symmetric dierence of two sets. We dene two circuit outputs at which a fault ' cannot be detected to be dependent: Xi = Xj = ; ! %' (Fi ; Fj ) := 1. The degree of dependence of a set H of circuit outputs with respect to ' is Fk2H Xk j %' (H) = 1 ? jS Fk2H Xk The degree of dependence is the probability that an error caused by ' will not be detected at the output H under the condition that it causes an error at at least one output Fi 2 H. 3 Error Masking If we perform a linear compression of the output space of the functional circuit then we map the ndimensional output space Y of the circuit to a kdimensional space Z, k n. For k < n there will be erroneous output vectors that will be mapped to the corresponding correct vector in Z. The eect that an erroneous circuit response y0 = y e is mapped to the same vector z 2 Z as the fault-free response y, h(y0 ) = h(y) = z, is called masking of the error e. The goal of this section is to calculate the masking probability of an error caused by a given fault ' for a given linear mapping h. Denition 6 The probability that a fault ' can be detected at the output Fi under the condition that ' causes an error is denoted Pdetect(Fi ; ') := jSnjXi jX j : k=1 k For the computation of the error masking probability for the mapping h we assume that we know the probability Pdetect(Fi; ') for each circuit output Fi and the degree of dependence %' (Fi ; Fj ) for each pair of outputs. From these values we have to compute the respective probabilities of the outputs in Z. We rst consider two outputs Fi and Fj . The output Fi Fj has the error detection probability jXi Xj j : Pdetect(Fi Fj ; ') = jS n X j k=1 k Using set operations we obtain Pdetect(Fi Fj ; ') = 1 ? %' (Fi ; Fj ) (P 1 + % (F ; F ) detect(Fi ; ') + Pdetect(Fj ; ')): ' i j In a similar way we can compute the probability Pdetect(Fi ; Fj ; ') that ' is detected at at least one of the outputs Fi and Fj : Pdetect(Fi ; Fj ; ') = 1 1 ? % (F ; F ) (Pdetect(Fi Fj ; ')) ' i j In the same way we can compute the probability Pdetect(H1 ; : : :; Hk ; '), Hi = Fi1 Fij , if we know the respective values of Pdetect and %' . Denition 7 The masking probability of an error caused by a fault ' for a given compression function h is given by Pmask (H1; : : :; Hk ; ') = 1 ? Pdetect(H1; : : :; Hk; '); where h is determined by the XOR-trees Hi, i = 1; : : :; k. However, in general it is not possible to compute the values of Pdetect for each combination of circuit outputs only with the knowledge of the error detection probabilities of single outputs and the dependence degrees of each output pair. It can be shown that all outputs form an Abelian group G dened by the the fault ' (linear combinations of outputs are considered as outputs too). The subset of outputs at which ' cannot be detected forms a subgroup S0 of G. The elements of each coset of S0 have the same error function and therefore the same satisfying set. Thus, the outputs of the same coset have the same error detection probability, and the degree of dependence of two outputs Fi and Fj only depends on the cosets they belong to. It is therefore sucient to know the error detection probability of only one element of each coset of S0 and the degree of dependence between two representatives of each pair of cosets of S0 . Example 1 Consider the circuit of Fig. 2. Let '1 be a stuck-at 1 fault at the input x3. The subgroup S0 and the cosets S1 , S2 , and S3 dened by '1 are given below. S0 = f0; F4; F1 F2; F1 F2 F4g S1 = fF1; F2; F1 F4; F2 F4 g S2 = fF3; F3 F4; F1 F2 F3; F1 F2 F3 F4g S3 = fF1 F3; F2 F3; F2 F3 F4; F1 F3 F4g F1 f1 x1 x2 x3 x4 F2 f2 F3 f3 F4 Fig. 2. Example Circuit. It is, however, not always possible to say to which coset an output (or a linear combination of outputs) belongs to. In thoses cases we have to compute the probabilities Pdetect in the same way as for a single circuit output, for example through fault simulation. But with each computation of Pdetect for an output (or a combination of outputs) we compute this probability for an entire coset of equivalent outputs. '1 '2 '3 '4 '5 '1 '2 '3 '4 '5 b H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12 H13 H14 H15 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Table 1. Matrix M1 for the circuit of Fig. 1. H1 0.66 0.00 0.25 0.00 0.14 1.05 H2 0.66 0.00 0.75 0.00 0.43 1.84 H3 0.00 0.00 0.80 0.00 0.57 1.37 H4 0.66 0.80 0.00 0.80 0.43 2.69 H5 0.66 0.80 0.25 0.80 0.57 3.08 H6 0.66 0.80 0.75 0.80 0.43 3.44 H7 H8 H9 H10 H11 0.66 0.00 0.66 0.66 0.00 0.80 0.29 0.29 0.29 0.29 0.80 0.75 0.75 0.25 0.25 0.80 0.00 0.00 0.00 0.00 0.57 0.43 0.43 0.80 0.80 3.63 1.47 2.13 2.00 1.34 Table 2. Matrix M2 , (Pmin = 0:8). 4 Algorithm In this section we will show how to compute a linear mapping h : Y ! Z with dim(Z) = min such that for each fault ' 2 an error caused by ' will be detected with a probability P Pmin. We derive the algorithm step by step with the help of Example 1. Example 1 (cont.) We consider the following faults: '1 , '2 : x4 stuck-at 0, '3: x1 stuck-at 0, '4 : F3 stuck-at 1, and the double fault '5 : x1 stuck-at 0/f2 stuck-at 1. Our goal is to nd a mapping h such that an error caused by a fault ' 2 = f'1; : : :; '5g is detected with probability P Pmin = 0:8. As a rst step we construct an s t- matrix M1 , where s = jj and t = 2n ? 1. The number i of a row of M1 corresponds to the index of 'i 2 . The number j of a column of M1 corresponds to the decimal encoded subset Hj of outputs. If for example n = 4 then j = 10 is the decimal equivalent to (F4; F3; F2; F1) = (1; 0; 1; 0) which indicates that H10 = fF2; F4g. The matrix M1 = (mij ) is determined by mij = 1; if (Hj ; 'i ) = Fk2Hj (Fk ; 'i ) 6= 0 0; otherwise. P For the set the matrix M1 is shown in Table 1. For P > 0 a linear mapping h with dim(Z) = min is obtained by computing a minimal column cover of M1 . A simple greedy strategy is to take at each step that matrix column that covers the most rows not yet covered, until a complete cover is found. In Table 1 we can see that there are 7 single columns that are already a cover. H12 0.66 0.80 0.75 0.80 0.57 3.58 H13 0.66 0.80 0.75 0.80 0.57 3.58 H14 0.66 0.80 0.25 0.80 0.57 3.08 H15 0.66 0.80 0.25 0.80 0.57 3.08 For our purposes we have to modify this algorithm slightly. First we replace each matrix element by mij = Pdetect(Hj ; 'i ); if Pdetect(Hj ; 'i) < Pmin Pmin; otherwise. and obtain the matrix M2 . For each column mj of Pjj M2 we can compute a column weight bj = i=1 mij that charactarizes the fault detection quality of the respective linear combination of outputs. Table 2 shows the matrix M2 together with the column weights b. The next step of the algorithm consists of choosing the matrix column with the largest value b. In matrix M3 this column is column 7 with b7 = 3:63. Therefore the rst XOR-tree involves the set H7. Errors caused by the faults '2, '3 , and '4 are already detected with the required probability Pmin = 0:8. The required error detection probability for '1 and '5 is still not achieved by H7. Therefore we have to choose the next column i with the best value bi. Due to the fact that H7 now belongs to the column cover, we have to recompute the matrix values and column weights. The fault detection probability for 'k with Hj is now determined by Pdetect(H7; Hj ; 'k ). The new matrix M3 is shown in Table 3. The column weight bi = 4:0 = Pmin jj indicates that C = fH7; Hig is a minimum column cover such that each fault of is detected with a probability P Pmin. As a result, the smallest possible dimension of Z such that an error caused by a fault ' 2 is detected with probability P Pmin = 0:8 is dim(Z) = 2. The algorithm for the computation of a mapping h : Y ! Z with dim(Z) = min is summarized below. '1 '2 '3 '4 '5 b H1 0.80 0.80 0.80 0.80 0.57 3.77 H2 0.80 0.80 0.80 0.80 0.79 3.99 H3 0.66 0.80 0.80 0.80 0.79 3.85 H4 0.66 0.80 0.80 0.80 0.79 3.85 H5 0.80 0.80 0.80 0.80 0.79 3.99 H6 H7 H8 H9 H10 H11 H12 0.80 0.66 0.66 0.80 0.80 0.66 0.66 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.80 0.57 0.57 0.79 0.79 0.80 0.80 0.80 3.77 3.63 3.85 3.99 4.00 3.86 3.86 Table 3. Matrix M3 , (values mij for fH7; Hig). Procedure 1 (MIN COVER) C := ;; d := 0; repeat d := d + 1; i := BEST COLUMN(C); C := C [ Hi ; until bi = Pminj_j: After termination of Procedure 1, the value d is the dimension of Z, and C denes a mapping h : Y ! Z with dim(Z) = d. In Procedure 1, BEST COLUMN(C) computes the matrix column i with the largest value bi with respect to the partial column cover computed so far. Since the number of columns of a matrix M grows exponentially with the number of primary circuit outputs, an exhaustive enumeration of all matrix columns and the computation of the respective weights bi is not feasable for circuits with a large number of outputs. The computation of a matrix column i with the largest weight bi is equivalent to the problem of determining a point Hi with the maximum value bi for the objective function in a discrete search space with 2n elements The problem to be solved is a classical combinatorial optimization problem for which ecient algorithms are known. For nding the best matrix column we use an algorithm that is derived from a threshold accepting algorithm [2]. This algorithm is listed below. Procedure 2 (BEST COLUMN(C)) choose a set Hi (initial set); choose an initial threshold T > 0; repeat repeat choose a neighbor Hj of Hi ; if bi ? bj < T then Hi := Hj ; until a long time no increase of bi or too many iterations; T := T t, (0 < t < 1); until no change of bi anymore. H13 0.80 0.80 0.80 0.80 0.80 4.00 H14 0.80 0.80 0.80 0.80 0.79 3.99 H15 0.66 0.80 0.80 0.80 0.79 3.85 5 Experimental Results In this section we present the results for the OSC using the algorithm developed in section 4. Experiments were performed on the ISCAS-85 benchmark circuits [1] to study the eectiveness of the algorithm for varying values of Pmin. For the computation of a matrix column corresponding to an actually chosen set of XOR-trees the circuit with these XOR-trees and the original circuit were fault simulated for a given set of 100 faults and a given number of random patterns. The simulation was performed with the fault simulator COMSIM [5]. The values of Pdetect were obtained by dividing the corresponding fault detection frequencies. Table 4 shows the results of the OSC algorithm for the benchmark circuits for dierent values of Pmin. Since for circuits with a large number of primary inputs it is not feasible to simulate the circuit for every input pattern we have to choose a sample of random patterns. Therefore, the values Pdetect obtained through the simulation are estimations P^detect in reality. In the experiments, for each circuit we used sets of random patterns of uniform size 105. Fig. 3 illustrates the working mechanism of the proposed algorithm for the circuit c432 and Pmin = 0:95. The plotted values of Pdetect correspond to the temporary values obtained when exiting the inner loop in Procedure 2. circuit # XOR-trees name #PO > 0 0.7 0.8 0.9 0.95 0.99 1.0 c432 7 1 2 3 5 5 6 7 c499 32 2 2 3 3 4 5 6 c880 26 2 2 2 3 4 5 8 c1355 32 1 1 2 2 3 4 5 c1908 25 1 1 2 3 4 3 4 c2670 140 2 2 3 3 4 5 5 c3540 22 1 2 3 4 5 7 14 c5315 123 1 2 2 3 4 5 6 c6288 32 1 2 3 4 5 6 9 c7552 108 2 2 2 3 3 4 5 Table 4. OSC results for ISCAS-85 circuits. # faults Pdetect 0.95 0.80 1600 1400 1200 Pmin > 0 1000 Pmin = 0.7 0.60 Pmin = 0.8 800 error detection probability 0.40 0 100 Pmin = 0.9 600 Pmin = 0.95 Pmin = 0.99 200 400 0.20 Pmin = 1.00 200 circuit c1355 treshold value 0.00 XOR-tree 1 XOR-tree 2 XOR-tree 3 XOR-tree 4 XOR-tree 5 0 0.0 0.2 0.4 0.6 0.8 1.0 error detection probability Fig. 3. Working mechanism of the algorithm. Fig. 4. Error detection for all single stuck-at faults. It is interesting to observe that the overall error detection probability grows with increasing values of Pmin. This eect can be seen in Fig. 4. The diagram shows that the number of faults for which errors are detected with probability P is in general higher for higher values of Pmin . Large improvements are achieved if the number of XOR-trees increases. The diagram shows the overall error detection probability for the circuit c1355 for varying values of Pmin . Another interesting fact is that for Pmin = 1:0 (5 XORtrees) there are 1438 faults (of 1566 testable nonequivalent stuck-at faults in c1355) for which each error is detected with the set of 105 random patterns, although the compression function was computed to detect each error caused by a set of only 100 faults. The errors of 98% of all (testable nonequivalent) single stuck-at faults are still detected with probability P 0:95. This example shows that high output space compression ratios can be achieved even for very high values Pmin and large fault sets. the ISCAS-85 benchmark circuits. The results show that high compression ratios can be achieved, even for high minimum fault detection probabilities and large fault sets. 6 Conclusions In this paper we presented a method for designing concurrent checkers based on a linear compression of the output space of the circuit. The checker design is tailored to a given set of target faults. Errors caused by these faults have to be detected with a probability that is equal to or above a given bound. To keep the complexity of the checker low, the main objective of the method is the minimization of the number of outputs of the space compressor. The design method is based on the solution of a set covering problem for which a simple greedy strategy was combined with a threshold accepting algorithm. The eectiveness of the algorithm was studied on References [1] F. Brglez and H. Fujiwara: A Neutral Netlist of 10 Combinational Benchmark Circuits and a Target Translator in Fortran, Proc. IEEE Symp. on Circuits and Systems, June 1985, pp. 663-698. [2] G. Dueck and T. Scheuer: Threshold Accepting: A General Purpose Optimization Algorithm Appearing Superior to Simulated Annealing, J. of Computational Physics, vol. 90, 1990, pp. 161-175. [3] E. Fujiwara, N. Mutoh, and K. Matsuoka: A SelfTesting Group-Parity Prediction Checker and Its Use for Built-In Testing, IEEE Trans. Comput., vol. C-33, no. 6, June 1984, pp. 578-583. [4] E. Fujiwara and K. Matsuoka: A Self-Checking Generalized Prediction Checker and Its Use for Built-In Testing, IEEE Trans. Comput., vol. 36, no. 1, Jan. 1987, pp. 86-93. [5] U. Mahlstedt and J. Alt: Simulation of nonclassical Faults on the Gate Level | The Fault Simulator COMSIM, Proc. IEEE Int. Test Conf., Baltimore, MD, Oct. 1993, pp. 883-892. [6] E.S. Sogomonyan and M. Goessel: Design of SelfTesting and On-Line Fault Detection Combinational Circuits with Weakly Independent Outputs, J. of Electronic Testing, vol. 4, pp. 267-281, 1993. [7] J.F. Wakerly: Partially Self-Checking Circuits and Their Use in Performing Logical Operations, IEEE Trans. Comput., vol. C-23, no. 7, July 1974, pp. 658-666.
© Copyright 2026 Paperzz