TPG CUT CUT ORA BISTer Syndrome

Efficient On-Line Testing of FPGAs with
Provable Diagnosabilities
Vinay Verma (Xilinx Inc. )
Shantanu Dutt (Univ. of Illinois at Chicago)
Vishal Suthar (Univ. of Illinois at Chicago)
Outline
Previous on-line testing methods
Roving Tester (ROTE) & Bulilt-in Self Tester (BISTer)
Concepts
Two new BISTer architectures
– 1-diagnosable BISTer-1
– 2-diagnosable BISTer-2
New fast functional testing and diagnosis: FASTTAD
Simulation results (fault coverage and fault latency)
Conclusions
Previous On-Line Testing Methods
On-line testing: Testing a (small) part of the FPGA while a
circuit is executing on another part – increases system
availability
Fault scanning technique of [Shnidman et al., IEEE Tr. VLSI’98]
that is applicable to bus-based FPGAs
STAR technique of [Abramovici, et al., ITC’99] that uses a
roving tester that tests part of the FPGA while the rest
executes the application circuit.
– Their group have presented several Built-in-Self-Testers (BISTers)
with different diagnosabilities and complex adaptive diagnosis;
e.g., [Abramovici, et al., ITW’00] – will be discussed later
– Have also presented on-line BIST for interconnects [Stroud et al.,
ITW’01]
Roving Tester (ROTE) with Built-in-Self-Testers (BISTers)
• Two column left spare for
BISTer ROTE
CIRCUIT
CIRCUIT
SPARE COLUMN
CIRCUIT
ROTE; one for fault reconf.
• ROTE roves across the FPGA
• ROTE concept similar to STAR
at a high level
• Differentiation:
BIST designs,
fault reconfig. & incr. re-routing
techniques
TPG - Test Pattern Generator
CUT - Cells Under Test
ORA - Output Response Anal.
TPG
CUT
CUT
ORA
BISTer
Syndrome
Definitions
k-diagnosability:
A testing technique is said to be k-diagnosable if in the
presence of any m ≤ k faulty components it can correctly
identify all m faulty components among the n ≥ k
components that it tests.
Detailed syndrome:
The detailed syndrome for a session is the 0/1 bit pattern
observed at the ORA output (0 => match, 1 => mismatch)
over all the test vectors of the TPG.
Gross syndrome:
CUT
A gross syndrome
of a session
is the overall pass/fail
TPG
(indicated as X/√) observation over all modes of operation
for that session. In other words, the gross syndrome of a
session is a X (fail)
output is 1 for any input test
ORA
CUTif the ORA
vector and is a √ (pass), otherwise.Syndrome
BISTer
BISTer-0 [M. Abramovici et. al., ITC ’99]
A
TPG
CUT
B
D
CUT
A
CUT
D
ORA
ORA TPG
CUT
C
B
C
(S2)
(S1)
D
CUT
A
CUT
D
TPG
CUT
TPG
B
C
(S3)
ORA
B
CUT
C
(S4)
A
ORA
TPG - Test Pattern Generator
CUT - Cells Under Test
ORA - Output Response Analyser
• Exhaustive testing of CUTs
• S1, S2, S3, S4 are four sessions
of testing in a BISTer tile
BISTer-0 [M. Abramovici et. al., ITC ’99]
S1 S2 S3 S4
A
B
C
D
Theorem: BISTer-0 is zero-diagnosable.
TPG CUT ORA CUT Proof:
The same pair of PLBs are configured as
CUT TPG CUT ORA CUTs in two different sessions:
PLBs A and C in S2 and S4
ORA CUT TPG CUT
PLBs B and D in S1 and S3.
CUT ORA CUT TPG
Faulty
PLB
S1 S2 S3 S4
A
√
X
√/
X
C
√/
X
X
√
X
X
When either PLB fails, the gross syndrome
will be identical in these sessions.
E.g. if A fails as a CUT only, then its gross
syndrome is identical to the gross syn. of
C failing as a CUT only. Hence we cannot
distinguish between faulty PLBs A and C.
Thus has a complex adaptive diagnosis phase
Our BISTer-1 Architecture
TPG A
ORA
B CUT
TPG
C CUT
Sess 
PLB 
S1
ORA
CUT D
S2
S3
S4
S1
S2
S3
S4
Inference
√
√
√
√
No faulty PLB
X
√
√
√
Fault not in PLB
√
X
√
√
Fault not in PLB
√
√
X
√
Fault not in PLB
√
√
√
X
Fault not in PLB
X
X
√
√
Faulty C (CUT)
√
X
X
√
Faulty D (CUT)
√
√
X
X
Faulty A (CUT)
X
√
√
X
Faulty B (CUT)
X
√
X
√
Fault not in PLB
A
TPG ORA CUT CUT
√
X
√
X
Fault not in PLB
B
CUT TPG ORA CUT
X
X
X
√
Faulty D
√
X
X
X
Faulty A
X
X
√
X
Faulty C
X
√
X
X
Faulty B
X
X
X
X
Fault not in PLB
C
CUT CUT TPG ORA
D
ORA CUT CUT TPG
Our BISTer-1 Architecture
Each PLB is a CUT in 2 unique sessn’s
and a TPG in another unique session –
this
serves to
uniquelyisidentify
the
Theorem:
BISTer-1
1-diagnosable
faulty PLB which will have a X X √ in
these sessions.
Sess 
PLB 
S1
S2
S3
S4
S1
S2
S3
S4
Inference
√
√
√
√
No faulty PLB
X
√
√
√
Fault not in PLB
√
X
√
√
Fault not in PLB
√
√
X
√
Fault not in PLB
√
√
√
X
Fault not in PLB
X
X
√
√
Faulty C (CUT)
√
X
X
√
Faulty D (CUT)
√
√
X
X
Faulty A (CUT)
X
√
√
X
Faulty B (CUT)
X
√
X
√
Fault not in PLB
A
TPG ORA CUT CUT
√
X
√
X
Fault not in PLB
B
CUT TPG ORA CUT
X
X
X
√
Faulty D
√
X
X
X
Faulty A
X
X
√
X
Faulty C
X
√
X
X
Faulty B
X
X
X
X
Fault not in PLB
C
CUT CUT TPG ORA
D
ORA CUT CUT TPG
BISTer-2 Architecture
B
CUT
A
TPG
C
ORA
F
ORA
1
2
Y1
D
CUT
Y1 – output of the ORA comparing CUTs
Y2 – output of the ORA comparing TPGs
Theorem: BISTer-2 is 1-diagnosable
Y2
E
TPG
6 rotations => 6 sessions
Proof:
Gross syndrome corresponding to Y1
for each faulty PLB is unique.
E.g. Y1 is pass in section 2 only for
faulty PLB A and no other PLB.
Gross syndrome corresponding to Y1
Faulty
PLB
S1
S2
S3
S4
S5
S6
A
X
√
X
X
X
X
B
X
X
√
X
X
X
C
X
X
X
√
X
X
D
X
X
X
X
√
X
E
X
X
X
X
X
√
F
√
X
X
X
X
X
dist. 1 pair dist. 3 pair
BISTer-2 Architecture (cont.)
B
CUT
A
TPG
Theorem: BISTer-2 is 2-diagnosable under the assumptions:
1. No fault masking for all detailed syndromes
2. Faulty PLBs either uniformly all fail or all pass as TPG/ORA
C
OR1
Y1
F
OR2
Proof:
B
TPG
Y1 OR1
A
CUT
OR2 Y2
Y2 • For the case faulty PLBs fail as TPG/ORA also, possible gross
syndromes (GS) are: Y1Y2 = X √ and XX
E
D
• Class 1: faulty pairs corresponding to GS= X √.
CUT TPG
• 3 Class 1 pairs: (CUT,CUT)2, (CUT,OR1)1 and (OR1,CUT)1
• Class 2 includes remaining faulty pairs (GS=XX).
(S1)
dist. 2
2, BC1 and CD1
•
For
session
S1,
Class
1
includes
BD
Class
1
pairs
pair 1 pair from S1
Class 1 BC
pairs
only Class
C
CUT
D
TPG
OR1
Y1
F
TPG
E
OR2
CUT Y2
(S6)
(S2)
S1: GS = X √
=> BC/CD/BD
S2: GS = X √
=> CD
S6: GS = X √
=> BC
S1: GS = X X
=> Class 2 pairs
S2: GS = X X
=> BC/BD
S6: GS = X X
=> BD
In S1-S6 all the faulty
pairs at dist. 1 & 2
will be in Class 1 and
hence will be diag.
CD only Class 1 pair from=>
S1GS’s are distinct for all dist. 1 & 2 faulty pairs
Three dist. 3 pairs
B
A
CUT
C
OR1
Y1
D
CUT
TPG
F
OR2
For faulty pairs at dist. 3, i.e., pairs AD, BE and CF,
G.S. of Y1Y2 = XX in all sessions.
Hence they don’t fall in Class 1 and hence are not
distinguishable among themselves.
Y2
To distinguish these dist. 3 pairs we compare their detailed
E syndromes:
AD: dS1 = dS3 (T-C in both sess’s), dS4 = dS6 (C-T in both)
TPG
(S1)
B
BISTer-2 Architecture (cont.)
Similarly,
BE: dS1 = dS5, dS2 = dS4
CF: dS2 = dS6, dS3 = dS5
OR2
A
TPG
C
TPG
F
CUT
D
CUT
E
OR1 Y1 Thus all faulty pairs are diagnosable with high probability.
Y2
(S3)
These pairs are uniquely diag. except for the case when
dS1 = dS3 = dS5 and dS2 = dS4 = dS6;
which is a very low probability event---e.g. requires 4
v. low prob. events of the type ds(CUT, TPG) = ds(TPG, CUT)
The detailed syndrome for a session is
the 0/1 bit pattern observed at the ORA
output (0 => match, 1 => mismatch) over
all the test vectors of the TPG.
Fast-TAD: A Fast Functional Testing and Diagnosis
•
In this methodology a PLB is tested only for specific functions (called
operational functions) it will assume as the ROTE moves across the FPGA.
•
A PLB X is functionally-faulty (f-faulty) if faults in X produce incorrect outputs,
when X implements any of its operational functions.
•
Property: While roving the ROTE in an FPGA either without f-faults or with
reconfigured f-faults, a PLB X needs to implement at most 2 functions: its
original function (when ROTE is in its initial position) and the fn. of the PLB
two f-fault-free PLBs to its right.
Operational functions of c3
Advantages:
• Faster T&D
• >> yield
c1
c2
c3
fx1
c6
fx3
c7
fx4
c6
fx3
ROTE
c7
fx4
c4
fx2
c5
c4
c3
fx4
fx3
ROTE
c5
ROTE
• >> availab.
c1
c2
fx1
fx2
ROTE
PLB in column c3 implements functions fx1 and fx3
as the ROTE moves across the FPGA.
Diagnosis in Fast-TAD (overlaid on BISTer-1)
Ses
Ses
PLB
PLB
A
A
B
B
C
DC
f-faulty
PLB
D
A
S1
S1
S2
S2
S3
S3
S4
S4
• Each PLB is tested in its two operational fn.
Theorem: Fast-TAD using BISTer-1 is
• A1-diagnosable
f-faulty PLB Q config. as a TPG will have
CUT
CUT CUT
CUT a GS of √ while Q configured as a CUT &
TPG ORA
TPG ORA d1,d2
d1,d2 a1,a2
a1,a2 performing its oper. functions will have GS
a1,a2 CUT
b1,b2 of X. In all other cases GS is either a √ or a X
CUT
TPG ORA
b1,b2
a1,a2
CUT
CUT • In some cases, faults in A and C ( or B and D)
nd
TPG ORA a1,a2 may not be distinguishable – a 2 test reqd.
b1,b2 CUT
CUT
TPG ORA
c1,c2 c1,c2
b1,b2 • Require 10.t1 time versus 16.t1 if both CUTs
b1,b2
in a session are config. both their oper fns.
CUT CUT
CUT CUT
ORA
TPG TPG
ORA
b1,b2 c1,c2
c1,c2 d1,d2
Ses.
S1
S2
S1
S2
Faulty
c1,c2
d1,d2
PLB (C/A) (B/D)
PLB
(C/A) (B/D)
CUT
CUT
S1 S2 S3 S4
CUT
ORA c1,c2 d1,d2 TPG
A
√
A
TPG
√√
X
/√
X/√
X
b1,b2
d1,d2 X/√
a1,a2 X
X/√
B
X
√
X/√
X/√
C
X/√
X/√
X
X
√
√
X/√
X/√
D
X/√
X/√
X
√
B
CUT CUT
c1,c2 b1,b2
C
CUT
c1,c2
ORA
ORA
TPG
D
B
C
D
X
X
√
Simulation Environment
• A 32 x 32 FPGA was simulated with 3-input 1-output PLBs.
• Fast-TAD with BISTer-1 and STAR BISTer (enhancement of BISTer-0 with
1-diagnosability) techniques were implemented on this FPGA.
• The adaptive diagnosis phase of the STAR BISTer is very complex; we
have simulated only the fault detection and direct diagnosis phase of the
STAR BISTer (BISTer-1 has no adaptive diagnosis phase)
• Two types of faults (with internal fault density up to 25%) were inserted:
1. Randomly distributed faults with external faulty density up to 40%
2. Clustered faults with cluster density up to 3%
Prob. of a fault around
a “center” fault = k/d
(k=const, d=distance)
1
2
Legend:
Center faulty PLB
Correlated faulty PLB
Non-faulty PLB
Simulation of 3 x 2 STAR BISTer [M. Abramovici et, al., ITW ’00]
T
C
T
C
T
O
T
O
T
C
T
C
T – TPG, O – ORA, C – CUT
• 1-diagnosable; it can diagnose 1 fault in a
3 x 2 BISTer area (1 / 6).
• Each BISTer consists of 3 TPGs, 2 CUTs
and 1 ORA – 6 sessions reqd.
• STAR moves by 2 cols
• Very complex adaptive diagnosis phase
Version of our 2 x 2 BISTer-1 w/ a 3-PLB TPG
• # of TPG PLBs = ratio of inps/outps in PLB
=> 3 TPGs for testing 3-inp 1-outp PLBs
B
T
A
B
T
A
B
A
T
• 2x3 BISTer-1: 3 TPGs, 2 CUTs & 1 ORA
T
C
B
B
C
D
A
A
D
C
T
D
T
• Basically two partially overlapped basic 2x2
BISTer-1’s – 8 sessions reqd.
• ROTE moves by 2 cols
C
T
D
C
T
D
• Result: Can diagnose up to 1 fault in every alt.
col of a 2-row FPGA subarray – diagnosability
is thus 1 / 4 approaching that of ideal Bister-1’s
Fault coverage (%)
BISTer-1
STAR BISTer
Results:
Fault Coverage v/s Fault
Density
100
90
80
70
60
50
40
Randomly distributed faults
1
2 5
7 10 15 20 25 30 35 40
Clustered faults with k = 0.5 in
p  k /d
The three values of fault density in the
plot correspond to cluster densities of
1%, 2% and 3% respectively.
Fault coverage (%)
Fault density (%)
BISTer-1
100
90
80
70
60
50
40
30
8.8
STAR BISTer
16.9
Fault density (%)
26.6
Fault latency (x t_1)
Results: Fault Latency v/s Fault Density
BISTer-1
STAR BISTer
800
700
600
500
400
300
200
1
2
5
7 10 15 20 25 30 35 40
Fault density (%)
Conclusions
• Developed a 1-diag. (1 of 4) BISTer
• Developed (for the 1st time) a 2-diag. (2 of 6) – w/ high prob. -- BISTer
• Developed (for the 1st time) functional T&D: tests PLBs in only 2 funcs
that they will perform; prev. methods performed exhaust testing
• Fast-TAD w/ BISTer-1 has the same diagnosability (1 of 4) for f-faults
• Our methods do not require adaptive diagnosis; previous techniques
have complex adaptive diag. mechanisms
• Simulation results for Fast-TAD w/ BISTer-1:
fault coverages of 96% & 92 % at fault densities of 10% & 20% resp.
The previous best STAR-2x3-BISTer (non-adaptive version):
coverages of 74% & 46% at these densities
• Much lower fault latency of Fast-TAD w/ BISTer-1 compared to that of the
STAR-3x2-BISter
• Its high fault coverage at high flt. densities and low fault latency should
prove useful for testing and diagnosing emerging tech. FPGAs (<= 90 nm,
nanotechnology) that are expected to have high fault densities
BISTer-2 architecture
B
CUT
A
TPG
C
ORA
F
ORA
1
2
Y1
Y1 – output of the ORA comparing CUTs
Y2 – output of the ORA comparing TPGs
Theorem: BISTer-2 is 1-diagnosable
Proof:
Gross syndrome corresponding to Y1
for each faulty PLB is unique.
E.g. Y1 is pass in section 2 only for
faulty PLB A and no other PLB.
Y2
E
D
CUT
TPG
OR1 => ORA 1 (Y1)
OR2 => ORA 2 (Y2)
S1
S2
S3
S4
S5
S6
A
TPG OR2 TPG CUT OR1 CUT
B
CUT TPG OR2 TPG CUT OR1
C
OR1 CUT TPG OR2 TPG CUT
D
CUT OR1 CUT TPG OR2 TPG
E
TPG CUT OR1 CUT TPG OR2
F
OR2 TPG CUT OR1 CUT TPG
Gross syndrome corresponding to Y1
Faulty
PLB
S1
S2
S3
S4
S5
S6
A
X
√
X
X
X
X
B
X
X
√
X
X
X
C
X
X
X
√
X
X
D
X
X
X
X
√
X
E
X
X
X
X
X
√
F
√
X
X
X
X
X