COTS for the LHC radiation environment

COTS for the LHC radiation
environment:
the rules of the game
Federico Faccio
CERN
Federico Faccio - CERN
Outline
•
•
•
•
Introduction
Summary of radiation effects
Risk management
Dealing with the radiation hazard:
fundamental steps
• Conclusion
Federico Faccio - CERN
Why talking about COTS?
• COTS = Commercial Off The Shelf
• No effort made to improve, assure or even test
the radiation tolerance
• Poor or no traceability of origin (what
REALLY is inside the package??)
• Cheaper and better performance, sometimes
there is no alternative to their use
• The cost of using COTS is higher than the bare
part cost: testing and logistic are expensive!
Federico Faccio - CERN
Summary of radiation effects
Permanent SEEs
SEL
Total Ionizing Dose (TID)
CMOS technologies
Potentially all components
SEB
Power MOSFETs, BJT and diodes
SEGR
Cumulative effects
Power MOSFETs
Displacement damage
Bipolar technologies
Optocouplers
Optical sources
Optical detectors (photodiodes)
Single Event Effects (SEE)
Transient SEEs
Static SEEs
SEU, SEFI
Digital ICs
Federico Faccio - CERN
Combinational logic
Operational amplifiers
Risk Management
Use of COTS => risk avoidance
Mission: LHC and experiments running
Which failure tolerable?
How often?
Where?
= f (system)
Risk management at system
level (top-down)
Do we have enough experience and competence in the same organizational
unit? (learning process is time and resources consuming!)
Federico Faccio - CERN
Dealing with the radiation hazard
Get a good knowledge
of the environment
Understand
the effects
Define the requirements
for the components
Identify the candidate
components
Test the candidate
components
Engineer the system
Federico Faccio - CERN
The radiation environment
Knowledge in “meaningful” terms:
TID
Total Dose [krad, Gy]
Displacement damage
1 MeV equivalent neutron
fluence (n/cm2)
SEEs
Fluence and energy distribution
of main particles (p/cm2)

or at least
Safety factor = cost
(E,all hadrons)dE
20MeV
Get the most precise estimate
of the environment
Taylor the safety factor
Federico Faccio - CERN
Effects of the environment
• CMOS technologies
– Memories (SRAM, DRAM, Flash, EEPROM)
– FPGAs
– Microprocessors and DSPs
• Bipolar technologies
• Power devices
• Optocouplers
Federico Faccio - CERN
CMOS technologies (1)
Displacement
damage
TID
Sensitive with dose rate effects
Variable failure levels
SEL
Not very likely in LHC
A few known sensitive components:
K5 mp from AMD, SRAMs, ADCs,
DSPs, FPGAs
Federico Faccio - CERN
SEGR, SB
Very unlikely
CMOS technologies (2)
SEU: memories
SRAMs
Sensitive with low threshold
Sometimes MBU
Stuck bits only with high LET
DRAMs
Sensitive with low threshold
Situation improved with
decreased cell area and better
signal over noise
sp comparable to SRAMs
Flash Memories
SEFI possible (low s)
Errors in the complex control circuitry
with different consequences
Higher threshold than SRAMs-DRAMs
Much lower sp (100-1000 times)
EEPROMs
Write mode more sensitive than read
Higher threshold than SRAMs-DRAMs
SEFI possible
Federico Faccio - CERN
CMOS technologies (3)
SEU: FPGAs
SRAM-based
Loss of configuration: consequences?
Low threshold: likely in LHC
Requires reprogramming
Antifuse-based
ONO antifuses sensitive to destructive
event with high threshold
A-Si antifuses more robust
• FF and combinatorial logic gates:
– Sensitive in both technologies (FF implementation with  sensitivity)
– TMR can be integrated in antifuse-based
– In new Virtex series, TMR can be safely integrated
• SEFI:
– Can happen in both technologies (SEU in JTAG circuitry) with low s
– Solutions proposed by both Actel and Xilinx
• Radiation tolerant products available (on epi substrate)
• Variability in radiation performance (esp. TID and SEL)
• Documented mitigation techniques exist for both Actel and Xilinx
Federico Faccio - CERN
CMOS technologies (4)
SEU: microprocessors and DSPs
• SEU effects strongly application-dependent
• Testing has to be performed running a
representative program
• SEU consequences: very variable (no effect,
calculation error, code stopped, …)
• Most devices are sensitive in a proton
environment, hence in LHC
Federico Faccio - CERN
Bipolar technologies
Simultaneous effects:
they add up
TID
Leakage paths and b degradation
Sensitive with dose rate effects
(ELDR)
Displacement damage
Variable failure levels
b degradation
PNP are affected from 3•1011 p/cm2 (50MeV)
NPN are affected from 3•1012 p/cm2 (50MeV)
Voltage regulators, comparators, op amps
SEL
SET
At the output of comparators
Rail-to-rail signal
Federico Faccio - CERN
Power devices
• Sensitive to TID and displacement damage
• Power MOSFETs, bipolar and diodes
SEB
– Sensitive in hadron environment (also 14MeV n)
– De-rating often required (of variable %)
– P-channel MOSFETs are much less sensitive
• Power MOSFETs and IGBTs
SEGR
– Very rare in an hadron environment
– Dependent on Vgs (sensitive for Vgs < -20V)
– Dependent on gate oxide thickness
• Most data refer to HI: de-rate as indicated for
experiments run with LET of 26 MeVcm2mg-1
Federico Faccio - CERN
Optocouplers
• Sensitive to TID and displacement damage
– CTR decreases after 1-5•1010 p/cm2 (4N49 Micropac
and Optek, P2824 Hamamatsu)
– Degradation of LED and ptotodetector
– Other devices, with different LED and coupling
LED/phototransistor, have good resistance (6N140,
6N134, 6N139 from HP)
• Sensitive to SET
– Sensitivity increases with speed
– Sensitive to direct ionization from p+ (angular effect)
– Might induce transient out dropout in DC-DC conv.
Federico Faccio - CERN
The radiation requirements
(theory)
Know the system where the component operates (top-down)
•Cumulative effects:
Estimated level • SF
Simulation
Test procedures
COTS variability
•Destructive SEEs:
No destructive SEE
•Transients and SEU
Acceptable rate for the system
Federico Faccio - CERN
The radiation requirements
(headaches)
•Cumulative effects:
Which SF????
Simulation
Test procedures
COTS variability
Taylor the SF
•Destructive SEEs:
Which limit on cross-section?
Which limit on HI LETth?
accurate
correct
systematic
Example
Envir. = 1011 h/cm2
1000 components
s = 10-11 cm2?
•Transients and SEU
Estimate the error rate in the real environment
Evaluate the system-level impact of each error
Federico Faccio - CERN
The candidate components
• Search for radiation data
– Databases on web (often obsolete): JPL compendia,
GSFC, DTRA, SPUR, ….
– NSREC “Workshop records”
– December issue of Trans. Nucl. Science
– ESA/ESTEC final presentation day (soon
database?)
– For FPGA, look in the manufacturer’s home page
for fresh data
• How to interpret SEE data?
– Rough guidelines based on “Computational method to estimate
SEU rates in an accelerator environment” (NIM, August 00)
Federico Faccio - CERN
How to interpret SEE data (1)
You have data for mono-energetic p or n beams (60-200MeV)!
SEErate = sp/n • flux (all hadrons above 20MeV)
Example
Xilinx XC4010XL: s100MeV n = 4.4•10-15 cm2/bit
Estimated flux = 2•103 cm-2s-1 (=1011 cm-2)
=> SEErate = 8.8•10-12 errors/(bit s)
Each chip contains about 283k configuration bits
=> SEErate chip = 2.5•10-6 s-1
For each 110 FPGA, one looses its
configuration each hour!
Federico Faccio - CERN
How to interpret SEE data (2)
You only have Heavy Ion data...
… but you have the Weibull fit parameters!
Probability curves from the
simulation of the environment
3
Weibull curve
cross section (cm 2)
2
1
0
20
40
60
80
Deposited
energy
Federico Faccio - CERN
100
120
How to interpret SEE data (3)
You only have Heavy Ion data...
… and you do not have the Weibull fit parameters...
You can just have a feeling:
•LETth < 5 MeVcm2mg-1 => quite sensitive
•LETth > 15 MeVcm2mg-1 => not sensitive
Federico Faccio - CERN
Testing the candidate parts
• Never use data from a database as a source for
qualification, only to identify candidate parts!
– Radiation source
– Irradiation procedure
– Board-level testing and hybrid devices
Federico Faccio - CERN
Radiation source
60Co
TID
Low energy neutrons
(nuclear reactor)
Displacement damage
SEEs
Mono-energetic
hadron beams
(60-200 MeV p)
Global test plan
(CMS: HCAL, Muons, Cavern)
With 60 MeV:
- rare SEU under-estimate
- Is the energy enough
for SEB/SEGR?
What about thermal neutrons? (they have not been taken into
account for the experiments)
Federico Faccio - CERN
Preferential access conditions
for high-E proton beams
Preferential agreement with 2 facilities established since several years
through the RD49/COTS project :
-CRC (Cyclotron Research Centre) in UCL, Louvain-la-Neuve (Be)
-> protons (60MeV), Heavy Ions, neutrons (low intensity)
-- PSI (Paul Scherrer Institute) in Villigen (Ch)
-> protons (250MeV)
Federico Faccio - CERN
Irradiation procedure (1)
CMOS
Prompt + Latent charge buildup
Irradiation + Annealing
Test methods give worst case picture
TID
ELDR effect
JPL advice:
Bipolar
TIDspec < 30krad
TIDspec > 30krad
50 & 0.005 rad/s test at room T
test up to 30krad in 3 conditions:
compare
50 & 0.005 rad/s at room T, 1rad/s at 90oC
if failure in any condition
compare
(@TID<1.5TIDspec) => do not use!
if comparable => use 90oC test
BUT take an additional SF = 2 on TIDspec
Federico Faccio - CERN
Irradiation procedure (2)
Displacement damage
SEU, SET
- room T, all grounded
- measurement of s
- representative conditions
- needs a dedicated setup
- careful to SEFI
- with h-beams => in air
and packaged
SEL, SEB, SEGR
s
- measurement of s
- protect the component!
- needs a dedicated setup
- for SEB & SEGR look for
derating conditions
Federico Faccio - CERN
E (MeV)
sSEB
Vds
rated Vds
Board-level testing & hybrids
• Board-level testing
–
–
–
–
Less infos on actual safety margins
It can be difficult to trace back the origin of problems
Use for go/no go tests only!
Can give useful infos on system response (esp. SEU)
• Hybrid devices
– Difficult to know what is in the hybrid (proprietary
designs, no infos from the manufacturer)
– Examples on DC-DC power converters (JPL, NASA)
Federico Faccio - CERN
Engineer the system
Is the
tolerance
sufficient?
Test the candidate
components
Yes
Qualify the components
to be used
No
No
Yes
Is there
an alternative
component?
No
Reduce requirements:
- refine the environment knowledge
- use mitigation techniques (for SEU)
- foresee replacement if possible
- modify the system
Federico Faccio - CERN
Qualification
OK?
Yes
Use the components
Summary
• Radiation effects
• Risk management
– risk avoidance impossible with COTS!
– more efficiently applied at system level!
• Steps to deal with the radiation hazard
–
–
–
–
–
–
know the environment
understand the effects
define the requirements
identify the candidate components
test
engineer the system
Federico Faccio - CERN
Conclusion
Main rule of the game:
To merge knowledge on
System
Environment
Radiation hazard
Big challenge for all LHC teams!
Federico Faccio - CERN
Reference material
• This presentation, made at the 6th Workshop on
Electronics for the LHC Experiments (Cracow,
September 2000), has been followed by a full
paper with an extensive set of references (79
papers). The paper can be found as:
• - F.Faccio, “COTS for the LHC radiation environment: the rules of the
game”, proceedings of the 6th Workshop on Electronics for the LHC
Experiments, CERN 2000-010, CERN/LHCC/2000-041, 25 October 2000,
page 50
Federico Faccio - CERN