Diapositive 1 - LPC Clermont

Preshower Front-End Boards News
LHCb group / LPC Clermont
1. News
2. TRIG-PGA Bit Flip Behaviour
3. Bit Flip Simulations
4. Conclusion
1. PS FEB Validation : Data Path
• 2 First Final Prototypes FE Boards received from Hitachi
• Fully Tested @ LPC Test Bench = FUNCTIONNAL
• Production tests processes ready
•
•
FEPGAs
• data processing path
• offsets
Analogic and the DAQ parts by analogic injection with AWG and DAQ through
the SEQPGA
Power-up failure SEQ ?, ‘random’ Problematic with multiple Boards
Stable Offsets over 55 h
‘sine’ injection with AWG
1. PS FEB Validation : Trigger Path
•Functioning of the delay chips & Synchronisation
~OK, ongoing studies
•Connectivity on the trigger path: OK
FEPGA-TRIGPGA
inputs/outputs with memory boards
see also Combined
Tests in Building 156
•Trigger algorithm: OK
Manufacturer JTAG Tests for Production
•Basic tests on wires between PGAs, spot soldering defects
•Improved JTAG tests including most of the board I/Os with dedicated back-plane
data conversion, access to individual bits @ 40 MHz, TTL levels
internal buses, pga pins + links drivers and connectors
Limitation : no JTAG access to analogue parts and ADCs
1. TRIG-PGA ISSUE
An issue was spotted with the APA450 TRIG-PGA
Some input patterns lead to erroneous output
This failure can be correlated to inputs Bit Flip Rate ( BFR )
Failure diagnostic : output values missing, then repeated
TRIG-PGA enters a ‘Blocked’ status
We are in touch with ACTEL France
No obvious problem from PGA code, implantation
Internal to APA450, Noise handling issues suggested ?
•Meanwhile …
… TRIG behaviour extensivelly studied @ LPC, Clermont Test Bench …
2. Bit Flip Generation
Monte-Carlo strategy to study BFR impact
Total Available Bits
Used with RAMs
Used with Memory
Boards
PS
64
64
0
SPD
64
64
64
ECAL
26
10*
10*
Neighbours
34
34
0
188 ( 100 % )
178 ( 95 % )
74 ( 39 % )
Total
Table 1: Bit usage for the TRIG bit flip tests. In the last raw are also indicated the fraction of the total 188
bits varied during the tests. (*) The BCID counters of ECAL bits have a particular status.
RAMs:
PS & SPD random
Memory Boards:
SPD random
SPD alternated & full flip
Bit Flip Rate ( % )
Maximal Run Duration
( cycles )
13-47
2.5∙106
4-21
1.6∙108
21-39
4.5∙105
Table 2: Bit Flip Rate generation strategies. For the RAMs setup all data, except BCIDs were taken
randomly at each cycle. In MB runs in order to finely investigate the problematic area of high BFR we
used a combination of some bits being fully flipped while others were alternated with random patterns.
8 words of 8 bits
Mask n/8
2 Internal BCID
counters [0; 255 ]
Average BFR = 2 %
BFR = flips / 188
40 s LHCb @ 40 MHz
= 2 days of injection
2. TRIG-PGA Failure Distribution
Failure occurs on average when
cumulating some amount of Bit Flip
Board P#8-01
Board P#8-02
Statistical spread:
•Fluctuations in the BFR over time
due to BCIDs, pattern
randomization protocol
•Inherent. Even for same inputs
few % spread
Figure 1: TRIG PGA failure in the RAM configuration for a mean BFR of ~47 %.
The figure shows the distribution of cumulated flips in PS and SPD bits before
failure occurs. The open circles stand for board P#8-01 and the plain dots for
board P#8-02. Error bars are statistical.
2. TRIG-PGA Failure as BFR
•There is a threshold BFR, fr,0, below which no failure occurs.
•Failure is not driven by a dynamical process.
•Below the threshold, the TRIG-PGA can re-generate.
fr,0
Relax BFR = 5 %
Relax BFR = 16 %
Additional cycles
before failure
Relax
Blocked
Postponed failure sequency
2. TRIG-PGA Failure as T, Clock f
Failure depends on Temperature, Clock frequency
No failure @ -55 ºC
No failure below
25 MHz
Figure 4: TRIG PGA failure in the MB configuration for various temperatures and operating clock frequencies. On the left is shown the failure
cycle number for different temperatures. The usual conditions encountered in the lab were of 30 ºC. The right figure shows the failure cycle
number for various operating frequencies. The nominal frequency is 40 MHz. Below 25 MHz no failure occurs up to a maximal 39 % BFR.
2. TRIG-PGA Recovery
•Once blocked the TRIG-PGA can recover if BFR below threshold
•Full recovery is long … ~ 103 cycles
‘Exhausted’ plateau
Full recovery
1st load
2nd load
Relax
Blocked
Blocked
Recovery Sequency
Recovery litle depends on load and relax BFR values ( except threshold )
 Suggests time driven mechanism
2. TRIG-PGA Failure Cost Model
•Failure cycle follows a simple ‘Cost’ model as BFR
•Assume cost is stationnary
Cost modeled from data
Cross-check model
Cost / cycle ( % )
Fully random BFR in perfect
agreement with data
~ 20 % bias when BFR varies
in time ; BOOT ?
Blocked
Cross-check sequency
3. TRIG-PGA Bit Flip Expected in LHCb
•Bit Flip simulated from DC 06 minimum bias pp events
(103 events / cell)
•Set SPD threshold to 0.3 MIP = very low
L = 5·1032 cm-2·s-1
•Select the 2 ‘hottest’ boards
(close beam line)
BCID Counters
Average BFR well below
failure Threshold
But …
Distribution extends well
above ( ~ 3 % above )
Threshold
Mean Bit Flip
Rate ( % )
Contribution to
BFR ( % )
BCIDs
2.1
31
PS bits
0.9
14
SPD bits
2.0
29
ECAL
addresses
1.7
25
Total
6.7
100
Table 3: Mean Bit Flip Rate for cells ( 82, 91 ) at a
Figure 9: Bit Flip Rate distribution for pp events at a luminosity of 5·1032
cm-2·s-1 and for the two ‘hottest’ cells ( 82, 91 ) of the PS/SPD system.
The SPD threshold was set as low as 0.3 MIP.
luminosity of 5·1032 cm-2·s-1. The fractions at
which the various bits contribute to the BFR are
also indicated.
3. TRIG-PGA Failure Probability @ LHCb
•MiniBias pattern injection looped @ 40 MHz over 1 day : No TRIG-PGA failure
But always same pattern … 40 s LHC = 2 days tests
Cost Below 10 %
Estimate failure from
simulation / Cost model
116 min @ 40 MHz simulated
in Lyon batch farm
1st : Bunchs of 65 k cycles
 No faillure predicted
2nd : Extrapolate to 1 yr LHCb
‘non-stop’
Failure expected @ cost ~ 80 %
Figure 10: Cost probabilities and TRIG-PGA failure as simulated. The left figure
shows the maximal cost distribution within 65 k events runs. The right figure is the
failure probability within 1 year of LHC operation and assuming various cost
thresholds.
SHOULD NOT HAPPEN
in standard conditions
4. CONCLUSION
THE PS FEB ARE WORKING PROPERLY
fits our needs
PS FEB BOARDS EXTENSIVELY TESTED
@ LPC & ongoing in Building 156
•Power-up issues @ SEQ ?
•TRIG-PGA ‘pathologic behaviour’
No problem expected in standard LHCb conditions
Actel France @ LPC Monday, December 4th
 Production Should Follow