FPGA Co-processor for the ALICE High Level Trigger Gaute Grastveit University of Bergen Norway H.Helstrup1, J.Lien1, V.Lindenstruth2, C.Loizides5, D.Roehrich3, B.Skaali4, T.Steinbeck2, K.Ullaland3, A.Vestbo3, T. Vik4, A. Wiebalck2 for the ALICE Collaboration 1Bergen College, Norway Institute for Physics, University of Heidelberg, Germany 3Departement of Physics, University of Bergen, Norway 4Departement of Physics, University of Oslo, Norway 2Kirchhoff 5Institute of Nuclear Physics, University of Frankfurt, Germany ALICE – A Large Ion Collider Experiment TPC - Time Projection Chamber Very High Data Rate Pb-Pb central collisions Event rate: 200Hz Event size: ~75Mb => 15 Gbyte/s Max data-rate to tape is 1.25 Gbyte/s Compression/selection is needed Conventional, lossless methods: factor 2 HLT functionality • Compress • Reduce the amount of data required to encode the event as far as possible without loosing physics information • Trigger • Accept/reject events on the basis of physics application • Select • Select regions of interest within an event • remove pile-up in p-p • ... Task: reconstruct the tracks of 20.000 charged particles (each producing 150 clusters) in the TPC Timebudget: 5 ms The HLT setup Data are received in parallel 216x320 MB/s 216x100 MB/s RORC DDL reveiver Buffer > 1000 Events RcvBd PCI NIC RCU – Readout Controller Unit DDL – Data Detector Link ALTRO TPC FEE Buffer (8 Events) RORC DDL RORC – ReadOut Reciver Card RCU HLT farm reveiver Buffer > 1000 Events RcvBd PCI NIC •PCI kernel in the FPGA •FPGA will also be utilised for pattern recognition •Reduces number of CPU’s needed The HLT FPGA co-processor • FPGA: APEX 20K400 • Next prototype: Altera Stratix FPGA – Large internal memory – DSP cores Two Schemes for Finding Tracks •Low occupancy (p-p, Pb-Pb outer padrows) •Conventional approach with (2d) cluster finder and track follower •High occupancy (overlapping clusters): •Hough transform on raw data •Cluster analysis for deconvolution •(Kalman filter) High multiplicity picture Cluster Finder time The numbers represent Charge (ADC values) A vertical uninterrupted stack of numbers is called a sequence. The square shows the geometric centre of the sequence. Neighbouring sequences belong to the same Cluster. Final mean value: charge scalevalue charge (Weighted mean) Pad FPGA implementation of a cluster finder - the algorithm • Calculate the mean for every sequence • Adjacent pads with similar means are merged • Two lists of sequences are used: one for clusters on the previous pad one for clusters on the current pad • Clusters are removed from the searchrange when a match is found or we know it is finished • Clusters are inserted in the inputrange after merging or when we start a new cluster Memory of clusters begin Searchrange / Previous pad end Inputrange / Current pad insert Block Diagram, Verification Testbench Top structure RAM (lpm) T Decoder seq FIFO (lpm) seq File: charges C++ model Merger cluster File: VHDL clusters File: C++ clusters C++ program compares the results Relative Scales As before the mean is calculated by: charge scalevalue charge smaller + Smaller numbers, only multiplies by <11 - Multiplication can’t be done until merging takes place Alternative, (absolute): Decoder FIFO (lpm) Pre_Calc (2 mult, 1 add) Merger Deconvolution Simplified implementation, almost for free – splits at minima in both directions (time and pad) off on Merger Goals •spend few clock cycles per sequence Clock cycles spent in the different states •use few logic elements 6% •high clockspeed & new data 30 % 22 % & next pad send many new row or skip pad 5% merge store W 0% send all idle 11 % 4% new search range 11 % & 11 % idle - 30% merge_mult empty merge add ++ insert seq W merge_add send one & merge_store send_all send_many old is above send_one old is below merge mult **+ within match distance calc dist -- calc_dist insert_seq Cluster Finder Performance •Syntesized on Altera APEX •Uses 1800 Logic Elements (11%) •Memory usage 16*80 + 64*112= 8448 bits •Circuit runs at 33Mhz (4%) Outlook Implementation of Hough transformation Back Linked List (ALTRO sequences) Detector Data Link Detector Data Link TPC coordinates (Padrow, Pad, Time) Data Format Data Format Decoder Decoder Local coordinates (X, Y, Z) (A,B,E) XYZ XYZ Transformer Transformer ABE ABE Transformer Transformer Parameter Space (k,phi,eta-index) Histogram 1 Histogram 1 Histogram 2 ADC count 10-to-8 10-to-8Bit Bit Converter Converter .. .. .. Histogram N-1 Histogram N-1 Histogram N Histogram N Find Find Maxima Maxima Conclusion We have demonstrated the feasibility of a real time cluster finder implemented in an FPGA Firmware implementation of a Hough transform looks promising transperacy replacements from now on ALICE – A Large Ion Collider Experiment TPC - Time Projection Chamber 18 sectors on each side, each sector is readout in 6 subsectors Total is ca. 570.000 pads
© Copyright 2026 Paperzz