

# Performance of the AMBFTK board for the FastTracker processor for the ATLAS detector upgrade

To cite this article: F Alberti et al 2013 JINST 8 C01040

View the <u>article online</u> for updates and enhancements.

# You may also like

- A Pattern Recognition Mezzanine based on Associative Memory and FPGA technology for Level 1 Track Triggers for the HL-LHC upgrade
   D. Magalotti, L. Alunni, N. Biesuz et al.
- Design of Si-photonic structures to evaluate their radiation hardness dependence on design parameters
   M. Zeiler, S. Detraz, L. Olantera et al.
- Investigations with Gaseous Electron Multipliers for use on the ISIS spallation neutron source
   D Duxbury, N Rhodes, E Schooneveld et





RECEIVED: November 18, 2012 ACCEPTED: December 9, 2012 PUBLISHED: January 24, 2013

TOPICAL WORKSHOP ON ELECTRONICS FOR PARTICLE PHYSICS 2012, 17–21 SEPTEMBER 2012, OXFORD. U.K.

# Performance of the AMBFTK board for the FastTracker processor for the ATLAS detector upgrade

F. Alberti,<sup>a</sup> A. Andreani,<sup>b</sup> A. Annovi,<sup>c</sup> M. Beretta,<sup>c</sup> M. Citterio,<sup>a</sup> F. Crescioli,<sup>d</sup> M. Dell'Orso,<sup>e,f</sup> P. Giannetti,<sup>f</sup> A. Lanza,<sup>g</sup> V. Liberali,<sup>a,b,1</sup> D. Magalotti,<sup>h</sup> C. Meroni,<sup>a</sup> M. Piendibene,<sup>e,f</sup> I. Sacco,<sup>i</sup> A. Stabile<sup>b</sup> and G. Volpi<sup>e,f</sup>

<sup>&</sup>lt;sup>a</sup>INFN — Sezione di Milano,

Via G. Celoria 16, 20133 Milano, Italy

<sup>&</sup>lt;sup>b</sup>Università degli Studi di Milano,

Via G. Celoria 16, 20133 Milano, Italy

<sup>&</sup>lt;sup>c</sup>INFN — Laboratori Nazionali di Frascati (LNF),

Via E. Fermi 40, 00044 Frascati, Italy

<sup>&</sup>lt;sup>d</sup>Laboratoire de Physique Nucléaire et de Hautes Energies (LPNHE),

<sup>4</sup> place Jussieu, 75252 Paris, France

<sup>&</sup>lt;sup>e</sup>Università degli Studi di Pisa,

Largo B. Pontecorvo 3, 56127 Pisa, Italy

<sup>&</sup>lt;sup>f</sup>INFN — Sezione di Pisa,

Largo B. Pontecorvo 3, 56127 Pisa, Italy

<sup>&</sup>lt;sup>g</sup>INFN — Sezione di Pavia,

Via A. Bassi 6, 27100 Pavia, Italy

<sup>&</sup>lt;sup>h</sup>INFN — Sezione di Perugia,

Via A. Pascoli, 06123 Perugia, Italy

<sup>&</sup>lt;sup>i</sup>Institut für Technische Informatik der Universität Heidelberg, 68135 Mannheim, Germany

E-mail: valentino.liberali@unimi.it

<sup>&</sup>lt;sup>1</sup>Corresponding author.

ABSTRACT: Modern experiments at hadron colliders search for extremely rare processes hidden in a very large background. As the experiment complexity and the accelerator backgrounds and luminosity increase we need increasingly complex and exclusive selections. The FastTracker (FTK) processor for the ATLAS experiment offers extremely powerful, very compact and low power consumption processing units for the future, which is essential for increased efficiency and purity in the Level 2 trigger selection through the intensive use of tracking. Pattern recognition is performed with Associative Memories (AM). The AMBFTK board and the AMchip04 integrated circuit have been designed specifically for this purpose. We report on the preliminary test results of the first prototypes of the AMBFTK board and of the AMchip04.

KEYWORDS: Trigger concepts and systems (hardware and software); Digital electronic circuits; Trigger algorithms

| C | ontents                               |            |
|---|---------------------------------------|------------|
| 1 | Introduction                          | 1          |
| 2 | The FTK AM board                      | 2          |
| 3 | The AM chip 3.1 AMchip04 test results | <b>3</b> 5 |
| 4 | Conclusion                            | 5          |

#### 1 Introduction

The FastTracker (FTK) processor [1] is designed to process information from the tracking detectors of the ATLAS experiment [2]. FTK provides massive computing power to minimize the on-line execution time of complex tracking algorithms. The time consuming pattern recognition problem, generally referred to as the "combinatorial challenge", can be solved by the Associative Memory (AM) technology [3] exploiting parallelism to the maximum level: it compares the clusters found in the event ("hits") to pre-calculated "expectations" or "patterns" (pattern matching) at once, searching for candidate tracks called "roads". This approach reduces the typical exponential complexity of the CPU based algorithms into a linear problem.

Figure 1 shows the architecture of the FTK processor for the pixel and microstrip trackers (SCT) of the ATLAS detector [2]. The pixel and SCT data are transmitted from the front end ReadOut Drivers (RODs) to the Data Formatters (DFs) which perform cluster finding. The DFs organize the detector data into the FTK tower structure for output to the core crates, taking the needed overlap into account. The cluster centroids in each logical layer are sent to the Data Organizers (DOs). The barrel layers and the forward disks are grouped into logical layers so that there are 11 layers over the full rapidity range. The DO boards are smart databases, where full resolution hits are stored in a format that allows fast access based on the pattern recognition road identifier, and then retrieved when the AM finds roads with the requisite number of hits. In addition to storing hits at full resolution, the DO also converts them to a coarser resolution, referred to as super-strips (SS), appropriate for pattern recognition in the AM. The AM boards contain a very large number of preloaded patterns, corresponding to the possible combinations for real tracks passing through a SS in each detector layer. The AM is a massively parallel system, as it compares each hit with all patterns nearly simultaneously. When a pattern has been found with the requisite number of hit layers, it is then labeled as a road, and the AM sends the road back to the DOs. They immediately fetch the associated full resolution hits and send them and the road to the Track Fitter (TF). Because each road is quite narrow, the TF can provide high resolution helix parameters using the average parameters across the relevant tracking modules and applying corrections that are linear in



Figure 1. Architecture of the FastTracker processor.

the actual hit position in each layer. Fitting a track is thus extremely fast since it consists of a series of multiply-and-accumulate steps.

The following sections briefly summarize the design characteristics and report the performance of the AMBFTK (section 2) and of the AMchip04 (section 3).

#### 2 The FTK AM board

The AMBFTK [4] is a 9U VME board that contains the Associative Memory chips (AMchip04 [5]). It is an essential part of the FTK core, whose computing power is such that a few hundred such boards will enable pattern recognition to be carried out when the LHC instantaneous luminosity will increase (up to  $5 \times 10^{34}$  cm<sup>-2</sup>s<sup>-1</sup>), with an event input rate of 100 kHz and a latency of approximately a hundred microseconds.

Each of the AM chips will contain 50k to 100k patterns. Hits entering the system are simultaneously compared to all stored patterns. The AM chips are grouped in sets of 32 chips on a local associative memory board (LAMB), 16 on the top side and 16 on the bottom side of the LAMB. Each AM board contains four LAMBs. The AMBFTK is interfaced to an auxiliary board through a high speed connector (ERmet ZD).

The data traffic on the AMBFTK is challenging: a huge number of "hits" must be distributed at high rate with very large fan-out to all patterns (10 million patterns will be contained in the 128 chips on a single board), and a huge number of roads must be collected and sent back to the FTK post-pattern-recognition functions.

Figure 2 shows the AMBFTK layout, highlighting the position of one LAMB (in yellow), the 12 input serial links (in red), which are the hit paths from the high speed connector to LAMBs, and the 16 output serial links (in blue), which are road paths from the LAMBs to the high speed



**Figure 2**. Layout of the AMBFTK board, highlighting LAMB position (yellow), input serial links (red), and output serial link (blue).

connector. The serial link data rate is 2 Gbit/s, which corresponds to 20-bit words (16 data bits + 4 redundancy bits) at 100 MHz. Hits can be downloaded through VME inside the AMBFTK input FIFOs. The new board prototype had significant technological challenge due to the high density of chips populating both sides, to the use of advanced packages, and to the high frequency serial links.

Finally, it is worth pointing out that the number of AM chips leads to large current consumption. Since each LAMB is powered through a 1.2 V, 25 A voltage regulator, the maximum current drawn by a single AM chip must not exceed 780 mA.

The FPGA firmware has been successfully tested, and the serial links through the high speed connector have been verified up to 2 GHz.

## 3 The AM chip

The AMchip04 represents an intermediate step towards the final AM chip design. It is a test chip with a small area (14 mm<sup>2</sup>), and it contains about one tenth of the patterns forecast for the final chip (8 000 patterns instead of 80 000).

The AMchip04 design provides a big improvement with respect to the existing AMchip03 [6]. A new full-custom AM cell has been specifically designed for the AMchip04, to reduce area and to limit power consumption. The new chip, with extremely high pattern density, also contains new functional elements [7] and will be faster by at least a factor of 2 than the previous version. Table 1 compares the characteristics of different AMchip versions.

The AM structure is illustrated in figure 3. All patterns are compared in parallel with incoming data (HIT); one Flip-flop (FF) for each layer stores layer matches; and the AM readout is based on a modified Fischer tree [8].

| Vers. | Design approach         | CMOS Tech. | Area               | Patterns | Layers |
|-------|-------------------------|------------|--------------------|----------|--------|
| 1     | Full custom             | 700 nm     |                    | 128      | 6      |
| 2     | FPGA                    | 350 nm     |                    | 128      | 6      |
| 3     | STD cells               | 180 nm     | $100 \text{ mm}^2$ | 5 k      | 6      |
| 4     | STD cells + Full custom | 65 nm      | $14 \text{ mm}^2$  | 8 k      | 8      |

**Table 1**. Comparison between different versions of the AMchip.



**Figure 3**. Structure of the Associative Memory array.



Figure 4. Schematic diagram and layout of a single layer AM layer.

Figure 4 shows a single layer made of 18 single-bit AM cells: 4 NAND-based and 14 NOR-based cells [9]. To save power during comparison operations, we have combined two different match line driving schemes: a current race scheme, and a selective precharge scheme.



Figure 5. Test of the AMchip04 prototype: (a) testbed; (b) part of the report with results.

#### 3.1 AMchip04 test results

The first silicon prototype of the AMchip04 has been tested and it is fully functional. We found 9 defective chips in a batch of 60 devices, corresponding to a yield equal to 85 %.

Figure 5(a) shows the test setup. A Xilinx demo board has been used as pattern generator, with the Virtex FPGA configured as a MicroBlaze RISC processor with a Linux kernel. The output analyzer is an Agilent 16902B Logic Analysis System.

The test was performed by storing 8 000 different patterns and by applying a sequence of 4 000 different matching patterns at the input. This corresponds to the worst case condition when every input pattern matches. We observed excellent agreement between simulated and measured parameters: at 50 MHz clock frequency, all patterns were matched correctly. Figure 5(b) shows a small portion of the output test report, with prototype output data matching simulated data.

The measured current consumption is 133 mA (rms) at 50 MHz clock frequency: 50 mA for the I/O ring, and 83 mA for the core. The measurements are consistent with circuit simulations.

#### 4 Conclusion

The design of the first prototype of the Processing Unit for the FTK processor had to face the most challenging aspects of this technology: a huge number of detector clusters ("hits") distributed at high rate with large fan-out to all patterns (10 million patterns will be located on 128 chips placed on a single board) and the large number of roads collected and sent back to the FTK post-pattern-recognition functions.

The network of high speed serial links used to solve the data distribution problem has been experimentally verified.

The AMchip04 prototypes were successfully tested, and their performance and current consumption is fully compatible with the AMBFTK. However, further reduction in current consumption will be mandatory in the design of the final AM chip.

## References

- [1] A. Andreani et al., The FastTracker real time processor and its impact on muon isolation, tau and b-jet online selections at ATLAS, IEEE Trans. Nucl. Sci. 59 (2012) 348.
- [2] ATLAS collaboration, *The ATLAS experiment at the CERN Large Hadron Collider*, 2008 *JINST* 3 S08003.
- [3] M. Dell'Orso and L. Ristori, VLSI structures for track finding, Nucl. Instrum. Meth. A 278 (1989) 436.
- [4] A. Andreani et al., *The AMchip04 and the processing unit prototype for the FastTracker*, 2012 *JINST* **7** C08007.
- [5] A. Annovi et al., Associative memory design for the fast track processor (FTK) at ATLAS, in Proc. IEEE Nuclear Science Symposium and Medical Imaging Conference, Valencia Spain (2011), pg. 141.
- [6] A. Annovi et al., A VLSI processor for fast track finding based on content addressable memories, IEEE Trans. Nucl. Sci. 53 (2006) 2428.
- [7] A. Annovi et al., A new variable-resolution associative memory for high energy physics, in Proc. IEEE International Conference on Advancements in Nuclear Instrumentation Measurement Methods and their Applications (ANIMMA), Ghent Belgium (2011).
- [8] P. Fischer, First implementation of the MEPHISTO binary readout architecture for strip detectors, Nucl. Instrum. Meth. A 461 (2001) 499.
- [9] K. Pagiamtzis and A. Sheikholeslami, *Content-addressable memory (CAM) circuits and architectures:* a tutorial and survey, *IEEE J. Solid State Circ.* 41 (2006) 712.