Design of a hardware track finder (Fast Tracker) for the ATLAS trigger

: The use of tracking information at the trigger level in the LHC Run II period is crucial for the trigger and data acquisition system and will be even more so as contemporary collisions that occur at every bunch crossing will increase in Run III. The Fast TracKer is part of the ATLAS trigger upgrade project; it is a hardware processor that will provide every Level-1 accepted event (100 kHz) and within 100 µ s, full tracking information for tracks with momentum as low as 1 GeV. Providing fast, extensive access to tracking information, with resolution comparable to the oﬄine reconstruction, FTK will help in precise detection of the primary and secondary vertices to ensure robust selections and improve the trigger performance.

: The use of tracking information at the trigger level in the LHC Run II period is crucial for the trigger and data acquisition system and will be even more so as contemporary collisions that occur at every bunch crossing will increase in Run III. The Fast TracKer is part of the ATLAS trigger upgrade project; it is a hardware processor that will provide every Level-1 accepted event (100 kHz) and within 100µs, full tracking information for tracks with momentum as low as 1 GeV. Providing fast, extensive access to tracking information, with resolution comparable to the offline reconstruction, FTK will help in precise detection of the primary and secondary vertices to ensure robust selections and improve the trigger performance.

K
: Trigger concepts and systems (hardware and software); Pattern recognition, cluster finding, calibration and fitting methods; Trigger algorithms; Data reduction methods

Introduction
The Large Hadron Collider (LHC) in Run-I, using only a fraction of the full LHC potential, was remarkably successful: the Higgs boson's discovery [2,3] and strong limits on new physics phenomena. After a shut-down period of almost 2 years, the LHC will be able to provide 13 TeV collisions, almost twice the energy of the previous run, expecting to collect a luminosity of 40-60 fb −1 per year, therefore increasing the discovery potential of the experiment. Greater instantaneous luminosity will provide an average number of contemporary collisions (pileup) up to 80. In order to achieve the required on-line data reduction in the trigger and data acquisition system (TDAQ), the LHC experiments are expected to increase the use of silicon detector information, reconstructing the track trajectories close to the interaction points, and allowing to distinguish between the contribution of each pileup collision. For this reason, the ATLAS experiment [1] has decided to include, within the existing multilevel trigger architecture, an electronic system designed to perform full track reconstruction from the hits observed in the Inner Detector (ID). The Fast TracKer (FTK) [4] processor will receive the ID data for each event accepted by Level-1, up to 100 KHz, and it will reconstruct charged-particle tracks with p T > 1 GeV, within the full tracker acceptance. This is expected to be of great importance for the high level trigger (HLT) computing farm, it will free resources and be more efficient on event topologies which are difficult to identify while maintaining a large rejection of the backgrounds.

FTK challenges
For every event passing the Level-1 Trigger, FTK performs a hardware-based track reconstruction based on hit information from all channels of the ATLAS silicon detectors. The resulting tracks are sent to the HLT to be used in the software algorithms. In order to cope with event rates of up to 100 kHz the tracking performed by FTK has to be several orders of magnitude faster than offline tracking. Hence, the processing of the data is organised as parallel as possible. The signals from -1 - the detector volume are split into 64 regions, so-called towers, which are processed independently. Further, the data volume is decreased as much as possible by a custom clustering algorithm defining "hits" which are considered later on instead of the full pixel/strip information. In addition, the hit information is re-binned into coarse-resolution "superstrips" whenever appropriate. FTK performs the tracking in two steps. At first, track candidates are identified by comparing the fired superstrips to predefined trajectories stored in memory. Such a "pattern" refers to a list of superstrips describing the trajectory of a simulated particle as it traverses the detector layers. These track candidates at coarse resolution (roads) seed a high resolution track fitting done by FPGAs. By considering only hits from the road the combinatorics is significantly reduced and hence makes the fit itself is much faster. The pattern matching procedure is implemented in a custom associative memory (AM) chip designed to perform it at very high speed. It allows to compare the incoming data simultaneously to all stored patterns. The parameters of the pattern matching can be adjusted. Narrow roads permit fast track fitting but require many patterns to be stored and searched in AM. Wide roads, on the other hand, allow for fewer patterns stored but the increased combinatorics within the matched roads slows down the track fitting. This choice is optimised by implementing the feature of variable resolution of the roads via ternary bits in the AM logic [5]. Furthermore, the number of matching layers is programmable.
The track parameters evaluation has been reduced to a set of scalar products. The same fitting formula, with different coefficients and numbers of coordinates, is used in coarse 8-layer pattern matching and fitting as well as during the second stage 12-layer fit. A missing layer is allowed in both stages.

The FTK hardware processing chain
All the functionalities mentioned in the previous section are implemented in specific electronic boards and cards, designed using VME and ATCA standards. The final system will have about 8000 AM chips and 2000 FPGAs, from different vendors and of different models. This huge computing power will be distributed on 32 data formatter (DF) boards, 128 associative memory boards-serial link processors (AMBSLP or AMB) and auxiliary cards (AUX), 32 second stage boards (SSB), and 2 FTK to Level-2 interface cards (FLIC). A description of each board type is presented in the following sub-sections.

Clustering reconstruction and data formatting
A complete sketch of the FTK system can be seen in figure 1. The entry point of the system is composed of 32 ATCA boards called Data Formatters (DFs, see figure 2) [6]. Each DF receives data from the ATLAS inner detector read-out drivers (RODs) through up to 4 daughter cards, the FTK input mezzanine (FTK_IM). Each FTK_IM (see figure 2 b) receives up to 4 fibers, with a data bandwidth of 2 Gbps for each one. All FTK_IMs will receive in total 380 links, equivalent to about 750 Gbps of raw ID data. The goal of the FTK_IM is to find clusters in incoming data performing a major data reduction while at the same time improving track reconstruction precision [7].
The DF board geometrically organizes the incoming clusters. This board arranges the clusters in η − φ projective towers, with a dimension of δφ × δη ∼ 32 • × 1.2, and in logical layers to be sent to the core crates. Each DF is expected to provide data to 4 core processors and 1 SSB board, equivalent to 2 FTK towers, with the possibility to send data to other DF boards in case the clusters belong to towers not served by the current DF. The connection with the processing units uses optical links placed on the RTM module. The connection with other DF boards on the same -3 -

Processing Units
The central part of the FTK pipeline is the system composed by the pair of AM and AUX cards called collectively a Processing Unit (figure 3). The two boards perform the pattern matching and the first stage fit, which are the most computationally intensive steps of the pipeline. Data are received by the AUX card through SFP connectors, already organized by layers. Two main functionalities are implemented within the card: the data organizer (DO) and track fitter (TF). The DO is a smart database organizing all clusters according to a coarse resolution position identifier, the super-strip (SS). The SS is sent to the AM (called also AMBSLP) system for the pattern matching. The TF receives the list of found roads from AMBSLP as well as the clusters associated to them from DF. According to the SS content of each road, the clusters are retrieved by the DO; the packet of hits belonging to each road are then sent to the TF. The TF builds all combinations of clusters in a road, with 1 cluster per layer, evaluating the χ 2 and then sends all good candidate tracks to the next board, the SSB. The AUX computation is distributed in 6 identical Altera Arria V FPGAs: 4 are devoted to the track fitter, while the other 2 respectively control the input and the output.
The AM board receives the SSs from the AUX and sends them to the AM chips. Internally, data are replicated to reach all the chips at the same clock cycle. The board is controlled by 4 FPGAs: 2 Xilinx Artix 7 which control the input and output logic, one Spartan 6 FPGA controlling -4 - the VME interface, and 1 Spartan 6 FPGA controlling the state of the board. The pattern matching function is done by 64 AM chips installed on 4 LAMBs.
Summarizing the data throughput, the AUX receives data at 6.4 Gbps from the DF and it has a 6.4 Gbps data channel toward the SSB. They are connected by P3, in which high speed serial links guarantee 12 Gbps as input from the AUX to the AMBSLP, and 16 Gbps as output from the AMBSLP to the AUX.

Second Stage Board and interface to HLT
The track candidates coming from the AUX card do not exploit the full precision of the ATLAS ID because they do not use some of the layers. The SSB shown in figure 4) improves the helix parameter resolution from 8-layer track fits using 12-layer fits and removes duplicate tracks. Each SSB receives the output from 4 AUX cards, the stereo SCT hits and IBL hits for the 2 η − φ towers associated with those AUX cards from the DF system. The maximum amount of data that are expected from the SSB is 6.4 Gbps/AUX, for a total of 25 Gbps. The board also shares track data among other SSBs for overlap removal and merges FTK data within a core crate for output to the FTK Level-2 Interface via two fiber-optic connections with 3Gbps for each one.
The SSB primary functions are: • Extrapolator which uses 8-layer track information to compute likely positions of hits in the other 4 layers for use in 12-layer track fitting.
• Track Fitter (TF) which determines best-fit helix parmeters from hits in roads using 12 silicon layers.
• Hit Warrior (HW) which removes duplicate tracks based on a requisite number of common hits and χ 2 .
In the current design, these functions are implemented in firmware loaded into separate Virtex 7 Xilinx FPGAs on each SSB.
-5 -Since the FTK towers have a generous overlap at the boundaries, a large number of duplicated tracks are found in these areas; in order to perform a cross tower duplicate removal, the output tracks are first moved to the SSB of the next tower, then sent to the system final board: the FLIC board.
The FLIC board's goal is to collect reconstructed tracks information from the SSB, reduce the data volume and convert them into a format compatible with the HLT software. The FLICs system has a total input and output maximum bandwidth of 32 Gbps.

Stage of integration and commissioning schedule
The FTK system is in a very advanced development status. All boards have either the green light for production or the production already started. The key component of the pattern matching, the AMChip06, has been submitted for production, and the first few thousand chips are expected to be released in November 2015. A first full slice test is being carried out, testing all the pipeline and the firmware of each board. The full slice is expected to be fully functional by the end of 2015. At the same time the installation of few boards in U.S.A. 15 in the ATLAS underground counting room has been started (1 DF and 4 IMs plus the FLIC). The DF is connected to the ATLAS data taking system and the IMs have received the first real data in August. A fully working system, which will be able to reconstruct tracks in the whole barrel region, is expected early in 2016, with full inner detector coverage by the summer of 2016. The commissioning of the complete system (128 AM boards) is expected in 2018, but it can be anticipated if the luminosity provided by the LHC increases faster than expected.

Conclusions
The ATLAS FTK processor will be able to provide high quality tracks to the HLT algorithms at full Level-1 rate. Thanks to this information the HLT will be more efficient in collecting evens with τs or b jets [4]. The hardware is ready for production and a first complete slice is being tested. The first production of boards is expected to cover the barrel region, with |η| < 1, by spring 2016, and full η coverage by the end of 2016.