Parallel 4-Dimensional Cellular Automaton Track Finder for the CBM Experiment

The CBM experiment (FAIR/GSI, Darmstadt, Germany) will focus on the measurement of rare probes at interaction rates up to 10 MHz with data flow of up to 1 TB/s. It requires a novel read-out and data-acquisition concept with self-triggered electronics and free-streaming data. In this case resolving different collisions is not a trivial task and event building must be performed in software online. That requires full online event reconstruction and selection not only in space, but also in time, so-called 4D event building and selection. This is a task of the First-Level Event Selection (FLES). The FLES reconstruction and selection package consists of several modules: track finding, track fitting, short-lived particles finding, event building and event selection. The Cellular Automaton (CA) track finder algorithm was adapted towards time-slice-based reconstruction and included into the CBMROOT framework. In this article, we describe the modification done to the algorithm, as well as the performance of the developed time-based approach.


Introduction
The goal of the CBM experiment research program [1] is to explore the QCD phase diagram in the region of high baryon densities using high-energy nucleus-nucleus collisions. The CBM detector is designed to measure rare diagnostic probes such as multi-strange hyperons, charmed particles and vector mesons decaying into lepton pairs with unprecedented precision and statistics. In order to perform measurements with the required precision, the experiment is being designed to operate under interaction rates of up to 10 MHz, which puts strong constraints on the data processing ( Fig. 1). This requires very fast and radiation-hard detectors, and a novel data readout and analysis concept based on free streaming front-end electronics and a high-performance computing cluster for online event selection.
Thus, the traditional latency-limited trigger architectures, typical for conventional experiments with a hardware trigger, are inapplicable for the case of CBM. Instead, the experiment will ship and collect time-stamped data into a readout buffer in a form of a timeslice of a certain time length, not in the form of isolated collisions, and deliver it to a large computer farm, where online event reconstruction and selection will be performed. In this case a fraction of collisions inside a time-slice potentially may overlap in time with each other. The reconstruction algorithms in this case should be modified in order to process not event-associated data. The association of the hit information with physical events must be performed in software and requires full online event reconstruction not only in space, but also in time, so-called 4dimensional track reconstruction. In order to study the problem of event association and to All reconstruction algorithms are vectorized and parallelized. develop proper algorithms, simulations must be performed which go beyond the traditional event-by-event processing as available from most experimental simulation frameworks.

Time-slice concept
In order to switch from the traditional event-by-event processing to realistic time-based simulations, a concept of time-slice should be introduced to the reconstruction chain. The beam at FAIR will constitute a free stream of particles without a bunch structure. Neglecting possible fluctuations in the beam intensity and, thus, non-constant average collision rate, the CBM collision distribution in time can be described as a Poisson process. A dedicated procedure was introduced to group the Monte-Carlo hits delivered by the eventby-event transport code into time-slices. At first, a group of 100 simulated UrQMD Au+Au minimum bias events at 25 AGeV 1 was taken. In future the number of events in a time-slice can be changed easily and should be adjusted to the power of a compute node.
The start time of each collision was obtained with the Poisson distribution, assuming the average interaction rate of 10 7 Hz. A time stamp, assigned to each hit in the main tracking detector Silicon Tracking System (STS) was calculated as a sum of the start time of the event, in which it was produced, and the time shift due to the time of flight from the collision point to a hit 1 GeV per nucleon position. In order to obtain from this absolutely precise time stamp the hit time measurement in the STS, the time stamp was smeared according to the Gaussian distribution with a σ-value of 5 ns, thus simulating the expected STS time resolution.
As a direct outcome of this procedure, the distribution of hit time measurements in the STS detector for a time-slice of 100 minimum bias events at 10 MHz interaction rate is presented in Fig. 2. The distribution clearly shows that at this interaction rate events do significantly overlap each other. The association of hits with events is no longer trivial. Space-time correlations must be employed, so that the tracking procedure takes into account not just three spacial dimensions, but in addition the fourth dimension -time.

4-D CA track finder performance
The features of the Cellular Automaton (CA) method, namely the intrinsic parallelism and the ability to suppress large combinatorial enumeration, made the algorithm an appropriate solution for the track reconstruction in the Silicon Tracking System of CBM [2]. The CA track finder proposes a solid solution for the combinatorial search optimisation.
The method benefits from drastic suppression of combinatorial enumeration by introducing a phase of building up short track segments, namely triplets, at an early stage before going into the main search. Triplet is defined as a group of 3 hit measurements on adjacent detector stations, potentially produced by the same particle. After that the method does not work with the hits any more but instead with the created track segments. It puts neighbor relations between the segments according to the track model here and then one estimates for each segment its possible position on a track. After this process a set of tree connections of possible track candidates appears. Then one starts with the segments with the largest position counter and follows the continuous connection tree of neighbors to collect the track segments into track candidates. In the last step one sorts the track candidates according to their length and χ 2 -values and then selects among them the best tracks.
In addition to that, the method is intrinsically local with respect to data processing and, thus, can be run in parallel on modern many-core CPU/GPU computer architectures, which is particularly suitable for CBM. The CA track finder has been significantly modified to be both vectorized (using SIMD instructions) and parallelized (between CPU cores). The algorithm shows now strong scalability on many-core systems. The speed-up factor of 10.6 was achieved on a Intel Xeon E7-4860 CPU with 10 hyper-threaded physical cores [3].
Since resolving different physical events in the case of CBM is a non-trivial task, the CA track finder has been further modified to take into account time information. The algorithm was adjusted to take as an input time-slices, containing STS hits. Each STS hit contains two spacial coordinates x and y, measured at a certain z-position of the detector plane, as well as a time measurement t.
The STS hit time measurement information was used in the CA track finder algorithm to improve the speed and performance. Since the triplets are to be build of three hits, potentially produced by the same particle, these hits should correlate not only in space, but also in time. Neglecting the time of flight between stations, the hits, belonging to the same track, should coincide in time measurement within the detector time precision. All the combinations of hits, whose time measurements differ from each other more than expected time of flight plus STS time precision should be rejected.
The time-based CA track finder version was tested to reproduce the results of the eventby-event analysis. The resulting reconstruction efficiency and the speed of the event-by-event analysis as well as for the 4D CA track finder in case of reconstructing time-slices, produced out of one Au+Au minimum bias event at 25A GeV, are presented in Tab. 1. For evaluation purposes a reconstructed track is assigned to a generated particle, if at least 70% of its hits have been caused by this particle. A generated particle is regarded as found, if it has been assigned to at least one reconstructed track. If the particle is found more than once, all additionally reconstructed tracks are regarded as clones. A reconstructed track is called a ghost, if it is not assigned to any generated particle according to the 70% criterion. As one can see including the time and optimizing the 3D CA algorithm towards 4D reconstruction have made it possible to achieve the speed comparable to the case of the eventby-event analysis. Moreover, the track reconstruction efficiency has been improved after taking into account the STS time measurement, while comparing to event-based performance. The efficiency improvement is present even at the extreme case of 10 MHz interaction rate and can be explained by the presence of slow particles, which create random combinations of hits in case of event-based approach. These random combinations are rejected in the case of time-slices due to the hit time measurement cut, thus improving the performance.
The algorithm was included into the CBMROOT framework [4]. The simulation of detector response in the framework provides a time measurement, taking into account the anticipated behavior of the detector. For instance, a fake hit in this case is produced not only for the strips, which were accidentally fired simultaneously within a single event, but within a certain time interval. Thus, it puts the track finder in a more challenging condition.
The performance for the algorithm included into CBMROOT framework is presented in the last column of Tab. 1. It is comparable to the case of event-based analysis. The sightly higher clone level may be explained by the time measurement cut, which may be too strict for this case and may require additional tuning.
Residuals of the track parameters are determined as a difference between the reconstructed parameters and their true Monte-Carlo values. The normalized residuals (pulls) are determined as the residuals normalized by the estimated errors of the track parameters. In the ideal case these should be unbiased and Gaussian distributed with width of 1.0. Thus the pull distributions provide a measure of the track fit quality.
The residuals and the pulls for all track parameters are calculated at the first hit of each track. The distributions of residuals and the pulls for all track parameters in the CBM experiment together with their Gaussian fits are shown on Fig. 3. All distributions are not biased with pulls widths close to 1.0 indicating correctness of the fitting procedure.

Conclusion
The First-Level Event Selection (FLES) package for the CBM experiment contains all reconstruction stages: track finding, track fitting, short-lived particles finding, event building and event selection. For the most time-consuming part of the reconstruction procedure the Cellular Automaton track finder is used. In order to process large time-slices the CA track finder was vectorized using SIMD instructions and parallelized between CPU cores. The CA algorithm was further adapted towards time-slice-based reconstruction, which is a requirement in case of CBM for the event building. The 4D CA track finder algorithm is now able to achieve the efficiency and timing performance comparable with the event-by-event analysis even at the extreme interaction rate of 10 MHz. Thus, the FLES package fulfills the experimental requirements and is ready for time-slice based 4D reconstruction in the CBM experiment.