System design and prototyping of the CMS Level-1 Trigger at the High-Luminosity LHC

For the High-Luminosity LHC (HL-LHC) era, the trigger and data acquisition system of the CMS experiment will be entirely replaced. The HL-LHC CMS Level-1 Trigger system will consist of approximately 200 ATCA boards featuring Xilinx UltraScale+ FPGAs connected by 25 Gb/s optical links. These boards will process over 60 Tb/s of detector data within 9.5 μs of the collision to select up to 750 kHz of events for readout. In this paper, we summarise the current status of hardware tests, our progress on system integration tests, and the online software designed to control and monitor these boards.


System architecture
All of the CMS experiment's detector systems [1,2] will be completely replaced or significantly upgraded for the High-Luminosity LHC (HL-LHC) era, in order to cope with the increased luminosity of 7.5 × 10 34 cm −2 s −1 .The Level-1 (L1) trigger selects which events will be read out for further selection by the High-Level Trigger.Despite the higher pile-up, the L1 trigger will need to have an extended physics acceptance, making decisions within 9.5 μs, accepting up to 750 kHz of events.It will receive data from the silicon tracker for the first time, in addition to higher granularity information from calorimeters and muon detectors.
The HL-LHC L1 trigger [3] will be implemented with Advanced Telecommunications Computing Architecture (ATCA) boards featuring UltraScale+ FPGAs and high-speed optical engines, connected by point-to-point links.It will be organised into several subsystems, as shown in figure 1.The Calorimeter, Muon and Global Track Trigger subsystems implement independent particle reconstruction and selection algorithms using data from separate detectors.The correlator combines information from all detectors using particle-flow algorithms, reconstructing electrons, photons, tau leptons, jets and energy sums with optimal performance.The Global Trigger (GT) applies a menu of up to approximately 1700 trigger paths to decide whether to accept an event for read out.Each path consists of kinematic and quality cuts applied to particles reconstructed by the upstream subsystems, or a neural network whose inputs are these particle candidates.The trigger decision is then sent from the GT to the Trigger Control and Distribution System (TCDS).
In contrast to the other subsystems, the scouting subsystem does not contribute to the experiment's readout decision.It receives a copy of all particle candidates that are sent to the GT, as well as those sent within correlator.The scouting boards reorganise and compress this data before forwarding it to servers, which analyse the data and store derived results.Although the efficiencies and resolutions from the trigger algorithms are not as good as those from offline reconstruction, certain analyses will benefit from the reduced statistical uncertainty that results from processing every single bunch crossing.Notably, L1 scouting will uniquely allow for multi-bunch crossing analyses and higher-statistics real-time diagnostics.The scouting input data from a few percent of the bunch crossings will also be stored, to support the development of new analyses.
-1 -Figure 1.The architecture of the HL-LHC CMS L1 trigger system.Boxes represent the L1 trigger subsystems and upstream systems, with lines indicating optical links between these components.The first row of boxes (labelled TP) represents the trigger primitive generators of the calorimeters (BC, HF, HGCAL), muon detectors (DT, RPC, CSC, GEM, iRPC) and silicon tracker (TF).The second row (labelled Local) shows subsystems performing additional regional reconstruction of data from the barrel calorimeter (BCT) and the muon detectors (Barrel Layer-1, OMTF, EMTF).The third row (labelled Global) shows subsystems that perform global reconstruction from independent detector systems.The fourth and fifth rows consist of the correlator (labelled Particle Flow Layer 1/2) and Global Trigger subsystems respectively.Reproduced from [3].CC BY 4.0.

Hardware
The trigger algorithms will be implemented in the four generic data-processing ATCA boards shown in figure 2. Each of these boards features a VU13P FPGA (A2577 package) which processes the trigger data, up to 124 25 Gb/s optical inputs and up to 124 25 Gb/s optical outputs -either a combination of x4 and x12 Samtec FireFly parts [4], or QSFP (Quad Small Form-factor Pluggable) modules.In case of errors on the optical links, trigger data cannot be retransmitted between boards due to latency and FPGA resource constraints, and as a result the bit error rate (BER) on the optical links must be less than 10 −12 .As such the optical link performance has been extensively measured on all boards, including evaluation of the new x12 25G Samtec FireFly parts.These tests included: long-duration -2 -tests under nominal operating conditions; measuring the BER with optical attenuators, to verify that their behaviour as a function of Optical Modulation Amplitude (OMA) meets specifications; and measuring BER as a function of the settings for the optical module's electrical interface.
There is a clear separation between the firmware that implements the trigger algorithm and the infrastructure on each trigger board.This separation avoids code duplication where different algorithms are run on the same board in different parts of the system.The algorithm-independent infrastructure firmware and the associated software is delivered by the board development teams.It includes components for clocking, slow control, fast control and feedback, capture and injection of test data, low-latency transfer of trigger data, and readout to the CMS DAQ boards.
Boards are synchronised and debug data read out into the CMS event record through the CMS DAQ and Timing Hub board (DTH-400) [5].There is one DTH-400 in each ATCA crate, located in one of the hub slots.The non-hub boards in each crate are synchronised via 10 Gb/s point-to-point backplane links between the DTH-400 and each non-hub board; the DTH-400s in different crates are synchronised via optical links with TCDS.Debugging data for the CMS event record is sent to the DTH-400 over optical links using a custom protocol; the DTH-400 merges event fragments from multiple boards and sends these larger event fragments to the DAQ server farm using TCP/IP.In order to include the buffer memory required for four 100 Gb/s TCP/IP streams, the DTH-400 features two VU35P FPGAs which each contain 4 GB of high-bandwidth memory.The scouting system is implemented with DAQ-800 boards -a readout-only sibling of the DTH-400 -since TCP/IP is also used to stream L1 candidates from the scouting system to COTS servers.
The final design of each board hosts a Xilinx Kria System-on-Module mezzanine, featuring a Zynq UltraScale+ MPSoC which runs a linux operating system.This on-board computer handles control, monitoring and management tasks, including acting as the interface to off-board control and monitoring software.The Intelligent Platform Management Controller (IPMC) -the ATCA shelf management interface for boards, required by the ATCA standard -is implemented either on a dedicated mezzanine card, or in the Zynq.
Prototypes of these ATCA boards have undergone extensive tests, notably: signal integrity tests, verifying sufficient margin at FPGA transceivers through the built-in eye scan functionality, and demonstrating BERs of less than 10 −14 ; power dissipation tests, verifying that the temperatures of FPGAs and optical modules remain less than 100 °C and 50 °C respectively with the maximum expected load of 200 W; and functional tests of all components and interfaces, by operating slices of the final system in integration facilities.Given the positive results from these tests, the final production runs of boards will start in the next year.

System integration
In the final deployment, approximately 200 boards of 6 designs implementing 20 different algorithms will need to function as a single coherent system.Over the last few years significant progress has been made in demonstrating that we can operate vertical slices.Having a clear boundary between algorithm and infrastructure firmware along with capture and playback buffers at I/O interfaces has been essential to this progress, since it has allowed us to factorise activities: the development of data transmission over optical links and of each algorithm could proceed in parallel, with these elements only combined once each component passes tests individually.The main integration facility is at CERN (see figure 3), with some tests also performed at other sites around the globe.
-3 - As mentioned in section 2 the protocol adopted for transferring LHC-synchronous trigger candidates over optical links between the boards cannot rely upon retransmission in case of errors, due to latency and FPGA resource constraints.In the R&D phase of the project, multiple custom link protocols were developed [3,6] to evaluate different mechanisms for ensuring that transmission errors are detected, and for minimising the duration of any resulting corruption of the trigger data stream.Merging together features from these previous protocols, a single common low-latency asynchronous link protocol has been defined.Implementations of this protocol have been extensively tested to verify their compatibility and robustness against errors.This included operating links under nominal conditions for many tens of thousands of link-hours, the injection of well-defined errors in firmware, and the induction of random errors on the link.
Tests involving algorithm firmware are performed in two stages.Firstly, algorithm firmware is tested in RTL simulation and on a single board.Monte Carlo event data is injected into the algorithm firmware via input buffers and then the outputs are captured in another set of buffers.A bit-accurate C++ emulator of the algorithm logic is developed alongside the firmware; the output captured in hardware must match the output predicted by emulators.Notably the format of the algorithm inputs and outputs in these single-board tests exactly matches that send to/from the link protocol engines in a multi-board system.Algorithm firmware for multiple boards from each subsystem has been verified in single-board tests, including the most complex algorithms in the correlator subsystem.
The second stage of algorithm tests are multi-board slice tests.These are first performed in 2-board slices to test a specific interface, with some inputs on the downstream board received from links and others injected from local buffers as shown in figure 3.Over 10 of the approximately 60 board-to-board algorithm interfaces have been tested to date, and some of these 2-board slices have now been chained together to form 3-and 4-board slices.Notably, a slice of prototype phase-2 electronics has also processed real proton-proton data from a sector of the muon Drift Tube detector on the experiment site.Latency is measured in both single-and multi-board tests; current results indicate a system latency of 8.6 μs, less than our target of 9.5 μs.

Online software
The L1 trigger system must be controlled and monitored coherently through software, with high operational efficiency.This software must have reliable and predictable behaviour under all circumstances, and must fulfil the trigger's operational requirements both in its initial commissioning period, and for several years afterwards.
-4 -The online software for the current L1 trigger system was developed on a compressed timescale, primarily targeting support for the final system at the experiment site.For the HL-LHC upgrade, complex integration tests are already being performed in test stands containing several boards, and so the associated online software is being designed to be equally easy to use for running tests in our integration facility and labs as for operating the final system at the experiment site.It must be designed with sufficient flexibility to support a heterogeneous system of about 20 different board functions implemented on 6 hardware platforms, while minimising the fraction of board-/subsystemspecific code.Its design should also take advantage of new open-source industry-standard tools and libraries, for example by leveraging modern automation tools to streamline software/firmware development and system testing.
The HL-LHC trigger online software architecture is based on the proven framework + plugin approach that is used for controlling and monitoring the current L1 trigger system.Beyond the additional requirements and goals outlined above, a key difference that influenced its design is that in the upgraded system all boards host an on-board computer.As such board-and subsystem-specific functionality is integrated into the framework directly on the boards.Mature versions of two key components of this design have been developed: the on-board application, HERD, and the off-board supervisor and user interface, Shep.
The common on-board application, HERD, provides a board-/subsystem-agnostic network API, allowing off-board services and user interfaces to invoke configuration procedures and retrieve monitoring data without any prior knowledge of the board/subsystem.This application is based on the SWATCH framework [7] created for the current L1 trigger system, which provides board-and subsystem-independent hardware descriptions, control primitives and monitoring primitives with a fine granularity.The HERD executable loads a board-/subsystem-specific plugins, which register concrete configuration procedures and monitoring data to the framework, associating these with firmware blocks and optical modules through the common abstract board model.
The common off-board supervisor, Shep, provides uniform web-based and command-line user interfaces for controlling and monitoring all boards and algorithms.It queries HERD instances running on the boards as shown in figure 4, discovering the available configuration procedures and monitoring data at runtime.The command-line interface and backend server are implemented in Python; the frontend is implemented using the Vue.js framework [8].This software is now routinely used for Serenity-Serenity link tests and slice tests, and has significantly simplified the running of these tests.-5 -

JINST 19 C03016
Plugins are under development for other boards.To speed up the algorithm development workflow, we have integrated this software into GitLab CI (Continuous Integration) pipelines so that after algorithm firmware is pushed to the CERN GitLab repository, it can be automatically tested in hardware by clicking one button in a web browser.We have also created a plugin test suite with similar GitLab CI integration, allowing us to validate board plugins automatically with one button click.New features in the software itself are typically tested using a 'dummy' plugin, that mimics responses from real hardware; this approach to testing significantly speeds up development of the common components, given that the limited number of prototype boards are already in high demand for other tests.

Conclusions
In the CMS HL-LHC upgrades, the current L1 trigger will be replaced by a system of ATCA boards featuring UltraScale+ FPGAs and 25 Gb/s optics in order to support the higher rate of trigger primitive data from subdetectors, and to increase the physics acceptance of the system in spite of higher pile-up.Prototypes of these boards have undergone extensive testing, and final production runs will start in the next year.Significant progress has been made on integration tests, including rigorous tests of the robustness of the link protocol, single-board tests of the most complex trigger algorithms, and several multi-board slice tests.Advanced development of online software is proving essential in maintaining the pace of these tests as we scale up to larger systems.

Figure 3 .
Figure 3. Left: integration test crate at CERN.Right: illustration of dataflow in a multi-board slice test.

Figure 4 .
Figure 4. Left: illustration of how the key ShepHERD components interact in a multi-board test.Right: status of key board components shown in Shep GUI.