Phase 1 upgrade of the CMS drift tubes read-out system

In order to cope with up to two times the nominal LHC luminosity, the second level of the readout system of the CMS Drift Tubes (DT) electronics needs to be redesigned to minimize event processing time and remove present bottlenecks. The μ ROS boards are μ TCA modules, which include a Xilinx Virtex-7 FPGA and are equipped with up to 6 12-channel optical receivers of the 240 Mbps input links. Each board collects the information from up to 72 input links (3 DT sectors), requiring a total of 25 boards. The design of the system and the first validation tests will be described.


Introduction
The Drift Tubes (DT) system is part of Muon subdetector of the Compact Muon Solenoid (CMS) experiment [1]. The DT readout system is organized in several layers [2]. In each of the 172 200 drift chambers, the charge generated by the energy deposition from the passing muons is collected in the anode, and shaped as a pulse by the front-end electronics. The ReadOut Board (ROB), based on CERN's HPTDC (High Performance Time-to-Digital Converter), digitizes the timestamp of the pulses' leading edges from 128 of these channels, stores them in memory, and delivers them through a 240 Mbps link when it matches the search window determined by the arrival of a Level-1 Accept (L1A) trigger signal. The information from the 1500 ROB boards is received by 60 ROS (Read-Out Server) boards. Each ROS receives 25 links (one sector), and performs data merging and quality monitoring. The 5 DDU boards collect data from 12 ROS each (one detector wheel) and deliver it to the DAQ system. A schematic view of this chain is shown in figure 1.
The ROS was originally located in the underground experimental cavern (UXC). It was relocated during Long Shutdown 1 (LS1, 2013-2014) to the underground service cavern (USC) [3], in preparation for the upgrade described in this paper. In USC, the electronic systems are more accessible for maintenance, and are subject to looser design constraints, as they are not exposed to radiation or high magnetic field. -1 -

Motivation
Simulations show that the ROS processing time is the current most severe bottleneck in the readout chain [4]. Figure 2 shows a simulation of the maximum sustainable L1A trigger rate as a function of the number of hits in the event, for three different hit distribution scenarios. As it can be seen, the ROS design is very efficient in dealing with individual noisy drift chambers, but is more affected by the presence of evenly-distributed background noise or multiple muons. Currently ROS is performing well, but it will not be able to deliver the necessary performance to cope with the expected twofold increase in luminosity for the LHC in the coming years [5]. Consequently, it was decided to replace the ROS by a new system which is capable of delivering the performance required for the future years' operation.

System architecture
The CMS experiment has chosen to use the µTCA architecture for the Phase-1 upgrades. The slot intended for a redundant MCH (MicroTCA Carrier Hub) is populated, instead, with the socalled AMC13 board, developed by the Boston University CMS Group [6]. The AMC13 receives timing data from TCDS (Timing and Control Distribution System) and delivers payload data to the DAQ, thus acting as an interface for the subsystem's data processor boards, which populate the AMC (Advanced Mezzanine Card) slots.
For the DT subdetector, both the readout and trigger systems needed to upgrade their secondlevel electronics. Because of the similar requirements for the two systems, a versatile data receiver and processor board (TM7) was designed to be used in both the trigger (TwinMux) and the readout -2 -

JINST 12 C03070
(µROS) second-level electronics upgrade [7]. Different firmwares implement the required functionality for each system. A picture of the TM7 is shown in figure 3. It is a singleslot doublewidth and fullheight AMC designed around a Xilinx Virtex-7 FPGA. It includes optical transceivers for slow-speed inputs and high-speed output data transmission up to 13 Gbps. The trigger upgrade (5 crates, 60 TwinMux boards) was already installed during LS1 and is currently in operation. The AMC13 board interfaces directly to the DAQ, and thus the µROS system will replace both the ROS and DDU systems. Each TM7 board has six 12-fibre MTP receivers, for a maximum of 72 inputs, while each DT sector consists of 25 channels. In order to simplify signal distribution and optimize hardware, each wheel's data will be processed by 5 µROS boards, 4 of them receiving 3 almost-complete sectors each (24 channels per sector), and the 5th TM7 receiving the 25th channel for each of the 12 sectors (and performing the so-called µROSv2 role). These boards will be distributed in 3 µTCA crates: one for the central wheel (with 5 TM7s), one for the two negative wheels and one for the two positive wheels (with 10 TM7s each). In total, 25 µROS boards will be necessary for operation.
Under Phase-1 conditions, the expected bandwidth need is 160 bit per event per ROB. With the planned architecture, this will translate to 1.1 Gbps per µROS-AMC13 link, well under the 4 Gbps available bandwidth through the backplane. The AMC13's output bandwidth will reach 30 Gbps (24 Gbps available after 8B/10B encoding) with 3 active transceivers, which will be enough to deliver the resulting payload bandwidth of 9.6 Gbps for the positive and negative wheels' crates and 4.8 Gbps for the central wheel's crate.

Hardware
The production of µROS boards (plus spares) and procurement of the different system components is ongoing, due for the end of 2016. A development µTCA crate has been installed at the CMS service cavern, equipped with one TM7 spare board from the TwinMux system. An optical fibre splitter has been installed to feed live collision data from the chambers and allow development and validation of firmware (FW) and software (SW) before the year-end technical stop.

Firmware
The firmware development is very well advanced. The blocks for slow control, TTC, and flash reprogramming through Internet Protocol (IP) have been developed for the TwinMux system and thus are already available.
The FW module for deserialization of the ROB links has already been prototyped and tested on the splitted channels, showing excellent results. The input stream carries data at 240 Mbps using National Instrument's DS92LV1021 protocol. The receiver samples data at 1.2 Gbps (5× oversampling), performs a majority filter in the 3 central samples of each bit, and reassembles the original 32-bit data word. A bit error rate test has been conducted on 27 channels over the course of 37 hours, from which we can conclude that, for optimized channels, BER <10 −13 . In this time, LHC went through various cycles of injection, ramp-up, physics, and beam dump, with their respective clock frequency shifts, during which the deserializer firmware stayed locked to the incoming signal, thanks to the dynamic phase adjustment of sampling.
The readout link has already been used to transfer real collision data to the AMC13, from where it has been read through IP. With this data, a histogram of pulse lead edge times as digitized by ROB over many events (called DT chamber timebox [8]) has been plotted in figure 4. It is consistent with the expected behaviour, with a central part with more hits that corresponds to the 400 ns drift time of the drift cells, and a baseline background inside the 1250 ns ROB readout window that correspond to other collisions data. The merging and event-building block is under development, although no significant challenge is expected. As compared to current ROS, the ratio of programmable logic cells to number of channels has risen by a factor 15×, and the Virtex 7 architecture allows clocking its logic resources at much higher frequencies than Spartan-IIe. The required increase of the processing performance will not consume all of the available processing power, and thus, the µROS processing time will not be the limiting factor of its performance.

Summary
The present CMS DT readout electronic system cannot perform satisfactorily under the increased luminosity projected by LHC. To solve this, a new system has been designed to replace the current DT readout electronics at USC (ROS, DDU). This new µROS system consists in 25 TM7 boards distributed in 3 µTCA crates. It will receive the 1500 readout links from the ROB boards at UXC, process the data in Virtex-7 FPGAs, and deliver it to the DAQ system through the AMC13 boards.
The procurement of the system components and the manufacturing of the TM7 boards are under way, and the development of firmware and software is already very advanced.