CMS Outer Tracker Phase-2 Upgrade on-module powering

The CMS Tracker Phase 2 Upgrade for the High Luminosity Large Hadron Collider will use two module types, which contain different sensor configurations and custom ASICs. Guaranteeing the power integrity of all modules for the full lifetime of the detector is crucial for the detector performance. This article describes the historical evolution of the powering architecture, the problems encountered, and the solutions implemented for the two types of modules, as well as the final on-module powering strategy along with the data and modelling that motivated it.


The CMS Phase-2 Upgrade Outer Tracker modules
A major upgrade of the Compact Muon Solenoid (CMS) experiment is required to cope with the conditions expected at the High Luminosity Large Hadron Collider [1].The development of the new silicon modules for the CMS Outer Tracker (OT) aims to deliver almost ten times higher granularity, lower mass (by a factor of two) and provide higher data rates as well as local track segments (stubs) at 40 MHz.The rejection of low momentum tracks for the L1 track trigger is performed in the front-end electronics by locally correlating the signals from a pair of sensors.Two module types (2S and PS) will be used in the OT (figure 1).The 2S modules contain a double strip sensor configuration with an active area of (10 × 10) cm 2 , wire-bonded to two front-end hybrids (FEH) [2] that are powered and controlled by a service hybrid (SEH) [3].The PS modules contain a macro-pixel sensor bump-bonded to 16 readout application-specific integrated circuits (ASICs) and a strip sensor, both with a size of (5 × 10) cm 2 .Wire-bonds connect the macro-pixel ASICs and the strip sensor to two FEHs interconnected with a power hybrid (POH) on one side and with an optical readout hybrid (ROH) on the opposite side.

On-module powering requirement
The two module types of the OT host a variety of ASICs that perform different functions.Excluding those responsible for powering, there are 35 and 20 ASICs powered on the PS and 2S modules, respectively.Along with the VTRx+ (module responsible for electro-optical translation) which is plugged on both modules, the PS and 2S module active elements consume about 6.5 W and 4 W, respectively, when configured for data taking.
The main requirement for the on-module powering implementation for the two module types is to provide appropriate voltages for all the ASICs.This requirement needs to be met for different consumption conditions, all the modules to be built (which is more than 13 000 for the two types combined) and the detector lifetime of 10 years.A total ionizing dose (TID) of up to 77 Mrad is expected for a subset of the modules and the operating temperature can range from −40 • C to +50 • C.There are many constraints when trying to achieve this requirement: • The ASIC technology and the subsequent testing define strict limits for optimal operation.These limits range from 250 mV wide down to just 120 mV for the different ASICs.Operating an ASIC at lower than its minimum voltage can result in deterioration of the digital and/or analog performance while a voltage higher than the maximum can accelerate its aging or even cause catastrophic failure.
• The power consumption has to be minimized in order to reduce the load on the cooling and external powering systems and guarantee low temperature operation for the sensors.
• The location of the ASICs and the system geometry are fixed by the application (front-end ASICs for example have to be close to the respective silicon sensor channels).
• The circuits that host these ASICs must be flexible and require trackwidth and spacing that reach as low as 45 μm, which limits the number of copper layers and their thickness, complicating the power integrity of the system.
• The temperature, TID and various other parameters will vary over time and across different locations for the different ASICs within the detector's modules.Consequently, any adopted solution has to work throughout the entire operating range.

Powering strategy and the prototyping phase
Both the 2S and the PS module employ a two-stage DC-DC conversion to increase total powering system efficiency.The powering implementation as it was foreseen during the drafting of the module designs, was based on a bus-like topology for both power and ground (GND) of the 2S module [4].
For the PS, it was a bus topology for the power and a ring topology for the GND.The first stage relies on a buck converter operating at 2.5 MHz (the bPOL12) converting from up to 12 V down to 2.55 V.The second stage is based on a buck converter operating at 4 MHz (the bPOL2V5) to output 1.25 V for the 2S module and pair of converters delivering 1.25 V and 1.05 V for the PS module.Some rudimentary calculations to account for the voltage drops across the module and the ASICs' electrical specifications led to those output voltages and the nets are hence called 2V55, 1V25 and 1V05.
-2 -Power integrity simulations were performed to guide the prototype design of the individual hybrid circuits in order to keep current densities below a safe limit (approximately 50A/mm 2 , deduced from IPC-2221 guidelines [5]) and estimate the voltage drops expected in normal operation within a module.The simulations showed that the powering of the PS module was marginal on the 1V25 net.As a consequence, it was decided to apply two changes to the module architecture: to interconnect the 1V25 net around the module to form a ring and introduce a powering tail to lower the voltage drops on the 1V25 and the GND (figure 2).Finally, during the prototyping activities, the original design of the POH was found to have insufficient performance because of higher than anticipated voltage drops which triggered a complete re-design of the circuit and the implementation of remote sensing.With this method, the DC-DC converters are set to track the voltage at the output connectors of the powering circuit instead of the output of the main air-core coils.The apparent voltage drops on the powering circuit are then reduced to just the ones on the GND planes which cannot be compensated for.

Powering strategy validation
The final prototype hybrids became available in early 2022 and with them the task to build and thoroughly test modules, before launching the hybrids' production.The OT modules are complex objects and required a lot of attention to solve various issues with noise [6] as well as to qualify the module electrically and verify their performance for particle detection and track stub finding [7].In parallel with these efforts, we had to evaluate the power integrity of the prototype modules and prove that the systems' powering strategy was sound for the production.This meant proving that there was an output voltage that could be selected for all the DC-DC converters of the two module types such that the requirement described in section 2 is met.

Power integrity system considerations
As previously described, there are three powering nets in the PS and two in the 2S that require analysis.It is useful at this point to break this analysis in two parts, the source and the load.
-3 -Starting with the source, as is common in DC-DC converters, for both bPOL12 and bPOL2V5, a resistor divider is used to feedback the output voltage to the input of the internal error amplifier [8,9].The output voltage can then be determined by a simple equation where V ref is a factor affected mostly by the reference voltage internal to the converter but also from other factors internal to the converter.In the following analysis this parameter is used as a proxy for all these effects and can be thought of as an "effective" reference voltage.We can probe the distribution of output voltages from different devices under the possible operating conditions by looking at the terms of the V out equation: where t: Temperature, i: TID; a: aging; v: process variation.
If both resistors are from the same resistor family, same lot and are well-coupled thermally, one can assume that the first three effects are proportional to the nominal R which means that they will cancel out in the ratio R 1 /R 2 .In that case we have only R = R (v).With most resistors in industry the distribution of resistance values due to process variation in a certain resistor population is not known but the limits are typically ±1% or ±0.1%, depending on the resistor grade.
• Effects on effective V ref : Assuming aging and other complex effects (e.g.irradiation causes efficiency drop which causes higher temperature which in turn causes higher effective V ref ) are not significant, one can express V ref as: where A, B and Γ are functions describing the dependence on temperature, TID and process variation, respectively.
In the worst case, all effects combined result in the following expected output voltage for a random device, with random resistors, operating within the expected operating conditions: Effects of temperature (A(t)) and TID (B(i)) are known from single chip tests.The process variation Γ(v) is minimized in the bPOL2V5s since they are precisely adjusted during testing and as a result, the effective process variation is equal to the trimming precision (±1 mV).For the bPOL12, Γ(v) is known from the ASIC production testing as a normal distribution  (μ, σ 2 ) with mean μ and standard deviation σ.The limits of Γ(v) can then be estimated as μ ± 3σ.With these in mind, for the case of the bPOL12, min B (i) (μ + 3σ), where "o.r." is the operating range expected in our application for temperature and TID.This type of analysis leads to an estimation of a spread of 350 mV for the bPOL12 and 70 mV for the bPOL2V5 outputs for resistors with 1% tolerance, varying slightly depending on the exact nominal output voltages selected.
-4 -Moving to the load, the effects are similar but the analysis will be somewhat different.The voltage drop (sum of forward and return paths) developed on the powering network between a source (with an output V o ) and an ASIC which we shall call V drop , is a complex function involving all ASICs of the module and power distribution network elements.In practice it will, however, only be affected by a few parameters: where t: Temperature, i: TID, v: ASIC, PCB and connector process variation, c: modules ASICs configuration.
Similar to the source case, single chip tests give us information regarding the temperature and TID effects.For ASICs in 65 nm technology which are the majority in the modules, an increasing TID and decreasing temperature both lead to lowering of the ASICs' consumption [10].The powering network's resistance is lower at lower temperatures as the resistance of copper drops.Because of these, max V drop will occur at the highest temperature and lowest TID within the operating range.The effects of process variation of ASICs, PCBs etc. and the different module configurations are difficult to estimate from first principles, hence the strategy was to use data from prototype modules for these.The module data were gathered in two distinct activities: • Voltage measurements for all the power nets on prototype modules built in different institutes at fixed baseline conditions of ASIC configuration (ready for data taking), highest operating temperature and TID of zero.From these measurements, the V drop in the worst location of the module was computed for each power net.Identifying the worst location for each power net is best achieved through simulations.The particular location is defined by a simple rule: if the ASIC located there consistently operates above its minimum voltage, all other ASICs on the same net will also operate above their respective minimums.For the 1V25 net of the PS module, as seen in figure 2, this location does not coincide with the location of the lowest voltage (B) but changes between (C) and (A) depending on the usage of the powering tail or not because of the different minimum voltage requirement for different ASICs on the same power net.
• Powering modules with benchtop power supplies and bypassing on-board powering to measure module power consumption at different DC-DC converter output voltages and the V drop in the worst location of the module at different ASIC configuration states.
From the first set of data, depending on the number of modules from which data are available and pass basic validation checks, one can use different statistical inference methods to estimate the max baseline V drop of the population of all circuits to be built.Worth noting that this estimation assumes that the process variation of circuits, ASICs etc. remains constant between prototype and production batches.
From the second activity, the real-world sensitivity of the module power consumption to changes in converter output voltage was measured and found to be significant.As an example, the PS module will consume 250 mW more for every 50 mV increase of the 1V25 voltage.To put this into perspective, this would manifest in the full detector as a 1.5 kW increase in consumption, showcasing the need to limit the converters output voltage as much as possible.Furthermore, the peak consumption condition for the modules was investigated.For the 2S modules, the peak consumption happens during calibration of the analog front-ends and causes a 10% increase in the voltage drops for the worst location in the 1V25 net.
-5 -Using the estimated max baseline V drop and accounting for ASIC configuration effects as measured above we can have the max(V drop ) for each power net for the module with the most resistive powering network, at its worst location, with the most challenging operating conditions in terms of temperature, TID, and configuration.

Final validation and voltage selection
Once the DC-DC converter output spread and the maximum voltage drop corresponding to the worst location on a module for each power net is estimated, the selection of all the DC-DC converter output voltages becomes simple.We must satisfy that min (V out ) − max(V drop ) is higher than the minimum operating voltage of the ASIC at the worst location on the module and that max(V out ) is lower than the maximum operating voltage of all the ASICs powered by it.The adjustable parameter in these equations is only the nominal V out and we are further constrained by the commercial availability of a given resistor value that sets this nominal V out .
Fortunately, at this stage, the constraints are actually leading to a system that is solvable and resistors can be selected accordingly.If this is not the case, one can numerically compute the distributions of V out and V drop and not just work analytically with their limits.The distributions can then guide the decision to e.g.allow a small percentage of modules where a few ASICs operate below the allowable minimum voltage at some point during their lifetime, statistically not compromising detector performance.This had to be done for the modules' 2V55 nets in our case.
With this method, we could also identify the need of using 0.1% precision resistors for all the power nets.We could prove that the extra powering tail that was designed and used in the prototype PS modules is not required anymore and we could select output voltages for all the DC-DC converters for both module types that satisfy our requirement from section 2. Finally, while working on the 2V55 nets, we realized that no nominal V out would actually yield acceptable system performance.To address this, we first chose to retest and bin the bPOL12 converters into two groups based on their V ref , effectively halving their expected output spread.Second, we ensured that the operation of bPOL2V5 converters on our circuits at voltages of up to 2.7 V, higher than their official rating of 2.5 V, is safe.During the kick-off production, measurements will be used to verify our estimates for the voltage drops and the output voltage spreads and if needed output voltages can be adjusted for the rest of the production.

Conclusion
The development of the on-module powering for the CMS Phase-2 OT was presented.The considerations that we described in section 4 are important to be taken into account early in the project and a similar analysis to be performed, to avoid potentially severe consequences.From the results, it becomes apparent that the direction in the ASIC and PCB design and manufacturing along with the requirements of electronics for demanding areas like the cores of high-energy physics experiments are creating new challenges in terms of achieving power integrity, especially when dealing with the production of thousands of circuits and modules as for the CMS OT.

Figure 1 .
Figure 1.Modules of the CMS OT: the 2S (left) and PS (right).

Figure 2 .
Figure 2. The PS module.In solid red line the topology of one of the 3 nets (1V25) in the original module design and with dashed line the changes before the final prototyping phase: the blue "zig-zag" tail and the interconnection between PS-FEH left and PS-ROH.A,B,C denote certain locations of interest for the 1V25 power net, see section 4.1 for details.