Performance of reconstruction and identification of $\tau$ leptons decaying to hadrons and $\nu_\tau$ in pp collisions at $\sqrt{s}=$ 13 TeV

The algorithm developed by the CMS Collaboration to reconstruct and identify $\tau$ leptons produced in proton-proton collisions at $\sqrt{s}=$ 7 and 8 TeV, via their decays to hadrons and a neutrino, has been significantly improved. The changes include a revised reconstruction of $\pi^0$ candidates, and improvements in multivariate discriminants to separate $\tau$ leptons from jets and electrons. The algorithm is extended to reconstruct $\tau$ leptons in highly Lorentz-boosted pair production, and in the high-level trigger. The performance of the algorithm is studied using proton-proton collisions recorded during 2016 at $\sqrt{s}=$ 13 TeV, corresponding to an integrated luminosity of 35.9 fb$^{-1}$. The performance is evaluated in terms of the efficiency for a genuine $\tau$ lepton to pass the identification criteria and of the probabilities for jets, electrons, and muons to be misidentified as $\tau$ leptons. The results are found to be very close to those expected from Monte Carlo simulation.

Extrapolation of the τ h identification efficiency to large p τ h T . . . . . . . . 29 9.5 Using the tag-and-probe method in Z/γ * events for highly boosted τ lepton pairs . .

Introduction
Searches for new phenomena that consider signatures with τ leptons have gained great interest in proton-proton (pp) collisions at the CERN LHC. The most prominent one among these is the decay of Higgs bosons (H) to pairs of τ leptons, which constitutes an especially sensitive channel for probing Higgs boson couplings to fermions. The observation of the standard model (SM) Higgs boson decaying to a pair of τ leptons has recently been reported [1,2]. Moreover, searches with τ leptons in the final state have high sensitivity to the production of both neutral and charged Higgs bosons expected in the minimal supersymmetric standard model (MSSM) [3,4], in which enhancements in the couplings to τ leptons can be substantial at large tanβ, where tanβ is the ratio of vacuum expectation values of the two Higgs doublets in the MSSM. Examples of such searches can be found in Refs. [5][6][7]. In addition, searches for particles beyond the SM, such as new or heavy Higgs bosons [8][9][10][11], leptoquarks [12], supersymmetric particles [13][14][15][16], or gauge bosons [17][18][19] benefit significantly from any improvements made in τ lepton reconstruction and identification.
The τ lepton, with a mass of m τ = 1776.86 ± 0.12 MeV [20], is the only lepton sufficiently massive to decay into hadrons and a neutrino. About one third of the time, τ leptons decay into an electron or a muon, and two neutrinos. The neutrinos escape undetected, but the e and µ are reconstructed and identified through the usual techniques available for such leptons [21,22]. These decay final states are denoted as τ e and τ µ , respectively. Almost all the remaining decay final states of τ leptons contain hadrons, typically with a combination of charged and neutral mesons, and a ν τ .
The decays of τ leptons into hadrons and neutrinos, denoted by τ h , are reconstructed and identified using the hadrons-plus-strips (HPS) algorithm [23,24], which was developed and used in CMS when the LHC operated at √ s = 7 and 8 TeV. The HPS algorithm reconstructs the τ h modes by combining information from charged hadrons, which are reconstructed using their associated tracks in the inner tracker, and π 0 candidates, obtained by clustering photon and electron candidates from photon conversions in rectangular regions of pseudorapidity and azimuth, η×φ regions, called "strips". The major challenge in the identification of τ h is to distinguish these objects from quark and gluon jets, which are copiously produced in pp collisions. The primary method for reducing backgrounds from jets misidentified as τ h candidates exploits the fact that there are fewer particles present in τ h decays, and that their energies are deposited in narrow regions of (η, φ) compared to those from energetic quark or gluon jets. In certain analyses, the misidentification (MisID) of electrons or muons as τ h candidates can also constitute a sizeable background.
The τ h identification algorithm improved for analyzing data at √ s = 13 TeV contains the following new features: 1. A modification of the strip reconstruction algorithm, to the so-called dynamic strip reconstruction, that changes the size of a strip in a dynamic way that collects the π 0 decay products more effectively; This paper is organized as follows. After a brief introduction of the CMS detector in Section 2, we discuss the data and the event simulations used to evaluate the performance of the HPS algorithm in Section 3. The reconstruction and identification of physical objects (other than τ h ) is briefly described in Section 4. Section 5 describes the HPS algorithm used for 13 TeV data and its simulation. The extended version of the algorithm used to reconstruct τ h pairs produced in topologies with high Lorentz-boosts is presented in Section 6, while the specialized version developed for trigger purposes is discussed in Section 7. The selection of events used to evaluate the performance of the τ h reconstruction algorithm, as well as systematic uncertainties common to all measurements are discussed in Section 8. The performance evaluation of the improved algorithm using selected data samples is given thereafter: Section 9 describes the τ h identification efficiency, while Sections 10 and 11 summarize the respective jet → τ h and e/µ → τ h misidentification probabilities. The τ h energy scale is discussed in Section 12. Finally, Section 13 presents the performance of τ h identification in the high-level trigger, and a brief summary in Section 14 concludes this paper.

The CMS detector
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. A silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections, reside within the field of the solenoid. Extensive forward calorimetry complements the coverage provided by the barrel and endcaps. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid.
The CMS tracker is a cylindrical detector, constructed from 1 440 silicon-pixel and 15 148 siliconstrip detector modules that cover the range of |η| < 2.5. Tracks of charged hadrons are reconstructed with typical efficiencies of 80-90%, depending on transverse momentum (p T ) and η [25,26]. The silicon tracker presents a significant amount of material in front of the ECAL, mostly due to the mechanical structure, the associated services, and the cooling system. A minimum of 0.4 radiation lengths (X 0 ) of material is present at |η| ≈ 0, which rises to ≈2.0 X 0 at |η| ≈ 1.4, and decreases to ≈1.3 X 0 at |η| ≈ 2.5. Photons originating from π 0 decays therefore have a high probability to convert into e − e + pairs within the volume of the tracker.
The ECAL is a homogeneous and hermetic calorimeter made of PbWO 4 scintillating crystals.
It is composed of a central barrel, covering the region |η| < 1.48, and two endcaps, covering 1.48 < |η| < 3.0. The small radiation length (X 0 = 0.89 cm) and small Molière radius (2.3 cm) of the PbWO 4 crystals provide a compact calorimeter with excellent two-shower separation. The ECAL is >25 X 0 thick.
The HCAL is a sampling calorimeter made of brass and plastic scintillator, with a coverage up to |η| = 3.0. The scintillation light is converted by wavelength-shifting fibres and channelled to photodetectors via clear fibres. The thickness of the HCAL is in the range 7-11 interaction lengths, depending on η.
The muon detection system is made up of four planes of gas-ionization detectors, where each plane consists of several layers of aluminium drift tubes (DTs) in the barrel region and cathode strip chambers (CSCs) in the endcap region, complemented by resistive-plate chambers (RPCs) that are used only in the trigger.
A two-tiered trigger system [27] is employed to select interesting events from the LHC bunch crossing rate of up to 40 MHz. The first level (L1), composed of custom-made hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of ≈100 kHz, within a fixed time interval of less than 4 µs. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software, optimized for fast processing, and reduces the event rate to ≈1 kHz before data storage.
A more detailed description of the CMS detector, together with a definition of the coordinate system and kinematic variables, can be found in Ref. [28].

Data and simulated events
The performance of τ h reconstruction and identification algorithms are evaluated in pp collisions recorded by CMS during 2016 at √ s = 13 TeV, corresponding to an integrated luminosity of 35.9 fb −1 . The Monte Carlo (MC) simulated signal samples contain H → ττ, Z → , W → ν, and Z/γ * → events, where refers to e, µ, or τ leptons. Simulated signal contributions from H → ττ, Z → (with masses up to 4 TeV), W → ν (with masses up to 5.8 TeV), and MSSM H → ττ (with masses up to 3.2 TeV) are used to optimize the identification of τ h candidates over a wide range of their p T values. The H → ττ events are generated at next-toleading order (NLO) in perturbative quantum chromodynamics (QCD) using POWHEG v2 [29][30][31][32][33], while Z and W boson events are generated using leading-order (LO) PYTHIA 8.212 [34]. In simulation, the reconstructed τ h candidate is taken as matched to the generated τ h when both objects lie within a cone of ∆R = √ (∆η) 2 + (∆φ) 2 < 0.3, where ∆φ and ∆η are the distances respectively in φ and η between the reconstructed and generated candidates.
The W+jets and Z/γ * → events are generated at LO in perturbative QCD using MAD-GRAPH5 aMC@NLO v2.2.2 [35] with the MLM jet merging scheme [36], while the single top quark and tt events are generated at NLO in perturbative QCD using POWHEG [37][38][39]. The diboson WW, WZ, and ZZ events are generated at NLO using MADGRAPH5 aMC@NLO with the FXFX jet merging scheme [40] or POWHEG [41], while events comprised uniquely of jets produced through the strong interaction, referred to as QCD multijet events, are generated at LO with PYTHIA. The PYTHIA generator, with the CUETP8M1 underlying-event tune [42], is used to model the parton shower and hadronization processes, as well as τ lepton decays in all events. The Z/γ * → and W+jets samples are normalized according to cross sections computed at next-to-next-to-leading order (NNLO) in perturbative QCD accuracy [43][44][45][46][47], while the tt sample is normalised to the cross section computed at NNLO supplemented by softgluon resummation with next-to-next-to-leading logarithmic accuracy [48,49]. The cross sections for single top quark and diboson production are computed at NLO in perturbative QCD accuracy [50]. The production of off-shell W bosons (m W > 200 GeV), with subsequent W → τν or W → µν decays, is simulated at LO with the PYTHIA generator. The differential cross section is reweighted as a function of the invariant mass of the W boson decay products, incorporating NNLO QCD and NLO electroweak corrections [46,51,52]. The NNPDF3.0 parton distribution functions [53] are used in all the calculations.
Additional pp collisions that overlap temporally the interactions of interest, referred to as pileup (PU), are generated using PYTHIA, and overlaid on all MC events according to the luminosity profile of the analyzed data. The generated events are passed through a detailed simulation of the CMS detector based on GEANT4 [54], and are reconstructed using the same CMS reconstruction software as used for data.

Event reconstruction
The particles emerging from pp collisions, such as charged and neutral hadrons, photons, electrons, and muons, are reconstructed and identified by combining the information from the CMS subdetectors using a particle-flow (PF) algorithm [55]. These particles are further grouped to reconstruct higher-level objects, such as jets, missing transverse momentum, τ h candidates, and to quantify lepton isolation.
The trajectories of charged particles are reconstructed from their hits in the silicon tracker [26], and are referred to as tracks.
Electrons are reconstructed from their trajectories in the tracker and from clusters of energy deposition in the ECAL [21]. Electron identification relies on the energy distribution in the electromagnetic shower and on other observables based on tracker and calorimeter information. The selection criteria depend on the p T and |η| of the electron, and on a categorization according to observables sensitive to the amount of bremsstrahlung emitted along the trajectory in the tracker.
Muons are reconstructed by combining tracks reconstructed in both the inner tracker and the outer muon spectrometer [22]. The identification of muons is based on the quality criteria of reconstructed muon tracks, and through requirements of minimal energy deposition along the muon track in the calorimeters.
The isolation of individual electrons or muons (I e/µ rel ) is measured relative to their transverse momenta p e/µ T by summing over the scalar p T values of charged and neutral hadrons, as well as photons, in a cone of ∆R < 0.3 for electrons or 0.4 for muons around the direction of the lepton at the interaction vertex: The primary pp interaction vertex is defined as the reconstructed vertex with largest value of summed p 2 T of jets, clustered using all tracks assigned to the vertex, and of the associated missing transverse momentum, taken as the negative vector sum of the p T of those jets. To suppress the contribution from PU, the charged hadrons are required to originate from the primary vertex. The neutral contribution to the isolation from PU (referred to as p PU T ) is estimated through a jet area method [56] for electrons. For muons, the p PU T contribution is estimated using the sum of the scalar p T of charged hadrons not originating from the primary vertex, scaled down by a factor of 0.5 (to accommodate the assumed ratio for the production of neutral and charged hadrons).
Jets are clustered from PF particles using the infrared and collinear-safe anti-k T algorithm [57,58] with a distance parameter of 0.4. The jet momentum is defined by the vectorial sum of all particle momenta in the jet. The simulation is found to provide results for jet p T within 5 to 10% of their true values over the whole p T spectrum and detector acceptance. To suppress contributions from PU, charged hadrons not originating from the primary vertex are discarded, and an offset correction is applied to correct the remaining PF contributions. Jet energy corrections are obtained from simulation to bring the measured response of jets to that of particle level jets on average, and are confirmed with in situ measurements through momentum balance in dijet, γ+jet, Z+jet, and multijet events [59]. The combined secondary vertex v2 (CSVv2) b tagging algorithm [60] with a medium working point (WP) is used to identify jets originating from b quarks. The working point corresponds to an identification efficiency of about 70% for b quark jets with p T > 30 GeV, and a probability for light-quark or gluon jets to be misidentified as b quarks of ≈1%.
The missing transverse momentum vector, p miss T , is defined as the projection of the negative vector sum of the momenta of all reconstructed particles in an event on the plane perpendicular to the beams. The p miss T is corrected by propagating to it all the corrections made to the momenta of jets. Its magnitude is referred to as p miss T .

Reconstruction and identification of τ h
The basic features of the HPS algorithm are identical to those used during the previous data taking at √ s = 7 and 8 TeV [24], except for the improvements in π 0 reconstruction described below in Section 5.1.1. Sections 5.2, 5.3, and 5.4 discuss the discriminants used to distinguish reconstructed τ h candidates from jets, electrons, and muons, respectively.

The hadrons-plus-strips algorithm
Starting from the constituents of reconstructed jets, the HPS algorithm reconstructs the different decays of the τ lepton into hadrons. The final states include charged hadrons, as well as neutral pions, as shown in Table 1. The π 0 mesons promptly decay into pairs of photons, which have a high probability of converting into e + e − pairs as they traverse the tracker material. The large magnetic field of the CMS solenoid leads to a spatial separation of the e + e − pairs in the (φ, η) plane. To reconstruct the full energy of the neutral pions, the electron and photon candidates falling within a certain region of ∆η×∆φ are clustered together, with the resulting object referred to as a "strip". The strip momentum is defined by the vectorial sum of all its constituent momenta. The procedure is described in Section 5.1.1, together with the improvements introduced to the previous algorithm.
Charged particles used in the reconstruction of τ h candidates are required to have p T > 0.5 GeV, and must be compatible with originating from the primary vertex of the event, where the criterion on the transverse impact parameter is not highly restrictive (d xy < 0.1 cm), to minimize the rejection of genuine τ leptons with long lifetimes. The requirement of p T > 0.5 GeV on the charged particles ensures that the corresponding tracks have sufficient quality, and pass a minimal requirement on the number of layers with hits in the tracking detector.
Based on the set of charged particles and strips contained in a jet, the HPS algorithm generates all possible combinations of hadrons for the following decay modes: h ± , h ± π 0 , h ± π 0 π 0 , and h ± h ∓ h ± . The reconstructed mass of the "visible" hadronic constituents of the τ h candidate (i.e., the decay products, excluding neutrinos) is required to be compatible either with the ρ(770), or the a 1 (1260) resonances in the h ± π 0 and in the h ± π 0 π 0 or h ± h ∓ h ± decay modes, respectively, as discussed in Section 5.1.2. The h ± π 0 and h ± π 0 π 0 modes are consolidated into the h ± π 0 mode, and are analyzed together. The combinations of charged particles and strips considered by the HPS algorithm represent all the hadronic τ lepton decay modes in Table 1, except This decay is not considered in the current version of the algorithm, because of its greater contamination by jets. The τ h candidates of charge other than ±1 are rejected, as are those with charged particles or strips outside the signal cone, defined by R sig = (3.0GeV)/p T , where the p T is that of the hadronic system, with cone size limited to the range 0.05-0.10. Finally, only the τ h candidate with largest p T is kept for further analysis, resulting in a single τ h candidate per jet.

Dynamic strip reconstruction
Photon and electron constituents of jets, which seed the τ h reconstruction, are clustered into ∆η×∆φ strips, and used to collect all energy depositions in the ECAL that arise from neutral pions produced in τ h decays. The size of the ∆η×∆φ window is set to a fixed value of 0.05×0.20 in the (η, φ) plane in the previous version of the HPS algorithm [24]. However, this fixed strip size is not always adequate to contain all electrons and photons originating from the τ h decays, meaning that some of the particles from τ h lepton decay can contribute to the isolation region and thereby reduce the isolation efficiency for genuine τ h candidates.
Our studies of τ h reconstruction have led to the following observations: 1. A charged pion from τ h decays undergoing nuclear interactions in the tracker material can produce secondary particles with lower p T . This can result in cascades of low-p T electrons and photons that can appear outside of the strip window, and affect the isolation of a τ h candidate, despite these particles originating from remnants of the τ h decay.
2. Photons from π 0 decays have a large probability to convert into e + e − pairs and, after multiple scattering and bremsstrahlung, some of the remaining electrons and photons can end up outside a fixed size window, also affecting the isolation.
Naively, these decay products can be integrated into the strip by suitably increasing its size. Conversely, if the τ h has large p T , the decay products tend to be boosted in the direction of the τ h candidate momentum. In this case, a smaller than previously considered strip size can reduce background contributions to that strip, while taking full account of all decay products.
Based on these considerations, the strip clustering of the HPS algorithm has been changed as follows: 1. The electron or photon (e/γ) with the highest p T not yet included in any strip is used to seed a new strip, with initial position set to the η and φ values of the new e/γ seed.
2. The p T of the second-highest e/γ deposition within of the strip position is merged into the strip. The dimensionless functions f and g are determined from single τ lepton events, generated in MC with uniform p T in the range where ∆m τ h is the change in the mass of the τ h candidate brought about by the addition of the e/γ candidates to its strip. It is calculated as follows: with ) are the four-momenta of the τ h and of the strip, respectively.

Discrimination of τ h candidates against jets
Requiring τ h candidates to pass certain specific isolation requirements provides a strong handle for reducing the jet → τ h misidentification probability. The two τ h isolation discriminants developed previously [24], namely the isolation sum and the MVA-based discriminants, have now been reoptimized. A cone with ∆R = 0.5 was originally used in the definition of isolation for all event types. However, in processes with a high number of final-state objects, such as for Higgs boson production in association with top quarks (ttH), the isolation is affected by the presence of nearby objects. Studies using such ttH events with H → ττ decays led to the conclusion that a smaller isolation cone improves the τ h efficiency in such events. A smaller isolation cone of radius ∆R = 0.3 is therefore now used in these types of events.

Isolation sum discriminants
The isolation of τ h candidates is computed by summing the scalar p T of charged particles (∑ p charged T ) and photons (∑ p γ T ) reconstructed using the PF algorithm within the isolation cone centered on the direction of the τ h candidate. Charged-hadron and photon constituents of τ h candidates are excluded from the p T sum, defining thereby the isolation as: The contribution from PU is suppressed by requiring the charged particles to originate from the production vertex of the τ h candidate within a distance of d z < 0.2 cm. The PU contribution to the p T sum of photons in the isolation cone is estimated by summing the scalar p T of charged particles not originating from the vertex of the τ h candidate (∑ p charged T with d z > 0.2 cm), but appearing within a cone of ∆R = 0.8 around the τ h direction multiplied by a so-called ∆β factor, which accounts for the ratio of energies carried by charged hadrons and photons in inelastic pp collisions, as well as for the different cone sizes used to estimate the PU contributions.
Previously, an empirical factor of 0.46 was used as the ∆β [24]. However, this is found to overestimate the PU contribution to the isolation in data taken in 2015 and 2016. And a new ∆β factor of 0.2 is therefore chosen. This value corresponds approximately to the ratio of neutral to charged pion production rates (0.5), corrected for the difference in the size of the isolation cone (∆R = 0.5) and the cone used to compute the ∆β correction (∆R = 0.8): 0.5 × (0.5 2 /0.8 2 ) ≈ 0.195.
The loose, medium, and tight working points of the isolation sum discriminants are defined by requiring I τ h to be less than 2.5, 1.5, or 0.8 GeV, respectively. These thresholds are chosen such that the resulting efficiencies for the three working points cover the range required for the analyses.
In dynamic strip reconstruction, a photon candidate outside the signal cone can still contribute to the signal. This effectively increases the jet → τ h misidentification probability because of the decrease in the value of I τ h for misidentified τ h candidates. An additional handle is therefore exploited to reduce the jet → τ h misidentification probability using the scalar p T sum of e/γ candidates included in strips, but located outside of the signal cone, which is defined as A reduction of about 20% in the jet → τ h misidentification probability is achieved by requiring p strip, outer T to be less than 10% of p τ h T , for similar values of efficiency. A comparison of the expected performance of the isolation sum discriminant for the previous and current versions of the HPS algorithm is shown in Fig. 2. The efficiency is calculated for generated τ h candidates with p T > 20 GeV, |η| < 2.3, having a decay mode of h ± , h ± π 0 , h ± π 0 π 0 , or h ± h ∓ h ± , and matching to a reconstructed τ h candidate with p T > 18 GeV. The misidentification probability is calculated for jets with p T > 20 GeV, |η| < 2.3, and matching to a reconstructed τ h candidate with p T > 18 GeV. The different sources of improvement in performance of the algorithm with fixed strip size are shown separately for ∆β = 0.46, ∆β = 0.46 with p strip, outer T < 0.1 p τ h T , and for ∆β = 0.2 with p strip, outer T < 0.1 p τ h T . The signal process is modelled using MC events for H → ττ (for low-p T τ h ) and Z → ττ decays, with m Z = 2 TeV (for high-p T τ h ). The QCD multijet MC events are used as background, with jet p T values up to 100 and 1000 GeV, respectively, such that the p T coverage is similar to that in signal events. The improvement brought about by the dynamic strip reconstruction for high-p T τ leptons can be observed by comparing the two plots in Fig. 2. At low-p T (Fig. 2, left), the performance for τ h candidates for medium and tight WPs improves slightly. However, in the high-efficiency region, the misidentification probability starts to increase faster than the efficiency in the current algorithm. This is caused by choosing the working points of the algorithm through changes in the requirements on I τ h . To reach a higher efficiency, the requirement on I τ h is relaxed, which in turn leads to an increase in the misidentification probability. However, the p strip, outer T requirement prevents the efficiency from rising at a similar rate, leading thereby to the observed behaviour of the response in the high-efficiency region.

MVA-based discriminants
The MVA-based τ h identification discriminants combine the isolation and other differential variables sensitive to the τ lifetime, to provide the best possible discrimination between τ h decays and quark or gluon jets. A classifier based on boosted decision trees (BDT) is used to achieve a reduction in the jet → τ h misidentification probability. The MVA identification method and the variables used as input to the BDT are discussed in Ref. [24].
In addition to those discussed in Ref.
[24], the following variables are included in the classifier to improve its performance: 1. Differential variables such as p strip, outer T in Eq. (7), and p T -weighted ∆R, ∆η, and ∆φ (relative to the τ h axis) of photons and electrons in strips within or outside of the signal cone; 2. τ lifetime information, based on the signed three-dimensional impact parameter of the leading track of the τ h candidate and its significance (the impact parameter length divided by its uncertainty); and 3. multiplicity of photon and electron candidates with p T > 0.5 GeV in the signal and isolation cones.
The charged and neutral-particle isolation sums and the ∆β correction, as defined in Eq. (6), are used as separate variables in the BDT classifier, and correspond to the most powerful discriminating variables. Other significant variables are the two-and three-dimensional impact parameters of the leading track and their significances, as well as the flight length and its significance for the τ h candidates decaying into three charged hadrons and a neutrino. The multiplicity of photon and electron candidates in the jet seeding the τ h candidate is found to contribute to the decision of the BDT classifier at levels similar to those of the lifetime variables.
The BDT is trained using simulated τ h candidates selected with p T > 20 GeV and |η| < 2.3 in Z/γ * → ττ, H → ττ, Z → ττ, and W → τν events (with the mass ranges of H, Z , and W detailed in Section 3). The QCD multijet, W+jets, and tt events are used to model quark and gluon jets. These events are reweighted to provide identical two-dimensional distributions in p T and η for τ h candidates in signal and in background sources, which makes the MVA training insensitive to differences in p T and η distributions of τ leptons and jets in the training samples.
The working points of the MVA-isolation discriminant, corresponding to different τ h identification efficiencies, are defined through requirements on the BDT discriminant. For a given working point, the threshold on the BDT discriminant is adjusted as a function of p T of the τ h candidate to ensure uniform efficiency over p τ h T . The working points for the reconstructed τ h candidates are chosen to have isolation efficiencies between 40 and 90%, in steps of 10%, for the reconstructed τ h candidates.
The expected jet → τ h misidentification probability is shown in Fig. 3, as a function of expected τ h identification efficiency. It demonstrates a reduction in the misidentification probability by a factor of 2 for MVA-based discriminants, at efficiencies similar to those obtained using isolation-sum discriminants. We compare two sets of MVA-based discriminants that were trained using MC samples that correspond to different conditions during data taking. The working points of the MVA-based discriminants are shifted relative to each other, but follow the same performance curve. This confirms the stability of the MVA-based discriminants. The expected τ h selection efficiencies and jet → τ h misidentification probabilities for low to medium p T , for the most commonly used working point (tight) of the training in 2016 are 49% and 0.21%, respectively. For high p T , the expected misidentification probability drops to 0.07%, while the τ h selection efficiency remains constant, as desired. Figure 4 shows the respective expected τ h identification efficiency (left) and the misidentification probability (right), as a function of p T of the generated τ h and of the reconstructed jet. The efficiency is computed from Z → ττ events, while the expected jet → τ h misidentification probability is computed for QCD multijet events with jet p T < 300 GeV.

Discrimination of τ leptons against electrons
Isolated electrons have a high probability to be misidentified as τ h candidate that decay to either h ± or h ± π 0 . In particular, electrons crossing the tracker material often emit bremsstrahlung photons mimicking neutral pions in their reconstruction. An improved version of the MVA electron discriminant used previously [24] is developed further to reduce the e → τ h misidentification probability, while maintaining a high selection efficiency for genuine τ h decays over a wide p T range. The variables used as input for the BDT are identical to the ones described in Ref. [24], with the addition of the following photon-related variables: 1. the number of photons in any of the strips associated with the τ h candidate; 2. the p T -weighted root-mean-square of the distances in η and φ between all photons included in any strip and the leading track of the τ h candidate; and 3. the fraction of τ h energy carried away by photons.     Figure 4: Efficiency of τ h identification, estimated using simulated Z/γ * → ττ events (left), and the misidentification probability estimated using simulated QCD multijet events (right) are given, for the very loose, loose, medium, tight, very tight, and very-very tight WPs of the MVA-based τ h isolation algorithm. The efficiency and misidentification probabilities are shown as a function of p T of the generated τ h and of the reconstructed jet, respectively. Vertical bars (often smaller than the symbol size) correspond to the statistical uncertainties (the 68% Clopper-Pearson intervals [61]), while horizontal bars indicate the bin widths.
These variables are computed separately for the photons inside and outside of the τ h signal cone to improve separation. The most sensitive variables are the fraction of energy carried by the photon candidates, the ratio of the energy deposited in the ECAL to the sum of energies deposited in the ECAL and HCAL, the ratio of the deposited energy in the ECAL relative to the momentum of the leading charged hadron, the m τ h , and the p T of the leading charged hadron.
The BDT is trained using the simulated events listed in Section 3, which contain genuine τ leptons and electrons. Reconstructed τ h candidates can be considered as signal or background, depending on whether they are matched to a τ h decay or to an electron at the generator level. Different working points are defined according to the requirements on their BDT output and the efficiency for a genuine τ h candidate to pass the working points of the discriminants. The expected efficiency of τ h reconstruction and the e → τ h misidentification probability are presented in Fig. 5. Both are found to be approximately uniform over p T , except for a dip at ≈45 GeV, whose depth increases with the tightening of the selection criteria. This is because the MC events used to model the e → τ h misidentification in the training of the MVA discriminant have electron p T distributions that peak at ≈45 GeV, since the sample is dominated by Z/γ * → ee and W → eν events. , and the e → τ h misidentification probability estimated using simulated Z/γ * → ee events (right) for the very loose, loose, medium, tight, and very tight WPs of the MVA-based electron discrimination algorithm. The efficiency is shown as a function of p T of the reconstructed τ h candidate, while the misidentification probability is shown as a function of the generated electron p T . The efficiency is calculated for τ h candidates with a reconstructed decay mode that pass the loose WP of the isolation-sum discriminant, while the misidentification probability is calculated for generated electrons of p T > 20 GeV and |η| < 2.3, excluding the less sensitive detector region of 1.46 < |η| < 1.56 between the barrel and endcap ECAL regions. Vertical bars (often smaller than the symbol size) indicate the statistical uncertainties (the 68% Clopper-Pearson intervals), while horizontal bars indicate the bin widths.

Discrimination of τ leptons against muons
Muons have a high probability to be misreconstructed as τ h objects in the h ± decay mode. The discriminant against muons, developed previously [24], is based on vetoing τ h candidates when signals in the muon detector are found near the τ h direction. The two working points corresponding to different τ h identification efficiencies and µ → τ h misidentification rates are: 1. "against-µ loose": τ h candidates fail this working point when track segments in at least two muon detector planes are found to lie within a cone of size ∆R = 0.3 centered on the τ h direction, or when the energy deposited in the calorimeters, associated through the PF algorithm to the "leading" charged hadron of the τ h candidate, is <20% of its track momentum.
2. "against-µ tight": τ h candidates fail this working point when they fail condition (i), or when a hit is present in the CSC, DT, or RPC detectors located in the two outermost muon stations within a cone of size ∆R = 0.3 around the τ h direction.

Reconstruction of highly boosted τ lepton pairs
In events containing a (hypothetical) massive boson with large p T , e.g., a radion (R) decaying to a pair of Higgs bosons [62,63], with at least one of these decaying to a pair of τ leptons, the jets from the two τ leptons would be emitted very close to each other, thereby forming a single jet. The performance of the HPS algorithm in such topologies is poor, as it was designed to reconstruct only one τ h per jet. A dedicated version of the HPS algorithm was therefore recently developed to reconstruct two τ leptons with large momenta that typically originate from decays of large-momentum Z or Higgs bosons. This algorithm takes advantage of jet substructure techniques, as follows. A collection of "large-radius jets" is assembled from the PF candidates using the Cambridge-Aachen algorithm [64] with a distance parameter of 0.8 (CA8). Due to the large boosts, the emitted τ lepton decay products are expected to be contained within the same CA8 jet, when its p T exceeds 100 GeV. The algorithm proceeds by reversing the final step of the clustering algorithm for each given CA8 jet, to find two subjets sj 1 and sj 2 that can be expected to coincide with the two τ leptons from the decay of the boosted massive boson. To reduce the misidentification of jets arising from QCD multijet events, sj 1 and sj 2 must satisfy the following additional restrictions: 1. the p T of each subjet must be greater than 10 GeV, and 2. the mass of the heavier subjet must be less than 2/3 of the large-radius jet mass, where mass refers to the invariant mass of all jet constituents.
These requirements are obtained from an optimization of the reconstruction efficiency, while maintaining a reasonable misidentification probability. When these requirements cannot be met, the pair of subjets is discarded, and the procedure is repeated, treating the subjet with largest mass as the initial jet that is then split into two new subjets. If the algorithm is unable to find two subjets satisfying the above criteria within a given CA8 jet, no τ h reconstruction is performed from this CA8 jet, and the algorithm moves on to the next such jet. When two subjets satisfying the requirements are found, they are passed to the HPS algorithm as seeds. At this stage, the algorithm does not differentiate between subjets arising from hadronic or leptonic τ decays. After reconstruction, the decay-mode criteria (Section 5.1.2) and the MVA-based isolation discriminants (Section 5.2.2) are applied to the reconstructed τ h candidate, taking into account just the PF candidates belonging to the subjet that seeds the τ h in the reconstruction and the isolation calculations. The decay-mode criteria are relaxed relative to those used in the standard HPS algorithm by accepting τ h candidates with two charged hadrons, and therefore an absolute charge different from unity. This relaxation recovers τ leptons decaying into three charged hadrons when one of the tracks is not reconstructed in the dense environment of a high-p T jet. If an electron or muon, reconstructed and identified through the usual techniques available for these leptons [21,22], is found to be near (∆R < 0.1) to a τ h candidate reconstructed from a subjet, the corresponding CA8 jet is considered to originate from a semileptonic τ lepton pair decay. Cases in which both τ leptons decay leptonically are not considered. Figure 6 compares the efficiencies in standard reconstruction with that for highly boosted τ lepton pairs in simulated events of R → HH → bbττ decays in the τ h τ h and τ µ τ h final states. In addition, the expected probability for large-radius jets to be misidentified as τ h pairs is shown for simulated QCD multijet events. While the efficiency in τ µ τ h events is computed just for the τ h candidate, it is computed once relative to one τ h candidate and once relative to both τ h candidates in τ h τ h events. The misidentification probability is calculated in τ h τ h final states for both τ h candidates. The τ h candidates are selected requiring p T > 20 GeV and |η| < 2.3, using the very loose WP of the MVA-based isolation.
The algorithm used for highly boosted events provides a considerably higher efficiency than the standard HPS algorithm for τ lepton pairs with p T greater than ≈0.5 TeV, with an expected increase in misidentification probability. Since at such high p T , the contributions from background are highly suppressed, and the misidentification rate remains of the order of 10 −4 , this algorithm can be used for searches in this kinematic regime.

Identification of τ h candidates in the high-level trigger
Several analyses are based on experimental signatures that include τ h signals, and therefore, along with the offline reconstruction discussed in Sections 5 and 6, we also employ dedicated τ h identification algorithms in the trigger system, at both L1 and HLT.
The L1 system went through a series of upgrades [65] in 2015 and 2016, and it is now based on more powerful, fully-programmable FPGA processors and µTCA logic boards. This allows more sophisticated τ h reconstruction and isolation algorithms at L1, the performance of which can be found in Ref. [66].
The HLT system uses the full-granularity information of all CMS subdetectors, and runs a version of the CMS reconstruction that is slightly different than that used offline, as the HLT decision is made within 150 ms, on average, a factor of 100 faster than offline reconstruction. This is achieved using specialized, fast, or regional versions of reconstruction algorithms, and through implementation of multistep selection logic, designed to reduce the number of events processed by more complex, and therefore more time consuming subsequent steps. Both methods are exploited in the τ h reconstruction at the HLT.
The τ h HLT algorithm has three steps. The first step, referred to as Level 2 (L2), uses only the energy depositions in the calorimeter towers in regions around the L1 τ h objects with ∆R < 0.8. The depositions are clustered into narrow L2 τ h jets using the anti-k T algorithm with a distance parameter of 0.2. The only selection criterion required at L2 is a p T threshold.
In the second step, known as Level 2.5 (L2.5), a simple form of charged-particle isolation is implemented, using just the information from the pixel detector. Tracks are reconstructed from hits in the pixel detector around the L2 τ h jets (rectangular regions of ∆η×∆φ = 0.5×0.5), and  used to form vertices. If no vertex is found, the τ h jet is passed to the following step for more detailed scrutiny. If, on the other hand, at least one vertex is found, the one with highest ∑ p 2 T of its tracks is assumed to be the primary hard-scattering vertex in the event. Tracks originating from within d z < 0.1 cm of the hard-scattering vertex, in an annulus of 0.15 < ∆R < 0.4 centered on the τ h jet direction, and with at least three hits in the pixel detector, are used in the computation of the τ h jet isolation. An L2 τ h jet is considered isolated if the scalar sum of the p T of the associated pixel tracks ∑ p track T is less than 1.85 GeV.
Finally, at Level 3 (L3), full track reconstruction, using both pixel and strip detectors, is executed using rectangular regions of size ∆η×∆φ = 0.5×0.5 around the L2 τ h jets, followed by the PF reconstruction. Both components are tuned specifically for the fast processing at HLT, as discussed in Ref. [55].
The L3 τ h algorithm starts with jets clustered from PF particles by the anti-k T algorithm using a distance parameter of 0.4. First, photons, contained in a jet, within a fixed ∆η×∆φ area of 0.05×0.2 are clustered into the strips, and assigned the π 0 mass. A variable signal-cone size of ∆R L3 sig = (3.6 GeV)/p jet T , with ∆R L3 sig limited to the range of 0.08-0.12, and an isolation cone of ∆R = 0.4, are defined around the direction of the charged hadron in the jet with highest p T . The L3 τ h candidate is then constructed from the following constituents found within the signal cone: up to three charged hadrons that are ordered in decreasing p T , and assumed to be charged pions, and all the available π 0 candidates. To recover possible tracking inefficiencies, neutral hadrons within a distance of ∆R = 0.1 from the leading charged hadron are also considered as being part of the τ h candidate. The vertex with smallest d z relative to the track of the leading charged hadron is considered as the vertex of the τ h production. To maximize the HLT reconstruction efficiency, these identification criteria are chosen to be fairly inclusive, not requiring strict consistency with the τ h decay modes, with the respective sizes of the signal and isolation cones chosen to be larger and smaller than the sizes of the corresponding cones in the offline algorithm.
Two types of isolations were defined for L3 τ h candidates in 2016. First is the charged isolation (∑ p charged T ), computed by summing the scalar p T of charged hadrons (other than those constituting the L3 τ h candidate) with d z < 0.2 cm relative to the τ h vertex, located within the isolation cone; defining the loose, medium, and tight WPs through ∑ p charged T being smaller than 3.0, 2.0, and 1.5 GeV, respectively.
The second type is the combined isolation, I L3 τ , defined as where ∑ p γ T is the sum of the scalar p T of photons within an annulus between the signal and isolation cones that do not belong to the signal strips, and p PU T is the neutral contribution to the isolation from PU, estimated using the jet area method [56]. The respective loose, medium, and tight WPs of the combined isolation require I L3 τ to be smaller than 3.0, 2.3, and 2.0 GeV. The absolute isolation cutoff values (for both isolation types) are often relaxed by a few percent, depending on the trigger, as a function of p τ h T , starting at values of about twice the trigger threshold. This relaxation increases the reconstruction efficiency for genuine τ h candidates, and is possible because of the number of misidentified τ h candidates decreases with p T , providing thereby a control of the trigger rates.
Finally, the scalar p T sum of photons that are included in the strips of the L3 τ h candidate, but are located outside of its signal cone (R L3 sig ), is defined as for offline τ h candidates in Eq. (7). This variable was not used for τ h triggers in 2016, but is included in triggers during data taking in 2017.
The τ h reconstruction and identification algorithms described in this section are employed to define a set of triggers for data taking during 2016. The triggers and their performance are discussed in Section 13.

Event selection and systematic uncertainties
This section describes the selection requirements employed to define event samples used in the following measurements of the performance of τ h reconstruction and identification in data and simulation, as well as their related systematic uncertainties. Differences between data and simulated events in trigger, identification, and isolation efficiencies are taken into account through the reweighting of simulated events. In addition, the number of PU interactions in simulation is reweighted to match that measured in data.

The Z/γ * → ττ events
A sample of Z/γ * events decaying into eτ h or µτ h final states is selected by requiring at least one well-identified and isolated electron or muon, referred to as the "tag", and one τ h candidate that passes loose preselection criteria, which corresponds to the "probe".
The events in the eτ h final state are required to pass an isolated single-electron trigger with p T > 25 GeV. Offline, the electron candidate is required to have p T > 26 GeV and |η| < 2.1, pass the tight WP of the MVA-based electron identification (with an average efficiency of 80%) [21, 67], and have I e rel < 0.1, as defined in Eq. (1). In the µτ h final state, events are required to pass an isolated single-muon trigger with p T > 22 GeV. Offline, the muon candidate is required to have p T > 23 GeV and |η| < 2.1, pass the medium identification WP [22], and have I µ rel < 0.15. The τ h candidate is preselected to have p T > 20 GeV, |η| < 2.3, no overlap with any global muon [22] with p T > 5 GeV, to pass the against-lepton discriminant selection requirements defined in Sections 5.3 and 5.4, and to have at least one charged hadron with p T > 5 GeV. The τ h and electron or muon are required to be separated by at least ∆R = 0.5, and to carry opposite electric charges. If several eτ h or µτ h pairs in one event pass this set of selection criteria, the pair formed from the most isolated τ h and the most isolated electron or muon is selected. The events are rejected if they contain an additional electron or muon passing relaxed selection criteria. The relaxed selection requires that an electron satisfies the very loose WP of the MVAbased identification (with an average efficiency of 95%), a muon has to be reconstructed as a global muon, and both the electron or muon must have p T > 10 GeV and I e/µ rel < 0.3. To reduce the W+jets background contribution, the transverse mass of the electron or muon and p miss T , , is required to be less than 40 GeV, where ∆φ is the difference in azimuthal angle between the electron or muon p T and p miss T . In addition, a linear combination of the variables P p miss T ζ and P vis ζ , originally developed by the CDF experiment [68], namely D ζ = P p miss T ζ − 0.85 P vis ζ , is used to benefit from the fact that in Z/γ * → ττ events the p miss T from the neutrinos produced in τ decays typically forms a small angle with the visible τ h decay products. The D ζ is required to be greater than −25 GeV.

The µτ h final states in tt events
The tt → µτ h +jets events are selected in the same way as the Z/γ * → ττ → µτ h events, except for the requirements on m T and D ζ , which are not applied. The events are also required to have at least one b-tagged jet to enrich the content in tt events.

The Z/γ * → µµ events to constrain the Z/γ * → normalization
A high purity sample of Z/γ * → µµ events is selected to constrain the normalization of the Drell-Yan (DY, qq → Z/γ * → + − ) events in the measurement of τ h efficiency through the tag-and-probe method [69], described in detail in Section 9.1. The events are required to have a pair of well-separated (∆R > 0.5), oppositely-charged muons. The leading (in p T ) muon is required to pass the same selection as used in the µτ h final states of Z/γ * events. The subleading muon is required to pass the same selection as the leading muon, except for the η requirement, which is relaxed to |η| < 2.4. The invariant mass of the dimuon pair is required to be within 60-120 GeV. Events are rejected if they contain an additional electron or muon passing the relaxed selection criteria.

Off-shell W → τν events
Here, we use events in which a virtual W boson that decays into a τ h and a ν is produced with small p T (and no accompanying hard jet). The p T of the τ h and the p miss T are expected to be well balanced in such events.
Events are required to pass a trigger where p miss T, noµ and H miss T, noµ are both greater than 110 GeV, with p miss T, noµ being the magnitude of p miss T computed using all particles in an event except muons, and H miss T, noµ being the magnitude of p miss T computed using jets with p T > 20 GeV, reconstructed from all particles except muons. Offline, events are required to have one τ h candidate with p T > 100 GeV, and p miss T > 120 GeV. To ensure back-to-back topologies between the τ h candidate and p miss T , we require ∆φ(τ h , p miss T ) > 2.8 rad. The event is discarded if it has at least one jet with p T > 30 GeV and |η| < 4.7, except the one corresponding to the τ h , or an additional electron or muon passing the relaxed selection criteria.

Off-shell W → µν events to constrain the W → τν normalization
This event sample is used to constrain the normalization of off-shell W boson production for m W > 200 GeV, used in the τ h efficiency measurement, as described in Section 9.3. Events are selected with an isolated single-muon trigger with p T > 22 GeV and |η| < 2.1. Offline, the muon candidate must have p T > 120 GeV and |η| < 2.1; it must also pass the medium identification WP, and have a relative isolation of less than 0.15. The event must also have p miss T > 120 GeV and ∆φ(µ, p miss T ) > 2.8 rad. The event is discarded if it has at least one jet with p T > 30 GeV and |η| < 4.7, or an additional electron or muon passing the relaxed selection criteria.

Events from W → µν+jet production
These events are triggered using a single isolated-muon trigger with p T > 24 GeV and |η| < 2.1. Offline, we require one well-identified and isolated muon with p T > 25 GeV. Events with additional electrons or muons passing the relaxed selection criteria are rejected. In addition, the transverse mass of the muon and p miss T is required to be greater than 60 GeV, to suppress events with genuine τ h candidates, in particular from Z/γ * bosons. Events should contain exactly one jet with p T > 20 GeV and |η| < 2.4, and there should be no additional jets (in |η| > 2.4) with p T > 20 GeV. To ensure that the W boson is balanced in p T with the jet, the following selections are applied: ∆φ(W, jet) > 2.4 rad, and the ratio of jet p T and W boson p T must be between 0.7 and 1.3, where the p T of the W boson is reconstructed from the vector sum of muon p T and p miss T . 8.7 The eµ final states in tt events 21

The eµ final states in tt events
These events are triggered using a single isolated-muon trigger with p T > 24 GeV, and are required to have one well-identified and isolated electron and one well-identified and isolated muon both of p T > 26 GeV and |η| < 2.4. Events with additional electrons or muons passing the relaxed selection criteria are rejected.

8.8
The Z/γ * → ee, µµ events for measuring the e/µ → τ h misidentification probability High-purity samples of Z/γ * → ee and Z/γ * → µµ events are selected for measuring their respective e → τ h and µ → τ h misidentification probabilities. Consequently, again, we require at least one well-identified, isolated electron or muon (tag) and one isolated τ h candidate (probe).
The Z/γ * → ee events are selected by requiring a single-electron trigger to have fired. Offline, the electron candidate must match the trigger object (within ∆R < 0.5), have p T > 26 GeV and |η| < 2.1, pass the most-restrictive electron-identification criteria, and have an I e rel < 0.1. The Z/γ * → µµ events are collected using a single isolated-muon trigger with p T > 24 GeV. Offline, the muon candidate must match the trigger object (within ∆R < 0.5), be selected with p T > 26 GeV and |η| < 2.1, after passing medium muon-identification criteria, and I µ rel < 0.15. The τ h candidate is required to satisfy p T > 20 GeV and |η| < 2.3, be reconstructed in one of the decay modes h ± , h ± π 0 , h ± π 0 π 0 , or h ± h ∓ h ± , and pass the tight WP of the MVA-based isolation discriminant described in Section 5.2.2. It must also be separated from the electron or muon by ∆R > 0.5, and have an electric charge opposite to that of the electron or muon. The τ h candidate must pass the loose WP of the against-µ discriminant described in Section 5.4 when selecting Z/γ * → ee events. The purity of the sample is increased by requiring the invariant mass of the tag-and-probe pair to be between 60-120 or 70-120 GeV for Z/γ * → ee and Z/γ * → µµ events, respectively.
The W+jets and tt backgrounds are reduced by requiring the selected events to have m T (of the tag electron or muon and p miss T ) not exceeding 30 GeV.

Systematic uncertainties affecting all studied final states
The generic systematic uncertainties affecting most of the measurements presented in Sections 9-12 are discussed in this section. Uncertainties concerning particular analyses are not covered here, but are discussed in their corresponding sections. The same is true for deviations in the values of the systematic uncertainties.
The uncertainty in the measured integrated luminosity is 2.5% [70], and affects the normalization of all processes modelled via MC simulation. The combination of trigger, identification, and isolation efficiencies for electrons and muons, measured using the tag-and-probe technique, result in normalization uncertainties of 2% that also affect the normalization of processes modelled in simulation. Uncertainties in the normalization of production cross sections [45-48, 50, 71, 72] or in the method used to extract the normalization of tt (3-10%), diboson (5-15%), and DY (2-4%) production, are also taken into account. Uncertainties in the τ h energy scale, affecting the distributions in simulated events that depend on E τ h , and range between 1.2% (as determined in Section 12) and 3% for high-p T τ h candidates. Furthermore, to account for statistical fluctuations caused by the limited number of simulated events, we use the "Barlow-Beeston light" approach [73,74], which assigns a single nuisance parameter per bin that rescales the total bin yield. Most of the analyses discussed in the following sections correct the simulated p T distributions of the Z/γ * boson in DY events and of the top quark in tt events to the spectra observed in data through measured weights. This reweighting corrects only the differential distributions without changing their normalization. Uncertainties in these weights are propagated through the analyses, where the downward changes by one standard deviation are computed as a difference between the weighted distribution and the one without weight, while the upward changes by one standard deviation are computed as a difference between weighted distributions with nominal weight and with the square of that weight. Finally, the uncertainty related to the PU distribution is estimated by changing the minimum-bias pp cross section by ±5%.
A comprehensive overview of these uncertainties is given in Table 2.

Measurement of the τ h identification efficiency
The measurements of τ h reconstruction and identification efficiencies in data use approaches similar to those of Ref.
[24], and provide data-to-simulation scale factors and their uncertainties that can be used to correct the simulated predictions in analyses. The efficiency is measured in different p τ h T regions: small p τ h T between 20 and ≈60 GeV, using the µτ h final state of Z/γ * → ττ events, as discussed in Section 9.1; intermediate p τ h T of up to ≈100 GeV, using the µτ h final states in tt events, as discussed in Section 9.2; and high p τ h T of >100 GeV, using a selection of highly virtual W bosons (m W > 200 GeV) decaying into τ leptons, as presented in Section 9.3. The data-to-simulation scale factors obtained through these measurements are combined, as described in Section 9.4, to extrapolate to higher-p τ h T regions not covered by these measurements. Finally, the identification efficiency for τ h candidates reconstructed using the algorithm dedicated to highly boosted τ lepton pairs is measured using the tag-and-probe method, as described in Section 9.5.

Using the tag-and-probe method in Z/γ * events
The τ h identification efficiency for p τ h T up to ≈60 GeV is estimated in µτ h final states of Z/γ * events, selected as described in Section 8.1. The events are subdivided into passing ("pass" region) and failing ("fail" region) categories, depending on whether the τ h candidate passes or fails the appropriate working point of the τ h isolation discriminant. The data-to-simulation scale factor for the τ h identification efficiency is extracted from a maximum likelihood fit of the invariant mass distribution of the reconstructed (visible) µτ h system, referred to as m vis . The expected SM contributions are fitted to the observed data simultaneously in both categories.
The predictions for SM processes contributing to the distribution in m vis consist of a signal sample of Z/γ * → ττ → µτ h events, where the reconstructed τ h candidate is required to be matched to the generated one, and a set of backgrounds. All background events, except for QCD multijet production, rely on simulated m vis distributions. Diboson, single top quark, and tt samples are normalized to their theoretical cross sections. A sample of dimuon events, as described in Section 8.3, is used to constrain the normalization of the DY process, by using them simultaneously in the fit, along with the events in the passing and failing categories. The DY processes, other than the Z/γ * → ττ → µτ h signal, where τ h candidates from the misidentification of e, µ, or jets, contribute to the background, and are denoted as "other DY".
The normalization of the contribution from W+jets events is estimated using control samples in data. A data-to-simulation scale factor is estimated in a sample enriched in W+jets events, defined in a way similar to the signal sample, but without the D ζ requirement having been applied, and with m T > 80 GeV, where small contributions from other processes are subtracted from data, based on their estimated cross sections. The scale factor is then applied to the simulation of the W+jets events in the low-m T signal sample.
The distribution and normalization of the QCD multijet background is estimated from control samples in data. The distribution is extracted from a sample selected using the nominal selection criteria discussed previously, but requiring the µ and τ h candidates to have the same-sign (SS) electric charge. All other processes contributing to this sample are estimated using the procedures detailed above, and are subtracted from the data. The normalization is controlled using the ratio of events found in two separate control samples requiring same-and oppositesign (OS) charge for the µ and τ h candidates, respectively. Otherwise, both samples are defined in ways similar to that of the signal sample, but with an inverted muon isolation criterion.
The following uncertainties are considered in addition to the ones outlined in Section 8.9, that is, uncertainties in the W+jets background normalization that arise from a possible difference between the low-and high-m T regions and from the uncertainties in p miss T , which are used in computing m T . The uncertainty in the yield of W+jets events is estimated to be about 10%. The uncertainty in the OS/SS scale factor, used in the estimation of the QCD multijet background is ≈5%, which is mostly due to the limited number of events in the OS and SS control regions. The normalization of the DY process is extracted from the dimuon control region. An extrapolation uncertainty of 2% is used for the µτ h sample to account for the differences in lepton kinematics (mostly in p T ).
The results obtained for different working points of the MVA-based discriminant with ∆R = 0.5 are shown in Table 3. An uncertainty of 3.9% is added in quadrature to the one returned by the fit, to account for the uncertainty associated with the track reconstruction efficiency [26]. The scale factors obtained for different working points of the isolation sum discriminants are found to be close to 90%, with uncertainties of 5%, and the scale factors obtained for the MVAbased discriminants trained using 2016 simulations as well as for ∆R = 0.3, are found to be compatible with those presented in Table 3. The measured scale factors vary from 0.92 to 0.99, depending on the working point, with uncertainties of about 5%. The fitted distributions that maximize the likelihood for the tight WP of the MVA-based isolation are shown in Fig. 7. The scale factors are also measured in different ranges of p τ h T for the tight WP of the MVA-based isolation discriminant with ∆R = 0.5, and enter the extrapolation to high p τ h T , as discussed in Section 9.4.
The efficiency for τ h candidates to pass the working points of the discriminants used to reject electrons and muons, described in Sections 5.3 and 5.4, respectively, are also measured in the µτ h final states of Z/γ * → ττ events, which are selected as described above. The τ h candidates Table 3: Data-to-simulation scale factors for different MVA-based isolation working points with ∆R = 0.5, measured using Z/γ * events. An uncertainty of 3.9% is added in quadrature to the uncertainty returned by the fit to account for the uncertainty in track reconstruction efficiency.  are required to have p T > 20 GeV, |η| < 2.3, and to pass the tight WP of the MVA-based τ h isolation discriminant. The events are again subdivided into passing and failing categories, depending on whether the τ h candidate passes or fails the appropriate working points of the discriminants used against electrons or muons. The data-to-simulation scale factor is obtained from a maximum likelihood fit to the m vis distribution. The scale factors are compatible with unity to within the uncertainty in the measurements that range between 1 and 3%.

Using tt events
A sample of tt events with a muon and a τ h in the final state is used to measure the τ h identification efficiency for p τ h T up to 100 GeV. The selection requirements are described in Section 8.2. The selected τ h candidate must be accepted using the appropriate working point of the τ h isolation discriminant. The distribution in m T of the muon and p miss T is used to determine the data-to-simulation scale factors.
Contributions to m T distribution from Z/γ * → ττ, single top quark, diboson, and W+jets events are modelled using simulations normalized to theoretical cross sections. Background from QCD multijet production is determined as described in Section 9.1. The major background contribution is from tt events where a jet is misidentified as a τ h candidate. The distribution is taken from simulation and a dedicated sample of events is selected to constrain the normalization of this background, as well as the probability of a jet to be misidentified as a τ h candidate. Events have to pass the same criteria as discussed in Section 8.2, but must also contain an additional isolated electron of electric charge opposite to that of the selected muon. This selects the eµ final state of tt events with an additional jet which can be misidentified as a τ h candidate. These eµ events are then subdivided into passing and failing categories, based on whether the requirements imposed on the τ h candidate are met in the τ h isolation discriminant. A simultaneous likelihood fit is performed to the m T distribution in all three samples, constraining thereby the tt contribution and the probability for jets to be identified as τ h candidates, as well as measuring the efficiency of the τ h identification relative to that expected in simulation.
The systematic uncertainties are similar to those listed in Section 8.9, except for additional uncertainties related to the b tagging performance (3% effect on the normalization), and the cross section for Z/γ * +jet process (30%), given that the Z/γ * +b jet cross section is not well measured. A 3.9% uncertainty in the track reconstruction efficiency is added to the signal processes. The uncertainty in the jet → τ h misidentification probability is correlated between the signal and the control sample, where the τ h candidate passes the identification requirement. The eµ failing category is used to further constrain both the normalization for tt production as well as the uncertainty in b tagging performance. Figure 8 shows the fitted distributions in m T for the tight WP of the MVA-based isolation.
The measurement is repeated for different isolation working points of the MVA-based discriminant, as well as for the tight WP in different regions of p τ h T , and individually for each reconstructed decay mode. Although the mean value of the scale factor in the h ± h ∓ h ± decay mode is slightly below those of the other decay modes, no significant differences are observed between the three decay modes. The measured scale factors in different p τ h T regions enter the extrapolation as outlined in Section 9.4, and Table 4 summarizes the results for the working points of the MVA-based isolation discriminants. The scale factors measured from the inclusive tt events are slightly lower than those from Z/γ * → ττ. This is because the jet → τ h misidentification probability is slightly higher in simulation than in data, causing the τ h identification efficiency scale factor to be pulled down towards lower values, where the distributions of the tt events with genuine τ h and the misidentified jet → τ h candidates become similar. However, this is  mitigated for the measurement in bins of p τ h T , by constraining the normalization of the tt background with a jet misidentified as a τ h candidate, using the eµ passing sample, as discussed above.

Using off-shell W → τν events
The identification efficiency for τ h leptons with p T > 100 GeV is measured using a sample of events in which a highly virtual W boson (m W > 200 GeV) is produced at small p T (and often without an accompanying hard jet), and decays into a τ lepton and ν τ . The signature for such events consists of a single τ h decay and p miss T balanced by the p τ h T . The selection requirements for the W → τν sample are described in Section 8.4. A large fraction of events selected in this channel originate from processes where a jet is misidentified as a τ h candidate. The main processes contributing to this kind of background are QCD multijet, Z/γ * → νν+jets, and W → ν+jets events.
The background from events where a jet is misidentified as a τ h candidate is estimated using a control sample obtained by applying the same set of requirements as used in the selection of the W → τν events, except for the τ h isolation criterion, which is inverted. Events in this control sample are then extrapolated to the signal region using the ratio of probabilities for a jet to pass to that to fail the τ h isolation. The W → µν+1 jet and QCD dijet events are utilized to estimate the extrapolation factor. The method is verified with simulated samples of W → ν+jets and Z/γ * → νν+jets events.
The study shows that the set of requirements outlined in Section 8.4, selects W → τν events with an invariant mass of the τν pair m τν ≡ m W > 200 GeV. A dedicated auxiliary sample of W → µν events is used to constrain the normalization of virtual W boson production with m W > 200 GeV. The W → µν events are selected as described in Section 8.5, and verified using MC simulation that the phase space covered by the W → µν and W → τν samples tend to largely overlap.
The signal is extracted using a simultaneous maximum likelihood fit to the m T (of the p τ h /µ T and p miss T ) distribution for both the W → τν signal and W → µν control samples. This procedure minimizes the uncertainties related to the normalization of W boson events. The fit is performed using two freely floating parameters: 1. the scale factor in the τ h identification efficiency, i.e., the ratio of the measured value of the τ h identification efficiency to the value predicted by simulation, and 2. the normalization for W production with m W > 200 GeV, relative to the theoretical prediction (r W ).
In addition to the uncertainties listed in Section 8.9, the following systematic uncertainties are also taken into account in the fit. An uncertainty of 1% in the momentum scale of the muon that also alters the differential distributions. The energy scale of the p miss T is taken into account in propagating the uncertainty in the jet energy scale, as well as in the scale of the unclustered energy depositions. Uncertainties in the extrapolation factor used in the estimation of background from jets misidentified as τ h is also taken into account. The backgrounds with genuine τ leptons in W → τν events are dominated by diboson events, which are estimated via MC simulation. The normalization of the diboson background is verified in dedicated control regions, indicating discrepancies of up to 30%. An uncertainty of 30% is therefore used in the normalization of backgrounds containing genuine τ leptons. Figure 9 shows the fitted m T distributions for the W → τν signal and W → µν control samples. The scale factor in the τ h identification efficiency, the parameter r W , and the correlation coefficient between the two quantities obtained from the fits, are detailed in Table 5 for different working points of the MVA-based τ h isolation discriminants. The data-to-simulation scale factors range between 0.89 for the very tight WP and 0.96 for the loose WP. The fitted value of the W boson production cross section for m W > 200 GeV is consistent with theoretical predictions. The W boson sample normalization factor is anticorrelated with the scale factor for τ h identification efficiency, as an increase in the W boson yield is compensated in the fit by a reduction in the scale factor. The correlation between the scale factor and r W increases with tighter τ h isolation, as expected, due to an increase in the purity of the signal region.  Figure 9: The m T distribution for selected W → τν (left) and W → µν (right) events after the maximum likelihood fit. The medium WP of the MVA-based isolation discriminant is applied to select W → τν events. The electroweak background contribution includes diboson and single top quark events. Vertical bars correspond to the statistical uncertainties in the data points (68% frequentist confidence intervals), while the shaded bands to the quadratic sum of the statistical and systematic uncertainties after the fit.

fb CMS
We also measure the τ h identification efficiencies in bins of p τ h T , with the data-to-simulation scale factors extracted in a simultaneous fit to the m T distribution in four signal samples, corresponding to four bins of p τ h T , and of p µ T in the W → µν control sample. The results enter in the extrapolation of the scale factor to high p τ h T , as discussed in Section 9.4.  To extrapolate the scale factors for the τ h identification efficiency to high p τ h T , a fit is performed to the values obtained in Sections 9.1, 9.2, and 9.3, as a function of p τ h T . These measurements cover a p τ h T range between 20 and ≈300 GeV, with the mean value in each p τ h T bin used as a representative number for that bin. Fits to a zero-(constant) and first-order polynomial are performed, without considering the uncertainty in track reconstruction efficiency, as it is correlated among the individual measurements. Nevertheless, it is found to contribute very little to the overall uncertainty, with the exception of measurements at low p τ h T , where other uncertainties are small because of the large number of events and the high purity of the event samples. Despite having other possible correlations between p τ h T bins in a single measurement, or between different measurements, all measurements entering the fit are assumed to be uncorrelated.
The fit to a first-order polynomial provides a smaller goodness-of-fit per degree of freedom, χ 2 /dof, than that to a constant, indicating that the scale factor for τ h identification efficiency may decrease with p τ h T ; but, given that the slope of the fitted first-order polynomial barely deviates from zero (by only about one standard deviation), the scale factor is compatible with being constant. As there are no indications that components of τ h reconstruction or identification behave abnormally at high p τ h T , a constant scale factor with an asymmetric uncertainty that increases with p τ h T is defined by adding in quadrature the uncertainty in the fit to a constant, and the difference between the fit to a first-order polynomial and to a constant for the downward deviation. In addition, this also takes into account the uncertainty in the efficiency of track reconstruction, yielding the total (asymmetric) uncertainty of +5% × p τ h T (TeV) and −35% × p τ h T (TeV). The fit to a constant using the combined uncertainty is shown in Fig. 10.

Using the tag-and-probe method in Z/γ * events for highly boosted τ lepton pairs
The identification efficiency for highly boosted τ lepton pairs in τ h final states is measured using the same tag-and-probe method as described in Section 9.1. The selection is optimized to have a pure sample of τ leptons from the decay of high-p T Z bosons, where one τ lepton decays leptonically and the other one into hadrons and a neutrino. As the trigger thresholds for nonisolated leptons are very high, too few events are available to reliably measure the identification efficiency for very high p T τ lepton pairs. Single isolated-lepton triggers with lower thresholds are used therefore to select eτ h and µτ h events. However, events in which a τ h is within the isolation area around a triggering lepton (∆R < 0.4) are not accessible in this measurement.
The selection requires one isolated electron or muon fulfilling tight identification criteria, and satisfying p T > 40 or >26 GeV, respectively. Furthermore, as discussed in Section 6, at least one τ h candidate must be reconstructed with p T > 20 GeV and |η| < 2.3, in compliance with relaxed decay mode criteria. The ∆R between the selected lepton and τ h candidate must be  Figure 10: Fit of the measured scale factors to a constant value in the τ h identification efficiency, for the tight WP of the MVA-based isolation discriminant in Z/γ * , tt, and W events, as a function of p τ h T . The shaded band represents the uncertainties in the fit, where the result is combined with the difference obtained using a first-order polynomial instead of a constant for the downward deviations, which also contain an additional contribution from the uncertainty in track-reconstruction efficiency. between 0.4 and 0.8, and the m T of the p T and p miss T system must be <40 GeV. Moreover, p miss T must exceed 75 GeV, the scalar p T sum of all measured particles has to be greater than 200 GeV, and there cannot be any identified b jets in the event. If more than one eτ h or µτ h pair is present, the one with the largest p T is chosen for further analysis.
The contribution from DY events is modelled using MC simulation. It is split into the signal contribution by matching the reconstructed leptons to those generated and those contributing via misidentified Z boson decays. The distributions of the backgrounds from W+jets and tt production are also modelled using simulation, but their normalizations are obtained from dedicated control data samples. The control sample for W+jets production is defined by inverting the requirement on m T . The control sample for tt production is established by demanding at least one b-tagged jet.
The background from QCD multijet production is estimated from a sample selected in the same way as the signal, except for the requirement on p miss T , which is inverted to p miss T < 75 GeV. Contributions from other processes are subtracted based on simulation. The extrapolation factor from the sample with an inverted p miss T requirement to the signal region is obtained from the ratio of events in two other control samples, where the ∆R between the lepton and the τ h candidate is between 0.8 and 2.0, one which uses the nominal and the other an inverted p miss T requirement, respectively. Contributions from other processes are also subtracted from data using MC simulation in these two control regions.
The systematic uncertainties discussed in Section 8.9 are taken into account in the procedure, as are the additional uncertainties in the estimation of the QCD multijet background, which are dominated by the limited number of events in the control samples. Finally, the uncertainties in the normalization of background from tt and W+jets production are determined from their respective control samples, and amount to 3 and 13%, respectively.
The data-to-simulation scale factors are evaluated in the same way as outlined in Section 9.1.
The passing and failing events are defined by requiring the τ h to pass or fail a given working point of the MVA-based isolation discriminant. The scale factors for the six MVA-based working points are shown in Table 6. The values are compatible with unity, as well as with the scale factors obtained through the measurements described in Sections 9.1-9.3. The dependence of the scale factor on the ∆R between τ h and the lepton, is studied without revealing a significant effect. The fitted distributions corresponding to the medium isolation WP are shown in Fig. 11. Table 6: Data-to-simulation scale factors for different working points of the MVA-based isolation discriminant, using highly boosted Z/γ * events decaying to τ lepton pairs.   The probability to misidentify a quark or gluon jet as a τ h candidate is measured as a function of jet p T and η in a sample of W → µν+jet events, selected as described in Section 8.6. In addition to p jet T and η jet , the misidentification probability also depends on parton flavour, as well as whether the parton initiating the jet and the reconstructed τ h have the same or opposite charge. These factors cause differences of up to a factor of four between misidentification probabilities for c quark and gluon jets, and up to a factor of two for whether the initiating parton has the same or opposite charge as the τ h candidate. This means that the misidentification probabilities given in this section are indicative, in that they are mainly valid for W → µν+jet events, which contain a large fraction of light-quark jets, and therefore have a relatively high misidentification probability.
The misidentification probability is given by the ratio of the number of jets that are identified as τ h candidates with p T > 20 GeV, |η| < 2.3, and passing any one of the working points of the discriminants described in Section 5.2, to the total number of jets with p T > 20 GeV and |η| < 2.3. It should be recognized that p jet T differs from p τ h T because the four-momentum of the jet is computed by summing the momenta of all its constituents, while the τ h four-momentum is computed only from the charged hadrons and photons used in the reconstruction of the specified decay mode of the τ h candidate. For p jet T < 300 GeV, the p τ h T constitutes on average only 40% of the jet p T . Furthermore, p jet T is subject to additional jet energy corrections, whereas p τ h T is not. In the measurement of the misidentification probability, backgrounds with genuine τ h are subtracted, based on the expectations from simulated events. The fraction of events with genuine τ h candidates in the sample passing the τ h identification criteria is well below 10% for τ h with p T < 100 GeV, but reaches up to 50% for p T ≈ 300 GeV. Furthermore, backgrounds with prompt electrons and muons giving rise to τ h candidates are also subtracted based on expectations from simulated events. To reject events from Z/γ * → µµ production, the loose WP of the against-µ discriminant described in Section 5.4 is applied to the reconstructed τ h candidates.
The subtraction of backgrounds containing genuine τ h is subject to an uncertainty of 30%, leading to an uncertainty of up to 15% in the jet → τ h misidentification probability. Because of threshold effects, the jet energy scale also leads to a significant uncertainty, especially in the lowest bin of p jet T . Additional uncertainties are considered for probabilities with which electrons are reconstructed as τ h candidates (with ≈100% relative values), and with which muons are reconstructed as τ h candidates that pass the loose WP of the against-µ discriminant (at 50%). These lead to uncertainties in the measured misidentification probabilities of at most a few percent.
The observed and simulated jet → τ h misidentification probabilities for the loose, medium, and tight WPs of the MVA-based isolation discriminant are shown in Fig. 12, as a function of p jet T and η jet . The probabilities are observed to be almost constant as a function of η jet , while they decrease monotonically with increasing p jet T from ≈40 GeV, as the absolute isolation increases for quark-and gluon-initiated jets with increasing jet p T . The values of the misidentification probability as a function of p jet T range between 2.0 and 0.1% for the loose WP of the MVA-based isolation discriminant, and between 1.0 and less than 0.1% for the tight WP. The observed probabilities show a difference of 10-20% relative to expectations from MC simulation. This difference is well within the range of the misidentification probabilities obtained under variations of the parton shower models and underlying-event tunes, and reflects precision of modelling untypical, narrow and low multiplicity, quark and gluon jets being able to pass τ h identification criteria.

Using eµ+jets events
The probability to misidentify quark and gluon jets as τ h candidates is also measured in the eµ final state of tt events using the same methodology and uncertainties outlined in Section 10.1. The events are selected as described in Section 8.7, with the largest contributions being from tt and single top quark events, where the misidentified τ h candidates are dominated by b quark jets. The contribution from other processes is <10%. The observed and simulated jet → τ h misidentification probabilities for the loose, medium, and tight WPs of the MVA-based isolation discriminant are shown in Fig. 13, as a function of p jet T and η jet . The observed probabilities show a 10-20% difference relative to expectations from simulation, except in a few η jet bins where the differences are as large as 50%. The jet → τ h misidentification probabilities in eµ+jets events are found to be smaller than those for W+jet events because of the larger fraction of b quark jets. The b quark jets are typically less collimated than the light-quark jets, providing thereby smaller probabilities to pass the τ h isolation discriminant selection requirements.

Measurement of the e → τ h probability
The e → τ h misidentification probability is obtained from data using a tag-and-probe method in Z/γ * → ee events selected as described in Section 8.8.
Depending on whether the probe passes or fails a given working point of the against-e discriminant, the event enters the passing or failing category, respectively. The e → τ h misidentification rate is then measured in a simultaneous fit to the number of Z/γ * → ee events in both categories. The m vis distribution in the range 60 < m vis < 120 GeV is used in the passing category, obtained from the templates for Z/γ * → ee signal and for the Z/γ * → ττ, W+jets, tt, single top quark, diboson (WW, WZ, ZZ), and QCD multijet backgrounds. In the failing category, the total number of events in the same range of m vis is used to constrain the normalization of the Z/γ * → ee process.
The differential templates for signal and all background distributions, except for QCD multijet, are taken from MC simulation. The normalization is performed according to the cross section for the specific sample of events, with the exception of the W+jets background, which is obtained from data, using an enriched sample of W+jets events with m T > 70 GeV. The scale factor between the sideband and the signal region is extracted from simulation. The differential distribution and normalization of the QCD multijet background is obtained from data in a control sample where the tag and the probe are of SS. The contributions from all other backgrounds are estimated using simulation, and are subtracted from the SS control sample in this procedure.
Systematic uncertainties are represented through nuisance parameters in the fit, and account for the effects listed in Section 8.9, as well as for the energy scale of tag electrons, which is changed by its uncertainty of ±1% in the barrel region (|η| < 1.46) and ±2.5% in the endcap regions (|η| > 1.56), with the difference in the m vis template considered as an uncertainty in the differential distribution. Similarly, the energy scale of probe electrons and τ h are changed by ±1.5 and ±3%, respectively. The energy scale of leptons have been measured using the method described in Ref. [75]. Uncertainties in the normalization of W+jets and QCD multijet production are dominated by number of events in the relevant control regions, and each amount to 20%. Finally, an additional 3% uncertainty is associated with the Z/γ * → ee normalization because of the need to disentangle possible differences between the Z/γ * → ee and Z/γ * → ττ normalizations. Separate fits are used for probes in the barrel and in the endcap regions.
The fitted m vis distributions in the passing category are shown in Fig. 14 for the medium and very tight WPs of the against-e discriminant in the barrel region of the ECAL, while the e → τ h misidentification probabilities are displayed in Fig. 15. In the barrel region, the measured misidentification probabilities in data exceed those in the simulations. The difference between data and simulation increases for the tight and very tight WPs of the discriminant, and a similar trend is observed for the probes in the endcap regions. The observed misidentification probabilities range from ≈5% for the very loose WP to less than 0.1% for the very tight WP in the barrel region, while in the endcap regions, the probabilities are larger, ranging between 0.1 and 10%.

Measurement of the µ → τ h probability
The µ → τ h misidentification probability is also measured using a tag-and-probe method, following an approach similar to that used to measure the e → τ h misidentification probability discussed in Section 11.1. For this, we select Z/γ * → µµ events, as described in Section 8.8, and again divide these into two categories, depending on whether the probe passes or fails the specific working point of the against-µ discriminant. The number of Z/γ * → µµ signal events in each category is then extracted from a simultaneous maximum likelihood fit to the mass of the tag-and-probe pair, in the range 70 < m vis < 120 GeV. Separate fits are performed for probes in five |η| regions of <0.4, 0.4-0.8, 0.8-1.2, 1.2-1.7, and >1.7, corresponding to the geometry of the CMS muon spectrometer.
The normalization and distribution in m vis for signal and background processes are estimated as discussed in Section 11.1. Systematic uncertainties are also similar, except that those related to electrons are replaced by those appropriate for the muons, such as the energy scale for the probe, which is changed by ±1.5 and ±3% for the misidentified µ → τ h and the genuine τ h candidates, respectively, with the resulting difference in the m vis template taken as an uncertainty in the differential distribution. The uncertainty in the energy scale of the tag muon is negligible compared with the energy scale of the τ h candidates, and is therefore neglected. Figure 16 shows the mass distribution in the µτ h pair after the maximum likelihood fit, for events where the probe muon is reconstructed as a τ h candidate, and passes the loose or tight WPs of the against-µ discriminant. The probes in these distributions lie within |η| < 0.4. The µ → τ h misidentification rates are given for the loose and tight WPs of the against-µ discriminant in Fig. 17. For probes passing the WPs, the measured misidentification rates in data exceed the predictions, with the difference between data and simulation possibly increasing from small to large |η|. The observed trend is more significant for probes passing the tight WP. The observed misidentification probabilities for the loose WP are in the range of 0.1-0.5%, with the highest probability lying in the |η| range between 0.8 and 1.2 which corresponds to transition between barrel and endcap regions of the muon spectrometer. The probabilities for the tight WP range between 0.03 and 0.40%, with the highest value again falling in the same |η| region.  Figure 17: Probability for muons to pass the loose (left) and tight (right) WPs of the against-µ discriminant, as a function of the |η| of the probe. For each working point, the µ → τ h misidentification probability is defined as the fraction of probes passing that working point relative to the total number of probes. Vertical bars correspond to the statistical and the quadratic sum of the statistical and systematic uncertainties, respectively, for simulated and observed data.

Measurement of the τ h energy scale
The correction to the τ h energy scale is defined by the deviation of the average reconstructed τ h energy from the generator-level energy of the visible τ h decay products. The corresponding data-to-simulation correction is obtained from a fit of the distributions of observables sensitive to the energy scale, using samples of eτ h and µτ h final states in Z/γ * events. The distributions sensitive to the energy scale are m τ h and the mass of the τ h system, m vis . These are fitted, separately for the h ± , h ± π 0 , and h ± h ∓ h ± decays to extract the correction factors between data and simulation.
The eτ h and µτ h final states are selected as described in Section 8.1, except that the τ h candidates are required to pass the very tight WP of the MVA-based τ h isolation discriminant to further reduce backgrounds from jets misidentified as τ h candidates. Moreover, the requirement on m T is tightened to be less than 30 GeV, and the requirement on D ζ is removed. Finally, the τ h candidate must satisfy the tight and loose, or very loose and tight WPs of the against-e and against-µ discriminants in the respective eτ h or µτ h final states. Templates for events in which the reconstructed τ h is matched to some generated τ h are obtained by changing the reconstructed τ h energy between −6% and +6% in steps of 0.1%, with the m vis and m τ h recomputed at each step.
The maximal energy shifts of ±6% are selected to be sufficiently away from the nominal value in the simulation such that the true value can be obtained between them. While m τ h displays higher sensitivity to the energy scale for the h ± π 0 and h ± h ∓ h ± decay modes, it cannot be used in the h ± decay mode, where only m vis is used. The backgrounds are modelled in the same way as described in Section 9.1, and the templates for processes in which there is no match between the reconstructed and generated τ h candidates are not changed as a function of the τ h energy scale.
For illustration, the m τ h templates corresponding to no shifts, and to shifts in τ h energy scale of −6 and +6% are shown in Fig. 18 for events selected in h ± π 0 decay mode. The data are compared to predictions for these three energy scales.  Figure 18: The distributions in m τ h for µτ h events in the h ± π 0 decay channel. The data are compared to predictions with different shifts applied to the τ h energy scale: 0% (upper), −6% (lower left), and +6% (lower right). The electroweak background includes contributions from W+jets (dominating), diboson, and single top quark events. Vertical bars (smaller than the symbol size) correspond to the statistical uncertainty in the data points (68% frequentist confidence intervals), while the shaded bands provide the expected systematic uncertainties.
A likelihood ratio method is used to extract the τ h energy scale for each decay mode. In addition to those listed in Section 8.9, the following sources of systematic uncertainties are considered: uncertainties in the identification of τ h candidates, determined in Section 9, are split into those that are uncorrelated (≈2%) and correlated (≈4.5%) between the eτ h and µτ h final states. The rates for electrons, muons, and jets misidentified as τ h candidates have uncertainties of 12, 25, and 20%, respectively. Moreover, uncertainties in the energy scale of electrons (1% in the barrel and 2.5% in the endcaps) and muons (5%) identified as τ h are taken into account in their differential distributions. The results obtained from fits to m vis and m τ h distributions for each decay mode in the eτ h and µτ h final states are found to be compatible with each other, and their combination is given in Table 7. The measurement is limited by systematic rather than statistical uncertainties. Table 7: The data-to-simulation correction for the τ h energy scale from the combination of measurements performed in the eτ h and µτ h final states separately using m τ h and m vis distributions. The correction is relative to the reconstructed energy from simulation, expressed in %.

Decay mode
Additional studies performed using the µτ h final state are carried out to assess the stability of the measurement. To gauge the impact of fluctuations caused by the limited number of MC events relative to the data, the simulated events used to model Z/γ * decays are split into four samples of equal size, and the measurement is performed using each of these four subsamples. The resulting fluctuations in the measured τ h energy scale are up to 1%. Similarly, the effect of the contamination from backgrounds that arise from misidentification of the τ h is checked by changing the selection criteria, and found to be 0.5%. The choice of the binning is investigated by changing the number of bins up and down by a factor 2. The results are compatible to within 1%. Finally, the effect of the range of the fit is evaluated for the m vis template by increasing it by 10 GeV in either direction, resulting in changes compatible within 0.5% of the original measurement. Although these checks do not guarantee that similar levels of fluctuation exist in the original measurement (especially, the assessment of the limited number of MC events), an additional uncertainty of 1.0% is added in quadrature to the uncertainty detailed in Table 7, to reflect our limited knowledge of the true fluctuations. This results in a total uncertainty of <1.2%.

Performance of τ h identification in the high-level trigger
The τ h reconstruction and identification algorithm described in Section 7 for the HLT was used to define a set of triggers for 2016 data taking. These triggers cover all final states of interest, namely, τ lepton pair production in τ e τ h , τ µ τ h , and τ h τ h decays, τ h associated with p miss T (τ h p miss T ), and single τ h with large p T . There are two types of HLT decision trees that use τ h candidates and which are aimed at two different classes of final states, those that comprise other than τ h candidates in the event, e.g., eτ h , µτ h , τ h p miss T , and those that include only τ h candidates, e.g., τ h τ h . The first type of trigger is based on L1 seeds that require the presence of an electron, a muon, or large p miss T , possibly together with a τ h candidate. These triggers also have their corresponding selections in e, µ, or p miss T in the HLT, thereby greatly reducing the event rates processed at later stages. This allows reconstruction of τ h candidates directly through the resource-intensive L3 step, wherein the PF sequence underpinning τ h reconstruction is run using the full-detector acceptance. In the second type of trigger, only τ h candidates are required to be identified at L1, without additional lepton or p miss T selections. At HLT, since the L3 step would be too time consuming to run at the L1 output rates, the L2 and L2.5 filtering steps are executed first. The efficiency of the L2 and L2.5 filter is >95% per τ h candidate. In addition, this class of triggers has HLT τ h reconstruction used only in regions of the detector centered on the direction of the L1 τ h candidates, further reducing thereby the processing time.
The triggers for τ pair production are aimed mainly to select efficiently the SM H → ττ decays that require respective p T thresholds of 20-25 and 30-35 GeV for τ e or τ µ and τ h final states. In addition, trigger rates at an instantaneous luminosity of L = 1.4 × 10 34 cm −2 s −1 and PU close to 40 interactions per bunch crossing, typical for pp collisions in late 2016, were required not to exceed rates of about 10-15 and 50-65 Hz for the eτ h or µτ h and τ h τ h triggers, respectively.
The µτ h trigger is constructed as follows. First, we require the presence of a muon candidate with p T > 18 GeV at L1. Then, an isolated muon, seeded by the L1 candidate, with p T > 19 GeV is selected at the HLT. Subsequently, an unseeded L3 τ h candidate is selected with p T > 20 GeV that passes the loose charged-particle isolation WP. The isolation is relaxed linearly by 10%/GeV for p τ h T > 50 GeV. Finally, the L3 τ h candidate must be separated from the muon by ∆R > 0.3. At L = 1.4 × 10 34 cm −2 s −1 , the rate for the µτ h trigger corresponds to ≈20 Hz.
To adapt to different conditions in instantaneous luminosity delivered by the LHC in 2016, ranging from ≈3 × 10 33 cm −2 s −1 to 1.4 × 10 34 cm −2 s −1 , and to provide highest efficiency possible within the limited rate budget, two variants of eτ h triggers were developed. The first one is similar to the µτ h trigger in that an isolated electromagnetic (e or γ) object with p T > 22 GeV is required at L1, and is used to initiate the reconstruction of an isolated electron at the HLT that is required to have p T > 24 GeV. A seedless L3 τ h candidate, not overlapping with the electron, is required to have p T > 20 GeV, and to pass the loose charged-particle isolation WP (linearly relaxed by 10%/GeV for p τ h T > 50 GeV). This trigger covered instantaneous luminosities of up to 9 × 10 33 cm −2 s −1 .
The second, a more stringent version of the eτ h trigger, adds the requirement of an L1 τ h candidate to accompany the L1 electromagnetic object. First, the p T threshold on the L1 τ h was set to 20 GeV, and, as the instantaneous luminosity increased, was raised to 26 GeV, and eventually the L1 isolation condition was also applied. In the latter configuration at the HLT, the p T threshold for the L3 τ h candidate was adjusted to 30 GeV. In the utilized ranges of instantaneous luminosity for which the eτ h triggers were designed, the trigger rates remained below 15 Hz.
The τ h τ h triggers require a pair of isolated L1 τ h candidates, with p T above a threshold in the range of 28-36 GeV. The threshold is dynamically adjusted to maintain a constant rate of events passing L1, independent of the instantaneous luminosity. Even after satisfying the L1 requirements, the event rate is still too high to run the L3 τ h reconstruction. The L3 reconstruction is therefore used only if at least two τ h candidates pass the L2 and L2.5 stages, as discussed in Section 7. At L3, the candidates have to have p T > 35 GeV, and pass the medium WP of the charged isolation (the charged isolation was replaced by the combined isolation at L > 1.3 × 10 34 cm −2 s −1 ). The isolation is linearly relaxed by 6%/GeV for p τ h T > 73 GeV. Two such candidates must be present in the event, and must be separated by ∆R > 0.5. At L = 1.4 × 10 34 cm −2 s −1 , the rate of τ h τ h triggers was below 60 Hz.
The benchmark process that guided the design of the τ h p miss T trigger is the decay of a charged resonance X ± → τν, e.g., for X ± = H ± or W , with a mass m X > 200 GeV. At L1, this trig-ger requires p miss T in excess of 80-100 GeV, again with the threshold dynamically adjusted as a function of instantaneous luminosity to keep the rate of events passing L1 constant. At the HLT, the selected events must further satisfy the condition of p miss T > 90 GeV. After this, the L3 τ h reconstruction step is executed, and events are finally saved when an L3 τ h candidate with p T > 50 GeV, passing the loose WP of charged isolation (relaxed by 6%/GeV for p τ h T > 100 GeV) is found, with its leading charged hadron satisfying p T > 30 GeV. At L = 1.4 × 10 34 cm −2 s −1 , the rate for the τ h p miss T trigger is about 20 Hz.
Finally, a high-p T single-τ h trigger was developed for searches for high-mass resonances decaying into at least one τ lepton, for example W , H ± , or the heavy A or H boson in MSSM. This trigger was designed to cover portions of the phase space not covered by the more usual crosstriggers (τ h τ h , τ h p miss T , eτ h , and µτ h ), e.g., H ± events with an energetic τ h but small p miss T . The trigger that fulfilled those conditions required an isolated L1 τ h candidate with p T > 120 GeV. The τ h reconstruction at the HLT consists of steps taken in L2, L2.5, and L3. The L3 requires one τ h candidate with p T > 140 GeV, and with a leading charged hadron with p T > 50 GeV. The L3 τ h candidate has to pass also the tight WP of charged isolation, which is linearly relaxed by 2%/GeV for p τ h T > 275 GeV, and is discarded for p τ h T > 500 GeV. Rates of about 30 Hz were allocated to this trigger.
The basic features of triggers with τ h candidates used to record pp collisions in 2016 are summarized in Table 8. The efficiencies of the τ h part of the triggers listed in Table 8 are measured via the tag-and-probe technique as a function of the offline-reconstructed p τ h T , using data enriched in τ h leptons from Z/γ * → ττ → µτ h decays. To single out this sample, the selections for the µτ h final state described in Section 8.1, together with the requirement of m T < 30 GeV and the additional condition of 40 < m vis < 80 GeV, are applied to data previously collected through single-muon triggers. Furthermore, to provide an efficiency measurement that is specific to the selections used in H → ττ analyses, the τ h candidates must pass the tight WP of the MVA-based isolation discriminant. The residual contamination from other objects misidentified as τ h is subtracted statistically using SS events passing the same selections. The purity of the final sample exceeds 95%. Table 8: Triggers with τ h candidates used to record pp collisions in 2016: the final state (Channel), HLT p T thresholds and τ h isolation working point, L1 p T thresholds, peak instantaneous luminosity (L peak ) in the period of operation as main trigger, and integrated luminosity ( L) collected with the trigger. The τ h τ h and τ h p miss T triggers are seeded by sets of L1 triggers with thresholds dynamically adjusted as a function of the instantaneous luminosity to maintain a constant L1 rate, given by the ranges in p T . The trigger p T thresholds and isolation criteria were successively tightened over the data-taking period to keep the rate of events passing HLT approximately constant with increasing instantaneous luminosity. τ h p miss T triggers, special µτ h triggers were put in place. The special triggers have one part that is required to match the nominal single-muon trigger used to select events; the other part is required to pass the τ h trigger identification for the trigger of interest.
The passing τ h probes correspond to those that pass the τ h part of the special trigger, i.e., the trigger is satisfied and its τ h part geometrically matches (∆R < 0.5) the selected offline τ h . The efficiency of the τ h part of the µτ h and τ h τ h triggers, measured using collision data relative to the DY simulation, is shown in Fig. 19. For the τ h τ h trigger, we use only the portion of the 2016 data that contains the trigger employing the combined isolation. In both cases, simulation agrees well with the data. Data-to-simulation agreement is similar for the other triggers discussed in this section.  Figure 19: Single-τ h efficiency of the µτ h (left) and τ h τ h (right) triggers. The efficiency is computed per single τ h , using the tag-and-probe method as a function of the offline-reconstructed p τ h T . Observed data are compared to simulated Z/γ * → ττ events selected through the same procedure. Vertical bars correspond to the statistical uncertainties. The plot on the right has data points fitted using a cumulative (integral) distribution of the Crystal Ball function [76]. Figure 19 shows that the nominal p T threshold of the τ h triggers corresponds to an efficiency of 50%, as expected for trigger and offline objects with the same energy scale. The slow turn-on originates from two effects: in the p T range above about twice the trigger threshold, it is caused by the relaxed isolation selection applied at HLT, but not in the offline selection; in the range just above the trigger threshold, it is caused by an asymmetric energy response of the HLT τ h candidate relative to its offline counterpart. The asymmetry is due to a more inclusive selection of constituents of the τ h candidate at the HLT than offline. The second effect is clearly visible in the µτ h trigger with unseeded L3 τ h reconstruction, while for the τ h τ h trigger it is smeared out by the resolution of the L1-and L2-τ h candidates (relative to offline), which is much worse than the resolution of the L3 candidates.

Summary
The "hadron-plus-strips" algorithm developed at the CMS experiment to reconstruct and identify τ → hadrons + ν τ decays in proton-proton collisions at √ s = 7 and 8 TeV, as presented in Ref.
[24], has been improved. The changes include a dynamic strip reconstruction, the recon-struction of highly boosted τ lepton pairs, and the introduction of additional variables in the multivariate-analysis discriminants used to reject jets and electrons. The isolation discriminants have also been optimized to cope with the large pileup of events in √ s = 13 TeV protonproton runs.
The performance of the improved algorithm has been measured using 35.9 fb −1 of data recorded during 2016 at √ s = 13 TeV. The τ h identification efficiency in data at low, intermediate, and high transverse momenta, as well as for highly Lorentz-boosted τ lepton pairs, is similar to that expected from Monte Carlo simulation, while differences of 10-20% are found between data and simulation for the jet → τ h misidentification probability. The e → τ h and µ → τ h misidentification probabilities are smaller than those of the previous version of the algorithm under the same running conditions, while maintaining a high efficiency for the selection of genuine τ h candidates. The corresponding data-to-simulation scale factors have also been determined. The energy scale of τ h candidates is measured, and its response relative to Monte Carlo simulation is found to be close to unity. Finally, a specialized τ h reconstruction and identification algorithm has been used in the high-level trigger, and its performance has been presented. [13] CMS Collaboration, "Search for electroweak production of charginos in final states with two τ leptons in pp collisions at √ s = 8 TeV", JHEP 04 (2017) 018, doi:10.1007/JHEP04(2017)018, arXiv:1610.04870.